Logging rm's
/I wanted a way to be able to log anytime a file was deleted on a Linux machine. Not just when someone runs an rm, but even if those deletes were being done by a script or any compiled programs.
The neat way to do this on Linux is to use a custom crafted library that overrides the unlink and unlinkat methods from the C library. Once you have the module ready you can force any binary using libc to load this module by using the LD_PRELOAD environment variable. This environment variable forces the dynamic linker to load this module before anything else.
Below is a little program that implements the unlinkat system call
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
/* Not sure why i needed to redeclare this */
#define RTLD_NEXT ((void *) -1l)
/* pointer to the real unlinkat C lib method */
int (*true_unlinkat)(int, const char *, int);
void initialize()
{
true_unlinkat = dlsym(RTLD_NEXT,"unlinkat");
}
int unlinkat (int dirfd, const char *path, int flags)
{
initialize();
fprintf(stderr,"Unlink called with %s\n",path);
return (*true_unlinkat)(dirfd,path,flags);
}
Save the above code into a file called collector_test.c and compile it with the command below
> gcc -o collector_test.so -fPIC -m64 -shared collector_test.c -ldl
The -fPIC option generates Position Independent Code and the -m64 compiles it for a 64 bit architecture
And finally to run a test you can do the following:
> touch /tmp/testfile.log
> /usr/bin/env LD_PRELOAD=./collector_test.so rm -f /tmp/testfile.log
Unlink called with /tmp/testfile.log
removed `/tmp/testfile.log'
You can now see the "Unlink called ..." log message being printed out which means our custom library is intercepting the unlinkat C call.
This simple concept can be further developed to override and log all kinds of other Linux System calls, this kind of interception allows to do things like adding instrumentation to even closed source compiled programs which invariably use the system C library.
I'm currently working on a test project which uses this system to mine some very interesting data about data usage patterns, look for a post about that at a later date.