Memory Leak Detection - Continued

Well, it has certainly been a while since I last posted. Sorry about that. I was recently checking the analytics for this site, and noticed that my previous post has had quite a few referrals from Google. I feel I would be doing a disservice if I did not post an update, as the original post was merely an academic exercise.  There is most certainly a better way to achieve the same goal.

Specifically, glibc provides a series of memory allocation hooks which can be used to implement systems similar to that in my previous post. I was able to implement a similar system in a fraction of the code using the aforementioned hooks.

Overall, these hooks are a great took to have in your tool belt when debugging memory allocations and deallocations.

Memory Leak Detection

Whenever I work on C code of reasonable complexity, I find Valgrind to be an indispensable tool for memory debugging, and leak detection. It is all too easy to miss a free() somewhere, especially in fairly complex code. It really is an indispensable resource.

This weekend, as I was playing around with some LD_PRELOAD tricks to gather statistics on memory allocations, I realized I already had half of a rudimentary memory leak detector written. I didn't have any plans, aside from watching football, so I decided to follow through and see what I could come up with. It is interesting how quickly a working prototype could be hacked together. Sure, it is no Valgrind, but it works.

I started by creating a shared library (loadable through LD_PRELOAD) that wrapped three functions malloc(), free(), and exit(). When any of the functions are called, the allocations and deallocations are tracked as they happen. I had a relatively optimized binary tree implementation lying around, so I leveraged that, though a hash table would probably be better suited in this scenario. The code (with many implementation details omitted for brevity) looks something like this:

    void *malloc(size_t size) {
        void *(*real_malloc)(size_t size) = NULL;
        real_malloc = dlsym(RTLD_NEXT, "malloc");
        ...
        return real_malloc(size);
    }
	
    void free(void *ptr) {
        void *(*real_free)(void *ptr) = NULL;
        real_free = dlsym(RTLD_NEXT, "free");
        ...
        real_free(ptr);
    }

The major problem is that it will alert you to the existence of a memory leak, without providing any details on where the leak occurred. Fortunately, libc supports the generation of backtraces, so it is just a matter of tracking the backtraces along with the allocation information. Should be simple to add this feature -- but when is anything as easy as it initially seems?.

It turns out that the libc backtrace implementation has some serious problems when called from a wrapped implementation of malloc() -- backtrace() itself calls malloc(), leading to an infinite loop. Instead of detecting memory leaks, we just started the application down the path to a segfault. Great.

Back to square one. Fortunately once again, it is quite easy to write our own backtrace implementation. Below is a simplified implementation printing only the information on the calling function (1 level away on the stack)

    void print_stacktrace() {
        Dl_info *info = (DL_info *)malloc(sizeof(Dl_info));
        void *addr = __builtin_return_address(1);
        int ret = dladdr(addr, info);
        printf("%s(%s) [%p]\n", info->dli_fname, info->dli_sname, info->dli_saddr);
        free(info);
    }

Overall, I think things worked out pretty well. I can't say it will be very useful, but in some simple cases it might be of help. In any event, it was a nice way to spend a Sunday morning, and I got to learn a few new tricks.

Get Me Out of Here

Have you ever been dreading a meeting, social event, or gathering and strategically planned a well-timed call to make an exit? I am sure we have all been there, some even having tried it out. It was with that idea that I began working on a small project.

I heard about Twilio after John Britton's brilliant live coded demo at the New York Tech Meetup. I must say I am quite impressed with the service. I can definately envision some cool things being built on top of Twilio. I have a few interesting ideas myself!
 
I signed up for a developer trial account and formulated a small project to cut my teeth on. I don't have a name for it yet, but I have affectionately been referring to it as "Get Me Out of Here". The bulk of the functionality is up and running, though I need to implement a number verification scheme. I don't want to be woken up in the middle of the night because someone thought it would be funny to send a call to my number at 3:30 in the morning. That would be less than ideal. With Twilio, number verification should be a snap.
 
Overall, I am really impressed. I look forward to leveraging the service in more of my projects.

Confessions of a serial project starter

I have way too many half-finished projects. There I said it. The first step to recovery is admitting you have a problem, right? I can't say that I have ever started a project with the intention of not finishing it, but more often than I care to admit, that seems to be the outcome. I have set a goal for myself to finish up a few of the projects I have lying around, before starting anything new. I think of it as a non-newyear's resolution. 

With my new goal in mind, I began to think about why, exactly, so many of my projects meet the same fate. Sure, most are working in the sense that they fulfill my needs, but have a considerable way to go before I would consider them for release. I think the following list of reasons (read: excuses) are at the heart of most of my abandoned projects.
  1. Most of the projects I start are in response to a need I have, or to generally make my life easier. In that sense, I am the primary audience, and when a project gets to a working state I often call it a day. Sure, it is great that the project satisfies my needs, but it sure would be great to release it, given the chance that others will find it useful. You may say that it is likely that no one will find it of interest, or the project will go largely unnoticed, but there is an equal or greater chance that others will find it useful. I would rather have the code out there than collecting dust on my hard drive

  2. More often than not, my problem simply comes down to time. Not having enough time, to be specific. Believe me, if I could find a way to get a few more hours out of the day, I gladly would. Sadly, I don't foresee the common day suddenly getting longer, and cutting back on sleep doesn't sit well with me. I learned long ago that my body requires adequate rest. With a shortage of time, sacrifices will be made, and in my case it is usually some personal projects on the chopping block.

  3. At any given time, I may have hundreds of project ideas begging to be turned into code; vying for my time. I may even go so far as to say I have an overabundance of ideas. As I said before, many of the projects I work on are scratching my own itch, so to speak. As a software engineer, I can't help but think of solutions to problems that I can bang out with a fun new project. If I took on every project idea I have, well, lets just say my wife wouldn't be too pleased when she doesn't see me for months on end. I simply can't act on every idea I get, though I am certainly guilty of spreading myself too thin.

  4. Fun. The real fun to be had is in the initial design, development of the architecture, and solving the challenging technical problems. Sure the challenging work is the fun part, but of what use is it if the project stagnates because the less challenging and mundane tasks never get done?
With that now out in the open, I hope to become better about completing the projects I work on.

Back to Blogging - The Software Problem

I began seriously thinking about blogging again a few days ago. I have been on vacation in Maine with my wife Anna, who happened to come down with a sickness of unknown variety. Being unable to speak above a whisper, she has taken to speaking with me through charades. Naturally, this has led to some pretty awkward conversations.

Anywho, since Anna has been recuperating, I have found myself with a lot of free time on my hands. I decided I would start blogging again. I have been blogging in one form of another for quite some time, but generally loose focus after a few months, and the blog is left to stagnate. Hopefully I have lost some dropped habits since my last serious attempt at blogging.
 
With the decision to continue blogging fresh in my mind, I was only left with one decision -- which piece of blogging software is right for me? Should I go the hosted route and create a blog on Blogger? I already have a web server and would prefer to manage everything myself, so I quickly ruled that out. In the past I have used Movable Type, which was nice but the last time I used it the templating engine was a complete nightmare. Wordpress seems to be the most popular, though I have never been much of a fan -- and it doesn't exactly have a stellar security track record, either. 
 
Since I had a bit of time on my hands, I figured why not write a quick blog engine myself. And besides, I had a web framework lying around from when I was learning WSGI, so I may as well put that to use. In an hour or two, I had a simple blog up and running with the basic features you would expect from blogging software.
 
Now I have two problems on my hands -- keep the blog up to date, and also maintain the system used to manage the blog. I don't anticipate there to be much maintenance required for the software (famous last words, right?), so I don't think I have dug myself in too deeply.
 
All in all, I can't say that I would recommend writing something yourself while there are may fantastic systems out there, but it was a fun way to spend an hour or two.

About

I write software and play music. I am usually not very good about keeping a blog up to date, but I hope to change that.

Twitter

...

Elsewhere