Programming: GC better than Malloc?

There's an article at IBM saying that garbage collection (like what Java does) is better than C++ and Malloc. The article is a little deceiving and incorrect. It's a little ironic but I did learn a few years ago that GC can be better than allocating on the heap or even the stack. When it comes to allocating memory the stack is the king when it comes to speed. The only problem is the stack is has lots of limitations, that why in C++ we use the heap. But if you follow what the memory manager does on the heap you can see that it will take a bit of time. The trick here is to use a pool, if possible (and it usually is possible). A pool can be quite fast.
Another trick used in C++ is reference counting. This looks pretty fast but turns out that doing any calculation is too much calculation when it comes to memory allocation. Also, with reference counting you can forget to count correctly, which happens a lot with COM, for example. Python, my favorite language, uses reference counting, but it's all hidden so that problem doesn't occur however it is probably slightly slower than Java when it comes to memory allocation.
My problem with Java is that every little thing creates an object in it's memory which needs to be garbage collected. What you find, sometime is that your super power machine with 10 gig of memory performs worse with Java than your 1 gig test machine, why is that?
It's because Java sees that it has lot's of memory so it let's it fill up, so everything is hunky dory until the memory gets filled, and now it has 10 gig garbage to work through. This is a case where adding more memory makes things worse. This is the vicious little secret that Java has that nobody mentions.
In with 10 gig of memory you will probably find that reference counting provides better results for your server. It's slower, but it's steady and you know how much memory the system is actually using. With Java, if you bump up the heap to use all physical memory you don't really know how much is being used. And when you fill it up it can be a disaster. What can happen is that your server is being used consistently, not a lot, but enough for it be decide that there's no idle time to perform a good garbage collection. So the memory fill and fills. Then at your peak you get so many allocations that you run out of memory and it is forced to do some garbage collection, at the worst possible time.
The trick here is the same as C++, use a pool of objects for your server objects, especially your big objects that hold a lot of items. In this way you can reuse a lot of the memory instead of leaving it to garbage collection. There's no real downside, unless you forget to put an old object back in the pool, then you have a memory leak - in Java. Yes, Java sacrifices speed for certainty. You don't know you have a problem until it's too late. You test machines tell you nothing. Adding more memory just makes things worse. The nice thing with Java is that nobody might notice anything is wrong until after you leave, if you are lucky.


Popular posts from this blog

Shortest Sudoku solver in Python

Seven Segment Display in Inkscape