Sunday, July 04, 2010

Weekend at the Zoo(Keeper)

I've written about Apache ZooKeeper before, but I had never actually tried it. Only today did I get a chance to play with it.

The ZooKeeper recipes really piqued my curiosity. So after spending a few hours reading the docs, I decided to give it a try. My interest was purely the performance side of it. ZK makes it very clear in the docs that it excels under read-heavy workloads. And the more replicated servers you add, the better it gets. They were not kidding.

I have my test code here - Keep in mind that this is a simple test, perhaps even a micro benchmark. It does not even have the minimum 3 servers for a quorum. Remember, my tests were run on a new (2010) laptop with 4 hyper threads with some simple Xms/Xmx JVM settings and everything else remaining as is - default, out of the box. This is by no means a representative test. There are official numbers on the ZK wiki with tests run on a real server class machine. You should have a look at those too

Well, what can I say - it is a little slow. Even writing messages with a few bytes take a while. Granted, each write in a loop requires a network call. So, if I write a 1000 messages, it requires a 1000 remote/network calls. The CreateMode.PERSISTENT_SEQUENTIAL is very handy, like the RDBMS autogenerated-id column.

I would've liked a few more batch-oriented calls like getDataForChildren() and createIfAbsent() instead of making 2 calls first to find out the child names and then to get the actual data. But hey, I'm just trying to shoehorn it into a wrong usecase.

This is the simple test and the sample console output is further below. You can always get the full code from my Gist repo :

Console output: