Thursday, February 25, 2010

Big data, BigSheets and other stories

Blogging, re-blogging, tweeting and re-tweeting seems to me like a way of "recycling karmic debt". So, here are some blogs from around the world talking about scalable storage. If nothing, Web 2.0 companies have at least bequeathed us with a large variety of excellent storage solutions.

High Performance Scalable Data Stores is a well written quick summary of the state of the art. (Via:

Nice to see IBM leveraging Hadoop and family. At least they don't seem to be suffering from NIH - IBM BigSheets.

HBase vs Cassandra: why we moved is interesting if you are just starting, but remember to read the comments.

I liked this too - Data-Intensive Text Processing with MapReduce. It's funny how academicians like to express everything as Gamma, Sigma, Theta ... Especially funny when you see they've taken the Java source code and converted it to "thesis-y" language. It's usually the other way round. But it's a good read.