Hadoop is today a industry-standard software for the Big Data and this book it’s the industry-standard book for Hadoop. This book is able to bring you from no knowledge about Hadoop and the Big Data to a full knowledge of Hadoop and it’s usage.
The book is split in 16 chapters and 3 appendix for a total of 628 pages of contents. This make ~33 pages for chapter so it’s easy to read and to find what you need. This is very important since - even if it’s possible - it’s rare that this kind of book is read cover-to-cover.
The author, Tom White, does not limit himself to Hadoop itself, but does help the reader to understand the Hadoop ecosystem and White speaks a lot about some products of the Hadoop ecosystem giving to each of them a full chapter. These products are: Pig, Hive, HBase, ZooKeeper and Sqoop.
The last chapter is a “Case Studies” chapter. In this chapter is shown the usage of Hadoop in some situations you could have thought were difficult or impossible to handle for Hadoop.
At first I was perplexed by the author choice of constantly use the same example (Weather data), but after a couple of chapters I did see the value that this choice was creating.
The only down-side I’ve seen in this book is that, even if the last edition (3rd at the time I’m writing) has been published in May 2012, a lot of statistical data are outdated reporting data and sources of 2006-2008. I would suggest this book to anyone is interested in Hadoop and in the Big Data world.
You can find the book at O’Reilly website.
Disclaimer: I received a free electronic copy of this book as part of the O’Reilly Blogger Program