How do we train ourselves on Big Data and Hadoop technologies without spending a lot of money?
Many of us are asking this very same question because learning how to program and develop for the Big Data platform can lead to very lucrative career opportunities!
Think about the huge skill gap in the market place for Big Data programmers and developers.
According to McKinsey & Company, the United States alone is likely to “face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions by 2018“.
There are some top quality trainings available in the market, led by qualified instructors. However if you want to save some money, and prefer self learning, you still have some excellent options.
Many people even prefer self-directed training for Hadoop and Big Data as in that way there are opportunities to explore even more.
Okay, so where should we look?
I’ve listed my ten favorite resources below that you can use for your self-paced training.
The first five are the websites. Please spend some time on each of these and choose one that will work for you.
Then I’ve listed five books. The books are not free but still worth buying, at least one or two as they will be invaluable throughout your study. If you want to appear in the Hadoop certification or actively work on projects, some of these books will definitely be handy.
Let’s take a look at the list now:
- Big Data University – This is an online educational site which offers very good free online courses for beginners. These courses are self-paced and very well structured. Many of them include hands-on exercises that you can do in the cloud or on your own PC.
- Cloudera – Follow Cloudera and you can teach yourself a lot on Big Data. Being one of the most popular and respected Big Data service and training providers, Cloudera offers excellent documentation with very good coverage on the Big Data technologies implemented in CDH (Cloudera data platform on Hadoop). In addition, Cloudera University offers free video training sessions on Apache Hadoop ecosystem and Big Data analytics. Use these free training videos to get started or as a refresher.
- Hortonworks – Why not try Hortonworks? Go to the documentation page here. Also, I recommend you to watch the video tutorials from the Hortonworks University to understand the use cases. Note that Cloudera and Hortonworks both offer quick start virtual machines (VM) for Hadoop. Using the VMs, you can install and run a single node Hadoop cluster on your PC in no time! For more details refer to my earlier post here.
- Yahoo Hadoop tutorial – Yahoo provides a very comprehensive free tutorial in a structured format.
- Apache Hadoop website – Finally don’t forget Apache Project website for Hadoop. Though not exactly a tutorial, you may need to come here several times while working on these technologies.
- Hadoop the Definitive Guide – By Tom White (3rd Edition): This is may be the best Hadoop book available in the market now. If you want to appear for the Hadoop certification, this book is a must read.
- Hadoop in Action – By Chuck Lam: A very good book to start your journey on Hadoop and MapReduce programming. It’s easy to read with excellent good examples. However, the book is a bit dated as of now because it only covers Hadoop version 0.20.
- Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics – By Bill Franks – This is not a technical book. It is more intended for executive reading with a nice, broad, high level take on Big Data and related topics. Excellent read for all of us.
- Big Data – Principles and best practices of scalable real time data systems – By Nathan Marz and James Warren: This book is available in an early access edition for now. It’s targeted more for the solution architects and application owners working to build a Big Data solution.
- Hadoop Operations – By Eric Summer: This is a very good book for day-to-day operational usage for both Hadoop developers and administrators.
Definitely there are lots of great videos in the internet covering almost all components of Big Data technologies. Just do a YouTube search and see! However, today I limit myself on the popular websites and books only.
One thing I’ve learned in my career, learning never stops! You can jump-start with any of these five website. For a deep dive, buy one or more books as needed. Happy learning!
I’ll be very happy to update the list with your suggestions and recommendations.