Quick lessons in using MapReduce

Tools

The tools and techniques involved in big data use confuse a lot of people--even those people who use them or buy them but most especially those who are trying to follow big data debates without a working knowledge of what makes the whole thing tick. The following are quick lessons on MapReduce that will help almost anyone understand this technology behind Hadoop. I will share more short but good explanations on other big data tools and techniques as I come across them.

IBM (NYSE: IBM) has a short YouTube video explaining what MapReduce is and a MapRAcademy YouTube video explains the flow of a MapReduce program and what you need to know to use it. There is also a good YouTube five-part tutorial on MapReduce fundamentals by Lynn Langit that you'll likely find very useful.

A MapReduce explanatory video made by Jesse Anderson, one of the instructors at Cloudera University, is easy for most people to follow, as he uses playing cards to illustrate. There is a $9.00 fee for each of the three episodes. Episode 1 is the video I am referencing here.

Anderson would also like to clear up three common myths surrounding MapReduce, namely that it is Java only, you need to be a math whiz to use it, and it's hard to get started--all of which he says is untrue.

"I've had all sorts of students in class with varying levels of math skills," he told me.  "You can get value from MapReduce without having to use complex mathematical models.  There are various use cases for MapReduce and not all of them require an advanced math degree."

For those who think MapReduce is Java only territory, watch this screencast showing how to use Hadoop MapReduce with languages like Python, Ruby and Perl. But again, there is a $9.00 fee for this episode (Episode 3).

For those who want to a chance to play around with MapReduce, check out Cloudera's QuickStart VM.  

If you have made or know of excellent instructional videos or tutorials on any big data topic, please share them below or send me an email. Since big data is central to everyone's life now, it's important that everyone understands how it works--or at least understands the basics. - Pam

Filed Under