Big Data Mastery with Hadoop Bundle

362 Enrolled
8 Courses & 44 Hours
You save 91% -

What's Included

Taming Big Data with MapReduce & Hadoop
  • Certification included
  • Experience level required: All levels
  • Access 56 lectures & 5 hours of content 24/7
  • Length of time users can access this course: Lifetime

Course Curriculum

56 Lessons (5h)

  • Introduction
  • Getting Started
    Installing Enthought Canopy
    Installing MRJob
    Downloading the MovieLens Data Set
    Run Your First MapReduce Job
  • Understanding MapReduce
    MapReduce Basic Concepts
    Walkthrough of Rating Histogram Code
    Understanding How MapReduce Scales / Distributed Computing
    Average Friends by Age Example: Part 13:04
    Average Friends by Age Example: Part 2
    Minimum Temperature By Location Example
    Maximum Temperature By Location Example
    Word Frequency in a Book Example
    Making the Word Frequency Mapper Better with Regular Expressions
    Sorting the Word Frequency Results Using Multi-Stage MapReduce Jobs
    Activity: Design a Mapper and Reducer for Total Spent by Customer2:54
    Activity: Write Code for Total Spent by Customer
    Compare Your Code to Mine. Activity: Sort Results by Amount Spent
    Compare your Code to Mine for Sorted Results.
  • Advanced MapReduce Examples
    Example: Most Popular Movie
    Including Ancillary Lookup Data in the Example
    Example: Most Popular Superhero, Part 14:22
    Example: Most Popular Superhero, Part 26:31
    Example: Degrees of Separation: Concepts
    Degrees of Separation: Preprocessing the Data
    Degrees of Separation: Code Walkthrough
    Degrees of Separation: Running and Analyzing the Results
    Example: Similar Movies Based on Ratings: Concepts
    Similar Movies: Code Walkthrough
    Similar Movies: Running and Analyzing the Results
    Learning Activity: Improving our Movie Similarities MapReduce Job
  • Using Hadoop and Elastic MapReduce
    Fundamental Concepts of Hadoop
    The Hadoop Distributed File System (HDFS)
    Apache YARN4:20
    Hadoop Streaming: How Hadoop Runs your Python Code
    Setting Up Your Amazon Elastic MapReduce Account
    Linking Your EMR Account with MRJob3:40
    Exercise: Run Movie Recommendations on Elastic MapReduce
    Analyze the Results of Your EMR Job
  • Advanced Hadoop and EMR
    Distributed Computing Fundamentals
    Activity: Running Movie Similarities on Four Machines
    Analyzing the Results of the 4-Machine Job
    Troubleshooting Hadoop Jobs with EMR and MRJob, Part 1
    Troubleshooting Hadoop Jobs, Part 2
    Analyzing One Million Movie Ratings Across 16 Machines, Part 1
    Analyzing One Million Movie Ratings Across 16 Machines, Part 2
  • Other Hadoop Technologies
    Introducing Apache Hive
    Introducing Apache Pig
    Apache Spark: Concepts
    Spark Example: Part 1
    Spark Example: Part 2
  • Where to Go from Here
    New Lecture

Taming Big Data with MapReduce & Hadoop

Sundog Software

Frank Kane spent 9 years at Amazon and IMDb, developing and managing the technology that automatically delivers product and movie recommendations to hundreds of millions of customers, all the time. Frank holds 17 issued patents in the fields of distributed computing, data mining, and machine learning. In 2012, Frank left to start his own successful company, Sundog Software, which focuses on virtual reality environment technology, and teaching others about big data analysis. For more details on this course and instructor, click here. This course is hosted by StackSkills, the premier eLearning destination for discovering top-shelf courses on everything from coding—to business—to fitness, and beyond!


Big data is hot, and data management and analytics skills are your ticket to a fast-growing, lucrative career. This course will quickly teach you two technologies fundamental to big data: MapReduce and Hadoop. Learn and master the art of framing data analysis problems as MapReduce problems with over 10 hands-on examples. Write, analyze, and run real code along with the instructor– both on your own system, and in the cloud using Amazon's Elastic MapReduce service. By course's end, you'll have a solid grasp of data management concepts.

  • Learn the concepts of MapReduce to analyze big sets of data w/ 56 lectures & 5.5 hours of content
  • Run MapReduce jobs quickly using Python & MRJob
  • Translate complex analysis problems into multi-stage MapReduce jobs
  • Scale up to larger data sets using Amazon's Elastic MapReduce service
  • Understand how Hadoop distributes MapReduce across computing clusters
  • Complete projects to get hands-on experience: analyze social media data, movie ratings & more
  • Learn about other Hadoop technologies, like Hive, Pig & Spark


Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels


  • Unredeemed licenses can be returned for store credit within 30 days of purchase. Once your license is redeemed, all sales are final.