Categories

There are currently no items in your shopping cart.

User Panel

Forgot your password?.

The Ultimate Hands-On Hadoop – Tame your Big Data!


Learn all the buzzwords! And install Hadoop. :
[Activity] Introduction, and install Hadoop on your desktop!
Hadoop Overview and History
Overview of the Hadoop Ecosystem
Tips for Using This Course

Using Hadoop's Core: HDFS and MapReduce :
HDFS: What it is, and how it works
[Activity] Install the MovieLens dataset into HDFS using the Ambari UI
[Activity] Install the MovieLens dataset into HDFS using the command line
MapReduce: What it is, and how it works
How MapReduce distributes processing
MapReduce example: Break down movie ratings by rating score
[Activity] Installing Python, MRJob, and nano
[Activity] Code up the ratings histogram MapReduce job and run it
[Exercise] Rank movies by their popularity
[Activity] Check your results against mine!

Programming Hadoop with Pig :
Introducing Ambari
Introducing Pig
Example: Find the oldest movie with a 5-star rating using Pig
[Activity] Find old 5-star movies with Pig
More Pig Latin
[Exercise] Find the most-rated one-star movie
Pig Challenge: Compare Your Results to Mine!

Programming Hadoop with Spark :
Why Spark?
The Resilient Distributed Dataset (RDD)
[Activity] Find the movie with the lowest average rating - with RDD's
Datasets and Spark 2.0
[Activity] Find the movie with the lowest average rating - with DataFrames
[Activity] Movie recommendations with MLLib
[Exercise] Filter the lowest-rated movies by number of ratings
[Activity] Check your results against mine!

Using relational data stores with Hadoop :
What is Hive?
[Activity] Use Hive to find the most popular movie
How Hive works
[Exercise] Use Hive to find the movie with the highest average rating
Compare your solution to mine.
Integrating MySQL with Hadoop
[Activity] Install MySQL and import our movie data
[Activity] Use Sqoop to import data from MySQL to HFDS/Hive
[Activity] Use Sqoop to export data from Hadoop to MySQL

Using non-relational data stores with Hadoop :
Why NoSQL?
What is HBase
[Activity] Import movie ratings into HBase
[Activity] Use HBase with Pig to import data at scale.
Cassandra overview
[Activity] Installing Cassandra
[Activity] Write Spark output into Cassandra
MongoDB overview
[Activity] Install MongoDB, and integrate Spark with MongoDB
[Activity] Using the MongoDB shell
Choosing a database technology
[Exercise] Choose a database for a given problem

Querying your Data Interactively :
Overview of Drill
[Activity] Setting up Drill
[Activity] Querying across multiple databases with Drill
Overview of Phoenix
[Activity] Install Phoenix and query HBase with it
[Activity] Integrate Phoenix with Pig
Overview of Presto
[Activity] Install Presto, and query Hive with it.
[Activity] Query both Cassandra and Hive using Presto.

Managing your Cluster :
YARN explained
Tez explained
[Activity] Use Hive on Tez and measure the performance benefit
Mesos explained
ZooKeeper explained
[Activity] Simulating a failing master with ZooKeeper
Oozie explained
[Activity] Set up a simple Oozie workflow
Zeppelin overview
[Activity] Use Zeppelin to analyze movie ratings, part 1
[Activity] Use Zeppelin to analyze movie ratings, part 2
Hue overview
Other technologies worth mentioning

Feeding Data to your Cluster :
Kafka explained
[Activity] Setting up Kafka, and publishing some data.
[Activity] Publishing web logs with Kafka
Flume explained
[Activity] Set up Flume and publish logs with it.
[Activity] Set up Flume to monitor a directory and store its data in HDFS

Analyzing Streams of Data :
Spark Streaming: Introduction
[Activity] Analyze web logs published with Flume using Spark Streaming
[Exercise] Monitor Flume-published logs for errors in real time
Exercise solution: Aggregating HTTP access codes with Spark Streaming
Apache Storm: Introduction
[Activity] Count words with Storm
Flink: An Overview
[Activity] Counting words with Flink

Designing Real-World Systems :
The Best of the Rest
Review: How the pieces fit together
Understanding your requirements
Sample application: consume webserver logs and keep track of top-sellers
Sample application: serving movie recommendations to a website
[Exercise] Design a system to report web sessions per day
Exercise solution: Design a system to count daily sessions

Learning More :
Books and online resources
Bonus lecture: Discounts on my other big data / data science courses!

You Have Got Gift 25% OFF

Use this Coupon Code “J3JKN396