Apache Spark Essential Training

Video Introducing this tutorial


What you should know before watching this course
Using the exercise files

1. Introducing Apache Spark

Understanding Spark
Origins of Spark
Overview of Spark components
Where Spark shines
Overview of Databricks
Introduction to notebooks and PySpark

2. Analyzing Data in Spark

Understanding data interfaces
Working with text files
Loading CSV data into DataFrames
Exploring data in DataFrames
Saving your results

3. Using Spark SQL to Analyze Data

Creating tables
Querying data with Spark SQL
Visualizing data in Databricks notebooks

4. Running Machine Learning Algorithms Using MLlib

Introduction to machine learning with Spark
Preparing data for machine learning
Building a linear regression model
Evaluating a linear regression model
Visualizing a linear regression model

5. Real-Time Data Analysis with Spark Streaming

Introduction to streaming analytics
Streaming context setup
Querying streaming data

6. Connecting BI Tools to Spark

Setting up spark locally
Connecting Jupyter notebooks to Spark
Other connection options


Next steps