Advanced Analytics and Real-Time Data Processing in Apache Spark

Video Introducing this tutorial

The Course Overview
Introducing Spark Streaming
Streaming Context
Processing Streaming Data
Use Cases
Spark Streaming Word Count Hands-On
Spark Streaming - Understanding Master URL
Integrating Spark Streaming with Apache Kafka
mapWithState Operation
Transform and Window Operation
Join and Output Operations
Output Operations -Saving Results to Kafka Sink
Handling Time in High Velocity Streams
Connecting External Systems That Works in At Least Once Guarantee - Deduplicaion
Building Streaming Application -Handling Events That Are Not in Order
Filtering Bots from Stream of Page View Events
Introducing Machine Learning with Spark
Feature Extraction and Transformation
Transforming Text into Vector of Numbers - ML Bag-of-Words Technique
Logistic Regression
Model Evaluation
Implementing GMM in Apache Spark
Principal Component Analysis and Distributing the Singular Value Decomposition (SVD)
Collaborative Filtering - Building Recommendation Engine
Introducing Spark GraphX - How to Represent a Graph?
Limitations of Graph-Parallel System - Why Spark GraphX?
Importing GraphX
Create a Graph Using GraphX and Property Graph
List of Operators
Perform Graph Operations Using GraphX
Triplet View
Perform Subgraph Operations
Neighbourhood Aggregations - Collecting Neighbours
Counting Degree of Vertex
Caching and Uncaching
Vertex and Edge RDD
Structural Operators - Connected Components
Introduction to SparkR and How It's Used?
Setting Up from RStudio
Creating Spark DataFrames from Data Sources
SparkDataFrames Operations - Grouping, Aggregation
Run a Given Function on a Large Dataset Using dapply or dapplyCollect
Running Large Dataset by Input Column(s) and Using gapply or gapplyCollect
Run Local R Functions Distributed Using spark.lapply
Running SQL Queries from SparkR
PageRank Using Spark GraphX
Sending Real-Time NotificationWhen User Want to Buy a Product on the E-Commerce Site

