Apache Spark with Scala - Hands On with Big Data!

Apache Spark tutorial with 20+ hands-on examples of analyzing large data sets, on your desktop or on Hadoop with Scala!

  • Sundog Education by Frank Kane
  • 4.55
  • (16302 reviews)
  • 9 hrs
  • 69 lectures
  • Udemy
Apache Spark with Scala - Hands On with Big Data!

What you will learn?

  • Develop distributed code using the Scala programming language
  • Transform structured data using SparkSQL, DataSets, and DataFrames
  • Frame big data analysis problems as Apache Spark scripts
  • Optimize Spark jobs through partitioning, caching, and other techniques
  • Build, deploy, and run Spark scripts on Hadoop clusters
  • Process continual streams of data with Spark Streaming
  • Traverse and analyze graph structures using GraphX
  • Analyze massive data set with Machine Learning on Spark

Your trainer

Sundog Education by Frank Kane

Sundog Education's mission is to make highly valuable career skills in big data, data science, and machine learning accessible to everyone in the world. Our consortium of expert instructors shares our knowledge in these emerging fields with you, at prices anyone can afford.

Sundog Education is led by Frank Kane and owned by Frank's company, Sundog Software LLC. Frank spent 9 years at Amazon and IMDb, developing and managing the technology that automatically delivers product and movie recommendations to hundreds of millions of customers, all the time. Frank holds 17 issued patents in the fields of distributed computing, data mining, and machine learning. In 2012, Frank left to start his own successful company, Sundog Software, which focuses on virtual reality environment technology, and teaching others about big data analysis.

69 lessons

Easy to follow lectures and videos covering subject details.

9 hours

This course includes hours of video material. Watch on-demand, anytime, anywhere.

Certificate of Completion

You will earn a Certificate of Completion at the end of this course.

Course curriculum

  • Udemy 101: Getting the Most From This Course02:10
  • Alternate download link for the ml-100k dataset00:07
  • WARNING: DO NOT INSTALL JAVA 16 IN THE NEXT LECTURE00:07
  • Introduction, and installing the course materials, IntelliJ, and Scala15:54
  • Introduction to Apache Spark14:26
  • Spark Basics5 questions
  • Important note00:24
  • [Activity] Scala Basics25:58
  • [Exercise] Flow Control in Scala09:28
  • [Exercise] Functions in Scala09:08
  • [Exercise] Data Structures in Scala22:28
  • The Resilient Distributed Dataset11:30
  • Ratings Histogram Example11:26
  • Spark Internals01:59
  • Key / Value RDD's, and the Average Friends by Age example10:42
  • [Activity] Running the Average Friends by Age Example04:51
  • Filtering RDD's, and the Minimum Temperature by Location Example05:54
  • [Activity] Running the Minimum Temperature Example, and Modifying it for Maximum11:35
  • [Activity] Counting Word Occurrences using Flatmap()05:46
  • [Activity] Improving the Word Count Script with Regular Expressions03:44
  • [Activity] Sorting the Word Count Results06:35
  • [Exercise] Find the Total Amount Spent by Customer04:30
  • [Exercise] Check your Results, and Sort Them by Total Amount Spent05:09
  • Check Your Results and Implementation Against Mine03:00
  • Quiz: RDD's5 questions
  • Introduction to SparkSQL09:44
  • [Activity] Using SparkSQL07:05
  • [Activity] Using DataSets08:33
  • [Exercise] Implement the "Friends by Age" example using DataSets02:40
  • Exercise Solution: Friends by Age, with Datasets.07:22
  • [Activity] Word Count example, using Datasets10:37
  • [Activity] Revisiting the Minimum Temperature example, with Datasets09:00
  • [Exercise] Implement the "Total Spent by Customer" problem with Datasets02:10
  • Exercise Solution: Total Spent by Customer with Datasets06:28
  • Quiz: SparkSQL5 questions
  • [Activity] Find the Most Popular Movie05:24
  • [Activity] Use Broadcast Variables to Display Movie Names11:19
  • [Activity] Find the Most Popular Superhero in a Social Graph12:18
  • [Exercise] Find the Most Obscure Superheroes05:14
  • Exercise Solution: Find the Most Obscure Superheroes06:44
  • Superhero Degrees of Separation: Introducing Breadth-First Search07:14
  • Superhero Degrees of Separation: Accumulators, and Implementing BFS in Spark07:59
  • [Activity] Superhero Degrees of Separation: Review the code, and run it!12:55
  • Item-Based Collaborative Filtering in Spark, cache(), and persist()07:59
  • [Activity] Running the Similar Movies Script using Spark's Cluster Manager14:48
  • [Exercise] Improve the Quality of Similar Movies03:54
  • [Activity] Using spark-submit to run Spark driver scripts11:43
  • [Activity] Packaging driver scripts with SBT15:06
  • [Exercise] Package a Script with SBT and Run it Locally with spark-submit02:04
  • Exercise solution: Using SBT and spark-submit09:04
  • Introducing Amazon Elastic MapReduce07:11
  • Creating Similar Movies from One Million Ratings on EMR11:33
  • Partitioning04:18
  • Best Practices for Running on a Cluster06:25
  • Troubleshooting, and Managing Dependencies10:59
  • Quiz: Spark on a Cluster5 questions
  • Introducing MLLib09:56
  • [Activity] Using MLLib to Produce Movie Recommendations12:42
  • Linear Regression with MLLib06:58
  • [Activity] Running a Linear Regression with Spark07:47
  • [Exercise] Predict Real Estate Values with Decision Trees in Spark04:56
  • Exercise Solution: Predicting Real Estate with Decision Trees in Spark05:47
  • Quiz: Spark ML5 questions
  • The DStream API for Spark Streaming11:28
  • [Activity] Real-time Monitoring of the Most Popular Hashtags on Twitter08:51
  • Structured Streaming04:03
  • [Activity] Using Structured Streaming for real-time log analysis05:33
  • [Exercise] Windowed Operations with Structured Streaming06:04
  • Exercise Solution: Top URL's in a 30-second Window05:44
  • Quiz: Spark Streaming5 questions
  • GraphX, Pregel, and Breadth-First-Search with Pregel.06:51
  • Using the Pregel API with Spark GraphX04:29
  • [Activity] Superhero Degrees of Separation using GraphX07:07
  • Learning More, and Career Tips04:15
  • Bonus Lecture: More courses to explore!01:07
Online Courses

Learning Scala doesn't have to be hard. Here is our curated list of recommended online courses that will guide you step-by-step in the learning process.

Learn more
Books

Are you an avid book reader? Do you prefer paperback, or maybe Kindle version? Take a look at our curated list of Scala related books and take yourskills to the next level.

Learn more
YouTube videos

The number of high-quality and free Scala video tutorials is growing fast. Check this curated list of recommended videos - there is no excuse to stop learning.

Learn more