Scala and Spark for Big Data and Machine Learning
Learn the latest Big Data technology - Spark and Scala, including Spark 2.0 DataFrames!
What you will learn?
- Use Scala for Programming
- Use Spark 2.0 DataFrames to read and manipulate data
- Use Spark to Process Large Datasets
- Understand hot to use Spark on AWS and DataBricks
Your trainer
Jose Portilla
Jose Marcial Portilla has a BS and MS in Mechanical Engineering from Santa Clara University and years of experience as a professional instructor and trainer for Data Science and programming. He has publications and patents in various fields such as microfluidics, materials science, and data science technologies. Over the course of his career he has developed a skill set in analyzing data and he hopes to use his experience in teaching and data science to help other people learn the power of programming the ability to analyze data, as well as present the data in clear and beautiful visualizations. Currently he works as the Head of Data Science for Pierian Data Inc. and provides in-person data science and python programming training courses to employees working at top companies, including General Electric, Cigna, The New York Times, Credit Suisse, McKinsey and many more. Feel free to contact him on LinkedIn for more information on in-person training sessions or group training sessions in Las Vegas, NV.
80 lessons
Easy to follow lectures and videos covering subject details.
10 hours
This course includes hours of video material. Watch on-demand, anytime, anywhere.
Certificate of Completion
You will earn a Certificate of Completion at the end of this course.
Course curriculum
- Introduction02:28
- Course FAQs00:13
- Scala and Spark Overview10:40
- ScalaIDE Overview02:51
- Computer Set-up Time!00:17
- Windows Introduction00:40
- Quick note about Windows Installation.00:17
- Windows Scala and Spark Installation12:09
- Atom Windows Installation09:30
- Terminal Exericse00:35
- Mac OS Installation and Setup09:57
- Installing Scala and Spark on Linux (Ubuntu)12:49
- Arithmetic and Numbers07:00
- Values and Variables07:49
- Booleans and Comparison Operators02:11
- Strings and Basic Regex12:48
- Tuples02:35
- Scala Basics - Assessment Test Exercises00:38
- Scala Basics Assessment Test Questions00:25
- Scala Basics - Assessment Test Solutions05:53
- Intro to Collections00:47
- Lists08:28
- Arrays03:48
- Sets06:02
- Maps07:18
- Collections - Assessment Test Exercise00:30
- Scala Collections Assessment Test00:26
- Collections Assessment Test - Solutions06:14
- Flow Control08:35
- For Loops05:57
- While Loops05:55
- Functions12:45
- Scala Programming Exercises02:33
- Scala Programming Exercises - Solutions15:24
- Quick Note for Windows Users!00:39
- Introduction to Spark DataFrames06:29
- DataFrames Overview18:12
- Spark DataFrame Operations16:23
- GroupBy and Aggregate Functions10:53
- Missing data13:16
- Date and Timestamps09:53
- Quick Note on DataFrame Project00:11
- DataFrame Project Exercises01:34
- DataFrame Project - Solutions20:20
- Introduction to Machine Learning06:50
- Machine Learning with Spark11:50
- IntelliJ IDEA Installation Overview11:08
- Introduction to Linear Regression06:14
- Introduction to Regression Section01:07
- Linear Regression Documentation Example08:29
- Alternate Linear Regression Data CSV File00:17
- Linear Regression Walkthrough Part 116:40
- Linear Regression Walkthrough Part 207:26
- Linear Regression Exercise Project02:32
- Linear Regression Project Solutions16:56
- Introduction to Classification12:42
- Classification Documentation Example07:39
- Spark Classification - Logistic Regression Example - Part 115:49
- Spark Classification - Logistic Regression Example - Part 221:40
- Logistic Regression Project Exercise01:52
- Classification Project Solutions15:16
- Model Evaluation Overview10:23
- Spark Model Evaluation - Documentation Example21:32
- Spark - Model Evaluation - Regression Example23:16
- Introduction to Clustering with Spark01:37
- KMeans Theory Lecture05:05
- Note on Kmeans00:08
- Example of KMeans with Spark07:15
- Clustering Project Exercise Overview03:42
- Clustering Project Exercises - Solutions10:31
- PCA Theory Overview03:13
- PCA with Spark - Documentation Example06:00
- PCA with Spark - Project Exercise03:06
- PCA Spark Exercise - Solutions10:39
- Databricks Overview17:24
- Introduction to Spark Recommendation Systems04:03
- Spark Recommender System Implementation13:35
- Zeppelin Notebooks on AWS Elastic MapReduce19:57
- So what's next?00:49
- Bonus Lecture:00:10
Online Courses
Learning Scala doesn't have to be hard. Here is our curated list of recommended online courses that will guide you step-by-step in the learning process.
Learn moreBooks
Are you an avid book reader? Do you prefer paperback, or maybe Kindle version? Take a look at our curated list of Scala related books and take your
YouTube videos
The number of high-quality and free Scala video tutorials is growing fast. Check this curated list of recommended videos - there is no excuse to stop learning.
Learn more