Cleaning Data with PySpark
Learn how to clean data with Apache Spark in Python.
Course Description
Working with data is tricky – working with millions or even billions of rows is worse.
What You’ll Learn
DataFrame details
A review of DataFrame fundamentals and the importance of data cleaning.
Improving Performance
Improve data cleaning tasks by increasing performance or reducing resource requirements.
Manipulating DataFrames in the real world
A look at various techniques to modify the contents of DataFrames in Spark.
Complex processing and data pipelines
Learn how to process complex real-world data using Spark and the basics of pipelines.
User Reviews
Be the first to review “Cleaning Data with PySpark”
You must be logged in to post a review.
×
There are no reviews yet.