Contact us

PySpark for Data Science - II: 

Statistics for Big Data

  • Dive into the world of big data processing with PySpark, the Python library for Apache Spark.
  • Learn how to process, analyze, and derive insights from massive datasets using Python’s user-friendly
    interface.
  • Elevate your data skills with PySpark. Dive deep into distributed data processing, machine learning,
    streaming, and more to navigate the vast oceans of big data.

Created by Selva Prabhakaran

  • English

  • English Captions

What you will learn

01

Introduction to
PySpark

02

PySpark
Statistics

03

PySpark Data Cleaning
and Processing

04

PySpark MLlib
Models

Course Curriculum

Requirements

  • Courses Page1 Basics of Python
  • Courses Page1 Foundational knowledge of Data Science
  • Courses Page1 High school maths

Who should attend this course?

  • Data Science Aspirants

  • Data Science Professionals

  • Software/Data engineers interested in quantitative analysis

  • Professionals working with large datasets

  • Data analysts, economists, researchers

About the course

You will learn the following skills by the end of the course:

  • LightGBM
  • XGBoost Random
  • Forest Decision Tree
  • Logistic Regression
  • Hyperparameter
  • Tuning Feature Importance Confusion Matrix
  • ROC AUC
  • Concordance and Discordance
  • Precision Recall Curve
  • Capture Rates and Gains
  • Feature Engineering
  • Label Encoding
  • Frequency Encoding
  • Chi-Square test ANOVA test
  • Exploratory Data Analysis
  • Memory
  • Optimization
  • Data Preprocessing

Instructor

Selva Prabhakaran Principal Data Scientist

My name is Selva, and I am super excited to mentor you on this project!

I head the Data Science team for a global Fortune 500 company and over the last 10 years of my data science experience I’ve deployed 20+ global products. I’m also the Founder & Chief Author of Machine Learning Plus, which has over 4M annual readers.

I specialize in covering the in-depth intuition and maths of any concept or algorithm. And based on my existing student requests, I’ve put up the series of courses and projects with detailed explanations – just like an on the job experience. Hope you love it!

  • 4.8+Instructor rating

  • 200+ reviews

  • 75K+students

  • 40+ Courses

Launch your GraphyLaunch your Graphy
100K+ creators trust Graphy to teach online
machinelearningplus 2024 Privacy policy Terms of use Contact us Refund policy