Correlations, Association & Hypothesis Testing (with Python)
Learn how to measure and quantify associations between variables using Python. This course covers the foundation of hypothesis testing and explores the strength of associations in statistical analysis and machine learning. Suitable for both junior analysts and experienced data scientists, the course includes practical sessions and quizzes to reinforce learning. By the end, you’ll have a clear understanding of statistical metrics, hypothesis tests, and when to use them. Perfect for anyone interested in data analysis at all levels.
What you’ll learn
- Measure and quantify the strength and statistical significance of associations.
- Applications using real world datasets using Python.
- Exploratory analysis for association between variables/features.
- Foundation and essence of hypothesis testing for association between variables/features.
Exploring and assessing the strength of associations between variables/features plays a fundamental role in statistical analysis and machine learning.
All the applications in the course are implemented in Python. There are overlaps between this course and my other course “Correlations, Associations and Hypothesis Testing (with R)”.
I decided to create this course after leading many data science projects and coming across many data scientists struggling with the fundamentals of association between variables/features and hypothesis testing.
This course will be beneficial to junior analysts as well as to more experienced data scientists. In particular,
If you are an aspiring/junior data analyst/scientist, this course will contribute towards building the right foundation at an early stage of your career.
If you are an experienced data scientist, this course will help you to re-visit and eventually improve your understanding of the assessment of associations between variables/features.
The course is divided into three main sections.
The first section looks at the assessment and quantification of associations between numerical variables.
The second section focusses on the assessment of associations between categorical variables.
The third section covers the assessment of associations between numerical and categorical variables.
Each section discusses a number of statistical metrics in relation to associations between variables and then build statistical hypothesis tests to measure the strengths of these associations.
There are practical sessions throughout the course, where you will see how to implement the methods discussed in the course (using Python) and to perform various hypothesis testing using real world datasets. Your will also learn and master how to interpret results in a broader context.
In addition, quiz is added at the end of each section. The objective of these quizzes is to help you to consolidate the main concepts covered in the course.
By the end of the course, you will have a clear and coherent understanding of covariances, correlations, t-test, Chi-squared test, ANOVA, F-test, and much more. In particular, you will know when to use these tests and how to ensure that the underlying assumptions are satisfied.
Who this course is for:
- Anyone interested in data analysis.
- All levels