Introduction to Data Science – Python
Data science is an interdisciplinary field grounded in mathematics, statistics and computing aimed at extracting knowledge and insights from both structured and unstructured data. While this expanding domain allows organizations to address numerous business use cases, many technologies and solutions are currently available on the market and companies are struggling to navigate through this massive ecosystem.
Python, as the leading data science programming language opens countless possibilities to data scientists and mastering its wide-ranging capabilities is key. This course acts as a comprehensive introduction to this world and will help you harness Python’s full power through machine learning algorithms.
1 - Presentation of data science tools
Presentation and use of Jupyter, Python, Numpy, Matplotlib, Pandas and the sklearn environment
2 - Description and implementation of common Machine Learning algorithms
Supervised learning (classification using Knn, Bayes, SVM, decision trees and ensemble classifiers, linear and nonlinear regression), unsupervised learning (clustering with K-Means, CHA and SVDD)
3 - Discovering and application of the main notions of Machine Learning
Over and underfitting, splitting datasets, cross validation, the curse of dimensionality, selection of model, unbalanced datasets, regularization, etc.
4 - Summing-up exercise
5 - A few good practices in data science projects
- Understanding the fundamental principles of data science
- Discovering the biggest groups of algorithms
- Applying these algorithms with Python libraries
- Prior knowledge in mathematics
- Prior knowledge in programming