- Week1: Data acquisition: Web scraping, Calling Internet APIs
- Week2: Linear Regression: Multivariate linear regression, Polynomial regression, Regularization (Lasso, Ridge), Cross validation, Train-Test split, MAE, MSE
- Week3: Classification 1: Logistic regression, Accuracy, Confusion Matrix, Precision, Recall, F1-score
- Week4: Classification 2: KNN Classifier, Decision Trees
- Week 5: Clustering: K-Means, Hierarchical clustering, Dendrogram
- Week 6: Association Rules: Association rule mining, Apriori algorithm
- Week 7: Recommender systems: User-User Collaborative Filtering (from scratch and using Surprise library), Mean-centered cosine similarity, Precision and Recall at rank k, Precision-recall curve
- Week 8: Text analytics: Text preparation (Tokenization, Lemmatization, Stopwords), Text representation (Bag of Words, TF-IDF), Text structure (Dependency Parsing, Entity recognition), Text similarity (cosine similarity)
- Week 9: Text analysitics 2: Text embeddings, Bag of Words, TF-IDF, Word2vec, application to text classification
- OpenML
- Hugging Face
- 75 Public datasets for Machine Learning
- https://github.com/fivethirtyeight/data
For the project, you will have to work with Git and GitHub. The following documentation can be useful to you: