Skip to content
View avoytkiv's full-sized avatar

Block or report avoytkiv

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
avoytkiv/README.md

Andrii Voitkiv - portfolio

About

This is a repository to showcase skills, share projects and track my progress in Data Science / Machine Learning related topics.

Table of contents

Study projects

In this section I will provide links to my github repositories containing code and jupyter notebooks I created while passing courses.

Master of Data Sciene and Analytics

This is 12 months master program at the University of Calgary, Canada. For more details ---> go to repo...
List of courses:

Pytorch fundamentals

Code: go to repo...
Status: In progress

NLP Cook Dishes Project

Code: go to repo...
Description: This project is an in-depth exploration of various NLP models with the purpose of generating text based on a dataset of recipes.
Skills in focus: N-gram Language Model, Neural Language Models (RNN-LSTM, Convolutional), Sampling strategies, Evaluating language model
Status: Completed in November 2023

Portfolio end-to-end projects

In this section I will list projects briefly describing the technology stack used to solve cases.

Client Segmentation for targeted marketing in a credit union

Screenshot 2023-10-21 at 23 46 11

Code: go to repo...
Presentation: go to google slides...
Industry: Banking and Finance
Description: The focus of the project was to build a highly flexible and automated ML pipeline to run experiments. Then, the best model is deployed to an app by a series of automated workflows.
Skills in focus: Clustering, Model selection, Data and model versioning, Experimentations, CI/CD pipelines
Tools:

  • Environment: GitHub Codespaces, devcontainer, Docker, venv, Hydra
  • Data Management: DVC (Data Version Control), AWS S3
  • DS and ML: scikit-kearn (PCA, clustering algorithms), keras (autoencoder)
  • Continuous Integration: GitHub Actions, CML, AWS EC2
  • Continuous Deployment: Fast API, Heroku

Results: This helps the credit union make better decisions about how to reach out to different groups of clients.
Status: Completed in August 2023.

Predicting job salary

Screenshot 2023-09-28 at 11 02 13

Code: go to repo...
Description: This is from Kaggle competition: "Adzuna wants to build a prediction engine for the salary of any UK job ad, so they can make huge improvements in the experience of users searching for jobs, and help employers and jobseekers figure out the market worth of different positions."
Data: large dataset (hundreds of thousands of records), which is mostly unstructured text, with a few structured data fields.
Skills in focus: Regression, Tokenization, Categorical Vectorization, Neural Networks, OOP, ML Pipeline (Azure CLI), Components (Azure CLI), Deployment
Tools:

  • Environment: GitHub Codespaces, devcontainer, conda, Azure CLI, Azure ML Studio
  • DS and ML: PyTorch, scikit-learn

Status: Completed in September 2023.

Predicting diabetes on Azure ML with GitHub Actions

Screenshot 2023-10-21 at 23 47 36

Code: go to repo...
Industry: Healthcare
Description:
Skills in focus: Logistic Regression, CI/CD pipelines, Linting, Testing, Package and Register the Model
Tools:

  • Environment: GitHub Codespaces, devcontainer, Docker, venv
  • Data Management: Azure ML Datastore
  • DS and ML: scikit-kearn (Logistic regression)
  • Continuous Integration: GitHub Actions, Azure ML Resources (Job, Compute, Environment), flake8, pytest
  • Continuous Deployment: MLFlow

Results: An automated workflow that will be triggered when a new model is registered. Once the workflow is triggered, the new registered model will be deployed to the production environment.
Status: Completed in October 2023.

Fine-Tuning-LLM-with-SkyPilot-and-DVC

Screenshot 2023-10-21 at 23 48 21

Code: go to repo...
Description: Fine-tune the foundational LLM for hotel reviews' sentiment classification in the cloud on GPUs.
Skills in focus: Text classification, Fine-tune LLM, Provision infrastructure, Checkpointing
Tools:

  • Environment: GitHub Codespaces, devcontainer, Docker, venv
  • Infrastructure Management: SkyPilot
  • DS and ML: Transformer, PyTorch
  • Continuous ML: DVC, Weights and Biases

Results: Cost-optimized setup to run in the cloud to fine-tune LLM with continuous machine learning.
Status: Completed in October 2023.

Contacts

Pinned Loading

  1. automating-workflow automating-workflow Public

    Python

  2. credit-mlops credit-mlops Public

    Jupyter Notebook 1

  3. MDSA-UofC MDSA-UofC Public

    Tasks and projects solved while passing a Master of Data Science and Analytics program

    Jupyter Notebook