This repository contains the necessary information to follow the demo from the presentation Using MLflow at low cost in EC2 in the context of the AWS Community Day Colombia event.
The repository is structured into various sections for the installation and configuration of AWS services, following the architecture outlined below:
[Demo Guide in English - Still working on it but you can use the one in Spanish]
- Data science teams looking to take their first steps on AWS and enhance ML processes
- Individuals with a Data Scientist role
- Individuals with a Data Engineer role
- Individuals with an ML Engineer role
- Any other role seeking practical experience for tracking parameters and artifacts related to managing machine learning models.
- Introduction to the AWS architecture used to generate the MLflow Tracking server
- Introduction to MLflow
- Interaction with MLflow from an instance in a private network that uses a balancer
Note: Read the following before you begin!
An analytics team wants to start implementing an MLOps culture. Therefore, along with their technical leader, they are going to set up a tracking server using MLflow on AWS infrastructure, as per their organization's directives to use this cloud platform.
Similarly, the technical leader of the analytics team aims to streamline the compilation of the library of experiments that team members have been generating in Jupyter notebooks on their personal computers.
In this Workshop/Demo, we will discuss: Using MLflow at low cost on an EC2 instance.
Goals | See Goals section above |
---|---|
What You'll Learn | Using AWS services: EC2, S3, to implement the MLFlow open-source software |
What You'll Need | AWS account with free tier |
Duration | 1 hour |
Topics | EC2, EC2 systemctl usage, S3, MLFlow |
Slides in Spanish | Powerpoint |
- An AWS account is required
- It is recommended to create a role from the root account that will be used to provision the AWS services for the demo, including:
- VPCs
- Security Group
- EC2 Instance
- S3 Buckets
- Target Group
- Load Balancer
- Pre-Workshop Checklist
- Introduction
- Part 1: VPC Setup
- Part 2.0: Role Configuration, EC2 Instance, and S3 Bucket Setup
- Part 2.1: EC2 Instance Configuration for MLflow
- Part 2.2: Elastic IP Configuration for Public EC2 Instance. Subsequently, Generating AMI for Transition to Private Network
- Part 3: Load Balancer Configuration to Establish MLflow Server Domain
- Part 4: Introduction to MLFlow and Usage from Jupyter Notebooks
- Tips
Note: This is the repository structure from the root
README.md
> README with instructions for using this repositoryaws_configuration
> Folder containing EC2 permission policies in JSON formatnotebooks_demo
> Jupyter Notebooks containing use cases for the MLflow Tracking server deployed on AWSdata
> Files for use in the use cases within notebooks_demodocs
> Supplementary information about the repository
Contributions to this repository are welcome! If you'd like to contribute, please follow these guidelines:
-
Fork the Repository: Click the "Fork" button at the top right corner of this repository to create your copy.
-
Make Changes: Create a new branch on your fork, make changes or additions to the materials, and commit your changes.
-
Submit a Pull Request: Once you're satisfied with your changes, submit a pull request. Be sure to provide a clear and concise description of your changes.
-
Review and Collaborate: Collaborators will review your pull request, provide feedback, and merge it into the main repository if everything looks good.
Please follow good coding practices, and ensure that your contributions align with the purpose of this repository.
This repository is licensed under the MIT License. Please review the license before using or contributing to this repository.
- Jeico Percy: Offering support and expertise in validating AWS best practices for this exercise.
- Ana Maria Lopez: Providing feedback on the material for this talk.
- Diego Marulanda: Attending my presentation and providing feedback.
- Juanita Herrera: Attending my presentation and helping achieve the best results.