-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Build MLflow Tracking Server for MLOps Discovery #4275
Comments
Hi team, you might not be able to answer this right away, but for our own MLOps work and planning, it would be really good to know the timescales you might this be deliverable over. Even, what timescales you could start to explore it, whether that's days/weeks/months away. Thank you. |
Hi @Ed-Bajo could we set some timescales for this? I'm working with the Probation and Electronic Monitoring team and we'd like it to be available for testing and use soon. Is the end of June a feasible timescale to deliver to? Thanks. Michael |
This feature would be extremely useful for the BOLD AI for Linked Data team. We currently have no good way to track ML experiments and this would be a great step towards industry best practice. We'd like to see it as soon as is possible as we are a time limited programme. |
10/06/24 summary:
|
11/06/24 summary:
|
12/06/24 summary:
|
Moving to blocked while discussed way forward with Analytical Platform Product Management |
Notes:
|
Solution one: users set their own artifact location when creating experiment One solution is that users can define their own artifact location in MLFlow at the create experiment level (https://mlflow.org/docs/latest/rest-api.html#create-experiment) meaning they can direct artifacts to be stored at their own buckets anyway - but not sure how this works with access between MLFlow and that bucket? I will test this with the running server and see what error it gives. Solution two: wrapper and AP control panel can be used to create experiments and assign S3 perms There seems to be some circularity brewing with the process in that:
If in someway 1 can be done using their alpha user name somehow then we need a way of making sure if they make a new experiment, this then is linked back to their alpha user name for the S3 perms. A solution might be to force users to use the AP Control Panel for creating experiments through the MLFlow API (https://mlflow.org/docs/latest/rest-api.html#create-experiment) instead of them creating them through code or the UI (although not sure how we really can prevent this :/) as then the api wrapper can also do the S3 perms at the same time at the artifact level. |
@gfowler-moj is going to put a session in to review the way forward around authentication/permissions management |
Outcome of meeting:
|
I have created a group in Control Panel (analytical-platform-mlflow-admins), added @mshodge and @PriyaBasker23, and create 3 artefact buckets:
|
TODO:
|
alpha-analytical-platform-mlflow-development updated with below JSON {
"Version": "2012-10-17",
"Statement": [
{
"Sid": "DenyInsecureTransport",
"Effect": "Deny",
"Principal": "*",
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::alpha-analytical-platform-mlflow-development",
"arn:aws:s3:::alpha-analytical-platform-mlflow-development/*"
],
"Condition": {
"Bool": {
"aws:SecureTransport": "false"
}
}
},
{
"Sid": "AllowAnalyticalPlatformMLflow",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::381491960855:role/mlflow20240610161705974000000002"
},
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::alpha-analytical-platform-mlflow-development",
"arn:aws:s3:::alpha-analytical-platform-mlflow-development/*"
]
}
]
} MLflow is running again, but needs testing |
MLflow deployed to APC, follow on FR raised to create role for mutating permissions #4593 |
Describe the feature request.
Implement a fully managed MLflow tracking server on the AWS platform help in discovery of machine learning operations within MOJ.
Details:
Backend Store: Utilise Amazon RDS to store MLflow metadata and logs securely.
Artifact Storage: Use Amazon S3 for storage of machine learning models and artifacts.
Tracking Server: Deploy an EC2 instance or Docker container to host the MLflow tracking server, enabling remote access
Other Requirements
Details information are available at https://github.com/moj-analytical-services/mlops/blob/main/docs/mlflow/mlflow_tracking_server.md
Describe the context
MLflow Tracking is a component of the MLflow platform that enables data scientists and machine learning engineers to track and log experiments during the model development process. With MLflow Tracking, users can easily record parameters, metrics, and output files from their machine learning experiments, making it easier to organize and compare different approaches. It provides a centralized location to store experiment results, allowing for efficient collaboration and reproducibility. MLflow Tracking also offers a user-friendly interface for visualizing experiment results, enabling users to gain insights into model performance and make informed decisions about model improvements.
Value / Purpose
This configuration will enable data scientists to centralise their experimental data, streamlining access to experiments for all team members. It will facilitates the ability for data scientists to integrate and test MLflow from their existing projects within Visual Studio, using the Application Platform
User Types
Data Scientist
The text was updated successfully, but these errors were encountered: