Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Option to set the tracking URI for MLflowCallback. #29032

Merged
merged 4 commits into from
Feb 16, 2024
Merged

Feature: Option to set the tracking URI for MLflowCallback. #29032

merged 4 commits into from
Feb 16, 2024

Conversation

seanswyi
Copy link
Contributor

@seanswyi seanswyi commented Feb 15, 2024

What does this PR do?

Previously, the MLflowCallback was only able to set MLflow experiments or runs. This PR adds the option to also set the tracking URI.

Fixes #28961

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding! Changes LGTM

Third-party integrations are maintained by their original contributors, rather than the transformers team. So let's get a review from @noise-field

@seanswyi
Copy link
Contributor Author

Sounds good, thanks for the feedback @amyeroberts! I just made another minor change in the docstring and committed it: the default value of MLFLOW_TRACKING_URI should be an empty string rather than None.

@seanswyi
Copy link
Contributor Author

Is there any way to rerun tests? The failed tests_torch seems to be a timeout-related issue and wasn't there before my most recent commit. If passing all tests is absolutely necessary regardless of whether or not my code is related to it, then I feel like a simple rerun may solve this.

Some tests failed!

============================= FAILURES SHORT STACK =============================
_ LayoutLMTokenizationTest.test_add_tokens [type (microsoft/layoutlm-base-uncased)] _


self = <huggingface_hub.utils._http.UniqueRequestIdAdapter object at 0x7f75b9f8c730>
request = <PreparedRequest [GET]>, stream = True
timeout = Timeout(connect=10, read=10, total=None), verify = True, cert = None
proxies = OrderedDict([('no', '127.0.0.1,localhost,circleci-internal-outer-build-agent')])

        try:
            resp = conn.urlopen(
                method=request.method,
                url=url,
                body=request.body,
                headers=request.headers,
                redirect=False,
                assert_same_host=False,
                preload_content=False,
                decode_content=False,
                retries=self.max_retries,
                timeout=timeout,
                chunked=chunked,
            )
        ...
        except (_SSLError, _HTTPError) as e:
            if isinstance(e, _SSLError):
                # This branch is for urllib3 versions earlier than v1.22
                raise SSLError(e, request=request)
            elif isinstance(e, ReadTimeoutError):
>               raise ReadTimeout(e, request=request)
E               requests.exceptions.ReadTimeout: (ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: e8e1debd-e8f7-45eb-b23d-fa63400a0c0b)')

../.pyenv/versions/3.8.12/lib/python3.8/site-packages/requests/adapters.py:532: ReadTimeout




Exited with code exit status 255

@amyeroberts
Copy link
Collaborator

@seanswyi I can re-run tests for you. I've just set tests_torch run off again

Copy link
Contributor

@muellerzr muellerzr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick turnaround!

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@amyeroberts
Copy link
Collaborator

@seanswyi As everything is passing and both @muellerzr and I approve, let's just merge and if @noise-field has any comments we can do a follow-up PR.

Thanks again for adding!

@amyeroberts amyeroberts merged commit 161fe42 into huggingface:main Feb 16, 2024
21 checks passed
@harupy
Copy link
Contributor

harupy commented Feb 19, 2024

I'm an MLflow maintainer. Found this PR while I'm investigating https://github.com/mlflow-automation/mlflow/actions/runs/7949278234/job/21702921711#step:15:2462. What if we run this code like this?

import mlflow
import os

assert "MLFLOW_TRACKING_URI" not in os.environ
mlflow.set_tracking_uri("sqlite:///my.db")

# train
...

# Since MLFLOW_TRACKING_URI is not set, the tracking URI is set to an empty string, which is undesired in this case.

@seanswyi
Copy link
Contributor Author

@harupy Could you elaborate a bit on why you think that would be better? The reason I wrote the initial PR the way I did is because to me it made sense to allow the user to keep MLFLOW_TRACKING_URI as an environment variable along with the other MLflow-related environment variables such as MLFLOW_RUN_NAME. I also left the default value as an empty string because that seemed to be consistent with MLflow's documentation as well (https://mlflow.org/docs/latest/python_api/mlflow.html#mlflow.set_tracking_uri). I feel like the snippet you provided takes away that control from the user, as it actually seems to be enforcing the user keep the tracking URI to a hardcoded value of "sqlite:///my.db".

Please let me know if I'm misunderstanding or missing something. I haven't been able to fully look into the failed tests you provided but it seems like there's an issue between this PR and the autologging module?

@harupy
Copy link
Contributor

harupy commented Feb 19, 2024

@seanswyi

I feel like the snippet you provided takes away that control from the user, as it actually seems to be enforcing the user keep the tracking URI to a hardcoded value of "sqlite:///my.db".

True, but there may be users who use the code like my snippet (we do something similar in our tests, no MLFLOW_TRACKING_URI environment variable, just call mlflow.set_tracking_uri with a temporary sqlite file). Can we call set_tracking_uri only when MLFLOW_TRACKING_URI exists?

@seanswyi
Copy link
Contributor Author

@harupy If you believe that only calling the function when the environment variable exists is better then how about changing the current code to something like this?

if "MLFLOW_TRACKING_URI" in os.environ:
    self._tracking_uri = os.environ["MLFLOW_TRACKING_URI"]
    logger.debug(f"MLflow tracking URI is set to {self._tracking_uri}")
    self._ml_flow.set_tracking_uri(self._tracking_uri)
else:
    logger.debug(f"Environment variable `MLFLOW_TRACKING_URI` is not provided and therefore will not be explicitly set.")

Just curious, is there any reason why a temporary SQLite DB is used? It seems like the documentation would suggest that setting a tracking URI isn't necessary and that the data will just be stored locally at ./mlruns.

@harupy
Copy link
Contributor

harupy commented Feb 19, 2024

Yeah that looks good to me.

is there any reason why a temporary SQLite DB is used?

  • To clean up runs and experiments after each test invocation.
  • To run tests faster.

zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request Feb 19, 2024
…ace#29032)

* Added option to set tracking URI for MLflowCallback.

* Added option to set tracking URI for MLflowCallback.

* Changed  to  in docstring.
@njbrake
Copy link
Contributor

njbrake commented Feb 26, 2024

Hi! I'm here because this PR broke my code :) I think there might be a compatibility break here?

In my code I call mlflow.set_tracking_uri(my_mlflow_tracking_uri) , but I never set the MLFLOW_TRACKING_URI env var. Is there a way to change this logic so that it checks if the tracking_uri was already set in the mlflow library? In my case it appears that the code

        self._tracking_uri = os.getenv("MLFLOW_TRACKING_URI", "")
        self._experiment_name = os.getenv("MLFLOW_EXPERIMENT_NAME", None)
        self._flatten_params = os.getenv("MLFLOW_FLATTEN_PARAMS", "FALSE").upper() in ENV_VARS_TRUE_VALUES
        self._run_id = os.getenv("MLFLOW_RUN_ID", None)
        logger.debug(
            f"MLflow experiment_name={self._experiment_name}, run_name={args.run_name}, nested={self._nested_run},"
            f" tags={self._nested_run}"
            f" tags={self._nested_run}, tracking_uri={self._tracking_uri}"
        )

means that the tracking_uri is getting overriden to be "" because I didn't set the env var.

I can change my implementation to be compatible with the change that was merged, but others may have similar issues if their implementation was similar to mine.

@seanswyi
Copy link
Contributor Author

Hi @njbrake! Thanks for the comment, that's something that I missed earlier. The MLflow API seems to provide a mlflow.get_tracking_uri() and mlflow.is_tracking_uri_set() function. It would probably be better to check before initializing self._tracking_uri.

Maybe I could incorporate this in the other PR (#29096) that addresses @harupy's previous comment as well? (cc. @amyeroberts)

@amyeroberts
Copy link
Collaborator

@seanswyi Yes please!

harupy added a commit to harupy/mlflow that referenced this pull request Mar 14, 2024
harupy added a commit to mlflow/mlflow that referenced this pull request Mar 14, 2024
chenmoneygithub pushed a commit to chenmoneygithub/mlflow that referenced this pull request Mar 18, 2024
artjen pushed a commit to artjen/mlflow that referenced this pull request Mar 26, 2024
itazap pushed a commit that referenced this pull request May 14, 2024
* Added option to set tracking URI for MLflowCallback.

* Added option to set tracking URI for MLflowCallback.

* Changed  to  in docstring.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Option to set the tracking URI for MLflowCallback.
6 participants