-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deterministic and related flags don't guarantee same result on different machines #6683
Comments
Thanks for using LightGBM. This is similar to many other discussions here, you might find some of those useful
We know this is an area of confusion with LightGBM. I have some ideas to improve that but haven't put them into writing yet, apologies. Will try to help you here.
Those are not sufficient to make the training output deterministic. At a minimum, you should also set Here's a minimal, reproducible example that produces identical results for me (Python 3.10, import lightgbm as lgb
from sklearn.datasets import make_regression
X, y = make_regression(n_samples=10_000, n_features=5, n_informative=5, random_state=123)
params = {
"deterministic": True,
"force_row_wise": True,
"n_jobs": 1,
"n_estimators": 10,
"seed": 708
}
mod1 = lgb.LGBMRegressor(**params).fit(X, y)
mod1_str = mod1.booster_.model_to_string()
mod2 = lgb.LGBMRegressor(**params).fit(X, y)
mod2_str = mod2.booster_.model_to_string()
mod3 = lgb.LGBMRegressor(**params).fit(X, y)
mod3_str = mod3.booster_.model_to_string()
assert mod1_str == mod2_str
assert mod2_str == mod3_str If you can modify that in a way that still shows some non-determinism, we'd be happy to investigate further.
If the answer to any of those is "no", that could explain why you're seeing different results.
It's possible. There are multiple ways, but in general they could be summarized as "numerical precision". Feature values that are different before standardization could be identical once forced into the Even if that cardinality isn't changed, using very very small floating point numbers can lead to non-deterministic results in any operations that are multi-threaded and which involve operations like multiplication and division. I know that you said you're using |
James, thank you for your detailed response.
Apologies for this omission in my original post. I also set the random_state in these runs. Simply forgot to post that as it was a given in my mind.
This is currently the reason that seems most likely in my mind. I've been experimenting with applying a "rounding" transformer which rounds the features after doing the cyclical transformation. My first test in doing so gave me identical results on both machines I'm testing on. We'll see if I get the same results soon in the CI/CD pipeline run by github. Will post an update back here once I learn a bit more. |
edit: there was an important typo in my code sample above. I've corrected that and re-run, confirmed it still produces identical results across multiple runs.
Ok sounds good. |
I'm using the
LGBMRegressor
as part of a Scikit-learn API. I'm having issues in that I have some models that give me different results when calling.predict()
in my Docker environment on my local Mac machine and in the same Docker environment on an AWS EC2 instance. This is despite the model usingdeterministic=True
,force_row_wise=True
andnum_threads=1
.First question. Is this expected that even with these flags set that results might be different on different machines? Under the deterministic section of the docs, I see the following bullet point:
This makes it seem like maybe this is expected behavior, although I might have hoped that running in a Docker environment would allow for reproducible behavior. The problem of course is that, as I'm creating tests for my code base, I can't guarantee that the tests will pass in CI/CD if they pass locally on my computer or elsewhere. If this expected behavior, how are people including LGBM code in their test suites which don't run on the same hardware?
If this is not expected behavior, is there a data or model setup that would maybe not be covered by the flags being set in this way? Prior to the LGBMRegressor, I have a data transformation pipeline that makes various data transformations. Purely by guessing and checking, I figured out that by removing a
CyclicalFeatures
(https://feature-engine.trainindata.com/en/1.7.x/api_doc/creation/CyclicalFeatures.html#feature_engine.creation.CyclicalFeatures) transformation on the pipeline gave me reproducible results between my local machine and the EC2 box. This transformation isn't doing anything stochastic, but it simply transforming a feature into sine and cosine representations. Is there a reason why mapping a feature to the -1 to 1 range would introduce a behavior that would be non-deterministic?I have a minimal example which includes data, a saved pipeline, and a driver script. If useful, I could relabel the data to remove any sensitive information and provide it, provide a minimal working Docker environment, etc., but just wanted to ask the above questions first.
Thanks.
The text was updated successfully, but these errors were encountered: