Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add exception handling and retry on loading train batch #194

Merged
merged 9 commits into from
Mar 23, 2020

Conversation

AnnaKwa
Copy link
Contributor

@AnnaKwa AnnaKwa commented Mar 19, 2020

This allows the model training to proceed by skipping a batch if it runs into data quality or read issues. It will also save the actual number of batches used in the final config copy.

Copy link
Contributor

@oliverwm1 oliverwm1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for adding the logging statements!

I found the exception handling in _create_training_batch_with_retries a bit hard to follow. What do you think about just doing the retries no matter what the exact ValueError is?

fv3net/regression/sklearn/train.py Outdated Show resolved Hide resolved
fv3net/regression/dataset_handler.py Show resolved Hide resolved
@AnnaKwa
Copy link
Contributor Author

AnnaKwa commented Mar 20, 2020

Thanks for the review @oliverwm1 - I left a comment about the retry exception, let me know what you think. Ready for re-review

@AnnaKwa AnnaKwa requested a review from oliverwm1 March 20, 2020 23:10
Copy link
Contributor

@oliverwm1 oliverwm1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@AnnaKwa AnnaKwa merged commit db72139 into master Mar 23, 2020
@AnnaKwa AnnaKwa deleted the fix/ml_training_read_errors branch March 23, 2020 18:18
spencerkclark pushed a commit that referenced this pull request May 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants