Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle S3 connection failure #555

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ramanathan106
Copy link

.flush() in _EventLoggerThread create a new connection each time, if there is fluctuation in connection boto3.client('s3', endpoint_url) throws an error and since it is not handled the thread will hang and since the queue is full the training will also hang. The try block added will prevent the thread from getting stuck, instead it waits for the connection to appear again. Since it's a while loop the training wont resume till the connection is established again. Connection variable will make sure the print happens only once.

@codecov-io
Copy link

codecov-io commented Feb 13, 2020

Codecov Report

Merging #555 into master will decrease coverage by 0.2%.
The diff coverage is 41.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #555      +/-   ##
==========================================
- Coverage   88.68%   88.47%   -0.21%     
==========================================
  Files          38       38              
  Lines        2774     2785      +11     
==========================================
+ Hits         2460     2464       +4     
- Misses        314      321       +7
Impacted Files Coverage Δ
tensorboardX/event_file_writer.py 89.65% <41.66%> (-5.59%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d877dfe...cddad18. Read the comment docs.

This was referenced Oct 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants