Skip to content

Commit

Permalink
Merge pull request #87 from coralnet/s3-transfer-no-threads
Browse files Browse the repository at this point in the history
S3 downloads are now always performed in the main thread
  • Loading branch information
StephenChan authored Jan 23, 2024
2 parents 1f48c31 + 1a8b912 commit cb0b427
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 1 deletion.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@

- `ImageFeatures` with `valid_rowcol=False` are no longer supported for training. For now they are still supported for classification.

- S3 downloads are now always performed in the main thread, to prevent `RuntimeError: cannot schedule new futures after interpreter shutdown`.

## 0.7.0

- `TrainClassifierMsg` labels arguments have changed. Instead of `train_labels` and `val_labels`, it now takes a single argument `labels`, which is a `TrainingTaskLabels` object (basically a set of 3 `ImageLabels` objects: training set, reference set, and validation set).
Expand Down
8 changes: 7 additions & 1 deletion spacer/storage.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
import urllib.request

import botocore.exceptions
from boto3.s3.transfer import TransferConfig
from PIL import Image
from sklearn.calibration import CalibratedClassifierCV
from sklearn.linear_model import SGDClassifier
Expand Down Expand Up @@ -99,6 +100,10 @@ class S3Storage(Storage):

def __init__(self, bucketname: str):
self.bucketname = bucketname
# Prevent `RuntimeError: cannot schedule new futures after
# interpreter shutdown`.
# Based on https://github.com/etianen/django-s3-storage/pull/136
self.transfer_config = TransferConfig(use_threads=False)

def store(self, key: str, stream: BytesIO):
s3 = config.get_s3_conn()
Expand All @@ -107,7 +112,8 @@ def store(self, key: str, stream: BytesIO):
def load(self, key: str):
s3 = config.get_s3_conn()
stream = BytesIO()
s3.Object(self.bucketname, key).download_fileobj(stream)
s3.Object(self.bucketname, key).download_fileobj(
stream, Config=self.transfer_config)
return stream

def delete(self, key: str) -> None:
Expand Down

0 comments on commit cb0b427

Please sign in to comment.