-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add S3 data to existing package #2180
Conversation
c13cf13
to
4a707bf
Compare
Codecov Report
@@ Coverage Diff @@
## master #2180 +/- ##
==========================================
+ Coverage 47.42% 47.96% +0.54%
==========================================
Files 441 441
Lines 21138 21357 +219
Branches 2436 2436
==========================================
+ Hits 10024 10244 +220
+ Misses 10206 10205 -1
Partials 908 908
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
2eacd11
to
2567a7d
Compare
2567a7d
to
444e52b
Compare
fb43228
to
db50b31
Compare
db50b31
to
4994612
Compare
@@ -120,11 +127,51 @@ | |||
} | |||
|
|||
|
|||
PACKAGE_CREATE_SCHEMA = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like I've seen this before. Does it belong in lambda_shared?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could have seen that in the registry 🙂,
def validator(data): | ||
ex = next(iter_errors(data), None) | ||
if ex is not None: | ||
raise ApiException(HTTPStatus.BAD_REQUEST, ex.message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More useful if we send all errors back. This only sends the first one back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but that's how it currently works and there is no reason to change it right now.
def inner(f): | ||
@functools.wraps(f) | ||
def wrapper(request): | ||
version_id = request.data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is version_id really the only thing in the data string? What about unversioned buckets?
|
||
|
||
def large_request_handler(request_type): | ||
user_request_key = f'user-requests/{request_type}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand how this is thread safe at all. What happens if multiple users issue concurrent requests of the same type? Won't the be reading/deleting the same file in the TemporaryFile loop below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Users use the same key to write requests, but they write to the versioned bucket created by us via Cloudformation, so each request get its own ID (version ID of object). A single lambda instance process only a single request at the same time.
def create_package(request): | ||
json_iterator = map(json.JSONDecoder().decode, (line.decode() for line in request.stream)) | ||
|
||
data = next(json_iterator) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest calling this first_line
or something because the first line is special. data
makes more sense if you're going to iterate with it.
@@ -15,6 +15,8 @@ | |||
|
|||
PACKAGE_INDEX_SUFFIX = "_packages" | |||
|
|||
LAMBDA_TMP_SPACE = 512 * 2 ** 20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AWS says 512 MB not 512 MiB
LAMBDA_TMP_SPACE = 512 * 2 ** 20 | |
LAMBDA_TMP_SPACE = 512_000_000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to run shutil.disk_usage('/tmp/')
and got this:
usage(total=551346176, used=892928, free=538333184)
In [23]: 538333184 / 2 ** 20
Out[23]: 513.39453125
I think it might make sense to drop that check and simply try to preallocate space using os.posix_fallocate()
to fail early.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
os.posix_fallocate()
fails with PermissionError
in lambda 😞 .
Co-authored-by: Aneesh Karve <[email protected]>
* master: (54 commits) Use stable nginx version for catalog image (#2182) Ability to add S3 folders / files to package (#2171) lambda for adding S3 data to existing package (#2180) use github tarball for faster installation (#2181) Bump py from 1.7.0 to 1.10.0 in /lambdas/es/indexer (#2176) Bump py from 1.8.0 to 1.10.0 in /lambdas/s3select (#2177) Bump py from 1.8.0 to 1.10.0 in /lambdas/thumbnail (#2178) Allow unicode characters for package routes by allowing any character (#2179) Additional NotFoundPage scoped to Bucket (#2175) Docs: fix catalog config path (#2168) rework pkgpush auth (#2170) Use AWS credentials for directory package and copy package submit (#2172) Document package push limitations in catalog [ci skip] (#2161) Preview warnings accordion (#2167) tweak warning text (#2169) Copy tweaks (#2164) Don't crash pkgselect for empty manifests (#2147) add codecov config (#2155) Simplify warning messages for package name (#2134) Move package API requests to one file, consolidate naming and internal API (#2154) ...
Description
See #2171 for frontend part.
TODO