Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3sync fails with few files saying NoSuchKey: The specified key does not exist. #26

Open
prudhvigodithi opened this issue Nov 1, 2019 · 13 comments
Labels
bug Something isn't working wontfix This will not be worked on

Comments

@prudhvigodithi
Copy link

prudhvigodithi commented Nov 1, 2019

s3sync works with few s3 files and for few, it fails saying NoSuchKey: The specified key does not exist, anything to fix form s3 permission for am I missing any option with s3sync?

error log:
ERRO[0000] Sync error: pipeline step: 1 (LoadObjData) failed with error: object: content/directpath/health-check/abd/details/js.properties sync error: NoSuchKey: The specified key does not exist.
status code: 404, request id: xxxxxxxxxxxxxx, host id: v5dwefwefwefmkwmfkwa98y79790978ZKpjUNCAkKhE+8697fqbdqtdgbxbx0=, terminating

@prudhvigodithi prudhvigodithi changed the title s3sync to work with host IAM role. s3sync fails with few files saying NoSuchKey: The specified key does not exist. Nov 1, 2019
@larrabee
Copy link
Owner

larrabee commented Nov 2, 2019

Hello.
Please check the file (content/directpath/health-check/abd/details/js.properties) permissions through S3 Console. I think your user has no access to this file.
Also you can try to download this file via s3cmd.

Another option is that the file was deleted after it was listed.
You can skip this files with option --on-fail skipmissing

@prudhvigodithi
Copy link
Author

prudhvigodithi commented Nov 2, 2019

Hey thanks for reaching me back, I have tried using skipmissing option but that skips the entire content in that folder, however I have checked the permission form s3 console I see read object as yes, read object permission as yes and write object permission as yes, also I have tried with straight aws s3 sync command works well and gets synced to local directory.

WARN[0001] Skip missing object: content/00d9ffsdfsdfmkwckscwd16/15719586868768.bytes
WARN[0001] Skip missing object: content/00a329qqhqwdqdqkdkqdiqd2/1mcwkmfwkm2135.properties

Tested with s4cmd works fine:

with s3sync added the following options -

s3-retry 3 --s3-acl private -p -w 128 -f skipmissing --debug

gets the following error:

DEBU[0000] Pipeline step: ListSource finished
DEBU[0000] S3 obj content downloading request failed with error: NoSuchKey: The specified key does not exist.
status code: 404, request id: XXXXXXX, host id: XXXXXXX

However, under same s3 bucket, I have tested with another folder sync works:

INFO[0000] Starting sync
DEBU[0000] Listing bucket finished
DEBU[0000] Pipeline step: ListSource finished
DEBU[0000] Pipeline step: LoadObjData finished
DEBU[0000] Pipeline step: ACLUpdater finished
DEBU[0000] Pipeline step: UploadObj finished
DEBU[0000] Pipeline step: Terminator finished
DEBU[0000] All pipeline steps finished
DEBU[0000] Pipeline terminated
INFO[0000] 0 ListSource: Input: 0; Output: 3 (14 obj/sec); Errors: 0
INFO[0000] 1 LoadObjData: Input: 3; Output: 3 (14 obj/sec); Errors: 0
INFO[0000] 2 ACLUpdater: Input: 3; Output: 3 (14 obj/sec); Errors: 0
INFO[0000] 3 UploadObj: Input: 3; Output: 3 (14 obj/sec); Errors: 0
INFO[0000] 4 Terminator: Input: 3; Output: 0 (0 obj/sec); Errors: 0
INFO[0000] Duration: 218.480329ms
INFO[0000] Sync Done

s3sync -verison

VersionId: 2.9, commit: 3f7a732, built at: 2019-10-01T08:00:37Z

@prudhvigodithi
Copy link
Author

prudhvigodithi commented Nov 4, 2019

Hey looks something close to this issue:
#24

s3cmd info gives the following output:
ERROR: S3 error: 404 (Not Found)

However s3cmd sync works for that directory and files inside it.

s3cmd ls and aws s3 ls lists all the directories and files inside it.

@larrabee
Copy link
Owner

larrabee commented Nov 5, 2019

Issue #24 related to another bug, that has been fixed.
It's very strange that s3cmd info fails. Can you write full cmd line of s3cmd?

@prudhvigodithi
Copy link
Author

prudhvigodithi commented Nov 5, 2019

s3cmd info ouput:

s3cmd info s3://mytests3syncbucket
s3://mytests3syncbucket/ (bucket):
Location: us-east-1
Payer: BucketOwner
Expiration Rule: all objects in this bucket will expire in '
Policy: none
CORS: none
ACL: AWS-s3syncbucket: FULL_CONTROL

s3cmd info ouput with object:

s3cmd info s3://mytests3syncbucket//content
ERROR: S3 error: 404 (Not Found)

Below is the output for aws s3cmd sync for the same file that is failing with s3sync.

s3cmd sync s3://mytests3syncbucket//content/directpath/014c1784/data/test-data-1023
download: 's3://mytests3syncbucket//content/directpath/014c1784/data/test-data-1023/0.bytes' -> '/data/014c1784/data/test-data-1023/0.bytes' [1 of 2]
0 of 0 0% in 0s 0.00 B/s done
download: 's3://mytests3syncbucket//content/directpath/014c1784/data/test-data-1023/0.properties' -> '/data/014c1784/data/test-data-1023/0.properties' [2 of 2]
327 of 327 100% in 0s 5.51 kB/s done
Done. Downloaded 327 bytes in 1.0 seconds, 327.00 B/s.

s3sync failed with the following error:

s3sync s3://mytests3syncbucket/data/ --s3-retry 3 --s3-acl private -p -w 128

INFO[0000] Starting sync
ERRO[0000] Sync error: pipeline step: 1 (LoadObjData) failed with error: object: content/directpath/014c1784/data/test-data-1023/0.bytes sync error: NoSuchKey: The specified key does not exist.
status code: 404, request id: 5776977DF8, host id: S39P6rO+1AgQKp0tfkwfkw gjhuhb6588yb86nAwxzhh24=, terminating
INFO[0000] 0 ListSource: Input: 0; Output: 1000 (1952 obj/sec); Errors: 0
INFO[0000] 1 LoadObjData: Input: 1000; Output: 0 (0 obj/sec); Errors: 7
INFO[0000] 2 ACLUpdater: Input: 0; Output: 0 (0 obj/sec); Errors: 0
INFO[0000] 3 UploadObj: Input: 0; Output: 0 (0 obj/sec); Errors: 0
INFO[0000] 4 Terminator: Input: 0; Output: 0 (0 obj/sec); Errors: 0
INFO[0000] Duration: 512.565058ms
ERRO[0000] Sync Failed

@prudhvigodithi
Copy link
Author

prudhvigodithi commented Nov 5, 2019

Somehow I feel s3sync is not picking up files under prefixes with "//" meaning s3://mytests3syncbucket//content, my data is under s3 bucket, under an empty folder and then under content folder so the path created was s3://mytests3syncbucket//content with '//' before content folder, I have moved the same file under the bucket s3://mytests3syncbucket and s3sync worked fine wich failed if it was under s3://mytests3syncbucket//content, so I'm assuming it is throwing NoSuckKey if it was under // folder, correct me if I'm wrong and I have tested this across
multiple files and folders under // which didn't work and worked if they are /a/b/c format.

@larrabee
Copy link
Owner

larrabee commented Nov 5, 2019

It's known issue with double slash. It was fixed in latest commit, but build with this version was not released. I create new version with this fix, please try version 2.10 and let me know.

@prudhvigodithi
Copy link
Author

Hey thanks a lot for that release, I have downloaded it and tried the s3sync commands still gets the same error Nosuchkey

s3sync -version
VersionId: 2.10, commit: e0a4585, built at: 2019-11-05T15:54:51Z

@larrabee
Copy link
Owner

larrabee commented Nov 6, 2019

I commit bugfix to debug branch, you can build it. I'm not sure that this bugfix should be committed to master branch, because it may have strange effects in other scenarios.

@prudhvigodithi
Copy link
Author

Hey sorry I'm late to update, it's working now, however when I try to run more than 128 processes it breaks, but this double slash bug I suppose it exists in aws cli itself?

@larrabee
Copy link
Owner

Hello.

  1. Is it failed with Out of memory error? If yes you can increase swap.
  2. I don't know. Double slash object created by incorrect client that not normalize object url.

@larrabee larrabee added bug Something isn't working wontfix This will not be worked on labels Jan 15, 2020
@kannanvr
Copy link

@larrabee , Generally this issue comes , If Source and Destination Bucket User is different...
If we add the Bucket Permssion to acess from Source to Destination User, then we can avoid this issue

@larrabee
Copy link
Owner

@kannanvr, hello. I think it's not related issues. Can you create new issue and provide full cmd line (without keys).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants