-
Notifications
You must be signed in to change notification settings - Fork 416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document how to configure dynamodb lock client #1091
Comments
@MrPowers this would probably be a good thing to blog about once the conflict resolution is improved. Concurrent writes is definitely something you can't do with plain Parquet tables. 😉 |
Let me know if I can help you in this, we'll need this feature. 🙂 |
@wjones127 - feel free to assign me to this issue. I will be happy to create the docs when #593 is finished. |
Hi folks, is it possible to have a draft document first so that everyone can start to try and provide feedback?
|
I'm looking for the documentation on how to setup the LockClient in Python as well. |
In
In a different project, it documented the table schema of the dynamodb table: https://github.com/delta-io/kafka-delta-ingest#writing-to-s3
(The same schema is documented in
However, the python documentation
|
@danielgafni : I went with storage_options, and it worked well with deltalake 0.13.0.
|
Thanks. |
I think it's also worth documenting the required permissions to work on a deltalake stored on AWS S3. In my case, I needed:
|
@wjones127 when using an S3 compatible storage (other than AWS S3), one might have a set of access and secret key for the storage and another set for the DynamoDB. In this case, how to provide these two pairs (of access and secret keys) separately so one is used for storage and the other for DynamoDB? |
@ale-rinaldi would you mind adding this info to our docs? |
@ion-elgreco you are not referring to this right? Right now we have a real use case for what I have described above (using different credentials for s3 and dynamodb) and I created #2287 for it. If it is already supported it would be great to have the documentation clarify it. Otherwise we need to accommodate separate set of credentials for dynamodb to unblock uncoupling dynamodb from s3. |
@ion-elgreco of course! I opened #2393 |
# Description This documents the required AWS permissions on S3 and DynamoDB to interact with deltalakes. # Related Issue(s) - mentions #1091
Experiencing some issues that may be related to this. I set up a DynamoDB table using the following command:
And running following example
I receive the following error when running...
Looking at the policies assigned to my AWS account, it seems that I have all the permissions/policies that have been discussed above. Not sure what I am missing. |
In the published documentation they specify the aws dynamodb create-table \
--table-name delta_log \
--attribute-definitions AttributeName=tablePath,AttributeType=S AttributeName=fileName,AttributeType=S \
--key-schema AttributeName=tablePath,KeyType=HASH AttributeName=fileName,KeyType=RANGE \
--provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5 |
Thank you @dhirschfeld, this solved my issue. |
Description
Although we have an error message telling users to configure the
Lock client
if they want concurrent writes with S3, we don't have any documentation on how to do that. We should also provide general advice on concurrency, like not mixing different connectors in concurrent writers.See conversation: https://delta-users.slack.com/archives/C013LCAEB98/p1674435354811639
Use Case
Related Issue(s)
We probably shouldn't do this until we improve the conflict resolution, though. #593
The text was updated successfully, but these errors were encountered: