Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First version of DynamoDB Store #144

Merged
merged 8 commits into from
Oct 4, 2013
Merged

First version of DynamoDB Store #144

merged 8 commits into from
Oct 4, 2013

Conversation

rweald
Copy link
Contributor

@rweald rweald commented Sep 16, 2013

I modeled storehaus-dynamodb off storehaus-memcache which uses a base client Store[String, ChannelBuffer].

There is a base store that extends Store[String, AttributeValue] where AttributeValue is the AWS SDK specific data type that you use for requests to Dynamo.

DynamoStringStore and DynamoLongStore use ConversionStore to provide a more user friendly, native type interface to DynamoDB. DynamoLongStore is also mergeable and can be used for incrementing atomic counters in DynamoDB.

Still needs documentation which I will add, but I wanted to get expose my work for some feedback.

I could also pull the Bijections/Injections out of this codebase and into the Bijections codebase but I didn't want to block this pull request waiting on a version bump in Bijection. I figured I could submit a separate pull request to Bijection and then remove them from this code base once the new version of Bijection has been published.

I modeled storehaus-dynamodb off storehaus-memcache which uses
a base client  Store[String, ChannelBuffer].

There is a base store that extends Store[String, AttributeValue]
where AttributeValue is the AWS SDK specific data type that you
use for requests to Dynamo.

DynamoStringStore and DynamoLongStore use ConversionStore
to provide a more user friendly, native type interface to
DynamoDB. DynamoLongStore is also mergeable and can be used
for incrementing atomic counters in DynamoDB.
@rweald rweald mentioned this pull request Sep 16, 2013
@rubanm
Copy link
Contributor

rubanm commented Sep 17, 2013

DynamoDB supports batched gets. It might be better to use this for multiGet than deferring to default Store implementation, though the API has a key limit of 100 in the best case.

@rubanm
Copy link
Contributor

rubanm commented Sep 17, 2013

Not sure if you are considering this for your use-case - DynamoDB also lets you optionally set reads or writes to be strongly consistent. (This is more of an FYI and I'm not sure if it is worthwhile exposing this option via the store at this point.)

@rweald
Copy link
Contributor Author

rweald commented Sep 17, 2013

@rubanm thanks for the comments. I was planning on adding the more efficient batch get and put operation using the amazon batch API in a second enhancement pull request. Kind of wanted to keep this initial PR smaller and focused on the core client.

I was having the same internal debate about if/how to expose the option for strong consistency. I was thinking about potentially exposing it as a constructor option and then providing a helper object StronglyConsistentDynamoStore that would create a DyanamoStore with stronglyConsistent = true. That being said I am personally always unsure about exposing strong consistency in the dynamo client as it feels like somewhat of a land mine because of the substantial increase in cost associated with doing strongly consistent operations.

Map(underlying.valueColumn -> attributeUpdateValue).as[JMap[String, AttributeValueUpdate]]
)

Future {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't this block?

You can use a FuturePool to make this async, or they have an async API, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brain fart on my part, too much time working on Play apps. Should have been a FuturePool and not a regular Future. I took a look at their Async client initially but it is based on Java futures which are pretty gross and hard to work with in Scala. The Injection from java future to twitter future that behaves as expected could get a little messy.

My personal preference would be to continue to use the blocking dynamo client and just dispatch the blocking operations to a FuturePool which will give us a non-blocking Store that plays nicely with Scala and Twitter Futures.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Maybe just add a comment to document that that option has been explored.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rubanm good point, will do.

@johnynek
Copy link
Collaborator

Consistency in the constructor SGTM.

Mostly concerned about making sure it isn't blocking.

@avibryant
Copy link

The just-released DynamoDB Local server might be helpful for tests: http://aws.typepad.com/aws/2013/09/dynamodb-local-for-desktop-development.html

@rweald
Copy link
Contributor Author

rweald commented Sep 18, 2013

Yeah that local server will be very helpful for testing.

I've held off on committing test up to this point because I didn't want to create a dependency on the real dynamo api in testing because it would incur monetary cost. In theory you could also get some level of testing by injecting a mock dynamo client but I was unsure of your feelings on dependency injection and testing with mocks. Also you don't get a ton of safety since you are creating artificial responses that are often subtly different than the response objects that AWS returns.

@@ -0,0 +1,20 @@
package com.twitter.storehaus.dynamodb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't forget copyright headers please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caniszczyk can you please point me to the appropriate copyright headers. Also should those headers be in every file or just certain files?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rweald

See any source file... but something like this..
/*

  • Copyright 2013 Twitter Inc.
    *
  • Licensed under the Apache License, Version 2.0 (the "License"); you may
  • not use this file except in compliance with the License. You may obtain
  • a copy of the License at
    *
  • http://www.apache.org/licenses/LICENSE-2.0
    
  • Unless required by applicable law or agreed to in writing, software
  • distributed under the License is distributed on an "AS IS" BASIS,
  • WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  • See the License for the specific language governing permissions and
  • limitations under the License.
    */

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rweald
Copy link
Contributor Author

rweald commented Sep 18, 2013

Thanks again for the fast feedback. I'll make the improvements later today.

This method is a bit more resistant to errors and was suggested
by @johnynek
The default Dynamo client uses blocking requests which means
we need to execute the api calls on background threads so as
to not block the main thread.

At this point it uses a FuturePool backed by a
 hard coded fixed size thread pool. Work can be done in the
future to make the size of the threadpool configurable
{

//TODO: make this pool size configurable
val apiRequestFuturePool = FuturePool(Executors.newFixedThreadPool(4))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this a constructor parameter, and then handle the defaults with apply methods in the DynamoStore.

In my view, using the processor count is a better default:
val processors = Runtime.getRuntime().availableProcessors;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You read my mind. I was just working on doing the constructor argument this afternoon.

Good point about using processor count as a more sane default.

@johnynek
Copy link
Collaborator

using the client from more than one thread is only okay if there clients are thread safe. Can you link to the docs that say that is okay?

@rweald
Copy link
Contributor Author

rweald commented Sep 23, 2013

There is no explicit mention of thread safety in the DynamoDBClient class docs themselves. However, there is a class called DynamoDBMapper that takes a client as a constructor argument and is explicitly called out as being thread-safe. In addition there is a specific note about a single method in the DynamoDBClient.setEndpoint that is not thread safe which leads me to believe the rest of the class is thread safe. There are also several references in their docs to doing batch operations in multiple threads in parallel which also indicates thread safety. Apologies for the crappy javadoc link but Amazon's docs don't appear to support class links 😦 http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/index.html

I wish there was a more explicit callout in the docs for DynamoDBClient but given all the other evidence and a quick glance through their code I am fairly confident it is thread safe.

johnynek added a commit that referenced this pull request Oct 4, 2013
@johnynek johnynek merged commit e22bb9f into twitter:develop Oct 4, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants