Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DynamoDB documentClient support #1223

Closed
fboudra opened this issue May 30, 2020 · 69 comments · Fixed by #2097
Closed

DynamoDB documentClient support #1223

fboudra opened this issue May 30, 2020 · 69 comments · Fixed by #2097
Assignees
Labels
feature-request New feature or enhancement. May require GitHub community feedback. High Priority

Comments

@fboudra
Copy link

fboudra commented May 30, 2020

Is your feature request related to a problem? Please describe.
Migrate existing code using documentClient to v3
https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/DynamoDB/DocumentClient.html

I'm also using dynamodb-toolbox on some projects (it relies on documentClient ):
https://github.com/jeremydaly/dynamodb-toolbox

Describe the solution you'd like
ability to use the convenient documentClient API with v3

Describe alternatives you've considered

  • go back to primitive calls like client putitem/getitem
  • stick to v2 and don't migrate
@fboudra fboudra added the feature-request New feature or enhancement. May require GitHub community feedback. label May 30, 2020
@russell-dot-js
Copy link
Contributor

russell-dot-js commented Jul 22, 2020

I'm with you @fboudra, looking forward to DocumentClient!
However, this is a duplicate of #288

Maybe you or I could take a stab at porting over DocumentClient 🤔

@mhart
Copy link
Contributor

mhart commented Sep 15, 2020

It's a duplicate, but the other issue was closed so I'm assuming this is the best place to follow progress of this?

@AllanFly120 is there somewhere else tracking this?

@antstanley
Copy link

Hey folks, for those struggling with this I've created an npm package to convert JSON to DynamoDB AttributeValue type schema JSON and back again.

I'm using it today in a small production project.

I've created two packages, one a CommonJS module, and the other a ES module. You can find them, with docs here.

It's a fork and update from the DocumentClient converter.js code.

CommonJS module - dynamodb-type-marshall https://www.npmjs.com/package/dynamodb-type-marshall

ES module - @esmodule/dynamodb-type-marshall https://www.npmjs.com/package/@esmodule/dynamodb-type-marshall

Source code here: https://github.com/senzodev/dynamodb-type-marshall

@trivikr
Copy link
Member

trivikr commented Sep 18, 2020

Thanks @antstanley for creating a package for marshalling and unmarshalling.

We've implemented marshaller (convertToAttr) in #1531, and aiming to push it in gamma.10 launch before 9/25
A PR for unmarshaller (convertToNative) will be posted soon.

@trivikr
Copy link
Member

trivikr commented Sep 29, 2020

The support for marshall and unmarshall operations was released in gamma.10 release, and verified with the code below:

Versions tested on:

    "@aws-sdk/client-dynamodb": "1.0.0-gamma.10",
    "@aws-sdk/util-dynamodb": "1.0.0-gamma.1"
Code
const { DynamoDB } = require("@aws-sdk/client-dynamodb");
const { marshall, unmarshall } = require("@aws-sdk/util-dynamodb");

const { deepStrictEqual } = require("assert");

(async () => {
  const region = "us-west-2";

  // Table with Partition key named `HashKey`
  const TableName = `test-util-dynamodb`;
  const HashKey = "hashKey";
  const input = {
    HashKey,
    BinaryAttribute: Uint8Array.from("12345"),
    BoolAttribute: true,
    BoolSetAttribute: new Set([Uint8Array.from("1"), Uint8Array.from("2")]),
    ListAttribute: [1, "two", false],
    MapAttribute: { foo: "bar" },
    NumAttribute: 1,
    NumSetAttribute: new Set([1, 2, 3, 4, 5]),
    NullAttribute: null,
    StrAttribute: "one",
    StrSetAttribute: new Set(["one", "two", "three"]),
  };
  const Item = marshall(input);
  console.log({ putItem: input });

  const client = new DynamoDB({
    region,
    // logger: { ...console, debug: undefined },
  });
  await client.putItem({ TableName, Item });
  const response = await client.getItem({
    TableName,
    Key: marshall({ HashKey }),
  });
  console.log({ getItem: unmarshall(response.Item) });

  deepStrictEqual(input, unmarshall(response.Item));
})();
Output
{
  putItem: {
    HashKey: 'hashKey',
    BinaryAttribute: Uint8Array(5) [ 1, 2, 3, 4, 5 ],
    BoolAttribute: true,
    BoolSetAttribute: Set { [Uint8Array], [Uint8Array] },
    ListAttribute: [ 1, 'two', false ],
    MapAttribute: { foo: 'bar' },
    NumAttribute: 1,
    NumSetAttribute: Set { 1, 2, 3, 4, 5 },
    NullAttribute: null,
    StrAttribute: 'one',
    StrSetAttribute: Set { 'one', 'two', 'three' }
  }
}
{
  getItem: {
    ListAttribute: [ 1, 'two', false ],
    MapAttribute: { foo: 'bar' },
    BinaryAttribute: Uint8Array(5) [ 1, 2, 3, 4, 5 ],
    StrAttribute: 'one',
    BoolSetAttribute: Set { [Uint8Array], [Uint8Array] },
    NumSetAttribute: Set { 5, 4, 3, 2, 1 },
    BoolAttribute: true,
    StrSetAttribute: Set { 'one', 'three', 'two' },
    HashKey: 'hashKey',
    NumAttribute: 1,
    NullAttribute: null
  }
}

If you come across any bugs, do report them by creating an issue: https://github.com/aws/aws-sdk-js-v3/issues/new?assignees=&labels=bug+report&template=---bug-report.md&title=

@antstanley
Copy link

Awesome! Will give it a go today.

@trivikr
Copy link
Member

trivikr commented Oct 29, 2020

Closing as the core operations, i.e. marshall/unmarshall, were released in gamma.10 release.
Currently we are not planning to provide support for a separate DocumentClient.

@trivikr trivikr closed this as completed Oct 29, 2020
@mhart
Copy link
Contributor

mhart commented Oct 29, 2020

@trivikr will you be providing guidance on how to migrate from DocumentClient? Would be good to see how easy it is to go from DocumentClient to marshall/unmarshall

@trivikr
Copy link
Member

trivikr commented Oct 30, 2020

The utility operations marshall and unmarshall are exposed in @aws-sdk/util-dynamodb, and here’s a code showing how to use them:

const { DynamoDB } = require("@aws-sdk/client-dynamodb");
const { marshall } = require("@aws-sdk/util-dynamodb");

const client = new DynamoDB(clientParams);
const putParams = {
  TableName: "Table",
  Item: marshall({
    HashKey: "hashKey",
    NumAttribute: 1,
    BoolAttribute: true,
    ListAttribute: [1, "two", false],
    MapAttribute: { foo: "bar" },
    NullAttribute: null,
  }),
};

await client.putItem(putParams);

const getParams = {
  TableName: "Table",
  Key: marshall({
    HashKey: "hashKey",
  }),
};

const { Item } = await client.getItem(getParams);
unmarshall(Item);

The examples are also provided in README at: https://github.com/aws/aws-sdk-js-v3/tree/v1.0.0-rc.3/packages/util-dynamodb
We're working on updating the Developer Guide with this example.

@stang-tgs
Copy link

Does it not make sense for a DynamoDB JavaScript library to provide the DocumentClient functionality? I mean, under what circumstances in JavaScript/TypeScript would NOT use native JS objects and would want to use the marshalled DynamoDB types? I'd hazard a guess to say almost never. So, not providing a DocumentClient seems to be a straight annoyance for most SDK users, because, chances are, marshal() and unmarshall() is going to be sprinkled before/after almost every database operation where data types need to be specified/used - I can't imagine why a JS/TS developer would ever need to operate on a native, marshalled DynamoDB data model in JS.

I've started to port our existing TS code from aws sdk v2 to v3 where we use DocumentClient. Fortunately, we have an abstraction layer, so we are able to marshall/unmarshall there without affecting calling code. Unfortunately, we have to intercept and manually sprinkle in marshall/unmarshall everywhere now, which I suspect many developers porting would have to do as well.

I'm still going through the motions of doing the marshalling/unmarshalling - but my 2 cents is that it's an oversight and disservice by not providing a DocumentClient. If the .NET/Java library required developers to manually marshall/unmarshall, I'm quite sure everyone's heads will explode 😉.

@trivikr
Copy link
Member

trivikr commented Dec 16, 2020

Reopening the issue to explore addition of DocumentClient.

@trivikr trivikr reopened this Dec 16, 2020
@reni11111
Copy link

+1
Saying you don't need DocumentClient because u can marshall and unmarshall by yourself is the same as saying you don't need .map .reduce .filter .find because you can do a for loop and do it manually.

@eryon
Copy link

eryon commented Jan 1, 2021

I don't really know how safe this is, but I'm now using this to auto-marshall:

  client.middlewareStack.add(
    (next) => async (args) =>
      next({
        ...args,
        input: Object.entries(args.input).reduce(
          (acc, [key, value]) => ({
            ...acc,
            [key]: ['ExpressionAttributeValues', 'Key', 'Item'].includes(key)
              ? marshall(value, { convertEmptyValues: true })
              : value
          }),
          {}
        )
      }),
    {
      name: 'autoMarshall',
      priority: 'high',
      step: 'initialize'
    }
  );
  client.middlewareStack.add(
    (next) => async (args) => {
      const result = await next(args);

      return {
        ...result,
        output: {
          ...result.output,
          Item: result.output.Item && unmarshall(result.output.Item),
          Items: result.output.Items && result.output.Items.map((i) => unmarshall(i))
        }
      };
    },
    {
      name: 'autoUnmarshall',
      priority: 'high',
      step: 'deserialize'
    }
  );

where client is a new DynamoDBClient and this is in a common utility function.

@mattiLeBlanc
Copy link

Glad to have the marshall/unmarshall option. However, the V2 DocumentClient was easier to work with. So I would also love to see it migrated to V3.

@ghost

This comment has been minimized.

@ghost

This comment has been minimized.

@yaquawa
Copy link

yaquawa commented Feb 7, 2021

any updates on this??

@trivikr
Copy link
Member

trivikr commented Feb 9, 2021

Updates in @aws-sdk/util-dynamodb:

We're not planning to support a separate DocumentClient in v3 as of now.
Are there any specific features you're missing in @aws-sdk/util-dynamodb? Please create an issue to submit a request.

@mattiLeBlanc
Copy link

@trivikr I added one suggestion for the util-dynamodb here: #2009

@cadam11
Copy link

cadam11 commented Feb 9, 2021

@trivikr The downside of having to manually marshall/unmarshall is that the user now has to keep track of which params need to be marshalled/unmarshalled. IMO, that's too much internal architecture knowledge to require from consumers.

If there's to be no DocumentClient in v3, how about adding a first-party middleware for auto-marshall/unmarshall similar to what @eryon proposed? @aws-sdk/middleware-automarshall or something?

@trivikr
Copy link
Member

trivikr commented Feb 9, 2021

If there's to be no DocumentClient in v3, how about adding a first-party middleware for auto-marshall/unmarshall similar to what @eryon proposed? @aws-sdk/middleware-automarshall or something?

A separate DocumentClient was not written for v3 mainly because of the difference in types: AttributeValue from DynamoDB, and NativeAttributeValue for native JavaScript types. We will have to emit redundant models for DynamoDB operations which use AttributeValue or a separate client, and this work would have to be codegen as DynamoDB service may decide to introduce new operations which use AttributeValue.

We also evaluated automatically detecting whether customer code has sent an attribute value or native JavaScript types based on the actual data. But it had multiple edge cases. Another option we evaluated is allowing customer to configure whether DynamoDB records or native JavaScript values are passed in an operation. But that option also had some edge cases, and added overhead for customer to understand the configuration. I can provide specific details if required.

After a week long design discussion, we decided that the utility functions in @aws-sdk/util-dynamodb would be the best solution. It would be difficult to switch to in the short-term, but would be beneficial in the long term. The developer resources can be diverted for designing a higher level library like DynamoDB Datamapper instead.

@cadam11 Do you have a suggestion for a middleware @aws-sdk/middleware-automarshall which can work without changing DynamoDB input/output type definitions?

@cadam11
Copy link

cadam11 commented Feb 10, 2021

I guess it depends on how strict you are about

without changing DynamoDB input/output type definitions

At the very least, I'd expect that for a middleware to work there would need to be some (non-breaking) additions to the existing types. E.g., Item would have to allow an alternative to a map of AttributeValues– I'm imagining some use of generics to provide a return type for GetItem, say.

Wish I had more time to dig into this interesting problem. In any case, I'm still a fan of v3 sdk.

@kyeotic
Copy link

kyeotic commented Feb 11, 2021

@trivikr That doesn't make a whole lot of sense. Your first paragraph lays out the difficulty in doing this work, and your penultimate paragraph explains how you decided that its better that every consumer of your library do that work instead.

That's what a library is for: to do the difficult work that everyone needs once. You recognize that it's awkward for the consumer of the unmarshalled data to work with the underlying types. That's the reason that you should do it!

@trivikr
Copy link
Member

trivikr commented Feb 11, 2021

The first paragraph explains difficulty in doing this work when it comes to being an SDK owner. This involves writing and maintaining a separate DocumentClient and emitting mostly redundant type definitions.

The penultimate paragraph explains how easy it for customer to just use marshall/unmarshall functions which takes native JavaScript types as inputs, and return DynamoDB records as output. This way:

  • Customer can pass continue passing JavaScript types which they used for previous DocumentClient.
  • Customers don't have to make additional choice between DynamoDB client and DocumentClient.

Can you create a separate feature request on what specific difficulties you're facing when moving from DynamoDB DocumentClient in v2 to utilities in v3?

I'm writing a blog post on DynamoDB utilities, and aiming to publish in a month. I'll include some scripts to ease migration based on the problems faced.

@kyeotic
Copy link

kyeotic commented Feb 11, 2021

The difficulty is writing the same marhsall/unmarshall code and maintaining those "mostly redundant types" either in every app that uses the aws sdk or in our own library that provides the same.

This is a JavaScript library that doesn't work with JavaScript objects! What JavaScript app wants to interact with those DynamoDB marshalled objects? I'm guessing next to none. Why not abstract away the transport format in the library that interacts with the transport layer?!

@mobsense
Copy link

Just FYI: Previously I said it would be really helpful to have DocumentClient supported for use in our OneTable library. However, after digging deeper, we were able to support the V3 SDK pretty easily without this and we just released an update of OneTable with V3 support.

The marshall / unmarshall library did most of the work and we were able to maintain 100% API compatibility for OneTable users after initialization of the library. Users have 2 code lines of difference in using V2 or V3 of the SDK. All smiles here.

So while DocumentClient support doesn't matter for us now, it do believe that for many developers, having a compatibility shim so they don't need to rewrite their DynamoDB apps that use DocumentClient would be worthwhile.

@G-Rath
Copy link

G-Rath commented Feb 19, 2021

If it helps, here's a type-based Marshall:

import { AttributeValue, DynamoDB } from '@aws-sdk/client-dynamodb';

type GenericSMember<S extends string> = AttributeValue.SMember & { S: S };

type MarshallProperty<T> = T extends string
  ? GenericSMember<T>
  : T extends number
  ? AttributeValue.NMember
  : T extends boolean
  ? AttributeValue.BOOLMember
  : T extends Uint8Array
  ? AttributeValue.BMember
  : T extends Set<string>
  ? AttributeValue.SSMember
  : T extends Set<number>
  ? AttributeValue.NSMember
  : T extends Set<Uint8Array>
  ? AttributeValue.BSMember
  : T extends null
  ? AttributeValue.NULLMember
  : T extends Array<infer TItems>
  ? MarshallProperty<TItems>
  : T extends Record<infer TKeys, unknown>
  ? MarshallProperty<TKeys>
  : never;

type Marshall<T extends object> = { [K in keyof T]: MarshallProperty<T[K]> };

Example usage:

interface PullRequestOutcome {
  pk: 'PullRequest';
  sk: `${string}${string}#${string}${string}`;
  host: 'github' | 'bitbucket';
  conclusion: 'merged' | 'declined';
  openedAt: string;
  closedAt: string;
}

const Item: Marshall<PullRequestOutcome> = {
  pk: { S: '' },
  sk: { S: 'my-project#12' },
  host: { S: 'github' },
  conclusion: { S: 'declined' },
  openedAt: { S: '' },
  closedAt: { S: '' }
};

image

It should work fine for all things except for tuples (as they can't be preserved properly, so they'll just become Array).

I've also got an Unmarshall, but that's less accurate:

type UnmarshallProperty<T> = T extends GenericSMember<infer S>
  ? S
  : T extends AttributeValue.NMember
  ? number
  : T extends AttributeValue.BOOLMember
  ? boolean
  : T extends AttributeValue.BMember
  ? Uint8Array
  : T extends AttributeValue.SSMember
  ? Set<string>
  : T extends AttributeValue.NSMember
  ? Set<number>
  : T extends AttributeValue.NULLMember
  ? null
  : T extends Record<string, unknown>
  ? UnmarshallProperty<T>
  : never;

type Unmarshall<T extends object> = {
  [K in keyof T]: UnmarshallProperty<T[K]>;
};

declare const unmarshall: <T extends { [key: string]: AttributeValue }>(
  item: T
) => Unmarshall<T>;

const item = unmarshall({
  pk: { S: 'PullRequest' as const },
  sk: { S: 'my-project#12' },
  host: { S: 'github' },
  conclusion: { S: 'declined' },
  openedAt: { S: '' },
  closedAt: { S: '' },
});

@trivikr
Copy link
Member

trivikr commented Feb 21, 2021

Update: After the design discussion within the team, a PoC for DocumentClient is posted at #2062

This is a manually written PoC. In the final PR, the operations are going to be code generated and not manually written.

The modular DynamoDBDocumentClient can be created as follows:

  const client = new DynamoDBClient({});
  const ddbDocClient = DynamoDBDocumentClient.from(client);
  await ddbDocClient.send(
    new PutCommand({
      TableName,
      Item: { id: "1", content: "content from DynamoDBDocumentClient" },
    })
  );

The v2 backward-compatible DocumentClient can be created as follows:

  const client = new DynamoDBClient({});
  const ddbDoc = DynamoDBDocument.from(client);
  await ddbDoc.put({
    TableName,
    Item: { id: "2", content: "content from DynamoDBDocument" },
  });

@trivikr
Copy link
Member

trivikr commented Feb 24, 2021

Update: The existing marshalling/unmarshalling code for BatchGet/BatchWrite is pretty complex. The code would be black box to users, but they may go through it while debugging.

You can view the current implementation for posting suggestions/comments at https://github.com/trivikr/aws-sdk-js-v3/blob/058c2078a078a8f1e3d27a34b10955f4a27752df/lib/lib-dynamodb/src/commands/BatchWriteCommand.ts

The PoC PR can be viewed at #2062

Tagging @kyeotic as they had attempted writing marshall/unmarshall for batchWrite earlier in #1223 (comment)

@kyeotic
Copy link

kyeotic commented Feb 24, 2021

@trivikr That implementation looks pretty heavy, memory and computation wise. Using an object spread in a reduce, like this one

UnprocessedItems: Object.entries(data.output.UnprocessedItems).reduce(
    (acc: any, [key, value]: [string, any]) => ({
      ...acc,
      [key]: value.map((writeItem: any) => ({
        ...writeItem,
        ...(writeItem.PutRequest && {
          PutRequest: unmarshall(
            writeItem.PutRequest,
            configuration.translateConfiguration?.unmarshallOptions
          ),
        }),
        ...(writeItem.DeleteRequest && {
          DeleteRequest: unmarshall(
            writeItem.DeleteRequest,
            configuration.translateConfiguration?.unmarshallOptions
          ),
        }),
      })),
    }),
    {}
  ),
}),

Results in a copy of the param object for every PutRequest/DeleteRequest, each of which is simply re-spread and discarded except for the last.

I prefer the map method, which transforms the object without so much waste.

response.UnprocessedItems &&
    Object.fromEntries(
      Object.entries(response.UnprocessedItems).map(
        ([table, requests]: [string, WriteRequest[]]) => [
          table,
          requests.map((request) => ({
            PutRequest: request.PutRequest &&
              request.PutRequest.Item && {
                Item: this.unmarshall(request.PutRequest.Item),
              },
            DeleteRequest: request.DeleteRequest &&
              request.DeleteRequest.Key && {
                Key: this.unmarshall(request.DeleteRequest.Key),
              },
          })),
        ]
      )
    )

As far as complexity, I think that's all the more reason to do it. This code was very hard to get right, at least in Typescript, and it's fully generic. I'd rather that happen here in SDK, with 1000 eyes on it, than ad hoc in 1000 invisible apps.

@trivikr
Copy link
Member

trivikr commented Feb 24, 2021

That implementation looks pretty heavy, memory and computation wise. Using an object spread in a reduce.

Results in a copy of the param object for every PutRequest/DeleteRequest, each of which is simply re-spread and discarded except for the last.

I agree. The Object.fromEntries was not used as TypeScript expects it to be available at runtime, and we target ES5 to support old versions of browsers. Discussions in microsoft/TypeScript#32803 (comment)

As far as complexity, I think that's all the more reason to do it.

Yup, the AWS SDK for JavaScript team is committed to providing support for DocumentClient. I'm posting updates on this issue to get feedback while DocumentClient gets implemented.

@kyeotic
Copy link

kyeotic commented Feb 24, 2021

That's fair, though its easy to write a basic polyfill for fromEntries, even a bespoke one for this case. The main point is that reduce+spread is wasteful and should be avoided.

@trivikr
Copy link
Member

trivikr commented Feb 24, 2021

The main point is that reduce+spread is wasteful and should be avoided.

It's true that reduce+spread is wasteful, but we'll continue to use it because of the following reasons:

  • The marshalling/unmarshalling is not in the hot path and is called just once per service call.
  • The performance of spread has significantly improved in Chrome 72 (V8 7.2), which is also used in Node.js 12.x
  • The code used by spread is arguably cleaner and less susceptible to bugs as it creates a new object.

Benchmark used https://www.measurethat.net/Benchmarks/Show/11806/0/reduce-vs-map-fromentries-reduce-spread

@mhart
Copy link
Contributor

mhart commented Feb 24, 2021

Echoing @kyeotic's concerns here – we've seen this pattern have a noticeable negative effect in Node.js 12.x Lambda environment. It's used by some GraphQL libraries and was adding hundreds of milliseconds to our cold starts. We had to fork and rewrite those components to use a more imperative approach – it's really not much more code and it had a noticeable effect.

When you say "once per service call" do you mean for every DynamoDB call? If there are thousands of these a second, won't that have an impact?

@trivikr
Copy link
Member

trivikr commented Feb 24, 2021

When you say "once per service call" do you mean for every DynamoDB call?

Yes. The hot path of the SDK.

If there are thousands of these a second, won't that have an impact?

It would have an impact, as marshalling/unmarshalling would then be in the hot path of the application.

It's used by some GraphQL libraries and was adding hundreds of milliseconds to our cold starts. We had to fork and rewrite those components to use a more imperative approach – it's really not much more code and it had a noticeable effect.

Is this issue documented somewhere?

I agree that it's not much code, as seen in this experimental commit trivikr@1ad4dff

@mhart
Copy link
Contributor

mhart commented Feb 24, 2021

Just digging up the code now – don't think we created an issue for it.

Here are some of the pieces we had to change:

https://github.com/maticzav/graphql-shield/blob/258e9489589f2551a2f9507ede290fa740a097ac/src/generator.ts#L128-L180

https://github.com/maticzav/graphql-shield/blob/258e9489589f2551a2f9507ede290fa740a097ac/src/generator.ts#L202-L215

https://github.com/maticzav/graphql-shield/blob/258e9489589f2551a2f9507ede290fa740a097ac/src/generator.ts#L257-L273

We just rewrote it to use for loops, settings keys on a single object.

I believe the main issue was the GC cutting in all the time to clean up short-lived objects. We have a fairly big GraphQL schema, so it may be that others haven't seen the same impact we have.

@mhart
Copy link
Contributor

mhart commented Feb 24, 2021

Was a while ago (like, 9-12 mths ish 😊), but I think we used flamegraphs generated by https://github.com/davidmarkclements/0x to track this code down as one of the main offenders

@kyeotic
Copy link

kyeotic commented Feb 24, 2021

It's true that reduce+spread is wasteful, but we'll continue to use it because of the following reasons:

  • The marshalling/unmarshalling is not in the hot path and is called just once per service call.

Uh, what? How is "once-per-service-call" not in the hot path? What could possibly be more hot than "every single call to Dynamo"?

  • The performance of spread has significantly improved in Chrome 72 (V8 7.2), which is also used in Node.js 12.x

Improved, but it is still measurably worse than using a map followed by a single object construction.

  • The code used by spread is arguably cleaner and less susceptible to bugs as it creates a new object.

Map also creates new objects, and does not mutate the original. I don't this this argument holds water.

@trivikr
Copy link
Member

trivikr commented Feb 24, 2021

How is "once-per-service-call" not in the hot path? What could possibly be more hot than "every single call to Dynamo"?

I was referring to hot path of the SDK, comparing to calling once per service call to several times per service call.
It would be in the hot path of the application, if it makes several calls like the specific example @mhart mentioned.

Map also creates new objects, and does not mutate the original. I don't this this argument holds water.

This comparison was between reduce without spread vs reduce with spread.

From the output of the benchmark shared, we'll have to use reduce without spread over fromEntries because of the following reasons:

  • reduce without spread is more performant than fromEntries.
  • fromEntries is not available in old browsers.

@trivikr
Copy link
Member

trivikr commented Feb 24, 2021

Update: I experimented with using reduce without spread, but it caused some type conversion issues.
If there are performance concerns from benchmarks, it'll be picked up. It won't affect the API exposed by DocumentClient.

Update: The PR posted on 3/2 at #2097 removes reduce and updates object keys.

@trivikr
Copy link
Member

trivikr commented Mar 2, 2021

The Pull Request for modular version of DynamoDB Document Client is ready for review at #2097.
The full client version would be added by Wed 3/3.

There have been extensive design discussions and proof of concepts shared for Document client implementation over the last month. We expect to merge this code and release Document Client with v3.8.0 either on Thursday 3/4 or Friday 3/5.

Tagging active commenters from this issue for notifying: @kyeotic @cadam11 @mdesousa @stang-tgs @fboudra @mattiLeBlanc @rati-dzidziguri @mobsense @russell-dot-js @mhart @antstanley

@rati-dzidziguri
Copy link

@trivikr, after looking into the PR itself, I can only say it is clean, easy to understand, and follows the overall vision of AWS SDK V3. So thank you and the team for all this work, and I hope to see this out soon.

@stang-tgs
Copy link

@trivikr - awesome that this solution is in place, and elegantly so. That is, having the code programmatically generated and this DocumentClient being even better than the first (better native types support, with Set, BigInt, etc).

The whole process of this back and forth and you and the Amazon team being receptive has been great and proves that Open Source just helps makes things better, faster.

Cheers!

@fboudra
Copy link
Author

fboudra commented Mar 3, 2021

@trivikr the road has been long, a lot of bumps, but at the end, it reached expectations.

@serg06
Copy link

serg06 commented Mar 14, 2021

Update: After the design discussion within the team, a PoC for DocumentClient is posted at #2062

This is a manually written PoC. In the final PR, the operations are going to be code generated and not manually written.

The modular DynamoDBDocumentClient can be created as follows:

  const client = new DynamoDBClient({});
  const ddbDocClient = DynamoDBDocumentClient.from(client);
  await ddbDocClient.send(
    new PutCommand({
      TableName,
      Item: { id: "1", content: "content from DynamoDBDocumentClient" },
    })
  );

The v2 backward-compatible DocumentClient can be created as follows:

  const client = new DynamoDBClient({});
  const ddbDoc = DynamoDBDocument.from(client);
  await ddbDoc.put({
    TableName,
    Item: { id: "2", content: "content from DynamoDBDocument" },
  });

Thank you, this helped A LOT.

Here's an example UpdateCommand and GetCommand to match:

// First do yarn add "@aws-sdk/client-dynamodb"
// Then do yarn add "@aws-sdk/lib-dynamodb"
const ddb = require('@aws-sdk/client-dynamodb')
const {DynamoDBDocumentClient, PutCommand, UpdateCommand, GetCommand} = require("@aws-sdk/lib-dynamodb");

const client = new ddb.DynamoDBClient({region: "us-east-2"});

async function getUserSettings(name) {
    let command = new GetCommand({
        TableName: "userSettings",
        Key: {
            id: name
        }
    });

    const ddbDocClient = DynamoDBDocumentClient.from(client);
    try {
        let tmp = await ddbDocClient.send(command);
        if (tmp.Item === undefined) {
            // Not found
            return tmp.Item;
        }
        return tmp.Item;
    } catch (e) {
        throw e;
    }

    return undefined;
}

async function updateUserSettings(name, settings) {
    let updates = [];
    updateParams = {};

    let validKeys = [
        'enabled_voices',
        'enabled_alerts',
        'max_length'
    ];

    for (let key of validKeys) {
        if (key in settings) {
            updates.push(`${key}=:${key}`)
            updateParams[`:${key}`] = settings[key];
        }
    }

    let key = 'last_updated';
    updates.push(`${key}=:${key}`)
    updateParams[`:${key}`] = new Date().toISOString();

    if (updates.length === 0) {
        return
    }

    let expression = 'set ';
    expression += updates.join(', ');

    let command = new UpdateCommand({
        TableName: "userSettings",
        Key: {
            id: name
        },
        UpdateExpression: expression,
        ExpressionAttributeValues: updateParams
    });

    const ddbDocClient = DynamoDBDocumentClient.from(client);
    try {
        await ddbDocClient.send(command);
    } catch (e) {
        throw e;
    }
}

@serg06
Copy link

serg06 commented Mar 14, 2021

BTW how would you guys suggest validating incoming data before storing/updating it in the DB?

@yaquawa
Copy link

yaquawa commented Mar 26, 2021

I've ended up with creating a lib for the workaround,
plus, super strong type hinting feature(if you are using TypeScript)
I'm happy with this now 🤟🏼🤟🏼

https://github.com/yaquawa/dynamo-crud

@github-actions
Copy link

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs and link to relevant comments in this thread.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 14, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.