Skip to content

A wrapper for npm aws-sdk that manages multipart copy process

License

Notifications You must be signed in to change notification settings

jeffbski-rga/aws-s3-multipart-copy

 
 

Repository files navigation

@jeffbski-rga/aws-s3-multipart-copy

Fork of aws-s3-multipart-copy to fix issues with broken dependencies from snyk

Also includes code from https://github.com/spencer-jacobs/aws-s3-multipart-copy which switched to using AWS SDK V3.

Wraps aws-sdk with a multipart-copy manager, in order to provide an easy way to copy large objects from one bucket to another in aws-s3. The module manages the copy parts order and bytes range according to the size of the object and the desired copy part size. It speeds up the multipart copying process by sending multiple copy-part requests simultaneously.

This fork allows you to provide the exact AWS SDK V3 version that you want to use in the createDeps function rather than requiring this library to continuously be updated. The output deps awsClientDeps are then provided to the CopyMultipart constructor. This structure also makes it easy to mock out awsClientDeps for testing.

** The package supports aws-sdk version '2006-03-01' and above.

** The package supports node 8 version and above.

Travis CI

Installing

npm install @jeffbski-rga/aws-s3-multipart-copy

Getting Started

See the docs below for full details and parameter options. This is a quick overview of how to use the library.

Common JS

const {
    S3Client,
    CreateMultipartUploadCommand,
    AbortMultipartUploadCommand,
    CompleteMultipartUploadCommand,
    UploadPartCopyCommand,
    ListPartsCommand
} = require("@aws-sdk/client-s3"),

const awsClientS3Commands = {
    CreateMultipartUploadCommand,
    AbortMultipartUploadCommand,
    CompleteMultipartUploadCommand,
    UploadPartCopyCommand,
    ListPartsCommand
};
const awsClientDeps = createDeps(awsClientS3Commands);

const s3Client = new S3Client({ region: 'us-east-1' });
const params = { /* copyObjectMultipartParams here */ };
const copyMultipart = new CopyMultipart({ s3Client, awsClientDeps, params });
copyMultipart.done()
  .then((result) => {
    console.log(result);
  })
  .catch((err) => {
    // handle error
  });

ESM

import {
    S3Client,
    CreateMultipartUploadCommand,
    AbortMultipartUploadCommand,
    CompleteMultipartUploadCommand,
    UploadPartCopyCommand,
    ListPartsCommand
} from "@aws-sdk/client-s3";

const awsClientS3Commands = {
    CreateMultipartUploadCommand,
    AbortMultipartUploadCommand,
    CompleteMultipartUploadCommand,
    UploadPartCopyCommand,
    ListPartsCommand
};
const awsClientDeps = createDeps(awsClientS3Commands);

const s3Client = new S3Client({ region: 'us-east-1' });
const params = { /* copyObjectMultipartParams here */ };
const copyMultipart = new CopyMultipart({ s3Client, awsClientDeps, params });
await copyMultipart.done();

createDeps

aws-s3-multipart-copy is based on the aws-sdk, so createDeps creates a flattened dependency object with async fns that perform s3 commands.

If resilience is desired then these s3 functions can be wrapped to retry on certain types of errors.

const {
    S3Client,
    CreateMultipartUploadCommand,
    AbortMultipartUploadCommand,
    CompleteMultipartUploadCommand,
    UploadPartCopyCommand,
    ListPartsCommand
} = require("@aws-sdk/client-s3"),

const awsClientS3Commands = {
    CreateMultipartUploadCommand,
    AbortMultipartUploadCommand,
    CompleteMultipartUploadCommand,
    UploadPartCopyCommand,
    ListPartsCommand
};

const awsClientDeps = createDeps(awsClientS3Commands);
/*
{
  s3CreateMultipartUpload: (s: s3Client, p: CreateMultipartUploadCommandInput, h: HttpHandlerOptions) => Promise<CreateMultipartUploadCommandOutput>;
  s3UploadPartCopy: (s: s3Client, p: UploadPartCopyCommandInput, h: HttpHandlerOptions) => Promise<UploadPartCopyCommandOutput>;
  s3AbortMultipartUpload: (s: s3Client, p: AbortMultipartUploadCommandInput, h: HttpHandlerOptions) => Promise<AbortMultipartUploadCommandOutput>;
  s3ListParts: (s: s3Client, p: ListPartsCommandInput) => Promise<ListPartsCommandOutput>;
  s3CompleteMultipartUpload: (s: s3Client, p: CompleteMultipartUploadCommandInput, h: HttpHandlerOptions) => Promise<CompleteMultipartUploadCommandOutput>;
}

Example

const {createDeps, CopyMultipart } = require('@jeffbski-rga/aws-s3-multipart-copy');

const {
    S3Client,
    CreateMultipartUploadCommand,
    AbortMultipartUploadCommand,
    CompleteMultipartUploadCommand,
    UploadPartCopyCommand,
    ListPartsCommand,
} = require("@aws-sdk/client-s3");

const awsClientS3Commands = {
    CreateMultipartUploadCommand,
    AbortMultipartUploadCommand,
    CompleteMultipartUploadCommand,
    UploadPartCopyCommand,
    ListPartsCommand,
};
const awsClientDeps = createDeps(awsClientS3Commands);

CopyMultipart

Create a new instance of CopyMultipart class to prepare for use.

** Objects size for multipart copy must be at least 5MB.

The method receives two parameters: options and request_context

Request parameters

  • s3Client: Object(mandatory) : S3Client - S3Client instance
  • awsClientDeps: Object(mandatory) : CreateDepsOutput - deps created from the createDeps call, these are the functions that perform side effects calling s3 commands and for logging
  • params: Object (mandatory) : CopyObjectMultipartOptions - keys inside this object must be as specified below
    • source_bucket: String (mandatory) - The bucket that holds the object you wish to copy
    • object_key: String (mandatory) - The full path (including the name of the object) to the object you wish to copy
    • destination_bucket: String (mandatory) - The bucket that you wish to copy to
    • copied_object_name: String (mandatory) - The full path (including the name of the object) for the copied object in the destination bucket
    • object_size: Integer (mandatory) - A number indicating the size of the object you wish to copy in bytes
    • copy_part_size_bytes: Integer (optional) - A number indicating the size of each copy part in the process, if not passed it will be set to a default of 50MB. This value must be between 5MB and 5GB - 5MB. ** if object size does not divide exactly with the part size desired, last part will be smaller or larger (depending on remainder size)
    • copied_object_permissions: String (optional) - The permissions to be given for the copied object as specified in aws s3 ACL docs, if not passed it will be set to a default of 'private'
    • expiration_period: Integer/Date (optional) - A number (milliseconds) or Date indicating the time the copied object will remain in the destination before it will be deleted, if not passed there will be no expiration period for the object
    • content_type: String (optional) A standard MIME type describing the format of the object data
    • metadata: Object (optional) - A map of metadata to store with the object in S3
    • cache_control: String (optional) - Specifies caching behavior along the request/reply chain
    • storage_class: String (optional) - Specifies the storage class for the copied object. The valid values are specified in the aws s3 docs. When unset, the class will be 'STANDARD'
  • logger: Object (optional) - logger with info and error methods for logging, defaults to a null logger (no logging)
  • requestContext: String (optional) - this parameter will be logged in every log message, if not passed it will remain undefined.
  • abortController: optional AbortController instance which can be used to abort a copy
  • maxConcurrentParts: optional integer to controll how many concurrent copies are used, defaults to 4

Response

  • A successful result might hold any of the following keys as specified in aws s3 completeMultipartUpload docs

    • Location — (String)
    • Bucket — (String)
    • Key — (String)
    • Expiration — (String) If the object expiration is configured, this will contain the expiration date (expiry-date) and rule ID (rule-id). The value of rule-id is URL encoded.
    • ETag — (String) Entity tag of the object.
    • ServerSideEncryption — (String) The Server-side encryption algorithm used when storing this object in S3 (e.g., AES256, aws:kms). Possible values include:
      • "AES256"
      • "aws:kms"
    • VersionId — (String) Version of the object.
    • SSEKMSKeyId — (String) If present, specifies the ID of the AWS Key Management Service (KMS) master encryption key that was used for the object.
    • RequestCharged — (String) If present, indicates that the requester was successfully charged for the request. Possible values include:
      • "requester"
  • In case multipart copy fails, three scenarios are possible:

    • The copy will be aborted and copy parts will be deleted from s3 - CopyMultipart will reject
    • The abort procedure passed but the copy parts were not deleted from s3 - CopyMultipart will reject
    • The abort procedure fails and the copy parts will remain in s3 - CopyMultipart will reject

Example

Positive

const bunyan = require('bunyan');
const { createDeps, CopyMultipart } = require("@jeffbski-rga/aws-s3-multipart-copy");
const {
    S3Client,
    CreateMultipartUploadCommand,
    AbortMultipartUploadCommand,
    CompleteMultipartUploadCommand,
    UploadPartCopyCommand,
    ListPartsCommand,
} = require("@aws-sdk/client-s3"),

const awsClientS3 = {
    CreateMultipartUploadCommand,
    AbortMultipartUploadCommand,
    CompleteMultipartUploadCommand,
    UploadPartCopyCommand,
    ListPartsCommand,
};
const s3ClientConfig = {};
const s3Client = new S3Client(s3ClientConfig);
const awsClientDeps = createDeps(awsClientS3);

const logger = bunyan.createLogger({
        name: 'copy-object-multipart',
        level: 'info',
        version: 1.0.0,
        logType: 'copy-object-multipart-log',
        serializers: { err: bunyan.stdSerializers.err }
    });

const request_context = "request_context";
const params = {
  source_bucket: "source_bucket",
  object_key: "object_key",
  destination_bucket: "destination_bucket",
  copied_object_name: "someLogicFolder/copied_object_name",
  object_size: 70000000,
  copy_part_size_bytes: 50000000,
  copied_object_permissions: "bucket-owner-full-control",
  expiration_period: 100000,
  storage_class: 'STANDARD'
};

const copyMultipart = new CopyMultipart({ s3Client, awsClientDeps, params, logger, requestContext});
return copyMultipart.done()
  .then((result) => {
    console.log(result);
  })
  .catch((err) => {
    // handle error
  });

/* Response:
            result = {
                Bucket: "acexamplebucket", 
                ETag: "\"4d9031c7644d8081c2829f4ea23c55f7-2\"", 
                Expiration: 100000,
                Key: "bigobject", 
                Location: "https://examplebucket.s3.amazonaws.com/bigobject"
            }
        */

Examples of error messages

Negative 1 - abort action passed but copy parts were not removed

/*
            err = {
                message: 'Abort procedure passed but copy parts were not removed'
                details: {
                    Parts: ['part 1', 'part 2']
                    }
                }
        */

Negative 2 - abort action succeded

/*
            err = {
                    message: 'multipart copy aborted',
                    details: {
                        Bucket: destination_bucket,
                        Key: copied_object_name,
                        UploadId: upload_id
                    }
                }
        */

About

A wrapper for npm aws-sdk that manages multipart copy process

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • JavaScript 65.7%
  • TypeScript 34.3%