Skip to content

dangreaves/gatsby-adapter-aws

Repository files navigation

Gatsby adapter for AWS CDK

NPM Version NPM Downloads GitHub License

This Gatsby adapter enables deployments to AWS using a CDK construct.

  • Uploads static assets to S3 with CloudFront
  • Supports Gatsby Functions using Lambda functions
  • Supports Server-side Rendering (SSR) by packaging the SSR engine into either a Lambda function (for small projects) or ECS Fargate (for larger projects)

Contents

  1. Prerequisites
  2. Installation
  3. Adapter
  4. Construct
  5. Asset prefix
  6. Static assets
    1. Size limits
    2. Cache control
  7. Gatsby Functions
    1. Accessing Gatsby Function resources
  8. Server-side Rendering (SSR)
  9. Cache behavior options
  10. Distribution options
    1. Changing CloudFront options
    2. Disabling the cache
    3. Block search indexing with noindex
    4. Send custom headers to origin
    5. Configure a hosted zone
    6. Deploying additional distributions
  11. Contributors

Prerequisites

Your Gatsby version must be newer than 5.12.0, which is where adapters were introduced.

Installation

npm install @dangreaves/gatsby-adapter-aws

Adapter

Add the adapter to your gatsby-config file.

import { createAdapter } from "@dangreaves/gatsby-adapter-aws/adapter.js";

/** @type {import('gatsby').GatsbyConfig} */
export default {
  adapter: createAdapter(),
  assetPrefix: "/assets", // See "Asset prefix" section below
};

Construct

Add the GatsbySite construct to your AWS CDK stack.

Set gatsbyDir to the relative path to your Gatsby directory.

import * as cdk from "aws-cdk-lib";
import * as cloudfront from "aws-cdk-lib/aws-cloudfront";

import { GatsbySite } from "@dangreaves/gatsby-adapter-aws/cdk.js";

export class GatsbyStack extends cdk.Stack {
  constructor(scope: cdk.App, id: string, props: cdk.StackProps) {
    super(scope, id, props);

    /**
     * Must be constructed externally and shared between GatsbySite constructs to
     * avoid hitting the "Cache policies per AWS account" quota.
     */
    const cachePolicy = new cloudfront.CachePolicy(this, "CachePolicy", {
      queryStringBehavior: cloudfront.CacheQueryStringBehavior.all(),
    });

    new GatsbySite(this, "GatsbySite", {
      gatsbyDir: "./site",
      distribution: { cachePolicy },
    });
  }
}

Asset prefix

When building Gatsby, you must set the asset prefix to /assets. This is so that CloudFront can determine which requests to send to the S3 origin, regardless of where the default cache behavior points.

You must add assetPrefix to your config file (see above) and specifically enable asset prefixing when building.

gatsby build --prefix-paths
# or
PREFIX_PATHS=true gatsby build

Static assets

Static assets are deployed to an S3 bucket.

During the Gatsby build, the adapter groups static assets from the public directory into groups according to their mime type and cache control header.

During the CDK deployment, the construct creates a BucketDeployment for each of these groups, which is responsible for zipping the local assets, uploading it to an asset bucket managed by CDK, and executing a Lambda function which unzips the assets and uploads them to the S3 bucket.

Size limits

If your Gatsby site generates a large number of files, the Lambda function which copies them to S3 may run out of resources (see Size limits in the AWS docs).

If you see these errors, use the bucketDeploymentOptions option to increase the resources.

  • If the Lambda function runs out of memory, you may see a SIGKILL or function timeout error.
  • If the Lambda function runs out of ephemeral storage, you may see a "No space left on device" error.
new GatsbySite(this, "GatsbySite", {
  gatsbyDir: "./site",
  distribution: { cachePolicy },
  bucketDeploymentOptions: {
    memoryLimit: 2048,
    ephemeralStorageSize: cdk.Size.gibibytes(5),
  },
});

Cache control

The Cache-Control header is set for each asset when uploading to S3.

This header determines how the asset is cached in CloudFront and in the browser.

This adapter maintains a default set of headers, which is below.

{
  "/*.js": "IMMUTABLE",
  "/*.js.map": "IMMUTABLE",
  "/*.css": "IMMUTABLE",
  "/page-data/app-data.json": "NO_CACHE",
  "/~partytown/**": "NO_CACHE",
}

The key for each rule is a glob pattern (uses minimatch) and the value can be one of the following values.

  • IMMUTABLE - Asset will be cached forever in both the CDN and browser (public, max-age=31536000, immutable). Use this for assets which will never change, for example if they have a hash in their filename. Gatsby automatically hashes JS and CSS files generated by the framework.
  • NO_CACHE - Serve from the CDN if possible, but always revalidate that it's the latest version first (public, max-age=0, must-revalidate). This is most useful for assets which could change on each deploy.
  • String - Use a custom Cache-Control header. If you include a s-maxage part, that affects the CDN only, which makes it useful for caching in the CDN, but never allowing it to be cached in the browser.

If the asset does not match any of the glob patterns, the default value provided by Gatsby will be used.

You may set your own values using the cacheControl option on the adapter (these values will be merged with the default patterns).

import { createAdapter } from "@dangreaves/gatsby-adapter-aws/adapter.js";

/** @type {import('gatsby').GatsbyConfig} */
export default {
  adapter: createAdapter({
    cacheControl: {
      "/data.json": "NO_CACHE",
      "/images/*.png": "IMMUTABLE",
      "/custom.txt": "public, max-age=0, s-maxage=600, must-revalidate",
    },
  }),
};

Gatsby Functions

If you include a Gatsby Function in your site, this adapter will package it up and deploy it to AWS as a Lambda function.

You can modify various attributes for the function using the gatsbyFunctionOptions option, which takes a function which receives the Gatsby function definition, and returns a set of options.

new GatsbySite(this, "GatsbySite", {
  gatsbyDir: "./site",
  distribution: { cachePolicy },
  gatsbyFunctionOptions: (fn) => {
    if ("/api/intensive" === fn.name) {
      return {
        target: "LAMBDA",
        memorySize: 1024, // Increase memory to 1gb
        functionOptions: {
          environment: {
            foo: "bar",
          },
        },
      };
    }

    return {
      target: "LAMBDA",
    };
  },
});

This adapter also supports deploying the function to AWS Fargate, which involves packaging the function up as a docker image, and deploying it to a continuously running Elastic Container Service task. This is useful for functions which have high resource requirements, or need to respond very quickly. If your function has very high volume, it's also often cheaper to run it as a container than a Lambda function.

If you choose the FARGATE target for one or more functions, you must also provide a cluster.

import * as ecs from "aws-cdk-lib/aws-ecs";

const cluster = new ecs.Cluster(this, "Cluster", { vpc });

new GatsbySite(this, "GatsbySite", {
  cluster,
  gatsbyDir: "./site",
  distribution: { cachePolicy },
  gatsbyFunctionOptions: (fn) => {
    if ("/api/intensive" === fn.name) {
      return {
        target: "FARGATE",
      };
    }

    return {
      target: "LAMBDA",
    };
  },
});

Accessing Gatsby Function resources

You can access the underlying function resources using the gatsbyFunctions property on the GatsbySite construct.

An example of this would be to grant read access to a Secrets Manager secret.

// Resolve secret by ARN.
const secret = secretsmanager.Secret.fromSecretNameV2(
  this,
  "AcmeSecret",
  "acme-token",
);

// Create the Gatsby site.
const gatsbySite = new GatsbySite(this, "GatsbySite", {
  cluster,
  gatsbyDir,
  distribution: { cachePolicy },
  ssrOptions: ssr ? { target: ssr } : undefined,
});

// Allow Lambda functions to read secret.
for (const gatsbyFunction of gatsbySite.gatsbyFunctions) {
  if ("LAMBDA" !== gatsbyFunction.target) continue;
  secret.grantRead(gatsbyFunction.lambdaFunction);
}

Server-side Rendering (SSR)

If your Gatsby site includes a getServerData export on any of the pages, then Gatsby will export an "SSR engine" function for deployment (see Using Server-side Rendering). This function is responsible for rendering the data for your SSR pages, both in HTML format (for document requests) and in JSON format for page-data requests.

This adapter treats the SSR engine just like a Gatsby Function. You can use AWS Lambda or AWS Fargate to process the requests. If you have a small site, then Lambda (the default) should be enough, but if you have a large site (and thus a large SSR function), you may want to use AWS Fargate.

The function is connected to the "default" cache behavior in CloudFront, so all requests will go the SSR handler, unless they match another behavior.

Configure the SSR engine using ssrOptions, which takes the same input as the Gatsby Functions documented above.

For example, if you wanted to deploy the SSR engine to Fargate, do this.

new GatsbySite(this, "GatsbySite", {
  gatsbyDir: "./site",
  distribution: { cachePolicy },
  ssrOptions: {
    target: "FARGATE",
  },
});

If you wanted to deploy to Lambda, but increase the memory limit, do this.

new GatsbySite(this, "GatsbySite", {
  gatsbyDir: "./site",
  distribution: { cachePolicy },
  ssrOptions: {
    target: "LAMBDA",
    memorySize: 512,
  },
});

If your Gatsby site is generating an SSR function but you don't want to use it, you can explicitely disable the SSR function, which will make the default cache behavior point to S3 instead.

new GatsbySite(this, "GatsbySite", {
  gatsbyDir: "./site",
  distribution: { cachePolicy },
  ssrOptions: {
    target: "DISABLED",
  },
});

Cache behavior options

CloudFront uses cache behaviors to determine which origin requests should be sent to, based on a URL pattern.

This adapter deals with wiring up the various cache behaviors to send requests to the S3 bucket (for static assets), Lambda and Elastic Container Service (for Gatsby Functions and/or SSR).

There are four types of cache behavior.

  • default - The cache behavior which most requests will hit. For static sites, this will use S3 as the origin, and for SSR sites, this will use the SSR handler as the origin.
  • page-data - Special cache behavior for page-data requests, which technically look like assets, but actually get routed to the SSR handler. This behavior is not used for static deployments.
  • assets - The cache behavior associated with static assets. This uses the /assets prefix, and always points to S3 as the origin.
  • functions - Individual cache behaviors created for each function. These use the function name as the path (e.g. /api/foo) and point to either Lambda or Fargate as the origin.

Each cache behavior has a set of options associated with it, which you can control using cacheBehaviorOptions.

An example use of this option is to attach a Lambda@Edge function to the default cache behavior.

import * as lambda from "aws-cdk-lib/aws-lambda";
import * as cloudfront from "aws-cdk-lib/aws-cloudfront";

import { TypeScriptCode } from "@mrgrain/cdk-esbuild";

const originResponseFunction = new cloudfront.experimental.EdgeFunction(
  this,
  "OriginResponseFunction",
  {
    runtime: lambda.Runtime.NODEJS_18_X,
    handler: "cloudfront-origin-response.handler",
    code: new TypeScriptCode("functions.aws/src/cloudfront-origin-response.ts"),
  },
);

new GatsbySite(this, "GatsbySite", {
  gatsbyDir: "./site",
  distribution: {
    cachePolicy,
    cacheBehaviorOptions: {
      default: {
        edgeLambdas: [
          {
            functionVersion: originResponseFunction.currentVersion,
            eventType: cloudfront.LambdaEdgeEventType.ORIGIN_RESPONSE,
          },
        ],
      },
    },
  },
});

Distribution options

If you want to change options for the CloudFront distribution itself, use the distribution option.

Changing CloudFront options

The CloudFront distribution options can be changed.

new GatsbySite(this, "GatsbySite", {
  gatsbyDir: "./site",
  distribution: {
    cachePolicy,
    distributionOptions: {
      certificate,
      domainNames: ["example.com"],
    },
  },
});

Disabling the cache

To disable the cache entirely, you should set the cache policy to CACHING_DISABLED and set a ResponseHeadersPolicy to send the Cache-Control header with value no-store. This will prevent both CloudFront, and your users browsers from caching the responses. Every request will hit the origin.

import * as cloudfront from "aws-cdk-lib/aws-cloudfront";

const responseHeadersPolicy = new cloudfront.ResponseHeadersPolicy(
  this,
  "ResponseHeadersPolicy",
  {
    customHeadersBehavior: {
      customHeaders: [
        { header: "Cache-Control", value: "no-store", override: true },
      ],
    },
  },
);

new GatsbySite(this, "GatsbySite", {
  gatsbyDir: "./site",
  distribution: {
    cacheBehaviorOptions: {
      default: { responseHeadersPolicy },
      assets: { responseHeadersPolicy },
      functions: { responseHeadersPolicy },
    },
    cachePolicy: cloudfront.CachePolicy.CACHING_DISABLED,
  },
});

Block search indexing with noindex

Search indexing can be blocked for the entire distribution by appending a X-Robots-Tag: noindex header to all responses.

See developers.google.com/search/docs/crawling-indexing/block-indexing for more information on how this works.

const responseHeadersPolicy = new cloudfront.ResponseHeadersPolicy(
  this,
  "ResponseHeadersPolicy",
  {
    customHeadersBehavior: {
      customHeaders: [
        { header: "X-Robots-Tag", value: "noindex", override: true },
      ],
    },
  },
);

new GatsbySite(this, "GatsbySite", {
  gatsbyDir: "./site",
  distribution: {
    cachePolicy,
    cacheBehaviorOptions: {
      default: { responseHeadersPolicy },
      assets: { responseHeadersPolicy },
      functions: { responseHeadersPolicy },
    },
  },
});

Send custom headers to origin

To send a custom header to your origins, use the originCustomHeaders option. This is useful if you need to identity from your functions which distribution sent the request.

new GatsbySite(this, "GatsbySite", {
  gatsbyDir: "./site",
  distribution: {
    cachePolicy,
    originCustomHeaders: {
      "x-gatsby-preview": "true",
    },
  },
});

Configure a hosted zone

To create a Route53 zone with an apex record which points at the distribution, use the hostedZone option.

new GatsbySite(this, "GatsbySite", {
  gatsbyDir: "./site",
  distribution: {
    cachePolicy,
    distributionOptions: {
      certificate,
      domainNames: ["example.com"],
    },
    hostedZone: {
      domainName: "example.com",
    },
  },
});

Deploying additional distributions

You may deploy multiple distributions for the same Gatsby site. The underlying constructs like Lambda functions etc will only be deployed once, and each distribution will point to the same resources. This is useful if you need to individually control distribution options, like cache settings.

For example, your default distribution may use the default cache headers, and thus have SSR responses cache for a period of time. However, you might want a "preview" distribution which allows content editors to always see fresh content, without waiting for the cache to clear.

new GatsbySite(this, "GatsbySite", {
  gatsbyDir: "./site",
  distribution: {
    cachePolicy,
    distributionOptions: {
      certificate: mainCert,
      domainNames: ["example.com"],
    },
    hostedZone: {
      domainName: "example.com",
    },
  },
  additionalDistributions: {
    preview: {
      cachePolicy,
      distributionOptions: {
        certificate: previewCert,
        domainNames: ["preview.example.com"],
      },
      hostedZone: {
        domainName: "preview.example.com",
      },
      originCustomHeaders: {
        "x-gatsby-preview": "true",
      },
    },
  },
});

Contributors

Dan Greaves
Dan Greaves

💻
Sumesh Suvarna
Sumesh Suvarna

💻

This project follows the all-contributors specification (emoji key). Contributions of any kind welcome!