Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: allow dsg/ssr renders without access to datastore if it's not required #38974

Merged
merged 13 commits into from
May 17, 2024

Conversation

pieh
Copy link
Contributor

@pieh pieh commented May 9, 2024

Description

This makes datastore usage optional for SSR / DSG, so that pages that don't contain graphql queries (mostly make sense for SSR) can be rendered without access to datastore:

  1. This inlines pages manifest into page-ssr module so we can resolve a page without access to initialized datastore (we currently rely on datastore to find SitePage node in it which meant that all requests needed it)
  2. It still tries to initialize query-engine eagerly, but we are not blocking request handling on it overall - we only will await on that if page actually has query to run against it

It also contains couple of bug fixes:

  • preserving same path for datastore when it's excluded between build when build cache is restored through adapter (we were figuring path before cache was restored and hence it was freshly generated every time)
  • add a fallback for datastore download in case url provided at build time end up not working (which is the case currently with deploying through netlify-cli currently)

Tests

Adjusted one of adapter test fixture variants to exclude datastore from function bundle so those code paths are now actually e2e tested (fixture does contain SSR pages with and without graphql query)

Related Issues

FRA-409

@gatsbot gatsbot bot added the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label May 9, 2024
@pieh pieh removed the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label May 9, 2024
Comment on lines +267 to +294
const state = store.getState()
const buildActivityTimer = report.activityTimer(
`Building Rendering Engines`,
{ parentSpan: buildSpan }
)
try {
buildActivityTimer.start()
// bundle graphql-engine
engineBundlingPromises.push(
createGraphqlEngineBundle(program.directory, report, program.verbose)
)

engineBundlingPromises.push(
createPageSSRBundle({
rootDir: program.directory,
components: state.components,
staticQueriesByTemplate: state.staticQueriesByTemplate,
webpackCompilationHash: webpackCompilationHash as string, // we set webpackCompilationHash above
reporter: report,
isVerbose: program.verbose,
})
)
await Promise.all(engineBundlingPromises)
} catch (err) {
reporter.panic(err)
} finally {
buildActivityTimer.end()
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved engine bundling a bit later to a point where we know already which pages would actually be DSG or SSR (need to be after preparePageTemplateConfigs call)

Comment on lines 198 to 201
/**
* @deprecated use findPageByPath exported from page-ssr module instead
*/
public findPageByPath(pathName: string): IGatsbyPage | undefined {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kept for backward-compat, we will not use this anymore when deployed but to not break various integrations this is kept for now

@@ -32,7 +32,7 @@ function getCDNObfuscatedPath(path: string): string {
return `${store.getState().status.cdnObfuscatedPrefix}-${path}`
}

export const LmdbOnCdnPath = getCDNObfuscatedPath(`data.mdb`)
export const getLmdbOnCdnPath = (): string => getCDNObfuscatedPath(`data.mdb`)
Copy link
Contributor Author

@pieh pieh May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making it lazy here, as getCDNObfuscatedPath is potentially using some values from cache that might not have been restored yet at import time (adapters restore it later)

that was causing a bug where generated path was different on each build while it should be the same to avoid publishing multiple versions of db to cdn

@@ -11,7 +11,9 @@ function maybeDropNamedPartOfWildcard(
return path.replace(/\*.+$/, `*`)
}

export function getRoutePathFromPage(page: IGatsbyPage): string {
export function getRoutePathFromPage(
page: Pick<IGatsbyPage, "path" | "matchPath">
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this (and other places I applied Pick<> to IGatsbyPage) was mainly to figure out minimal subset of IGatsbyPage fields that is actually needed for engines to work.

After everything was narrowed down this allowed me to produce final subset of fields to be inlined in engine ( https://github.com/gatsbyjs/gatsby/pull/38974/files#diff-6e15aa2b28579972ae31b662310cd05809325fe178e51e48e4a89bbbc3b34134R45-R54 )

Comment on lines -77 to +105
pathName,
graphqlEngine,
req,
spanContext,
telemetryResolverTimings,
}: {
graphqlEngine: GraphQLEngine
interface IGetDataBaseArgs {
pathName: string
req?: Partial<Pick<Request, "query" | "method" | "url" | "headers">>
spanContext?: Span | SpanContext
telemetryResolverTimings?: Array<IGraphQLTelemetryRecord>
}): Promise<ISSRData> {
}

interface IGetDataEagerEngineArgs extends IGetDataBaseArgs {
graphqlEngine: GraphQLEngine
}

interface IGetDataLazyEngineArgs extends IGetDataBaseArgs {
getGraphqlEngine: () => Promise<GraphQLEngine>
}

type IGetDataArgs = IGetDataEagerEngineArgs | IGetDataLazyEngineArgs
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getData is public so had to preserve this function still continue to work with graphqlEngine being passed in as well as new way of allowing getting it lazily via getGraphqlEngine

function getGraphqlEngine(
req?: GatsbyFunctionRequest
): Promise<GraphQLEngineType> {
const origin = req?.rawUrl ? new URL(req.rawUrl).origin : cdnDatastoreOrigin
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

origin tracking here is to allow "retrying" to download datastore in case URL we pass from build wouldn't work (this is the case for CLI deploys which we use in our e2e tests, where that URL is wrong).

We still attempt to start downloading datastore at function init time as this is best perf/timing wise, but if that wouldn't work we will retry with URLs derived from "first request" url.

This is both useful "bug fix" but also requierment to test changes here in our e2e as otherwise we couldn't test case of excluding datastore from functions (one of test variants now exclude datastore)

@pieh pieh changed the title [wip] Optional lmdb in engines feat: allow dsg/ssr renders without access to datastore if it's not required May 14, 2024
@pieh pieh marked this pull request as ready for review May 14, 2024 14:34
mrstork
mrstork previously approved these changes May 15, 2024
Copy link

@mrstork mrstork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional:

  • script "prefix-never-exclude-datastore" to perhaps a name specifying its function
  • renaming some functions for readability (e.g. getGraphqlEngine, getGraphqlEngineInner, findPageByPath, findPageByPathInner)

@pieh pieh merged commit 884ecaf into master May 17, 2024
34 of 35 checks passed
@pieh pieh deleted the optional-lmdb-in-engines branch May 17, 2024 08:04
pieh added a commit that referenced this pull request May 17, 2024
…equired (#38974)

* fix: generate lmdb cdn path after state was reloaded from cache restoration

* feat: make lmdb datastore optional and lazy in engines

* use datastore exlucde and skip deploy cleanup temporarily

* fix lint

* build engines after page configs were prepared

* tmp: workaround local deploy problems with datastore exclussion

* set rawUrl on request and use it to extract origin for engine download in case url passed at build doesn't work

* no unhandled promise rejections, actually pass rawUrl

* make one of e2e variants use datastore exclussion, so base case is still tested

* revert adapters e2e test cleanup

* scripts rename

* findPageByPath renaming

* getGraphqlEngine renaming

(cherry picked from commit 884ecaf)
pieh added a commit that referenced this pull request May 17, 2024
…equired (#38974)

* fix: generate lmdb cdn path after state was reloaded from cache restoration

* feat: make lmdb datastore optional and lazy in engines

* use datastore exlucde and skip deploy cleanup temporarily

* fix lint

* build engines after page configs were prepared

* tmp: workaround local deploy problems with datastore exclussion

* set rawUrl on request and use it to extract origin for engine download in case url passed at build doesn't work

* no unhandled promise rejections, actually pass rawUrl

* make one of e2e variants use datastore exclussion, so base case is still tested

* revert adapters e2e test cleanup

* scripts rename

* findPageByPath renaming

* getGraphqlEngine renaming

(cherry picked from commit 884ecaf)
pieh added a commit that referenced this pull request May 17, 2024
…equired (#38974) (#38979)

* fix: generate lmdb cdn path after state was reloaded from cache restoration

* feat: make lmdb datastore optional and lazy in engines

* use datastore exlucde and skip deploy cleanup temporarily

* fix lint

* build engines after page configs were prepared

* tmp: workaround local deploy problems with datastore exclussion

* set rawUrl on request and use it to extract origin for engine download in case url passed at build doesn't work

* no unhandled promise rejections, actually pass rawUrl

* make one of e2e variants use datastore exclussion, so base case is still tested

* revert adapters e2e test cleanup

* scripts rename

* findPageByPath renaming

* getGraphqlEngine renaming

(cherry picked from commit 884ecaf)

Co-authored-by: Michal Piechowiak <[email protected]>
@pieh
Copy link
Contributor Author

pieh commented May 17, 2024

Released in

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Published
Development

Successfully merging this pull request may close these issues.

2 participants