Collecting Requirements for Per-Language Splitting #88

LorisSigrist · 2024-04-22T11:28:13Z

Context

Paraglide currently splits messages by component / page. If you load a page with 3 client components (or your framework's equivalent) only the messages for those three components are sent to the client. But, they are currently sent in all languages. Ideally we would only send messages in the language that is displayed.

This issue collects ideas on how that could be achieved

Expected Impact - Case Study Inlang.com

The average translation (1 message in one language) on Inlang.com is about 50 - 60 bytes. Times that by the number of languages (7) & you get the average impact per message. About 400 bytes.

There are about 200 messages on the Website, but because of per-page splitting only an average of 20 are loaded when you go to a page. This leaves us with a bundle-size impact of 400 * 20 = 8kB per page on average.

If we got per-language splitting to work on top of that it could save 6 out of 7 bytes, leaving us at just over 1kB. This would be a huge win, but only if the language-splitting adds less than 7kB to the client bundle.

Inlang.com has 7 languages, which is more than most sites. Usually you would have between 2 and 4. So the actual size-limit for the per-page splitting runtime would be about 2kB. For context: i18next is 40kB.

Work done so far

We have already tried a few approaches & run into various challenges.

Copying the routes/ directory for each language & using middleware to multiplex between the different builds based on language.
- Imports from in/out of the routes/ folder are incredibly fragile
- Doesn't work for all routers
- Only works if the framework has a rewrite mechanism
Post-processing the build output by copying each output file for each language and replacing messages with the language-specific version.
- Doesn't work with compressed build ouputs
- Introduces various linking issues

Fundamentally this is a dynamic linking problem in a world of ESM and static linking, which is really hard.

Another promising idea that we haven't tried yet is to serialize the messages & pass them along with the page-data. However, there are open questions on how we would know which messages need to be sent .

Note: Lazy Loading is not the Solution

Any solution using fetch or await import is bound to introduce a render-fetch waterfall which drastically increases Time-To-Interactive. Eagerly loading messages in all languages is preferable in the vast majority of cases.

Most projects have between 2-4 languages, lazy-loading only becomes justifiable at 10<.

The text was updated successfully, but these errors were encountered:

osdiab · 2024-04-30T05:04:59Z

Keenly watching this. Seems like a core make or break feature that determines if this library can truly scale.

LorisSigrist · 2024-04-30T06:49:12Z

Per-Language splitting is one of our big goals!

That being said, Paraglide already does scale really well. Because of it's small footprint (tiny runtime, minified message ids, per-client-component-splitting) it already stays small, even when shipping extra languages.

We did some benchmarks on this:

As long as you stay under 5 Languages Paraglide already is the smallest choice.
If you're using a Framework with Server-Components / Islands / Some sort of partial hydration it stays the best choice for up to 10 languages.

Per-Language splitting will make it so that paraglide stays the best regardless of how many languages you have, but for a lot of projects it's already the best choice.

osdiab · 2024-05-24T14:25:52Z

Another promising idea that we haven't tried yet is to serialize the messages & pass them along with the page-data. However, there are open questions on how we would know which messages need to be sent

Maybe leveraging AsyncLocalStorage (NextJS already seems to use this for headers()) to have a request context for this could help, having the translation functions add to a list at runtime?

LorisSigrist · 2024-05-24T14:34:47Z

That's an interesting idea, however, that likely only catches the messages that are actually executed during server-rendering, not messages that are used conditionally. We would need those too.

osdiab · 2024-05-25T00:56:12Z

Hmm yeah, in that case it probably can’t be a runtime thing then. Maybe can crawl the AST at compile time to find every invocation of a translation function, traversing from the starting point for each route (I think should be clear for each metaframework, eg for NextJS any default export from a page/layout/route file, not sure how one would achieve this framework agnostically though).

minht11 · 2024-07-13T18:18:36Z

I think this could be solved with import maps. It allows to load different specifiers dynamically. Main caveat being that importmap must be inserted before any module loading occurs.

With this only one language would be loaded. The downside that i18n module keys could not be inlined inside the bundle like they are right now or whole application chunks would need to be duplicated.

I tested it locally and it works, I am pretty sure something like this could be implemented by Paraglide relatively easily.

Main experiment code, missing en.js and de.js files.

<!doctype html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />

    <script type="application/javascript">
      const language = localStorage.getItem('language') ?? 'en';

      const im = document.createElement('script');
      im.type = 'importmap';
      im.textContent = JSON.stringify({
        imports: {
          language: `/${language}.js`,
        }
      });
      document.currentScript.after(im);
    </script>
    
  </head>
  <body>
    <div id="languageContent"></div>

    <button id="toggleLanguage">
      Toggle language
    </button>

    <script type="module">
      import { languageName } from 'language'

      languageContent.innerHTML = `<h1>${languageName}</h1>`

      toggleLanguage.addEventListener('click', async () => {
        const language = localStorage.getItem('language') ?? 'en';
        const newLanguage = language === 'en' ? 'de' : 'en';

        localStorage.setItem('language', newLanguage);

        window.location.reload();
      });
    </script>
  </body>
</html>

osdiab · 2024-07-14T03:34:22Z

I think the problem with that is that the goal is to pass the i18n strings in the initial page load, not in a separate HTTP request after the page loads in the browser. The script tag would need to be parsed and executed in the client’s browser rather than happening entirely on the server.

minht11 · 2024-07-14T08:50:40Z

If you inline script inside html, no seperate request will be made, it will load sync with html and since script for import map is very small the cost is very minimal, far less than even loading 2 languages.

Things won't be loaded lazy loaded, just i18n strings need to be in separate chunks for specifier imports to work (or whole separate app bundle for each language). Module preload native/vite polyfilled should make few more separate chunks non issue.

Also my solution would work for SPA too, in my use case I am not using a server or meta framework. Server solution you were discussing sounds like meta framework specific.

LorisSigrist · 2024-07-29T09:24:57Z

Inlining the scripts would be very nice! We'll definitely prototype that.
I'm not yet sure how we would do the build-transforms necessary to do this.

How do we know which messages can be rendered on the current page? This info exists in the tree-shaking but we need to have it at runtime.
How do we get the client-side build to use the inlined messages?

It's a promising approach though. I imagine it would generalize quite well across frameworks

ambigos1 · 2024-09-05T12:04:40Z

Hi @LorisSigrist .
I just noticed that all languages are getting downloaded and I found it after I translated my app to 57 languages :(

Performance is my top priority and deploying my website with 57 languages will damage my performance on web core vitals.

I am currently using Svelte Static-Adapter and all my website is prerendered for all languages.
Is there a way to prevent downloading the JavaScript files with all languages since they are generated on build time as static HTML files?

Thank you very much for your time :)

samuelstroschein · 2024-09-13T13:45:26Z

Hi @ambigos1

Is there a way to prevent downloading the JavaScript files with all languages since they are generated on build time as static HTML files?

Not atm. I might set a bounty on this issue. If anyone is down to implement per language-splitting after #217

samuelstroschein · 2024-11-19T16:24:33Z

The new vite environments https://main.vitejs.dev/guide/api-environment could be the solution we waited for by creating one environment per locale.

ambigos1 · 2024-12-16T09:02:41Z

@samuelstroschein
Vite 6 is out!
https://vite.dev/blog/announcing-vite6
:)

samuelstroschein · 2025-01-07T18:58:26Z

Making the locale/language tag getter static on a per build basis could be interesting. If the language getter is static on a given build, bundlers will tree-shake unused imports.

const jojo_mountain_day = (inputs, options = {}) => {
	const locale = "en";
	if (locale === "en") return en.jojo_mountain_day(inputs);
-	if (locale === "de") return de.jojo_mountain_day(inputs);
-	if (locale === "en-US") return en_US.jojo_mountain_day(inputs);
	return "jojo_mountain_day";
};

samuelstroschein · 2025-01-25T00:03:33Z

I will look into per language/locale splitting next week #201 (comment)

samuelstroschein · 2025-01-26T22:45:56Z

Trivial to implement for the compiler. Last open question is how to build per locale with bundlers. Vite's environment API could be the breakthrough.

add a staticLocale to the runtime
define staticLocale on compile
bundler tree-shakes messages that dont' correspond to the static locale 🚀

+const staticLocale = "de"

const greeting = (inputs, options = {}) => {
+	const locale = staticLocale ?? options.locale ?? getLocale();
	if (locale === "en") return en.greeting(inputs);
	if (locale === "de") return de.greeting(inputs);
	return "greeting";
};

export { greeting };

moufmouf · 2025-01-29T14:53:28Z

Hey!

For the context, I'm in a situation where I have 10+ languages to translate and a SPA running in Svelte only mode (no SvelteKit). See #351.

I just had an idea I wanted to share here. I don't think it was mentioned before.

If I understand correctly, you are scratching your heads to avoid lazy-loading because you assume it will cause an additional round-trip:

Any solution using fetch or await import is bound to introduce a render-fetch waterfall which drastically increases Time-To-Interactive. Eagerly loading messages in all languages is preferable in the vast majority of cases.

BUT! There are now ways to tell the browser to prefetch resources. For instance: Early hints: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/103

Basically, there might be no need to make one build per locale if you can ensure the correct message file is already loaded in the browser when your app is bootstrapping.

Did you guys already explore this solution?

samuelstroschein · 2025-01-29T18:35:37Z

@moufmouf thanks for the idea. might be something to it.

The waterfall is not the main issue. The main issue is avoiding message functions like m.happy_elephant() to be async. The moment that happens, complexity will explode. Every render turns async, which requires suspense, etc.

export async function happy_elephant(){
    if (locale === "de) return await import("de.js")
    // ...
}

function Component() {
   // 💥 happy_elepahtn is a promise
   // which will render <p>Promise</p>
   return <p>{m.happy_elephant()}</p>
}

What could work, however, is using ESM new top-level await. If the locale is set before the top level await of a message bundle function (the bundle function "bundles" the messages for all locales) is executed, then your approach could work! The bundler tree-shakes un-unsed message bundle functions, and the message bundle function lazy loads the message defined by the locale!

// top level import of the message in the current locale
const message = await import("{locale}.js")

export happy_elephant() {
   return message
}

I will investigate this! This might have legs!

ambigos1 · 2025-01-30T12:33:11Z

Hey!

For the context, I'm in a situation where I have 10+ languages to translate and a SPA running in Svelte only mode (no SvelteKit). See #351.

I just had an idea I wanted to share here. I don't think it was mentioned before.

If I understand correctly, you are scratching your heads to avoid lazy-loading because you assume it will cause an additional round-trip:

Any solution using fetch or await import is bound to introduce a render-fetch waterfall which drastically increases Time-To-Interactive. Eagerly loading messages in all languages is preferable in the vast majority of cases.

BUT! There are now ways to tell the browser to prefetch resources. For instance: Early hints: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/103

Basically, there might be no need to make one build per locale if you can ensure the correct message file is already loaded in the browser when your app is bootstrapping.

Did you guys already explore this solution?

I am using prerender, and all the text is generated at compile time.
I don't even need all the messages to be loaded at the production
If I only knew how I could remove them from the build output, it would help me boosting my performance

samuelstroschein · 2025-01-30T16:22:00Z

If I only knew how I could remove them from the build output, it would help me boosting my performance

@ambigos1 nice one. opened #354

LorisSigrist added the Feature label Apr 22, 2024 — with Linear

LorisSigrist self-assigned this Apr 22, 2024

samuelstroschein mentioned this issue Sep 13, 2024

Dynamic import of locales #222

Closed

samuelstroschein mentioned this issue Oct 4, 2024

support i18n sveltejs/svelte#1320

Closed

This comment was marked as off-topic.

Sign in to view

samuelstroschein mentioned this issue Jan 24, 2025

Project: Paraglide JS 2.0 (variants, pluralization, gendering) #201

Open

2 tasks

moufmouf mentioned this issue Jan 29, 2025

[Question] Lazy-loading of languages #351

Closed

samuelstroschein mentioned this issue Jan 30, 2025

server side pre-rendered messages shouldn't get shipped to the client #354

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Collecting Requirements for Per-Language Splitting #88

Collecting Requirements for Per-Language Splitting #88

LorisSigrist commented Apr 22, 2024 •

edited

Loading

osdiab commented Apr 30, 2024

LorisSigrist commented Apr 30, 2024

osdiab commented May 24, 2024

LorisSigrist commented May 24, 2024

osdiab commented May 25, 2024 •

edited

Loading

minht11 commented Jul 13, 2024

osdiab commented Jul 14, 2024

minht11 commented Jul 14, 2024 •

edited

Loading

LorisSigrist commented Jul 29, 2024

ambigos1 commented Sep 5, 2024

samuelstroschein commented Sep 13, 2024

samuelstroschein commented Nov 19, 2024

ambigos1 commented Dec 16, 2024

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

samuelstroschein commented Jan 7, 2025

samuelstroschein commented Jan 25, 2025

samuelstroschein commented Jan 26, 2025

moufmouf commented Jan 29, 2025

samuelstroschein commented Jan 29, 2025 •

edited

Loading

ambigos1 commented Jan 30, 2025

samuelstroschein commented Jan 30, 2025

Collecting Requirements for Per-Language Splitting #88

Collecting Requirements for Per-Language Splitting #88

Comments

LorisSigrist commented Apr 22, 2024 • edited Loading

Context

Expected Impact - Case Study Inlang.com

Work done so far

Note: Lazy Loading is not the Solution

osdiab commented Apr 30, 2024

LorisSigrist commented Apr 30, 2024

osdiab commented May 24, 2024

LorisSigrist commented May 24, 2024

osdiab commented May 25, 2024 • edited Loading

minht11 commented Jul 13, 2024

osdiab commented Jul 14, 2024

minht11 commented Jul 14, 2024 • edited Loading

LorisSigrist commented Jul 29, 2024

ambigos1 commented Sep 5, 2024

samuelstroschein commented Sep 13, 2024

samuelstroschein commented Nov 19, 2024

ambigos1 commented Dec 16, 2024

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

samuelstroschein commented Jan 7, 2025

samuelstroschein commented Jan 25, 2025

samuelstroschein commented Jan 26, 2025

moufmouf commented Jan 29, 2025

samuelstroschein commented Jan 29, 2025 • edited Loading

ambigos1 commented Jan 30, 2025

samuelstroschein commented Jan 30, 2025

LorisSigrist commented Apr 22, 2024 •

edited

Loading

osdiab commented May 25, 2024 •

edited

Loading

minht11 commented Jul 14, 2024 •

edited

Loading

samuelstroschein commented Jan 29, 2025 •

edited

Loading