-
Notifications
You must be signed in to change notification settings - Fork 27.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pages with utf-8 name don't work properly under SSR #10084
Comments
What's the purpose of using none-ASCII chars if your page name should be displayed as a valid URL?
|
@StarpTech you might want to have a link like |
@StarpTech none-ASCII URL's displayed properly in all modern browsers and used by popular sites. For example by wikipedia.org |
Thanks for the examples. I have never used it. |
As a workaround you can use dynamic page |
In version 9.2 client-side routing for pages with non-ASCII characters worked just fine. The issue was only with the server-side routing, that could be worked around with custom server.js with decodeURI(parsedUrl.pathname). After updating to version 9.5.1 client-side routing for pages with non-ASCII characters stopped working at all. In development mode, after clicking on the link with such a page name, no navigation happens without any error messages. After routeChangeStart event neither routeChangeComplete nor routeChangeError events are fired, and only after clicking on another link routeChangeError with "Error: Route Cancelled" is fired. Edit: |
Tested v9.5.0, v9.5.5, and v10.0.1 and none of them support statically generated pages with non-ascii names like |
Mostly done but utf-8 page names is currently not supported in Next.js. Most likely a bug. See: vercel/next.js#19135 vercel/next.js#10084
I am experiencing the same issue with route in thai language Is there a workaround? next-10.0.3 |
…19135) This ensures we handle encoding/decoding for SSG prerendered/fallback pages correctly. Since we only encode path delimiters when outputting to the disk we need to match this encoding when building the `ssgCacheKey` to look-up the prerendered pages. This also fixes non-ascii prerendered paths (e.g. 商業日語) not matching correctly. This does not resolve 👉 #10084 and further investigation will be needed before addressing non-ascii paths for non-SSG pages. The encoding output was tested against https://tst-encoding-l7amu5b9c.vercel.app/ to ensure the values will match correctly on Vercel. Closes: #17582 Closes: #17642 x-ref: #14717
Tested again and found something really peculiar. It works as expected when deployed on Vercel. It does not work locally when running Sample code: https://github.com/jonrh/next-unicode-bugs Video showing it working on Vercel: I would also like to clarify that this is only testing static routes, not server side rendering (SSR/SSG) as the title of this issue states. |
Can trigger this issue with 9.5+. Any ETA on this as I really want to upgrade to React 17 and webpack 5? |
If this will help someone, I had fixed this issue the following way:
|
As a workaround I used rewrites on async rewrites() {
return [
{
source: `/${encodeURIComponent('カート')}`,
destination: '/cart',
},
{
source: `/${encodeURIComponent('アカウント')}`,
destination: '/account',
},
]
} |
Mostly done but utf-8 page names is currently not supported in Next.js. Most likely a bug. See: vercel/next.js#19135 vercel/next.js#10084
Mostly done but utf-8 page names is currently not supported in Next.js. Most likely a bug. See: vercel/next.js#19135 vercel/next.js#10084
Note that there is a bug in Next.js where utf-8 page names do not work when developing locally. For example the route /málarameistari. However when deployed to Vercel it works as expected. For now I will deal with local dev pain later, probably by just manually rerouting to ASCII routes via a config. See: vercel/next.js#19135 vercel/next.js#10084
That happend to my with ñ's ans ´'s words |
for me, export async function getStaticPaths() {
const { posts } = await request(CMS, POSTS);
const paths = posts.nodes.map((post) => ({
params: { slug: decodeURI(post.slug) },
}));
return { paths, fallback: false };
} |
We just released a new package that overcomes this issue (and many others): https://github.com/Avansai/next-multilingual Looking forward to hearing feedback on our approach. |
While this package shows some promise, shouldn't international urls be supported by default? Internationalization is the concept of supporting multiple languages, which has nothing to do (maybe a little) with UTF-8-based urls. It looks like this package e.g enforces every url to use a language prefix, e.g /fr/my-international-url For example: This doesn't work. However, if I name my page p%C3%A5sk it works.... until I use getStaticPaths, then it breaks. Not to mention ISR revalidation doesn't work either. Using the approach above with rewrites is also not quite feasable when you got multiple pages using e.g getstaticpaths.
|
### Description We're moving all paths to UTF-8 for a whole bunch of reasons such as: - We know it'll be supported everywhere, across platforms, in the browser, and so on. - We have no evidence that any user is using non-UTF-8 paths - It's very very hard to manipulate paths without converting them to Rust Strings. - For instance the [only way to add a trailing slash](https://users.rust-lang.org/t/trailing-in-paths/43166/8) to a path is by doing `path.push("")` - `bstr` [implicitly converts](https://docs.rs/bstr/latest/bstr/#handling-of-invalid-utf-8) invalid Unicode into replacement characters, which is probably not what we want - `bstr` also explicitly notes that the end result of its conversion functions (which again either error or implicitly convert) is [“you’re guaranteed to write correct code for Unix, at the cost of getting a corner case wrong on Windows”](https://docs.rs/bstr/latest/bstr/#file-paths-and-os-strings) - Considering we know that we have Windows users and are committed to supporting Windows, that should be a higher priority than supporting hypothetical users using non-UTF-8 encodings. - To quote [camino](https://docs.rs/camino/latest/camino/): - “Unicode is the common subset of supported paths across Windows and Unix platforms.” - “The '[makefile problem](https://www.mercurial-scm.org/wiki/EncodingStrategy#The_.22makefile_problem.22)' (which also applies to `Cargo.toml`, and any other metadata file that lists the names of other files) has *no general, cross-platform solution* in systems that support non-UTF-8 paths. However, restricting paths to UTF-8 eliminates this problem.” - Basically, if we have non-Unicode encodings, you could have “packages/星巴克” in your turbo.json that does not match to “packages/星巴克” in your file system because the file system is using big5 and turbo.json is using Unicode. - “There are already many systems, such as Cargo, that only support UTF-8 paths. If your own tool interacts with any such system, you can assume that paths are valid UTF-8 without creating any additional burdens on consumers.” - [npm does not allow even Unicode in package names](https://github.com/npm/validate-npm-package-name). Only url-safe characters, i.e. characters, numbers and a few other ASCII characters - Next has [issues with Unicode paths too](vercel/next.js#10084) - How would you even import a non-Unicode JavaScript file? JavaScript strings are Unicode. - `path-slash` also only works on `AsRef<str` or requires a lossy conversion. - Glob walking appears to assume UTF-8 as well. - This simplifies our code significantly since we can drop a lot of errors on invalid Unicode that are sprinkled throughout the codebase. ### Testing Instructions <!-- Give a quick description of steps to test your changes. --> --------- Co-authored-by: --global <Nicholas Yang>
I keep getting this error when I go to non-ascii path in the local dev mode (
|
This comment has been minimized.
This comment has been minimized.
generateStaticParams behaves differently when doing SSG in dev vs build. So the slug params must have `encodeURI(param.slug)` called before being put into the list of generateStaticParams (even if they are not really encoded) and then any page must call `decodeURI(decodeURI(param.slug))` to make it work on build and dev. Blah. https://github.com/vercel/next.js/blame/5d5f58560f46b3300d2e5dc7de90025f46730da1/packages/next/src/server/base-server.ts#L2098C7-L2099 seems similar to: vercel/next.js#10084 vercel/next.js#30007
In dev mode incoming `[slug]` requests get the `encodeURI` treatment which then gives you the error that your `[slug]` is not part of the generateStaticParams result. BLAH! SO, to make dev mode work, you have to ensure params created by generateStaticParams that have non-ASCII, UTF-8, characters get the `encodeURI` treatment prior to returning their value so when a request comes in, it'll find the `encodeURI` param BUT of course that makes your `{params: {slug}}` encoded on the page side. So you must `decodeURI` the slug on the page side for it to match it up. Do not encode for build thought, or they will be encoded on the filesystem which is undesired. https://github.com/vercel/next.js/blame/5d5f58560f46b3300d2e5dc7de90025f46730da1/packages/next/src/server/base-server.ts#L2098C7-L2099 seems similar to: vercel/next.js#10084 vercel/next.js#30007 fix: decode once, always
I just take a quick look at this problem only , but it seems like next.js/packages/next/src/shared/lib/router/utils/route-regex.ts Lines 114 to 116 in 7dbb66f
Are any one eagerly want to do add non-ascii (ex. UTF-8) words at least here ( and supposely more, I cannot pick every dependencies.. sorry) ? next.js/packages/next/src/shared/lib/router/utils/route-regex.ts Lines 81 to 97 in 7dbb66f
|
In my case, I encountered this issue with Arabic pathnames, After debugging a little I noticed that we have a misalignment between dev, and export (on validation I think), as a workaround, I did the following: process.env.NODE_ENV === 'development' ? encodeURI(page) : page
// or
process.env.NODE_ENV === 'development' ? encodeURI(page) : decodeURI(page) On dev, I encoded the pathname. So, it would match what the Next server has, but on Full example:const pages = ['من-نحن', 'سياسة-الخصوصية', 'الشروط-والأحكام']
export async function generateStaticParams() {
return pages.map((page) => ({
pathname: process.env.NODE_ENV === 'development' ? encodeURI(page) : page
}))
} |
What exactly is main source of this problem? We could come up with solution together and fix it by v15 stable release perhaps |
It's been 4 years and no one else at Next showing interest of fixing it |
Bug report
Pages with utf-8 non-ASCII characters in their name don't work properly under SSR
Describe the bug
Pages with utf-8 non-ASCII characters in their name work just fine with client-side navigation,
but when rendered on server side return "404 This page could not be found."
To Reproduce
Steps to reproduce the behavior, please provide code snippets or a repository:
Expected behavior
I'm expecting to see page 'pages/тест.js' rendered
System information
Additional context
Minimal repository to reproduce bug: https://github.com/frei-0xff/nextjs-utf8-pagename
The text was updated successfully, but these errors were encountered: