Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: EMFILE: too many open files, open 'xxx.mdx' (x10) : Content and Pages at very large scale #7241

Closed
ezzle opened this issue May 30, 2023 · 15 comments
Labels
feat: content collections Related to the Content Collections feature (scope) needs response Issue needs response from OP requires refactor Bug, may take longer as fixing either requires refactors, breaking changes, or considering tradeoffs

Comments

@ezzle
Copy link

ezzle commented May 30, 2023

          > When generating the content collections, should we turn off the watcher?

That makes sense. We're spinning up one-off Vite servers in some places and turning the watcher off could fix it. You can't shut it down entirely though since Vite always spins it up, but we can configure to ignore watching all files.

Do you think Astro can help meet that content volume requirement ?

The issue is related with the number of files in 'src/content'. It apear when >= +-4000K with any Astro version. I am very new in the Astro community, coming fron ApostropheCMS eco-system. I am hoping to achieve a more reactive & lean alternative with astro & alpine as base. Unfortunately, the capacity to serve efficiently >= 1000K files is mandatory. Do you think Astro can help meet that content volume requirement ?

We should be able to but seeing that many files is a first for me so maybe we're not doing something right at that scale. However, I think it's diverging from the original issue so I'd suggest opening another issue if you can find out which Astro version starts causing issues. The errors you get are because some other dependencies rely on the latest version of Astro, so you might need to downgrade the others too.

@bluwy et all, the #7125 change fixed my issues

Thanks for the update! I'll go ahead and get the fix out then.

Originally posted by @bluwy in #7073 (comment)

@ezzle
Copy link
Author

ezzle commented May 30, 2023

Similar discussion on stackoverflow

https://stackoverflow.com/questions/8965606/node-and-error-emfile-too-many-open-files

Using
Astro 2.5.5
Windows 11
Chrome Version 113.0.5672.127 (Build officiel) (64 bits)

@bluwy
Copy link
Member

bluwy commented May 30, 2023

  1. Is there a version of Astro where this starts to happen? Or has it always happen?
  2. Did turning off the watcher fix the issue?

@bluwy bluwy added the needs response Issue needs response from OP label May 30, 2023
@ezzle
Copy link
Author

ezzle commented May 30, 2023

  1. Is there a version of Astro where this starts to happen? Or has it always happen?
    Always
  2. Did turning off the watcher fix the issue?
    Not. How to ?

@bluwy
Copy link
Member

bluwy commented May 31, 2023

// astro.config.js

export default {
  vite: {
    server: {
      watch: {
        ignored: ['**/*']
      }
    }
  }
}

https://vitejs.dev/config/server-options.html#server-watch

@ezzle
Copy link
Author

ezzle commented May 31, 2023

Not a solution : <= 5K = OK, >= 10K = KO

 error   EMFILE: too many open files, open 'D:\Workspace\astro-esse\src\content\essential\en\simple_instance\kb_867684_class40.mdx'
Error: EMFILE: too many open files, open 'D:\Workspace\astro-esse\src\content\essential\en\simple_instance\kb_867684_class40.mdx'
13:34:02 [vite] Internal server error: EMFILE: too many open files, open 'D:\Workspace\astro-esse\node_modules\vite\dist\client\client.mjs'       

 error   [object Object]
  File:
    D:\Workspace\astro-esse\node_modules\astro\dist\content\utils.js:325:11
  Code:
    324 |   } catch (e) {
    > 325 |     throw new AstroError(AstroErrorData.UnknownContentCollectionError, { cause: e });
          |           ^
      326 |   }
      327 |   const { slug: frontmatterSlug } = await contentEntryType.getEntryInfo({
      328 |     fileUrl,
  Stacktrace:
UnknownContentCollectionError: [object Object]
    at getEntrySlug (file:///D:/Workspace/astro-esse/node_modules/astro/dist/content/utils.js:325:11)
    at async handleEvent (file:///D:/Workspace/astro-esse/node_modules/astro/dist/content/types-generator.js:202:27)
    at async runEvents (file:///D:/Workspace/astro-esse/node_modules/astro/dist/content/types-generator.js:267:24)
    at async Object.init (file:///D:/Workspace/astro-esse/node_modules/astro/dist/content/types-generator.js:69:5)
    at async attachListeners (file:///D:/Workspace/astro-esse/node_modules/astro/dist/content/server-listeners.js:46:5)
    at async attachContentServerListeners (file:///D:/Workspace/astro-esse/node_modules/astro/dist/content/server-listeners.js:27:5)
    at async dev (file:///D:/Workspace/astro-esse/node_modules/astro/dist/core/dev/dev.js:63:3)
    at async runCommand (file:///D:/Workspace/astro-esse/node_modules/astro/dist/cli/index.js:151:7)
    at async cli (file:///D:/Workspace/astro-esse/node_modules/astro/dist/cli/index.js:209:5)

@ezzle
Copy link
Author

ezzle commented Jun 3, 2023

Tested 2.5.7 : Issue unchanged.

<=15K files : Build oK but errpr thrown when accessing any page

PS D:\Workspace\astro-esse> npm run dev

> [email protected] dev
> astro dev

   astro  v2.5.7 started in 2811ms

  ┃ Local    http://localhost:3003/
  ┃ Network  use --host to expose

07:16:05 [content] Watching src/content/ for changes
07:21:53 [content] Unsupported file types found. Prefix with an underscore (`_`) to ignore:
- essential-schema.ts
07:21:53 [content] Types generated
07:21:54 [astro] update D:/Workspace/astro-esse/.astro/types.d.ts (x2)
 error   EMFILE: too many open files, open 'D:\Workspace\astro-esse\src\content\essential\en\simple_instance\kb_867684_class47.mdx'
Error: EMFILE: too many open files, open 'D:\Workspace\astro-esse\src\content\essential\en\simple_instance\kb_86768  errno: -4066,
  syscall: 'open',
  code: 'EMFILE',
  path: 'D:\\Workspace\\astro-esse\\node_modules\\vscode-oniguruma\\release\\onig.wasm'
}

Node.js v18.15.0

@ezzle
Copy link
Author

ezzle commented Jun 4, 2023

[ 80K content Files ]

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
[email protected] dev
> astro dev

   astro  v2.5.7 started in 611ms

  ┃ Local    http://localhost:3003/
  ┃ Network  use --host to expose

07:48:06 [content] Watching src/content/ for changes

<--- Last few GCs --->

[19668:000001F7971293B0]  3399362 ms: Scavenge 4031.0 (4116.5) -> 4030.8 (4127.5) MB, 13.3 / 0.0 ms  (average mu = 0.282, current mu = 0.145) allocation failure;
[19668:000001F7971293B0]  3399384 ms: Scavenge 4037.6 (4127.5) -> 4038.3 (4127.5) MB, 13.8 / 0.0 ms  (average mu = 0.282, current mu = 0.145) allocation failure;
[19668:000001F7971293B0]  3402407 ms: Scavenge 4038.5 (4127.5) -> 4037.6 (4150.0) MB, 3023.0 / 0.0 ms  (average mu = 0.282, current mu = 0.145) allocation failure;


<--- JS stacktrace --->

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
 1: 00007FF7867B2BCF node_api_throw_syntax_error+175519
 2: 00007FF7867383A6 SSL_get_quiet_shutdown+64006
 3: 00007FF786739762 SSL_get_quiet_shutdown+69058

@matthewp
Copy link
Contributor

Can you provide an example project (git or stackblitz) that demonstrates this bug? Thanks.

@matthewp matthewp added the needs repro Issue needs a reproduction label Jun 12, 2023
@ezzle
Copy link
Author

ezzle commented Jun 12, 2023

Can you provide an example project (git or stackblitz) that demonstrates this bug? Thanks.

https://github.com/ezzle/docs/

@natemoo-re natemoo-re added - P4: important Violate documented behavior or significantly impacts performance (priority) and removed needs repro Issue needs a reproduction needs response Issue needs response from OP labels Jul 5, 2023
@natemoo-re natemoo-re added the feat: content collections Related to the Content Collections feature (scope) label Sep 13, 2023
@natemoo-re
Copy link
Member

Looks like this continues to be a problem for people!

I believe our type generation is reading all of the files in src/content into memory, but it definitely shouldn't do that. We just need to process the frontmatter to generate types, which could be done by reading the file stream, and closing it as soon as the frontmatter has been read. That would significantly reduce our memory consumption.

@gyulavoros
Copy link

I'm facing the same issue, serving around ~1,500 markdown files.

I use the Vercel adapter in SSR mode. If any of my pages (or components) reference the following Astro call, Vercel throws an HTTP 500, and I can find EMFILE: too many files open in the logs:

await getCollection("<my_collection>");

Running Astro locally is working fine (I'm on macOS). Both astro dev and astro build runs fine. The problem happens only when Vercel tries to serve a content page dynamically.

Our setup is relatively new:

"astro": "3.1.4",
"@astrojs/vercel": "5.0.1"

@carsonyl
Copy link

carsonyl commented Nov 13, 2023

I'm hitting this issue on Windows with Astro 3.4.3 and about 10k files. I tried it with both Markdown content and JSON data content collections. I'm using static site mode and astro dev.

Also tried: Astro 3.5.3, with and without experimental.contentCollectionCache.

@lilnasy lilnasy added requires refactor Bug, may take longer as fixing either requires refactors, breaking changes, or considering tradeoffs and removed - P4: important Violate documented behavior or significantly impacts performance (priority) labels Nov 21, 2023
@chadananda
Copy link

I'm getting this error periodically. And I have only 243 files in src/contents/

"astro": "^4.3.1",
"@astrojs/vercel": "^7.0.2",

@florian-lefebvre
Copy link
Member

Can you try with the content layer (https://docs.astro.build/en/reference/configuration-reference/#experimentalcontentlayer)? That may solve it

@ascorbic
Copy link
Contributor

ascorbic commented Oct 8, 2024

Closing this as there's been no response. If anyone has this, can they try using the latest beta of Astro and see if it still happens. We are working on making Astro more scalable for very large sites, and the content layer should help there.

@ascorbic ascorbic closed this as completed Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat: content collections Related to the Content Collections feature (scope) needs response Issue needs response from OP requires refactor Bug, may take longer as fixing either requires refactors, breaking changes, or considering tradeoffs
Projects
None yet
Development

No branches or pull requests

10 participants