Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extremely slow file handling with archives in Android #42

Open
Jaifroid opened this issue Jan 8, 2023 · 8 comments
Open

Extremely slow file handling with archives in Android #42

Jaifroid opened this issue Jan 8, 2023 · 8 comments
Assignees
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@Jaifroid
Copy link
Collaborator

Jaifroid commented Jan 8, 2023

Issue was initially discussed and diagnosed over at kiwix/kiwix-js-pwa#343. There, I thought the issue was with Emscripten's WORKERFS. However, using the test case in this Repo for large file access, and debugging on a Chromium instance on a midrange Samsung Android, the WORKERFS has no problem loading and reading (instantly) a 92GB Wikipedia ZIM from a microSD card. See screenshot below. If we can read bytes from the end of the file nearly instantly, why is javascript-libzim getting such awful performance? Loading Ray Charles into the WASM on Android takes 30 seconds in Samsung Internet (a Chromium browser), and nearly 90 seconds in Chrome. I was unable to load anything larger than 500MB into either instance.

The slowdown is mostly in relation to instantiating the archive. Once the archive has registered, full-text searching is reasonable. This is why I initially thought the issue was to do with passing the ArrayBuffer to the Web Worker, but big file test does this instantly. @mgautierfr would you have any thoughts on what could be going on here?

image

NB we can't test on Firefox because Firefox on Android attempts to copy picked archives into memory (or possibly an internal file system), and crashes on anything larger than about 2GB.

@Jaifroid
Copy link
Collaborator Author

Jaifroid commented Jan 8, 2023

I think I've narrowed it down to whatever lies behind this code in the bindings:

image

The app hangs (in Android) for a ridiculously long time even with Ray Charles on this JS invocation of the above code:

image

Strongly suggests that the problem is in libzim, or in the interaction between the file representation in Android and libzim. Note that this code runs AFTER the archive has been passed to the Web Worker, so problems with passing the ArrayBuffer are definitively ruled out.

@Jaifroid Jaifroid changed the title Extremely slow file handling with large archives in Android Extremely slow file handling with archives in Android Jan 8, 2023
@Jaifroid
Copy link
Collaborator Author

Jaifroid commented Jan 9, 2023

Perhaps @mgautierfr's analysis of search performance in openzim/libzim#418 is relevant to this issue, in particular the observation that most of the "lost" time is spent on I/O. However, the real bottleneck we're experiencing is in the function that loads the ZIM archive. Unless some of the caching ideas regarding Xapian search have been implemented, just loading the archive should not even initiate Xapian code, right?

@Jaifroid
Copy link
Collaborator Author

Jaifroid commented Jan 9, 2023

This comment from mossroy is pertinent, but using pools of Web Workers wouldn't solve the loading bottleneck:

Currently, the performance of the libzim version of kiwix-js is worse than the plain javascript version, although the ZIM accesses are much faster with libzim (compiled in wasm).

After adding some timers in the code, we found that it comes from the fact that the WebWorker requested by the emscripten build is mono-threaded : all calls are enqueued and handled one after the other one.

There is almost no overhead of copying the Array in the WebWorker (to have a transferable object).

I currently don't see any other idea than using a pool of WebWorkers instead of only one. It would increase the memory usage because each one would hold its own instance of libzim (with its own cache etc). I don't see how to have a single instance of libzim shared among several WebWorkers for now

We might technically use the libzim directly in the ServiceWorker, instead of a WebWorker. But we can't access the files from the ServiceWorker

@mgautierfr
Copy link
Collaborator

I don't know how you "system" is architectured. Where the io is happening ? When you want to access libzim or when libzim access the file ?

You speak about fulltext search. fulltext search is made by xapian library itself and I don't know how it is working. But it seems that loading the xapian database need a lot of io (we can humanly see that even locally in cpp). Once the database is loaded, following search are pretty quick. Maybe it is the same problem but augmented by a io a bit slower in JS/SW/WebWorker (even if 90s seem really a lot, even for that)

Your test for access to large file seems to get only one bytes (at different offsets). How it is behaving when you try to read several bytes (few KB, few MB, few hundreds of MB) ? How it is behaving when you try to access at random offsets (not only increasing offsets) ?

@Jaifroid
Copy link
Collaborator Author

@mgautierfr The bottleneck is entirely inside the libzim WASM from what I can tell. It's not the Xapian I/O (inside libzim) that is causing the bottleneck, as we are simply loading the ZIM archive into libzim at this point. I presume it starts to read metadata, establish caches, etc. when it runs the new zim::Archive(filename) function inside libzim. This is what takes a ridiculous amount of time on Android. Xapian search is initiated later (only when a user starts to search), and that is actually pretty fast (at least relative to this bottleneck) once the archive has loaded.

File reading on Android is always slow, but not this slow with an archive like Ray Charles! Our legacy back end, which emulates libzim in JavaScript, can load full English Wikipedia on Android, a bit slowly, but acceptably. Hence I'm so surprised that simply loading Ray Charles in libzim WASM is taking so long on Android, and that archives larger than about 500MB never seem to finish loading. I guess we're a bit stuck with this issue if there's no scope for reducing I/O or whatever it is that is taking so long. I'll keep investigating...

@mgautierfr
Copy link
Collaborator

Can you have a trace of all io made by libzim (how many bytes read, at which offsets) ?

@Jaifroid
Copy link
Collaborator Author

Good idea -- although I probably can't do that for the WASM (which is a kind of machine assembly code), I can probably do it in the ASM version which is JavaScript bytecode, and so vaguely human-readable and traceable/debuggable.

@stale
Copy link

stale bot commented May 26, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants