You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to create an interactive csv mapper / parser / uploader.
Assumptions:
CSV file is too large to fit in memory.
The file is uploaded in chunks.
The uploading / server side processing is smaller than the client side parsing / processing.
I use web workers to keep the page responsive.
I'm queuing ajax requests and firing them one after another.
Problem:
Each time a chunk has been read the data gets copied to the main thread.
Data gets parsed faster than it gets consumed, but I cannot pause the worker.
So memory usage increases and browser (theoretically) runs out of memory.
Possible solutions (also based on other issues mentioning pausing for web workers in general).
Add callback that gets copied to worker thread. This callback runs synchronously.
The callback can massage data before sending it to the main thread callback. (Or just compute some statistics and send those)
The worker callback can pause and resume parsing.
Note that while #130 notes the downsides of allowing the asynchronous callback to pause workers it does not explore other solutions.
For example, the parse object (Papa.parse(File, config)) could expose the pause, abort and resume functions and post messages to the worker. This would allow for asynchronous pausing (ie pausing will happen whenever the worker thread checks its messages next.), but that is acceptable for most use cases.
For me it's unclear why resuming would be an issue. The only issue I currently see is that we cannot get a reference to the worker from outside the parse object, but that is something that can easily be fiexd.
The text was updated successfully, but these errors were encountered:
Just noticed that when using workers, Papa.parse has no return value at all. It would be easy to create a management object that has a reference to the worker and functions pause and resume to send messages to the worker right?
Since this is an issue open since 2015, I don't have high hopes of it being addressed.
But I too have exactly the same scenario as described above. I think it's a pretty significant shortcoming not to be able to pause processing in workers.
Yes, it will slow down processing as we're being warned of in the FAQ, but that should not be a reason to not have this feature at all in my opinion. The user will have to wait for server side processing to complete anyway, so it doesn't matter if processing is slower.
Currently we have no way of staggering that as the parser just dumps chunk after chunk which overloads the server.
The only way to solve this as far as I can see, would be to store the output chunks in memory, but that would defeat the purpose of streaming the file in the first place.
I'm trying to create an interactive csv mapper / parser / uploader.
Assumptions:
Problem:
Possible solutions (also based on other issues mentioning pausing for web workers in general).
Note that while #130 notes the downsides of allowing the asynchronous callback to pause workers it does not explore other solutions.
For example, the parse object (Papa.parse(File, config)) could expose the pause, abort and resume functions and post messages to the worker. This would allow for asynchronous pausing (ie pausing will happen whenever the worker thread checks its messages next.), but that is acceptable for most use cases.
For me it's unclear why resuming would be an issue. The only issue I currently see is that we cannot get a reference to the worker from outside the parse object, but that is something that can easily be fiexd.
The text was updated successfully, but these errors were encountered: