You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The file size I'm using is about 200MB. I've already made this work by dumping everything into memory, but I want to use streams and put some custom logic into the parsing step to speed it up.
To do the same in the browser (as far as I can tell), you need to turn an <input> file into a stream, push it through a DecompressionStream('gzip').writable and then push that into a stream capable papaparse parser. So I have this:
const fileInput = document.getElementById('selectFileBtn');
async function parseGzippedCsv(file) {
const parser = papaparse.parse(papaparse.NODE_STREAM_INPUT)
parser.on('data', (chunk) => {
data.push(chunk);
}
);
parser.on('end', () => {
// I actually need to wrap in a Promise to make this work, but it fails before it gets here
resolve(data);
}
);
file.stream()
.pipeTo(new DecompressionStream('gzip').writable)
.pipeTo(parser);
}
fileInput.addEventListener('change', async function(e) {
const file = e.target.files[0];
const data = await parseGzippedCsv(file);
});
The error is: TypeError: Cannot read properties of null (reading 'stream') on the line that creates the papaparse parser, so maybe I can't use papaparse.NODE_STREAM_INPUT in the browser...?
I've also tried to do something similar with csv-parse without success. I'm a bit surprised nobody else wants to do this in the browser 🤔
The text was updated successfully, but these errors were encountered:
The file size I'm using is about 200MB. I've already made this work by dumping everything into memory, but I want to use streams and put some custom logic into the parsing step to speed it up.
Doing this with streams is easy enough in Node:
To do the same in the browser (as far as I can tell), you need to turn an
<input>
file into a stream, push it through aDecompressionStream('gzip').writable
and then push that into a stream capablepapaparse
parser. So I have this:The error is:
TypeError: Cannot read properties of null (reading 'stream')
on the line that creates thepapaparse
parser, so maybe I can't usepapaparse.NODE_STREAM_INPUT
in the browser...?I've also tried to do something similar with
csv-parse
without success. I'm a bit surprised nobody else wants to do this in the browser 🤔The text was updated successfully, but these errors were encountered: