Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doubt regarding workSheetReader.abort and workSheetReader.skip function. #57

Open
gaurav-cointab opened this issue Sep 9, 2020 · 3 comments

Comments

@gaurav-cointab
Copy link

Is my understanding correct regarding the workSheetReader.abort() and workSheetReader.skip() function.

The code will stop the processing of the worksheet as soon as workSheetReader.abort() is called, and then it will go on to other sheet, and if all other sheets are skipped by calling workSheetReader.skip() function, then the processing of the excel file will stop there it self, and workBookReader.on('end') event will be triggered.

If the above understanding is correct, then workSheetReader.abort() or workSheetReader.skip() is not working as expected.

I have a very large file (300MB) and I am reading only the first sheet like this.

if (workSheetReader.id > 1) {
    workSheetReader.skip();
    return;
}

And I am trying to only read the header row by this code

workSheetReader.on('row', function (row) {
    if (row.attributes.r == 1) {
        // do something with row 1 like save as column names
    } else{
        workSheetReader.abort();
    }
});

The code is stopping after a good 10-15 min. So I am assuming that the code is processing all the other rows, and sheets as well, before stopping the processing.

@DaSpawn
Copy link
Owner

DaSpawn commented Sep 9, 2020

After a quick look I do not think those functions are very efficient/as intended, and I actually don't use them in production. I suspect they could be optimized, but I would need time and test data

@gaurav-cointab
Copy link
Author

What would you suggest will be a better way to just read the header of the first sheet, and then stop while reading from excel.

@KEXUJIAN
Copy link

What would you suggest will be a better way to just read the header of the first sheet, and then stop while reading from excel.

Hi, I have got the same situation. there is my workaround

// skip other sheets
if (Number(workSheetReader.id) > 1) {
    workSheetReader.skip();
    return;
}

// handle the title
workSheetReader.on('row', row => {
    let i = Number(row.attributes.r) // assuming that the first line is the title
    if (i === 1) {
        // do something with the title. 

        // if you don't want handle the sheet any more
        workSheetReader.removeAllListeners('row')
        workSheetReader.abort()
        workSheetReader.skip()
    }
}).prependListener('end', () => {
    // prepend our own 'end' cb
    // delete the 'end' callback to avoid the error "TypeError: Cannot read property 'path' of undefined
    // at Immediate.processBooks (/xxxxx/node_modules/xlsx-stream-reader/lib/workbook.js:188:65)"

    // because workSheetReader.skip() will emit its own 'end' event while the real stream is still in the pipe process
    // when work sheet stream reach the end, it will also emit workSheetReader's 'end' event, multiple 'end' will set the `currentBook` variable large than length of `waitingWorkSheets`

    workSheetReader.removeAllListeners('end')
}).process()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants