Skip to content

Commit

Permalink
fix: opening large files
Browse files Browse the repository at this point in the history
VSCode only supports files up to 50MB so stop before resulting JSON gets to this size.
See microsoft/vscode#31078

Fixes #114, #74
  • Loading branch information
dvirtz committed Jan 28, 2024
1 parent 6579bfa commit 9c76323
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 1 deletion.
10 changes: 9 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,14 @@ The following setting options are available:
|`parquet-viewer.jsonSpace`|0|JSON indentation space, passed to `JSON.stringify` as is, see [mdn](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify#parameters) for details. Doesn't apply when `parquet-viewer.backend` is `parquet-tools`.|
|`parquet-viewer.parquetToolsPath`|`parquet-tools`|The name of the parquet-tools executable or a path to the parquet-tools jar|

### What's new
## Notes

### Size limit

VSCode allows extensions to work on files smaller than 50MB.
If the data is larger, it will be truncated a message indicating that will be appended to the output.
See https://github.com/microsoft/vscode/issues/31078 for details.

## What's new

See [CHANGELOG.md](CHANGELOG.md)
10 changes: 10 additions & 0 deletions src/parquet-document.ts
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,10 @@ export default class ParquetDocument implements vscode.Disposable {
this._lastMod = mtimeMs;

const lines: string[] = [];
const encoder = new TextEncoder();
const FILE_SIZE_MB_LIMIT = 50;
const limitExceededMsg = JSON.stringify({warning: `file size exceeds ${FILE_SIZE_MB_LIMIT}MB limit`});
let totalByteLength = encoder.encode(limitExceededMsg).byteLength;

await vscode.window.withProgress({
location: vscode.ProgressLocation.Notification,
Expand All @@ -72,6 +76,12 @@ export default class ParquetDocument implements vscode.Disposable {
},
async (progress, token) => {
for await (const line of this._backend.toJson(this._parquetPath, token)) {
const lineByteLength = encoder.encode(`${line}${os.EOL}`).byteLength;
totalByteLength += lineByteLength;
if (totalByteLength >= FILE_SIZE_MB_LIMIT * 1024 * 1024) {
lines.push(limitExceededMsg);
break;
}
lines.push(line);
}
}
Expand Down

0 comments on commit 9c76323

Please sign in to comment.