Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error occurs when parsing Parquet files using Brotli compression #140

Closed
craxal opened this issue Sep 4, 2024 · 2 comments · Fixed by #144
Closed

Error occurs when parsing Parquet files using Brotli compression #140

craxal opened this issue Sep 4, 2024 · 2 comments · Fixed by #144
Assignees

Comments

@craxal
Copy link

craxal commented Sep 4, 2024

Steps to reproduce

Use the below code to parse a Parquet file using Brotli compression.

Expected behaviour

The parse is successful.

Actual behaviour

An error occurs:

la.__wbindgen_malloc is not a function

Note that this does not happen for other supported compression methods.

Any logs, error output, etc?

This is the error:

{
  "name": "TypeError",
  "message": "la.__wbindgen_malloc is not a function",
  "stack": "TypeError: la.__wbindgen_malloc is not a function\n    at passArray8ToWasm (/Applications/Microsoft Azure Storage Explorer.app/Contents/Resources/app/node_modules/@storage-explorer/file-preview/dist/src/index.js:27:2264713)\n    at ju.exports.decompress (/Applications/Microsoft Azure Storage Explorer.app/Contents/Resources/app/node_modules/@storage-explorer/file-preview/dist/src/index.js:27:2265227)\n    at Object.inflate_brotli [as inflate] (/Applications/Microsoft Azure Storage Explorer.app/Contents/Resources/app/node_modules/@storage-explorer/file-preview/dist/src/index.js:27:2267375)\n    at Object.inflate (/Applications/Microsoft Azure Storage Explorer.app/Contents/Resources/app/node_modules/@storage-explorer/file-preview/dist/src/index.js:27:2267061)\n    at decodeDictionaryPage (/Applications/Microsoft Azure Storage Explorer.app/Contents/Resources/app/node_modules/@storage-explorer/file-preview/dist/src/index.js:72:280198)\n    at decodePage (/Applications/Microsoft Azure Storage Explorer.app/Contents/Resources/app/node_modules/@storage-explorer/file-preview/dist/src/index.js:72:279268)\n    at /Applications/Microsoft Azure Storage Explorer.app/Contents/Resources/app/node_modules/@storage-explorer/file-preview/dist/src/index.js:72:277570\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async e.readColumnChunk (/Applications/Microsoft Azure Storage Explorer.app/Contents/Resources/app/node_modules/@storage-explorer/file-preview/dist/src/index.js:72:277516)\n    at async e.readRowGroup (/Applications/Microsoft Azure Storage Explorer.app/Contents/Resources/app/node_modules/@storage-explorer/file-preview/dist/src/index.js:72:276909)"
}

Any other comments?

This occurs within the context of another app. I don't know for sure if this happens "out of the box" or not. Could this be due to a bad or missing Rust binding in brotli-wasm? Here's the Typescript code that is used:

import { ParquetReader } from "@dsnp/parquetjs";

async function* generateRecord(reader: ParquetReader): AsyncGenerator<any[]> {
  const cursor = reader.getCursor();
  let record: any;
  while (record = await cursor.next()) {
    yield record;
  }
}

const buffer = ...; // Retrieve by downloading from some URL or reading from local file.

const reader = await ParquetReader.openBuffer(buffer));
const schema = reader.getSchema();
const columns = Object.keys(schema.fields);
const rows = [];

for await (const record of generateRecord(this.#reader)) {
  const rowArray = columns.reduce<TabularCellValue[]>((acc, key) => {
    const value = record[key];
    if (value === null || value === undefined) {
      acc.push(undefined);
    } else if (value instanceof Date) {
      acc.push({ type: "Date", value: value.getTime() });
    } else if (typeof value === "bigint") {
      acc.push({ type: "BigInt", value: value.toString() });
    } else if (typeof value === "object") {
      acc.push(JSON.stringify(value, (_, v) => typeof v === "bigint" ? v.toString() : v));
    } else if (typeof value === "string" || typeof value === "number" || typeof value === "boolean") {
      acc.push(value);
    } else {
      acc.push(undefined);
    }

    return acc;
  }, []);
  rows.push(rowArray);
}

...
@wilwade
Copy link
Member

wilwade commented Sep 11, 2024

@craxal
Copy link
Author

craxal commented Sep 11, 2024

Actually, this may have been a problem on our end. I tried doing the same thing "out of the box", and it worked just fine. We use ESBuild to bundle things, but it only works well with static imports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants