You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During ideation for reporting data in duckDB, I read one of their blog posts (https://duckdb.org/2023/03/03/json.html) that gives an example of loading a large (>10GB) compressed JSON archive into memory.
It would be really nice if we could support this with our file IO. In theory, the following steps would need to happen:
This argument should also support the "infer" string as a special value (indicating that the compression should be inferred from the input filename).
Give the option of giving a directory name to (read|write)_batched that takes all records, reads/writes them to that directory in the given driver mode, and then compresses said directory.
The text was updated successfully, but these errors were encountered:
During ideation for reporting data in duckDB, I read one of their blog posts (https://duckdb.org/2023/03/03/json.html) that gives an example of loading a large (>10GB) compressed JSON archive into memory.
It would be really nice if we could support this with our file IO. In theory, the following steps would need to happen:
compression: str | None = None
argument on(read|write)
that gives the option of using a compression when writing a record (also prompting a lookup in a dictionary of compression algos, like https://github.com/fsspec/filesystem_spec/blob/master/fsspec/utils.py#L138).(read|write)_batched
that takes all records, reads/writes them to that directory in the given driver mode, and then compresses said directory.The text was updated successfully, but these errors were encountered: