Host minimal example dataset for use in tests and examples #11

Robinlovelace · 2024-07-08T10:28:36Z

We can host:

A minimal gzipped csv
Parquet translation

As a release, e.g.: https://github.com/Robinlovelace/spanishoddata/releases/tag/v0.0.1

If that does not allow direct querying of the parquet files, as I expect, we can host elsewhere e.g. on GitHub pages.

e-kotov · 2024-07-13T15:05:55Z

Uploading to releases works (at lest for not so large data sets) as documented in this brilliant example: https://docs.ropensci.org/piggyback/articles/cloud_native.html#duckdb. The files in a GitHub release can be queried efficiently using {duckdb}. However, this is of course inferior to just having the parquet files in a hive style format on S3 storage. But probably good enough for demonstrating proof-of-concept.

Robinlovelace · 2024-07-14T14:47:54Z

That is really good to see!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Host minimal example dataset for use in tests and examples #11

Host minimal example dataset for use in tests and examples #11

Robinlovelace commented Jul 8, 2024

e-kotov commented Jul 13, 2024 •

edited

Loading

Robinlovelace commented Jul 14, 2024

Host minimal example dataset for use in tests and examples #11

Host minimal example dataset for use in tests and examples #11

Comments

Robinlovelace commented Jul 8, 2024

e-kotov commented Jul 13, 2024 • edited Loading

Robinlovelace commented Jul 14, 2024

e-kotov commented Jul 13, 2024 •

edited

Loading