Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve indexes quickstart #197

Merged
merged 1 commit into from
Sep 30, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions acceleration/indexes/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,13 @@

Indexes can be created on accelerated datasets (for supported engines) to improve query performance. This quickstart will demonstrate how to create an index and compare the performance on the same accelerated dataset without an index.

Clone the Spice quickstarts repository and navigate to the `indexes` directory:

```bash
git clone https://github.com/spiceai/quickstarts.git
cd quickstarts/acceleration/indexes
```

**Step 1.** Download the large test dataset locally

Indexes only make a difference on large datasets. This dataset is 1.5GB bytes, decompressed into 7+GB in-memory.
Expand All @@ -18,6 +25,20 @@ Spice will start and load the dataset into sqlite. **This may take several minut
spice run
```

```console
2024-09-30T18:04:26.070605Z INFO runtime::flight: Spice Runtime Flight listening on 127.0.0.1:50051
2024-09-30T18:04:26.070827Z INFO runtime::http: Spice Runtime HTTP listening on 127.0.0.1:8090
2024-09-30T18:04:26.070596Z INFO runtime::metrics_server: Spice Runtime Metrics listening on 127.0.0.1:9090
2024-09-30T18:04:26.078670Z INFO runtime::opentelemetry: Spice Runtime OpenTelemetry listening on 127.0.0.1:50052
2024-09-30T18:04:26.270747Z INFO runtime: Initialized results cache; max size: 128.00 MiB, item ttl: 1s
2024-09-30T18:04:26.286500Z INFO runtime: Dataset traces registered (file:large_eth_traces.parquet), acceleration (duckdb:file), results cache enabled.
2024-09-30T18:04:26.287326Z INFO runtime: Dataset traces_no_index registered (file:large_eth_traces.parquet), acceleration (duckdb:file), results cache enabled.
2024-09-30T18:04:26.287668Z INFO runtime::accelerated_table::refresh_task: Loading data for dataset traces
2024-09-30T18:04:26.288332Z INFO runtime::accelerated_table::refresh_task: Loading data for dataset traces_no_index
2024-09-30T18:05:00.532792Z INFO runtime::accelerated_table::refresh_task: Loaded 7,595,994 rows (7.04 GiB) for dataset traces in 34s 245ms.
2024-09-30T18:05:00.683737Z INFO runtime::accelerated_table::refresh_task: Loaded 7,595,994 rows (7.04 GiB) for dataset traces_no_index in 34s 395ms.
```

**Step 3.** Run a query on the dataset without an index

Once the datasets have loaded, run the following query in the SQL REPL via `spice sql` to get a single row:
Expand Down
Loading