Skip to content

Benchmarks ‐ Old

Matt Hicks edited this page Aug 21, 2024 · 1 revision

This benchmark was written utilizing the data files from IMDB, which are posted here: https://datasets.imdbws.com.

Disclaimer: As many people point out, database benchmarks are often worthless as there is such a broad range of use cases, and no benchmark can effectively represent your needs. This benchmark is no different. It was created to compare the relative performance of general uses only.

The specific information used is the "Title Akas" (26,838,043 records) and "Title Basics" (8,081,413).

The initial importing of both datasets can be seen in the graph below:

Import Data

While LightDB imports quickly, MongoDB scales up better for large dataset imports. For a smaller import ("Title Basics"), LightDB is still the fastest.

However, in most use cases, reading speed is the most essential factor for consideration. The following graph shows the performance of individually querying each AKA record one ID at a time:

Validate Ids

While it's arguable that this bare-metal embedded database benchmark is an unfair comparison because of the latency other databases have to account for, it is arguable that this is a prime reason a bare-metal database is worth considering. The above graph makes it hard to see a clear indication of performance. Still, for comparison, LightDB is querying roughly 3,312,929 records per second, while the runner-up, ArangoDB, is only achieving 193,541 records per second. That's 17x faster for simple lookups. This indicates how the database could be used for non-linear queries like a graph database without being bound to the same constructs. In addition, since queries are written in Scala code and run within the application context, the database query language is no longer a boundary to writing effective queries to access the data.

The final graph shows a filtered query instead of a simple key lookup. LightDB leverages Yahoo's HaloDB behind the scenes for blazing-fast key/value storage but utilizes Lucene to provide powerful indexing capabilities on the data:

Search Titles

Clone this wiki locally