Implementing a custom database focused on Kawipiko's requirements #14

cipriancraciun · 2024-08-31T08:27:23Z

cipriancraciun
Aug 31, 2024
Maintainer

As Kawipiko is currently implemented, it could use any key-value store to serve its contents. Today it uses the venerable CDB, but it could be easily patched to use something like https://github.com/boltdb/bolt, SQLite, or anything else that resembles a key-value store.

However, Kawipiko's main focus is raw performance and security, thus sticking with something like CDB serves its purposes better. Unfortunately, CDB does have some issues, one of which is the 4 GiB limitation as #11 issue points out.

Thus, as time permits, I've been thinking that perhaps a better solution is to design a custom database format, inspired by CDB, that best suits Kawipiko's needs.

(Also, as discussed in #13, I intend to rewrite Kawipiko in Rust, thus this topic should be viewed from that perspective.)

As Kawipiko is implemented today, for each request it does a couple of key lookups:

one with the URL as the key, which yields a key for the HTTP body and another key for the HTTP headers;
one with the HTTP body key;
one with the HTTP header key;
(if the URL is not found, a few loops are tried;)

Out of all these only the first step actually requires a hash-map, the other two steps can be just an array lookup.

Also, the first step (that uses the URL as key) which implies a loop of shorter paths (i.e. /a/b/c, then /a/b/*, then /a/*, then /*`, etc.) could be perhaps more efficiently implemented taking into account we actually have a path in there.

Also, taking into account that we are implementing HTTP, which has content negotiation (via Accept-Encoding, or Accept-Language, etc.) we might also add support for "alternate" payloads, thus eliminating extra lookups.

What does one think about this? Any other ideas, observations, proposals?

lemondevxyz · 2024-08-31T12:31:12Z

lemondevxyz
Aug 31, 2024

Although this isn't an area I am particularly well versed in, I can suggest two which remove the size limit.

cdb64 as mentioned in issue Feature request: Support for CDB64? #11
ctrie "is a concurrent thread-safe lock-free implementation of a hash array mapped trie."

ctrie could allow for reloading the file without the server since its design is based on atomic swap. It should be noted that ctrie is not as fast as CDB.

ctrie would also support single file updates. CDB requires rebuilding the file while ctrie doesn't. No attempts have been made to write the data structure to disk and that could pose problems of its own...

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing a custom database focused on Kawipiko's requirements #14

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Implementing a custom database focused on Kawipiko's requirements #14

cipriancraciun Aug 31, 2024 Maintainer

Replies: 1 comment

lemondevxyz Aug 31, 2024

cipriancraciun
Aug 31, 2024
Maintainer

lemondevxyz
Aug 31, 2024