-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add interface for PaginatedKVStore. #22
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you update the PR to describe why pagination is needed?
/// Will create the given `primary_namespace` and `secondary_namespace` if not already present | ||
/// in the store. | ||
fn write( | ||
&self, primary_namespace: &str, secondary_namespace: &str, key: &str, time: i64, buf: &[u8], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why time
can't be an implementation detail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it could be handled by impl
internally, this would pose some limitations on impl
.
For example, a user of this interface might already have access to a monotonic counter and thus avoid a dependency on std::time
; iiuc, std::time has limited support in Wasm.
While it doesn’t matter much for ldk-server
right now, when we upstream something similar, we might want to do this in a non-std::time
-dependent fashion or atleast not at kv-store interface/impl level.
Additionally, providing time as an input could allow certain types of objects to be written without requiring time
, which is mainly useful in cases where the listing doesn’t need time ordering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it could be handled by
impl
internally, this would pose some limitations onimpl
.For example, a user of this interface might already have access to a monotonic counter and thus avoid a dependency on
std::time
; iiuc, std::time has limited support in Wasm.
Note that it doesn't need to be directly in the implementation but could be at the database-level, for instance.
While it doesn’t matter much for
ldk-server
right now, when we upstream something similar, we might want to do this in a non-std::time
-dependent fashion or atleast not at kv-store interface/impl level.Additionally, providing time as an input could allow certain types of objects to be written without requiring
time
, which is mainly useful in cases where the listing doesn’t need time ordering.
I suppose that's a fair point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that it doesn't need to be directly in the implementation but could be at the database-level, for instance.
We could do this at DB level but that is tight coupling and using very db specific features.
Doing this without time dependency in implementation requires the feature "auto-populate creation_time"
With different SQL impls it might be possible but not every DB, hence we don't want to place that limitation.
For example:
-
Can we do this in sqllite?
Yes,creation_time INTEGER NOT NULL DEFAULT (strftime('%s', 'now'))
-
Can we do this in postgres?
Yes,creation_time INTEGER NOT NULL DEFAULT (EXTRACT(EPOCH FROM NOW())::INT)
-
Can we do this in mysql?
Yes,creation_time INT NOT NULL DEFAULT UNIX_TIMESTAMP(),
-
Can we do this in dynamoDB?
Afaik, No -
Can we do this in cloudflare-kv?
Afaik, No -
Can we do this in mongo-db?
Yes, -
Can we do this in couch-db?
Afaik, No -
Can we do this in redis?
Afaik, No
Hence, not providing this at interface level, places expectations/requirements on db-implementations that were not necessarily required. It is only relevant for wasm environments, but if we can easily avoid this, i think we should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep the time
parameter in the extension trait method as defined in #22 (comment) and discussed offline.
/// | ||
/// [`ListResponse`]: struct.ListResponse.html | ||
fn list( | ||
&self, primary_namespace: &str, secondary_namespace: &str, next_page_token: Option<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than needing to wait on a dependency to be released, is there any reason why pagination couldn't be optional in KvStore
(i.e., an implementation could chose whether or not to support it)? An extension trait could add a list_paginated
method with a naive provided implementation to do the filtering. Then implementations could override it with something more efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pagination cannot be optional; we need to expose pagination at the interface level in ldk-server
APIs. There could be 100,000 forwarded payments or payments, and we don’t want to load them all into memory.
There is no reason for objects that need to be fetched in a paginated manner to be retrieved using just a list, as it results in an "unbounded" request on both the database and the server, consuming unbounded resources.
While pagination isn’t optional, time-based ordering can be optional depending on the objects we are listing, which we can consider.
We need to override both write
and list
. I am open to suggestions, but I thought it might be confusing to combine them into a single trait.
It might be better to write/access objects that need pagination through a dedicated paginated interface, while those that don’t can use the normal kv_store
interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pagination cannot be optional; we need to expose pagination at the interface level in
ldk-server
APIs. There could be 100,000 forwarded payments or payments, and we don’t want to load them all into memory.There is no reason for objects that need to be fetched in a paginated manner to be retrieved using just a list, as it results in an "unbounded" request on both the database and the server, consuming unbounded resources.
By optional, I mean you can decided to use an implementation that does not implement the extension trait since the latter could provide a default implementation using the non-paginated version.
That said, we don't need to provide a naive implementation for the extension trait. We can leave it undefined to force any implementation to define it in order to avoid the problems that you mentioned.
While pagination isn’t optional, time-based ordering can be optional depending on the objects we are listing, which we can consider.
We need to override both
write
andlist
. I am open to suggestions, but I thought it might be confusing to combine them into a single trait. It might be better to write/access objects that need pagination through a dedicated paginated interface, while those that don’t can use the normalkv_store
interface.
I think there's some confusion here then. An extension trait is a separate trait that is a subtrait of another trait. So if you have KvStore
you can define PaginatedKvStore
as:
pub trait PaginatedKvStore: KvStore {
fn paginated_list(
&self, primary_namespace: &str, secondary_namespace: &str, next_page_token: Option<String>,
) -> Result<ListResponse, io::Error>;
}
This seems better than duplicating the entire SqliteStore
implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned elsewhere, I agree with Jeff here in that I find going the extension trait route much preferable. At least, if I'm not currently missing something that would prohibit us from going that way.
In particular if the goal is to upstream this trait and the corresponding implementations to LDK Node and eventually to LDK, I imagine it would be unnecessarily hard to reconcile the APIs at that point if we start off with two entirely separate but functionally and terminologically overlapping traits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extension option proposed above only works if we write all objects with time
all the time.
1.) it would need changes in ldk-node base sql at the minimum.
2.) there is time dependency in database impl. (being discussed in above comment), (if we don't want two write
methods in same trait.)
Due to changes in both write
and list
, i proposed going with separate paginated_kv_store interface.
Also, ldk-node no longer has kvstore trait, any changes to kv_store trait require changes in rust-lightning, although it would be good-to-have, we probably don't need paginated api in rust-lightning right now.
If we want to have time in write,
We can either have an extension trait like:
pub trait PaginatedKvStore: KvStore {
// this write method doesn't make much sense imo.
fn paginated_write(
&self, primary_namespace: &str, secondary_namespace: &str, key: &str, time: i64, buf: &[u8]
) -> Result<ListResponse, io::Error>;
fn paginated_list(
&self, primary_namespace: &str, secondary_namespace: &str, next_page_token: Option<String>,
) -> Result<ListResponse, io::Error>;
}
or separate trait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is another problem with this combined interface.
paginated_writes and normal write operations cannot be combined for the same object type (since normal writes don't use the time field currently).
Similarly, a paginated list operation can't be run on objects written through normal writes. I think that will make for a super confusing API with footguns everywhere.
Moreover, we would want paginated_writes and list to go through a separate table since we will have indexes on the time column.
Essentially, these two are two different data stores. If the implementation chooses to combine them, they are free to do so, but I don't think we should complicate the interface just to avoid duplication of methods in trait.
We can always implement two traits using the same base methods, instead of combining two different functionalities into one.
For ldk-node as well, I think it would make sense to support providing both KvStore and paginated_kv_store implementations separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is another problem with this combined interface. paginated_writes and normal write operations cannot be combined for the same object type (since normal writes don't use the time field currently). Similarly, a paginated list operation can't be run on objects written through normal writes. I think that will make for a super confusing API with footguns everywhere.
Confusing to whom? It's not the implementors responsibility to get anything right. The caller (i.e., whichever struct is parameterized by the trait) is responsible for using it correctly. And here the caller and the author of the trait are one and the same.
Moreover, we would want paginated_writes and list to go through a separate table since we will have indexes on the time column.
Essentially, these two are two different data stores. If the implementation chooses to combine them, they are free to do so, but I don't think we should complicate the interface just to avoid duplication of methods in trait.
We can always implement two traits using the same base methods, instead of combining two different functionalities into one. For ldk-node as well, I think it would make sense to support providing both KvStore and paginated_kv_store implementations separately.
Hmm... not quite sure how you would support that in LDK Node. If anything, it would probably make more sense to decompose the API into KVRead
, KVWrite
, PaginatedKVRead
, and PaginatedKVWrite
. That way each part of the code is explicit about what the interface it needs. But when configuring LDK Node, you'd still need something that implements all the traits. So, in practice, an implementation would need to implement all those traits anyway.
My broader point, though, the onus is really on the user of the interface to do the right thing. How the traits are structured is less important until we talk about upstreaming. See #22 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confusing to whom? It's not the implementor's responsibility to get anything right.
I think it is confusing inside ldk-node
or ldk-server
and creates foot-guns, such as not combining list
with paginated_write
or paginated_list
with list
/write
.
But when configuring LDK Node, you'd still need something that implements all the traits.
I'm not sure if it is correct to assume that users will want to use the same datastore for both normal kvstore
and paginated_kvstore
. NoSQL/DocumentDB/ObjectStorage are almost a perfect fit for kvstore
, whereas SQL-based databases might be a better fit for paginated_kvstore
.
So, I think users could be better served if they have the option to configure kvstore
and paginated_kvstore
separately. Combining them and adding more and more functionality to kvstore
pushes them toward using SQL databases for everything.
What about the overhead of indexing on the time column? Are we okay with introducing additional overhead to index the time column for all keys instead of just the paginated ones?
In my opinion, there are bigger trade-offs to consider here, like flexibility in database choice and indexing performance, than just duplication in the interface.
Separate tables
If the implementor of the trait wants to configure separate tables for paginated and normal writes, it isn't easy to do so.
If there are separate tables, we would also need a paginated_read
(just reads from the paginated table) and a paginated_remove
(just deletes from the paginated table).
So, in practice, an implementation would need to implement all those traits anyway.
The key point is that there can be two different implementations/schemas for the two traits (KvStore
and PaginatedKvStore
) being used.
Not quite sure how you would support that in LDK Node.
I think, just like we configure KvStore
in the builder, we would also have the option to configure paginated_kv_store
.
Another option could be to just use paginated_kv_store
or change the interface of the current KvStore
. But this restricts database, schema and indexing options.
My broader point, though, the onus is really on the user of the interface to do the right thing. How the traits are structured is less important until we talk about upstreaming.
The trait structure adds additional requirements on implementations, database schemas, and indexing.
More requirements for trait-implementations and database not just for user(ldk-node/ldk-server).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed this offline with @G8XSU. Seems two independent traits would be ideal if we want to allow parameterizing LDK Node with two different stores, where one needs to be paginated. That would also mean we'd want two separate implementations -- one for each trait -- like SqliteStore
and PaginatedSqliteStore
. This is because each would need to be initialized differently (e.g., creating an index, using a different db/table name, etc.). If instead we had one type implementing both traits, you could imagine constructing one that is set up to work in a non-paginated manner but passing it for use with the paginated case, causing it to break.
@tnull What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed it offline in dev-sync, concluded with going for separate traits as outlined in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just one small fix then good to squash and merge.
Pagination in list is required because: 1. To bound query size for database. 2. Bound execution time of request on server. 3. Bound in-memory footprint of request on server. 4. Bound response size to client.
90dad9a
to
4a08248
Compare
Squashed with fix. |
PaginatedKVStore mainly add 2 functionalities in addition to that of KVStore:
Considered extending the KVStore trait but decided against it.
Reasons: