Stream API to get batch of messages from certain sequence with limit #266

FZambia · 2020-07-19T11:58:56Z

This is a follow up issue after short discussion with @ripienaar in Nats Slack.

I have a system which uses streams (Centrifugo). At moment we have two implementations of stream data structure that fits internal Centrifugo design. One is in-memory, the second based on Redis Stream data structure. I am investigating a possibility to extend possible options for Jetstream in future. I am still thinking about advisability in general but decided to ask a question here as soon as possible while Jetstream is in tech preview stage.

Centrifugo is designed in a way that clients should have a possibility to paginate over a stream since certain offset asking for a batch of messages. Actually something like this in pseudocode:

messages = stream.Get(sinceSequence, limit)

It also needs a way to access current stream max sequence. Which I suppose already possible with $JS.API.STREAM.INFO.*

But regarding to getting a batch of messages from a stream things are not too handy. As far as I understand at moment I can only create consumer to load messages one by one to form a batch for client to return. Pull based or maybe push based to avoid many RTT for messages. But this does not seem like a good approach for my use case.

For example Redis streams allow iterating over a stream with sth like:

XRANGE <STREAM_KEY> <OFFSET> + COUNT <LIMIT>

I suppose it can be part of Jetstream admin API maybe – it already has a method to get a single message from a stream: $JS.API.STREAM.MSG.GET.*. But looking at Store interface it has a method to load only one message from memory or disk (LoadMsg ). For in-memory store I suppose this may work fine for asking many messages in perspective, but for disk storage asking for a batch can be a problem (not sure how messages are stored on disk).

What I want to ask here – whether an API to iterate over stream with certain position and with a certain limit is in Jetstream design vision? Should I consider that it will be available at some moment?

The text was updated successfully, but these errors were encountered:

ripienaar · 2020-07-20T09:22:16Z

@derekcollison tldr, they want to be able to show a stream contents on a web page and need to be able to walk through it in pages, seems fine as we already have a get msg?

I imagine you ask for 10 messages with a inbox, you respond with them each and the last one has a header indicating its the last one so they dont have to wait for some timeout if there is only 8 messages not 10.

derekcollison · 2020-07-20T12:58:44Z

I would do this with a simple pull based consumer which supports batching. You can know that last sequence number of the stream as you pointed out so you know when to stop. We are also (most likely) going to add that information to the ack reply so you will not have to ask for it separately.

ripienaar · 2020-07-20T13:14:03Z

He doesn’t want to round trip to the server 90 times to build a single web page. And the get msgs api we have is awkward too.

Consumer like this was my first suggestion

derekcollison · 2020-07-20T13:37:01Z

I would not use the stream message API for this, at least as it exists today.

A pull based consumer makes most sense but I understand the potential issues since its durable etc.

You can grab state of stream, create durable pull based and use the batching mechanism there. That will work. Make sure to delete the consumer when done.

ripienaar · 2020-07-20T13:50:55Z

You mean you wouldnt use it even if it supported a request to retrieve n messages?

The consumer approach will be really difficult, because the durble work as long as no1 ever hit back :) so such a page would have to make ephemeral ones for every single request, it'll be super tedious

derekcollison · 2020-07-20T22:49:13Z

If the stream itself had an api to get batches of messages that could possibly be useful. But need to think more through that and the issues we have now with push based backlogs and applications not keeping up, etc.

ripienaar · 2020-07-21T04:48:10Z

Yes, that API is the feature request in this issue.

We could cap it at 256 messages max maybe?

FZambia · 2020-07-21T08:03:50Z

One thing to mention is that I am planning to do this from backend side since clients communicate with server over custom protocol with its own existing encoding/mechanics.

I understand that consumer approach should work but this seems a bit heavy to do in my specific use case. As far as I understand each consumer creation is an extra request to Jetstream, each consumer deletion is an extra request to Jetstream, each consumer may cost a goroutine to me, and I should wait an arbitrary and not very opaque amount of time while a batch of messages is being collected (actually I don't understand at moment whether I can control batch size in pull consumers or I only have Next method available). And yes - this should be done for every client request for messages in stream in my case.

With Redis Streams task is solved with one RTT to Redis (with a help of some Lua scripting that returns both messages and stream stats). And no extra goroutine involved in this process.

I also asked my colleague about similar possibilities of Kafka – and looks like it only possible with ksql-like tools or ksqlDB at moment (I mean without consumers involved).

ripienaar · 2021-03-31T09:18:08Z

Pull consumers now do batches and is probably what we'd suggest now, closing this. Accessing a stream directly repeatedly is not the correct approach

ripienaar closed this as completed Mar 31, 2021

FZambia mentioned this issue Sep 22, 2021

[feature] Scale at most once PUB/SUB without external broker with embedded Nats server centrifugal/centrifugo#479

Closed

FZambia mentioned this issue Oct 18, 2022

[feature] NATS engine idea: Presence and History supported via the new NATS Jetstream KV store centrifugal/centrifugo#568

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stream API to get batch of messages from certain sequence with limit #266

Stream API to get batch of messages from certain sequence with limit #266

FZambia commented Jul 19, 2020 •

edited

Loading

ripienaar commented Jul 20, 2020

derekcollison commented Jul 20, 2020

ripienaar commented Jul 20, 2020

derekcollison commented Jul 20, 2020

ripienaar commented Jul 20, 2020

derekcollison commented Jul 20, 2020

ripienaar commented Jul 21, 2020

FZambia commented Jul 21, 2020

ripienaar commented Mar 31, 2021

Stream API to get batch of messages from certain sequence with limit #266

Stream API to get batch of messages from certain sequence with limit #266

Comments

FZambia commented Jul 19, 2020 • edited Loading

ripienaar commented Jul 20, 2020

derekcollison commented Jul 20, 2020

ripienaar commented Jul 20, 2020

derekcollison commented Jul 20, 2020

ripienaar commented Jul 20, 2020

derekcollison commented Jul 20, 2020

ripienaar commented Jul 21, 2020

FZambia commented Jul 21, 2020

ripienaar commented Mar 31, 2021

FZambia commented Jul 19, 2020 •

edited

Loading