Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds HSCAN key/value support to storage.redisIterator #392

Merged
merged 2 commits into from
Sep 17, 2022

Conversation

chrisfjones
Copy link
Contributor

@chrisfjones chrisfjones commented Aug 23, 2022

The HSCAN result gives us a 1-dimensional array with [n] == key, [n+1] == value so we take advantage of that in the redisIterator.

FWIW We've been running this in production for the past 2 weeks with no issues. Before this change we were chugging through unexpected keys in VisitAll. Most of them were the values being passed as keys, a handful were the offset values (one per partition). After this change, we are only Visiting the expected keys.

The HSCAN result gives us a 1-dimensional array with [n] == key, [n+1] == value so we take advantage of that in the iterator.
@chrisfjones chrisfjones marked this pull request as ready for review September 2, 2022 17:06
@chrisfjones
Copy link
Contributor Author

@frairon 👋
Let me know if there is more I should add to this PR. We've been running our fork in prod for about a month now and it's all working as expected 👍

frairon
frairon previously approved these changes Sep 13, 2022
@frairon
Copy link
Contributor

frairon commented Sep 13, 2022

Hey @chrisfjones ,
many thanks for the contribution, sorry for the long delay - vacation took over :).
Anyway, to be honest we have not used the redis storage in a very long time - we can consider it unmaintained. Glad that you took a shot at improving it, really!! The changes don't look wrong, but I have no resources of double checking it with a running system at the moment, so let's just trust your prod environment.
Maybe we can add some system tests running as github actions in future.

One question out of curiosity: what's your use case for VisitAll here?
To iterate over a table, the recommended approach is View.Iterator as the VisitAll can really slow down a processor, so I'm just really curious.
Thanks again!

@chrisfjones
Copy link
Contributor Author

chrisfjones commented Sep 16, 2022

@frairon 👋
No worries about the delay, hope you had a refreshing break!

My use case for VisitAll is that I've built a little "time-window aggregation" framework for our needs. It's intended to work a bit like https://docs.ksqldb.io/en/latest/concepts/time-and-windows-in-ksqldb-queries/ but a with a little more generic handling of "active sessions".

One of our use cases goes something like this (not exact, but it should illuminate the usage of goka):
We have a set of services that handle real-time collaborative text editing. For simplicity's sake, each character typed becomes an event. We want to "group" these events together by time and after, say, 15 seconds we want to emit a bundled event for further downstream processing. If the user types 1 character and then pauses for 15 seconds, we still want that bundled event to fire at the end.

To accomplish this in goka, we've done the following:

  • For each "session" type we'd like to manage (typing bundle in this case), we create a goka group and an associated "SessionHandler". The input is the "all typing events" topic and the output is the "typing bundles" topic
  • The SessionHandler has logic to decide if a session should start (always for this case) and if it should close (after 15 sec.)
  • The state of each session is maintained in the group table.
  • In the process function, we first see if a session is already open or if we need to create a new session.
  • We then append whatever new state we want to for the session and see if we should close the session.
  • If the session is closed we delete the corresponding key in the group table and emit the resulting bundle event to our output, otherwise we save our updated session to the group table and move on.

This would work great on its own except for the case where the user types a few characters and then leaves the document. In this case, the "session" will remain open for that key until someone else decides to come in and type in the doc again. We chose to handle this case with VisitAll and it works like this:

  • Every X seconds, we call VisitAll on all active keys for all partitions handled by this processor.
  • For each active key, we grab the value as a session and we determine if that session can be closed (based on the current time)
  • If the session is closed we delete that key from the group table and emit the resulting bundle event to our output. It's this activity, I believe, that prevents us from using View.Iterator since we need a live GokaCtx for writes. Using VisitAll also helps us distribute this behavior by partition.

Furthermore, we are using RedisStorage to keep track of our group state so that our process can handle process death and restarts without losing track of active sessions. So far we've adapted this pattern to various different session types running in production and although the VisitAll does consume a bit of processing, it seems to be working well for our needs.

Apologies for the tome 😄 and thanks for your time! Hopefully that makes sense. I'm 99% certain this solution is overkill and there is a much simpler way to accomplish this, so I'd be anxious to hear of any alternatives.

Cheers!

@chrisfjones
Copy link
Contributor Author

p.s. I've just now discovered #99 (a little late to the party 😅) and it looks like they're trying to accomplish a similar thing. I'll cross reference the above comment over there and would be happy to provide more detail on our solution.

@frairon
Copy link
Contributor

frairon commented Sep 17, 2022

the two failing tests seem unrelated, let's merge this and fix the tests separately

@frairon frairon merged commit d13eca2 into lovoo:master Sep 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants