-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Session recordings ingester #10799
Session recordings ingester #10799
Conversation
* move recordings to their own queue * refactor: use headers for metadata, avoid needing to pass JSON I've moved the sessionId etc to KafkaMessage headers rather than having in the JSON data. This way we can treat the actual message data as opaque, which should have processing benefits as well as memory (if we can get rid of those toStrings that is.). * ci(session-recordings-ingester): add GitHub Workflow This will handle building and testing the docker image. Will follow up with the production deploy separately. * fix working dir? * set docker context * wait for server * send data via kafka headers * chore: fix prod start * add .swcrc for docker image * chore: get ci tests into a better state, fix signal handling in docker * add metadata writing * tweak test script * fix typo in tests for eventSource * add stub for producing to ClickHouse * skip many events test, get CI green Co-authored-by: Harry Waye <[email protected]>
expect(sessionRecording.events[0]).toMatchObject(JSON.parse(fullEventString)) | ||
}) | ||
|
||
// TODO: Handle this case and re-enable the test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rcmarron what do you think the cases here are? I know we get some guarantees on this from Kafka, but I guess there might be other parts of the system that we need to consider?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For details on guarantees perhaps https://kafka.apache.org/documentation/#semantics is useful
* it works * update username + password in docker * fix some tests * add distinctIds to tests * update tests to use new minio creds * add healthcheck Co-authored-by: Harry Waye <[email protected]>
* thread pool for metadata read * read snapshots from object storage * typing fixes
…nts in the ingester (#11022) fix(recording-ingester): handle concurrency in the ingester
…n-recordings-ingester
This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the |
Would adding a release-type allow this to be merged? Is there anything I can help with? e.g. testing |
@lancedouglas1 this has been put on hold for the time being as alternative solutions are discussed. Help with testing would be amazing though as well as any other input you can give. I gather you are on self hosted. May I ask what you use case/interest in this is for? |
I'm currently on cloud hosted as it works well, except the retention of recordings. My usecase has to do with retention lengths of recordings due to compliance. Basically, it is very rarely that we have to go back and review a session recording to confirm user activity with 60 days of the actual activity. It is almost always 90+ days later that other processes flag a session as requiring investigation. To be clear, we're not using recordings for security purposes, but rather we manage reservations for government owned properties, which means that we have many stakeholders with various interests in different types of citizenry-involved events that have happened in the past. |
This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the |
This PR was closed due to lack of activity. Feel free to reopen if it's still relevant. |
This is a WIP branch for the work on Session Recordings ingestion revamp.