Measure session recording attempts #5478

paolodamico · 2021-08-05T21:02:23Z

Is your feature request related to a problem?

We know session recording is a bit unreliable today, particularly in certain circumstances which we're unclear on today (e.g. browser versions, network connectivity, types of devices, etc.). To improve this, we need to measure it well.

Describe the solution you'd like

I'd like to be able to effectively measure failed session recordings (and ideally have some context on why they failed). One approach I was discussing with @macobo was to fire a regular PostHog event ("session recording started") whenever we activate session recording in the client. We can later match this event to a successfully completed session recording to get the ratio of success.

Describe alternatives you've considered

Using regular sessions as a proxy. Measure ratio of sessions in projects with enabled session recording that have < 1 recording.

Additional context

Work towards diagnosing causes.

Thank you for your feature request – we love each and every one!

paolodamico · 2021-08-05T21:03:04Z

@yakkomajuri not sure if this something you'd want to pick up or we should figure out in Core Experience?

macobo · 2021-08-06T06:31:10Z

Note - I would personally suggest the "alternative" approach since it reflects the users experience better. E.g. you can realistically end up in a situation where the "posthog-js" metric shows 99% and sessions/recordings ratio of e.g. 60%. As for why: see #4884

paolodamico · 2021-08-10T03:08:55Z

Well definitely one benefit of going with the approach you suggest is that we can measure already instead of waiting for data to come in. I would really appreciate it if someone with more context could help me build this query in Metabase. Maybe @macobo or @EDsCODE you could help?

paolodamico · 2021-10-12T04:59:31Z

@rcmarron @alexkim205 I think we should revive this and make sure we can measure % of failed recording attempts reliably.

rcmarron · 2021-10-12T17:47:01Z

100% agree @paolodamico

I think original approach may be more appropriate now that the UX has changed (although both would be valuable). I just did a bit of digging, and it looks like we add a $phjs-rrweb-record property to the Capture Metrics event when we start a recording. (https://github.com/PostHog/posthog-js/blob/4b82c682981542c0115d14f730f846a9278059e9/src/extensions/sessionrecording.js#L111).

I'm not very familiar with the Capture Metrics event, but @macobo would it be appropriate to base a metric off of that. Basically, I would check to see how many Capture Metrics events are fired with the $phjs-rrweb-record property that don't have a related session recording. I'm having trouble finding where the Capture Metrics event is actually fired...

The query would probably be too large if we're checking across teams, so I might try to sample it down somehow (maybe just take a subset of teams with recordings enabled).

Thoughts?

rcmarron · 2021-10-18T22:14:00Z

I have some revised thoughts here. It's helpful to think about two separate categories of missing recordings:

The server receives data, but it's incomplete
The server never receives data about the recording

After learning more about how session recording works, I think the vast majority of our 'missing recording' cases fall into the first category (see #2927, #6482). I'm sure there are cases of the 2nd category, but my guess is that the causes are out of our control (e.g. user with no network, analytics blockers etc.).

My proposal is that we focus on the first category. Fix those issues and then evaluate if we think there is more work to be done from there.

With that in mind, here is a query that measures the first case: https://metabase.posthog.net/question/167-recordings-w-o-a-full-snapshot-in-past-24hrs

@paolodamico Does that work for you?

paolodamico · 2021-10-19T14:58:05Z

That context is helpful @rcmarron. Well I would argue, we're assuming that (2) is small and/or we can't do anything about it, but unless we measure it we have no way of answering this (what if we have a bug in posthog-js that is dropping a lot more sessions?). I would challenge us to measure this so we can then decide whether further action is warranted.

Even for cases like network issues there might be things to try: can we minimize payloads? maybe don't capture images if network is slow?, ...

As a side note, it's great that we have that query, we now have a baseline and a solid way of measuring whether we achieved our goal for the sprint.

macobo · 2021-10-20T07:11:08Z

Suggestion: Don't target all session recordings. Sessions which immediately bounce or are very short are a lot more low-value than long sessions.

Suggested metrics:

% of recordings where no full snapshots are missing (>30s)
% of recordings with no intermediary events are missing (>30s)
- (e.g. done by sending an autoincrementing value together with every snapshot event, storing this info along with compression info)

paolodamico · 2021-10-21T22:08:06Z

Really love those conceptual metrics! We have a sync conversation scheduled for tomorrow to discuss and figure out more specifics. For instance, I'd want us to make sure we're holistically tracking any session >30s.

rcmarron · 2021-10-27T22:42:28Z

Closing the loop here. After some discussions, we decided to measure session recordings via 2 metrics:

Number of session recordings that do not have a full snapshot. (https://metabase.posthog.net/question/167-recordings-w-o-a-full-snapshot-in-past-24hrs)
- Note: It's on purpose that this isn't filtering out recordings less than 30 seconds. Even short recordings should have full snapshots, and if not it might be a sign of another issue like ph-js crashing when sending the snapshot.
Recording playbacks where the rrweb player fired warnings. It warns when it can't make sense of the sequence of events (e.g. events are missing or the sequence doesn't make sense). This metric needs to be implemented (Report when the rr-web player has warnings. #6704)

rcmarron · 2021-11-30T18:16:24Z

Closing this as part 1 is done and part 2 is captured here: #6704

paolodamico added the enhancement New feature or request label Aug 5, 2021

macobo added the session recording label Aug 6, 2021

macobo added the feature/session-analytics label Aug 6, 2021

paolodamico added the team-core-experience label Oct 12, 2021

rcmarron mentioned this issue Oct 27, 2021

Report when the rr-web player has warnings. #6704

Closed

rcmarron closed this as completed Nov 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Measure session recording attempts #5478

Measure session recording attempts #5478

paolodamico commented Aug 5, 2021

paolodamico commented Aug 5, 2021

macobo commented Aug 6, 2021

paolodamico commented Aug 10, 2021

paolodamico commented Oct 12, 2021

rcmarron commented Oct 12, 2021

rcmarron commented Oct 18, 2021 •

edited

Loading

paolodamico commented Oct 19, 2021 •

edited

Loading

macobo commented Oct 20, 2021

paolodamico commented Oct 21, 2021

rcmarron commented Oct 27, 2021

rcmarron commented Nov 30, 2021

Measure session recording attempts #5478

Measure session recording attempts #5478

Comments

paolodamico commented Aug 5, 2021

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Thank you for your feature request – we love each and every one!

paolodamico commented Aug 5, 2021

macobo commented Aug 6, 2021

paolodamico commented Aug 10, 2021

paolodamico commented Oct 12, 2021

rcmarron commented Oct 12, 2021

rcmarron commented Oct 18, 2021 • edited Loading

paolodamico commented Oct 19, 2021 • edited Loading

macobo commented Oct 20, 2021

paolodamico commented Oct 21, 2021

rcmarron commented Oct 27, 2021

rcmarron commented Nov 30, 2021

rcmarron commented Oct 18, 2021 •

edited

Loading

paolodamico commented Oct 19, 2021 •

edited

Loading