Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is sessions page? #4884

Closed
macobo opened this issue Jun 28, 2021 · 17 comments
Closed

What is sessions page? #4884

macobo opened this issue Jun 28, 2021 · 17 comments

Comments

@macobo
Copy link
Contributor

macobo commented Jun 28, 2021

Conclusion: We should rewrite or create a new sessions page. Changes include removing "sessions" and replacing them with one recording = one row, no list of events feature under the table.

More details under #4884 (comment)


I think it is worthwhile to revisit the concept of the sessions page from the ground up to support session recording and our users better.

Problem statement

Our sessions page is currently driving a 2 very different use-cases:

  1. Users can see dynamically constructed sessions on this page and see what actions a user took on the page
  2. Users can search and view interesting session recordings there (if enabled)

I'm not very confident in usecase (1) since I haven't heard users have great success with it.

In the current implementation, the two are often at odds with each other.

  1. "Sessions" are constructed dynamically from events in the backend, while session recordings are tab-based.
    1. This may cause 0 or 2 or 5 different "session recordings" to show up under one "session". Or one session recording to show up under multiple sessions.
    2. Session durations may not match session recording lengths since the duration calculation (at least used to be) made based on the difference between first and last event.
    3. This can lead to a general feeling of "session recordings are missing"
    4. If autocapture is off users may end up with "invisible" session recordings if no normal events were captured.
  2. Performance - "sessions" are expensive to calculate, while "session recordings" are relatively inexpensive
    1. Due to this, we currently only show sessions per day on the page. This in turn hurts session recording bad - if you're searching for a rare event you need to look through many pages.
  3. "Sessions" discriminate based on distinct_id (and not person_id)
    1. So if a user logs in and their distinct_id changes as a result, events prior to login would be counted as a separate session
    2. It's currently unclear where the session recording would end up - under one of the sessions?
  4. "Sessions" include backend-event only sessions which would not include any session recordings.
    1. If the backend uses a different distinct_id from clients javascript setup, the sessions are

What can we do about it?

I don't have a good answer. Some thoughts:

I think the trouble begins with the definition of "session" and "session recording" being different. One groups all events (including backend/mobile/others) into one, other uses an internal tab-specific id generated on the javascript client. We can't really use the tab-specific id for sessions without losing support for events outside posthog-js to be displayed.

Some sort of a "sessions" <-> "session recording" mapping is really useful, since it allows to search for recordings where some specific event happened and look up the specific time and we don't have that context from the recording itself.

Perhaps the solution lies in flipping the script - a separate page for session recordings which is inversely linked to sessions?

However more context for usecase (1) would be needed here.

Additional context

@paolodamico @marcushyett-ph @kpthatsme Food for thought product-wise.

This is a follow-up to https://github.com/PostHog/product-internal/issues/86.

@mariusandra
Copy link
Collaborator

Thanks for writing this! Having just implemented the "use the session recording duration if it's longer than the duration between the first and last events" feature (#4853), I felt this pain as well.

We can't really use the tab-specific id for sessions without losing support for.

Missing the bit after "for...?" :). I expect you were referring to losing support for filtering?

I think from the user's perspective, the "right solution" is clear. Just like the events page has multiple tabs (events, actions, event properties, etc), we have a similar system here. Once you have enabled session recordings, the first/default tab under "sessions" is "session recordings", queried from the session recordings table. The second tab is "computed sessions" or whatever we call what we currently have.

Applying various filters (give me rageclicks for pro plan users), the session <-> session recording mapping, etc are implementation details that we can sort out once we prioritise this.

Usecase 1 still has some value IMO, and is clearly the only thing you can rely on if you disable session recordings (e.g. handling medical data requires you to), so I wouldn't nuke it. Just put it behind an extra click when possible.

@macobo
Copy link
Contributor Author

macobo commented Jun 28, 2021

Missing the bit after "for...?" :). I expect you were referring to losing support for filtering?

Oops. Thanks, added missing bit to "We can't really use the tab-specific id for sessions without losing support for events outside posthog-js to be displayed."

I think from the user's perspective, the "right solution" is clear.

I kind of agree, but if we're rethinking this anyways it might make sense to talk with customers and revisit this anyways. The devil is in the details and our concept of a session might need tweaking!

@marcushyett-ph
Copy link
Contributor

I don't have any strong gut opinions here - other than from my own experience of using the product. I found the difference between sessions and session recordings is not clear - especially if your main focus is using session recording. The concept of navigating session recordings through session events was uniquely valuable whilst trying to understand why users were or were not doing something.

I kind of agree, but if we're rethinking this anyways it might make sense to talk with customers and revisit this anyways. The devil is in the details and our concept of a session might need tweaking!

I agree with this - we should spend some time with our users getting to know why they use sessions / session recordings and the root problem they're trying to solve by doing it, this should give us a clearer idea of the direction we should go in. @paolodamico thoughts?

@kpthatsme
Copy link
Contributor

@macobo awesome write up, I feel like sessions has been a bit of an elephant in a room :)

Unfortunately I really don't have any strong ideas here yet. I don't think any product today solves the session analysis use case out of the box for the 80%, because measuring sessions seems to be one of the more arbitrary kinds of analysis.

It's tough because I think a lot of what would be measured in sessions can be measured in trends, funnels, session recordings, and retention – which arguably are better kinds of analyses today.

Let's say for example – we had a really flexible systematic way for implementors to start and end sessions – would it be more beneficial to roll that type of meaning into our other kinds of analyses (i.e. insights now can now look at $session events and do time based breakdowns).

I'm not sure what the answer is to all this yet but those are some of thoughts, excited to explore more about the right thing to build here is.

@marcushyett-ph
Copy link
Contributor

marcushyett-ph commented Jun 30, 2021

What is the purpose of Sessions?

Context

Based on the question Karl posed, I wanted to spend some time to take a step back and think why people need to use sessions or session recordings to see if these can help us prioritize or and build a better product. Warning this is a bit of a long read and goes in a quite different directions to help explore the purpose of sessions.

Analogy

I’m going to use the analogy of investigating a crime to help frame up the purpose of sessions.

For product analytics, we need to invert this analogy, we actually want as many people to be as successful as possible - so we’re trying to do the opposite to a detective, we already know who committed the crime (successful users) and we want to work out why everyone else didn’t commit the crime.

When investigating a crime, detectives are looking to identify 3 things about any potential suspect:

  • Means: Did they have the capability to commit the crime
    • Product: Do they have the tool they need to solve their problem?
  • Motive: Did they have a reason to commit the crime
    • Product: Do they have a problem that’s worth solving?
  • Opportunity: Were they in the right place at the right time to commit the crime
    • Product: Could they find the tool they need to solve their problem?

Types of Evidence

When investigating a crime there are a ton of different ways evidence can be broken down, for simplicity I’m going to stick to the following:

  • Primary: A witness directly saw or heard the crime being commited
    • Product:
      • User interviews
      • Emails or slack messages from users
      • User Reviews
  • Secondary: We have evidence that indirectly alludes to the crime being commited (e.g. a photos showing they were at a specific location on a specific day)
    • Product:
      • Sessions & session recordings
      • Funnels
      • Trends

How does Secondary Evidence help us understand Means, Motive and Opportunity?

As highlighted above, sessions and session recordings are secondary evidence, so we cannot rely on them to give us the full picture of what happened and must make some interpretations as to what they might mean.

Using the framework for committing a crime above we can consider how sessions help us validate these dimensions:

  • Means
    • Does the user have the skills required to use the product (e.g. don’t have the skills to interpret the data)
      • Evidence we might expect to see:
        • They go to a screen and spend time there without progressing to the next screen
        • They click around aimlessly looking for something familiar (triggering lots of up funnel events in a short period of time)
        • They’re looking at tooltips and documentation
  • Motive
    • Does the user have a problem that they need this product to solve (e.g. They don’t have any itent to use the product they just saw it on reddit and thought it was interesting)
      • Evidence we might expect to see from sessions:
        • Dropping-off early in the onboarding experience
        • Using fake / spurious data to fill in forms
        • Rapidly skipping steps in the onboarding experience without reading further
  • Opportunity
    • Does the user find the feature they’re looking for to solve their problem (e.g. They have a have the skills to interpret the data but they cannot find it)
      • Evidence we might expect to see from sessions:
        • Rage clicks when they have been in a certain part of the product for an extended period of time
        • Methodic searching of features, clicking every button within a certain area of the product
        • Searching documentation and retracting their steps back through the product

Can you summarise this into something snappier?

Sure, I believe the main reasons people want to use sessions are as follows:

  • Identify Opportunities:
    • What parts of the product are people trying to do more than they product enables them do today?
    • How do people respond to changes in our product?
    • How can we make important features easy to discover?
  • Understand Confusions:
    • Can people work out how to use our product?
    • Are people trying to use the product in a way it wasn’t intended?
    • Do we have the right guidance and documentation to unblock users?
  • Experience Frustrations
    • What bugs are people encountering and what are the consequences?
    • Where are they getting stuck?
    • How might they feel when something goes wrong?

Is there an alternative abstraction we could use other than a session?

Since users do a lot on a product, It’s hard to navigate a long continuous list of events, actions or recordings. Sessions are a simple way to break down their actions into chunks that are likely related with eachother and make easier to navigate a potentially large amount of data.

Potentially removing the session abstraction all together we could group events and recordings by 4 dimensions to help people navigate to the exact event information they need to without exposing the session concept at all:

  • Who
    • Find all events and recordings that for certain people
  • When
    • Find all events and recordings that happened during a certain time period
  • Where
    • Find all events and recordings that took place in certain part of the product
  • Type
    • Find all recordings where a certain type of event occurred

Whats interesting about his abstraction (and food for thought) is that is very similar to how filters in trends works today and if we integrate session informations and recordings well into trends well - we might able able to do without a separate sessions tool.

This is just some early thinking - would appreciate any further thoughts and feedback.

@clarkus
Copy link
Contributor

clarkus commented Jul 2, 2021

I like the inverted crime-solving analogy and I think it makes a lot of sense. While this is targeted at "catching the perpetrator" there could be a secondary goal of identifying other possible perpetrators through their common attributes and behaviors. If we can identify an archetypal user, someone who really matched our desired behaviors, that archetype could be used as a model for matching against other users. Imagine finding this archetype, and then a simple action that said something to the effect of "show me more users like this". Secondary to that kind of powerful action, we could roll up common attributes for a given set of sessions and show distinct user lists and a weighted list of attributes based on their recurrence across sessions. This is weird to say, but take FBI profiling and apply that to session exploration as it relates to persons. It could be pretty powerful.

This is all very focused on the who aspect of sessions, but similar solutions could be applied to describe the when, where, and type aspects of sessions.

There is some related discussion at #4960 (comment) and https://github.com/PostHog/product-internal/issues/92#issuecomment-872602869.

@paolodamico
Copy link
Contributor

Love that you brought this up @macobo! (and apologies for the delay).

Proposal (TLDR)

  • Run an experiment based on @mariusandra proposal for users who have session recordings enabled, where we put the current sessions page in a separate tab and default to a session recordings list (in the persons page as well). Let's measure how many users still use the sessions page (and get some qualitative feedback on this too).
  • @clarkus and I can explore a different way of visualizing the event stream, particularly in the scope of a person, where we provide this exploratory capabilities in a way that better solve this "reverse-detective" needs and causes less confusion.
  • A bigger question I think is worth exploring. Should we have sessions-based analytics? Our current capabilities here are quite limited. Session-based analytics is also traditionally used more for web analytics vs. product analytics. Thoughts?

Rationale / My thoughts:

  • My gut feeling is that people use the sessions page to explore and discover what they don't know (aligned to the cases @marcushyett-ph pointed out), and it's preferable to events because it is indeed a bit easier to digest. Not only that, but it's pretty much the only view that aligns with this exploratory mental model (Events are a continuous stream, you can very easily lose your place in line and having a mix of users makes it hard to listen to a story. With persons, it might be easier, but you still have to find the right person to look at (i.e. extra effort)).
  • Identifying periods of inactivity is useful. When you're going through the stream of events from a user to investigate, clearly seeing drop-offs helps understand the story. Even the "arbitrary" 30-min dropoff is useful now. Perhaps different values will be relevant for different products. Is there a better way to visualize this?
  • Finally, our sessions-based analytics are quite bare and having this session page introduces confusion and missed expectations.

@macobo
Copy link
Contributor Author

macobo commented Aug 3, 2021

Given this will come into focus next sprint - any new thoughts and developments here, input from customers? cc @paolodamico

@paolodamico
Copy link
Contributor

I don't have any new context from users at this point but do hope to be able to provide more soon. I do think the proposed experiment approach could be a solid way to start. We could in the meantime consider alternative approaches / use cases for this view.

FYI we have this doc in which we're starting to discuss the potential sub-focus (theme) for the next sprint (e.g. session recording), and other potential approaches to sessions.

@macobo
Copy link
Contributor Author

macobo commented Aug 10, 2021

So I spent today looking at a bunch of session recording tools and I think I have an approach similar to @paolodamico's in mind.

Read https://github.com/PostHog/product-internal/issues/127#issuecomment-895908203 before this!

From first principles:

Key reasons to use sessions / session recording

Adapted from https://www.fullstory.com/resources/the-definitive-guide-to-session-replay

  1. Reproduce and solve bugs
  2. Supporting customers via context
  3. Conversion rate optimization
    • Special case: Understand and improve onboarding [UX]
  4. Improve user experience:
    • Generally, looking at how users are using $feature
    • Understand and improve onboarding [UX]

We're not currently excelling at any of these.

What to focus on

Strategically I'd focus on 3 and 4 for now. 2 will become good on it's own when 3-4 are great and 1 requires extra tooling built on top (e.g. network/console level capture) which are a distraction right now.

In conversion rate optimization/nailing diagnosis, step 3 is analyzing things qualitatively, that's where session recording comes in.

What role does sessions play here

  • If you’re doing conversion rate analysis, allow users to “jump in” and see sampling of recordings from other contexts
  • Sessions page allows users to find the sessions they’d like to watch based on advanced criteria
  • To make that less cumbersome (so it doesn’t feel like reinventing the wheel every time), the tooling often allows to use “saved filters” or “segments”.
  • Also predefine some key and tricky ones to get people off to the races.
    - E.g. “user experienced an error” or “frustrating experience” or “dead clicks” etc
  • Sessions should bring the most interesting sessions to the forefront (e.g. errors, frustrations)

Where are we going wrong right now

I think the big issue is that there’s sessions and session replays. Users don’t care about the difference!

Minor issues are:

  • Bad UX
  • Incomplete filtering (e.g. no frustration/error/etc measures/events)

How to improve

  • Let’s scrap the current sessions page (or in more practical ways, make a new page which is the default when session recording is turned on).
  • Make a page focused on session recordings.
    • When showing a session recording, do an “inverse” lookup - look up events for person_id during $timerange
    • This removes the feeling of “missing sessions” - short bouncy sessions get removed
  • Scrap “list of events” under recording - clicking on session jumps to recording
    • Show list of events only there.
    • If we have good ideas re summarization later I think we can use those. However I think we can delay this feature.
  • Capability-wise, let’s extend the filters to allow looking up people from funnels, etc. E.g. run the funnel query, get time per person and then look up the session recording.
    • We don't want users to start constructing these from within sessions but if we nail diagnosing causes quantitatively this comes in handy.
  • Let’s add predefined filters for “frustration”, “dead clicks”, etc.
  • Focus on improving the player. E.g. skipping inactive, better “affordances” on the bar around when events and inactivity occurs.

Note this list is not logistical, just some musings on how I'd do this in isolation. I'd prefer to solve step 2 - diagnose causes qualitatively first :)

@paolodamico
Copy link
Contributor

Very aligned with you @macobo, and I think this context will be extremely relevant for everyone working on this. On the specifics of what to actually work on for the next sprint, here's what I'm proposing:

  • Capturing other useful events/properties & allowing filtering and ordering (will help with prioritization), like “frustration”, “dead clicks”, requests load time, page performance / resource usage, bounces, etc.
  • Focus on improving the player. eg. skipping inactive, improving the playback bar, better “affordances” on the bar around when events and inactivity occurs, clarify redacted inputs, …
  • Capturing and reliability issues (Sentry errors, compression errors, ….)
  • Improving the play page experience ?, clear and useful list of events, additional useful context ** unsure if we have enough context for this.

Specifically I've been thinking about the sessions / recordings page and I would like to challenge it. My strongest argument against working on this right now is that this page will not actually be solving for our Diagnosing Causes goal. It's certainly helpful (particularly for the other use cases of session recording) and will definitely be needed to avoid confusion even with with Diagnosing Causes, but it seems tangential. This being said I'm making a proposal anyways to justify getting rid of the sessions page (with numbers & user feedback) and creating a A/B test for this, but it might not be the right time to ship this yet. Thoughts?

@macobo
Copy link
Contributor Author

macobo commented Aug 11, 2021

My strongest argument against working on this right now is that this page will not actually be solving for our Diagnosing Causes goal.

I agree - the goal of my previous post was to outline a longer-term solution (and how it fits into the larger scheme), not to outline the plan for the next sprint :) However let's take a look now!

Thoughts on logistics/tactics

So the improvements you listed fall into three camps:

1. Improving the player / play page

Improving this is in no way blocked by the improvements listed above.

However the improvements here also IMO only softly align with "Diagnosing causes" since they're too far down the hierarchy of things user needs to do.

2. Improving sessions page (via ordering, etc)

This is closer to the "Diagnosing causes" root (kind of the 3rd step). However there is one cause why I think it's better to make these improvements after we do the steps outlined in my previous post:

Performance is tricky in the current sessions page.

  1. The way we're dynamically grouping sessions together would make e.g. improving ordering really tricky :)
  2. One large issue with the current sessions page is the date filter (you only see recording from a single day at a time) - this hurts investigation really bad in e.g. the scale we work on.

Both of these become much more trivial by making the page more bare-bones and stripping out the "sessions" -> "recordings" dichotomy.

3. Improving reliability

I don't think this is really a priority if we create a new page because:

  1. Most of the "recording is missing" reasons which are outlined in the first post (person_id vs distinct_id, mismatching times, etc) just don't exist with this page.
  2. Users won't also feel the pain as acutely

There's also a fourth hidden improvement:

4. Improving "linking" pages together

The more functionality we add here (e.g. user dropped out at Xth step in funnel) the more work we create to refactor later. See also the argument on "performance" under 2.


That said, all is not perhaps as rosy as I portrayed in my original post. Replacing the page requires figuring out some "tricky" questions like:

  1. What happens to the users page?
  2. How do we want to handle linking features together
  3. If we keep the two pages alive at the same time, do we add functionality to both?

etc.

That said, I think it's important to get this ball rolling and what you propose here makes sense imo as a first step:

This being said I'm making a proposal anyways to justify getting rid of the sessions page (with numbers & user feedback) and creating a A/B test for this, but it might not be the right time to ship this yet.

@marcushyett-ph
Copy link
Contributor

This sounds like a reasonable focus to me:

Capturing other useful events/properties & allowing filtering and ordering (will help with prioritization), like “frustration”, “dead clicks”, requests load time, page performance / resource usage, bounces, etc.
Focus on improving the player. eg. skipping inactive, improving the playback bar, better “affordances” on the bar around when events and inactivity occurs, clarify redacted inputs, …
Capturing and reliability issues (Sentry errors, compression errors, ….)
Improving the play page experience ?, clear and useful list of events, additional useful context ** unsure if we have enough context for this.

However I'd be keen to explicitly call out solving the link between funnels (or persons modal more generally) and a specific (part of a) session recording - I think this is essential for anyone to diagnose a cause using session recordings.

Would be great to work with the wider team(s) to break these down into clear projects individual team members can tackle or collaborate on during the next sprint?

@marcushyett-ph
Copy link
Contributor

@macobo If (hypothetically) people are using the persons modal as the main entry point to session recordings, would we not come across the same issue with some people having missing recordings etc? (e.g. we cannot just filter out people if they don't have a recording)

@macobo
Copy link
Contributor Author

macobo commented Aug 11, 2021

Thing is I don't think the issue is "reliability" in most cases.

Here are some scenarios where currently user is currently left with a feeling of "sessions are missing".

  1. Application sent some events from the backend (resulting in a separate "session" in the FE)
  2. User got "posthog.identify"d in the middle of the session (person_id vs distinct_id)
  3. When user reads a TOC page for 40 minutes (falling asleep at the computer) but consistently scrolling. Will result in "2" sessions but 1 session recording
  4. User bounced on the page immediately (e.g. <5s session)
  5. Site is problematic for session recording - e.g. has deep nested HTML consistently changing, making the session recording events huge
  6. User is on a bad mobile network, we didn't manage to receive the events

Now, while 4-6 are real cases, every session recording tool struggles with them and it'll be hard to make more than incremental progress on this. However issues 1-3 are in my opinion more visible and completely caused by our own product decisions (which I'd hope to fix via solution laid out above).

For person modal - I think solving 1-3 will lead to more of an improvement there than 4-6 for e.g. funnels (which naturally take a longer time to complete)

@paolodamico
Copy link
Contributor

Alright, created a PR (PostHog/posthog.com#2028) to finish the discussion and reach a final conclusion as this issue has become too bloated. Keeping this issue around because there are a bunch of issues which will likely be solved by whatever ends up being the final solution, and we should update those.

@PostHog PostHog locked as resolved and limited conversation to collaborators Sep 17, 2021
@paolodamico
Copy link
Contributor

We're still crossing a few ts (pun intended), but the fundamental issue has been addressed, closing.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants