Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Auto Session Tracking #1290

Closed
antonpirker opened this issue Feb 21, 2022 · 16 comments
Closed

Implement Auto Session Tracking #1290

antonpirker opened this issue Feb 21, 2022 · 16 comments

Comments

@antonpirker
Copy link
Member

antonpirker commented Feb 21, 2022

Sentry can monitor the health of releases by checking session data it receives from the SDK.
In other SDKs this session data is already automatically collected. In SDK for Python and now brand new for Ruby. We also want to have this for PHP.

For reference see:

We want to implement request mode sessions which are aggregated in the SDK (as compared to application mode sessions which are sent as soon as they finish.) The main reason for this is the scale of most PHP servers out there which would overload the Sentry ingestion pipelines.

In the Ruby and Python implementation a session is one request-response cycle and there is a SessionFlusher that runs in a separate thread that collects the session data and sends it to the server once a minute in bulk.

Basically what this feature should do when enabled:

  • decide when a session should be created and start it in memory (set status of ongoing session to ok. also set release, environment, user, session_mode.)
  • decide when a session should end and if this end is happening update the status of the current session to exited.
  • when an error is raised, set the status of the current session to crashed or errored
  • every 60 seconds aggregate all current sessions into one JSON payload like described here: https://develop.sentry.dev/sdk/sessions/#session-aggregates-payload
  • when this aggregation is done, send the JSON payload to the Sentry server.

Due to the fact that PHP is single threaded this issue should be the start of a discussion on how this can be achieved, if it can be achieved at all.

You can also have a look on how this was done in Ruby: getsentry/sentry-ruby#1715

@stayallive
Copy link
Collaborator

For reference; #1254 is related to the issue of having a storage/buffer over multiple requests and flusher task in PHP. This is probably impractical without external (buffer) storage and task runner.

@Jean85
Copy link
Collaborator

Jean85 commented Feb 21, 2022

Exactly. In PHP we do not have any native way to bulk those information somewhere, because we cannot assume any external infrastructure or thread that we could leverage to do something like this out of the box.

Maybe something is feasible in the framework integrations, but that too would require a couple of assumptions or something that has to be manually enabled.

@antonpirker
Copy link
Member Author

Do Laravel or Symfony have something that can be used out of the box? Having this only in one framework is also a possiblity.

@smeubank
Copy link
Member

smeubank commented Feb 21, 2022

I wonder if we are building up a number use cases for an sidecar approach. What that might be is for sure open for discussion. But let's say, self-hosted relay to send such information to and it handles complexities needed for client reports, session tracking and other future features needed for gathering performance type data to aggregate and send to sentry

edit: not a real agent but a sidecar service

@mfb
Copy link
Contributor

mfb commented Feb 21, 2022

For apps that have a queue worker (i.e. a lot of apps though not all), the app could give the SDK a callback for adding items (json, or json plus any other metadata needed to send the request) to the queue, and then the queue worker needs an SDK method(s) to aggregate and send off the items. This logic could actually be used for all Sentry events so it's possible to send them separately from the request process (which you often want to keep free for handling requests), and aggregated together for efficiency. This is basically the app bringing its own agent, I guess.

@antonpirker
Copy link
Member Author

Sounds like a idea on how to do this. Could the SDK discover the queue worker by itself and hook into it, so the SDK can use the queue worker without the user needing to set anything up by hand?
And could you estimate how many PHP projects have this queue worker? It it something you setup right when you do your first "hello world" or is it something you only have when you have millions of users and a team of >5 programmers working on a project?

@mfb
Copy link
Contributor

mfb commented Feb 22, 2022

I don't think there is any commonality between how frameworks setup queues and the SDK is pretty abstract, so I think this would have to happen at the level of integration plugins/libraries that have more awareness of the particular app/framework they are running in.

@mfb
Copy link
Contributor

mfb commented Feb 22, 2022

But if the SDK made it possible, then that integration could wire it up.

@Jean85
Copy link
Collaborator

Jean85 commented Feb 23, 2022

And could you estimate how many PHP projects have this queue worker? It it something you setup right when you do your first "hello world" or is it something you only have when you have millions of users and a team of >5 programmers working on a project?

I would say that it's something that happens a lot with bigger dev teams and apps. Having background workers is becoming more common thanks to libraries like Symfony Messenger, but it's still something that you put in your app long after the launch of your app, and it's set up manually.

Auto discovery is probably partially doable in the Symfony integration, but you would still inject workload into the users queues, and that could be troublesome; it would highly object to have that in opt-out mode.

@HazAT
Copy link
Member

HazAT commented Feb 23, 2022

I think in order for that to make it applicable/accessible in most cases, we might need to go down the route that @smeubank mentioned briefly -> Agent (Relay).
While we do already support Relay as a kind of an acting Agent, there is still a lot of room for improvement to make the experience more seamless.
For example, we could do something like Scout APM and download the agent in the background on first-time use and then run it on the side https://github.com/scoutapp/scout-apm-php/blob/eaf275883dd2640ea2ad9ed6e568314554e334f0/src/CoreAgent/Downloader.php#L100

I am not saying this is the way, I am just not sure if adding support for Sessions only for Laravel and only if you run background queues/workers makes sense.

@ste93cry
Copy link
Collaborator

ste93cry commented Feb 23, 2022

For what is worth, as a user I would never ever want something to be downloaded in the background on my behalf and ran without me knowing about it. I would rather prefer to have a real agent (as a PHP extension or as an external dependency), even if it means that out of the box I have one more manual step to do to set it up

@Jean85
Copy link
Collaborator

Jean85 commented Feb 23, 2022

I agree with @ste93cry; and in fact any other service that I tried that works with performance & monitoring (Blackfire.io, NewRelic, DataDog) goes with an extension (that eventually spawns a background process) or with a clear dedicated agent to be deployed.

@mfb
Copy link
Contributor

mfb commented Feb 23, 2022

Yeah I think this would have to just be something SDK could provide infrastructure for, not fully automatic functionality. I maintain the Sentry integration for Drupal, and if it was possible to add Sentry events to Drupal's queue subsystem, I'd definitely have to provide various opt-in configurations around that e.g. to make sure someone didn't understand what was going on and flood their mission-critical queue with unexpected stuff :)

A PHP extension definitely makes sense to make performance tracing instrumentation easier; if it existed then could be leveraged for other functionality as well such as this..

@antonpirker
Copy link
Member Author

Thanks everyone for the input. Really amazing!

tl;dr: The agent is probably the best way to go.

I will close this issue now and we will start discussions about the agent approach in Sentry. When we have any news, you will be the first to know!
Thanks again!

@github-actions

This comment was marked as off-topic.

@cleptric
Copy link
Member

Closing this for now, we might revisit this in the future.

@cleptric cleptric closed this as not planned Won't fix, can't repro, duplicate, stale Oct 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants