Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more metrics to the stats #1128

Closed
josecelano opened this issue Dec 12, 2024 · 1 comment · Fixed by #1130
Closed

Add more metrics to the stats #1128

josecelano opened this issue Dec 12, 2024 · 1 comment · Fixed by #1130
Assignees
Labels
- Admin - Enjoyable to Install and Setup our Software - Developer - Torrust Improvement Experience Enhancement / Feature Request Something New

Comments

@josecelano
Copy link
Member

josecelano commented Dec 12, 2024

The tracker collects these metrics:

{
  "torrents": 171983,
  "seeders": 41739,
  "completed": 19352,
  "leechers": 130658,
  "tcp4_connections_handled": 43,
  "tcp4_announces_handled": 43,
  "tcp4_scrapes_handled": 0,
  "tcp6_connections_handled": 0,
  "tcp6_announces_handled": 0,
  "tcp6_scrapes_handled": 0,
  "udp4_connections_handled": 13970767,
  "udp4_announces_handled": 36213856,
  "udp4_scrapes_handled": 819566,
  "udp4_errors_handled": 11157054,
  "udp6_connections_handled": 0,
  "udp6_announces_handled": 0,
  "udp6_scrapes_handled": 0,
  "udp6_errors_handled": 0
}

that are exposed via an API endpoint (https://127.0.0.1/api/v1/stats?token=MyAccessToken).

I've been working on the integration with Prometheus and Grafana, and I have realized there are critical missing metrics that could help identify problems.

For example, regarding udp4_announces_handled, what we are actually doing is increasing the counter after calling the core tracker but before sending the response to the client. That means the requests could not have been handled at all. In the current implementation, "handled" means the requests have reached the core tracker at least.

We use a ring buffer for the active requests, and requests being processed can be aborted. I would like to know:

  1. The number of requests that have been received (with the receiver). At this point, we don't even know the type.
  2. The number of requests that have been aborted (force push in the ring buffer). We could know the type, but that would require some changes, and it can decrease performance because we nned to ask the processor about the current state of processing.
  3. The number of responses successfully sent to the client.

NOTE: we have to check the impact on performance after adding these new metrics.

Proposal for names:

  • udp4_requests
  • udp4_requests_aborted
  • udp4_responses_sent

Same for udp6.

What do you think @da2ce7? This will allow us to quickly detect problems. If it affects performance, we can enable/disable it with a configuration option.

@josecelano josecelano added Enhancement / Feature Request Something New - Developer - Torrust Improvement Experience - Admin - Enjoyable to Install and Setup our Software labels Dec 12, 2024
@josecelano josecelano self-assigned this Dec 13, 2024
josecelano added a commit to josecelano/torrust-tracker that referenced this issue Dec 13, 2024
In the stats enpoint the new values are:

- udp4_requests
- udp6_requests
@josecelano josecelano linked a pull request Dec 13, 2024 that will close this issue
6 tasks
josecelano added a commit to josecelano/torrust-tracker that referenced this issue Dec 13, 2024
In the stats enpoint the new values are:

- udp4_responses
- udp6_responses
josecelano added a commit to josecelano/torrust-tracker that referenced this issue Dec 13, 2024
josecelano added a commit to josecelano/torrust-tracker that referenced this issue Dec 13, 2024
josecelano added a commit that referenced this issue Dec 16, 2024
6ca82e9 feat: [#1128] add new metric UDP total requests aborted (Jose Celano)
9499fd8 feat: [#1128] add new metric UDP total responses (Jose Celano)
286fe02 feat: [#1128] add new metric UDP total requests (Jose Celano)

Pull request description:

  Add more metrics to the UDP tracker stats. The new values are:

  - `udp4_requests`: total number of requests received from IPv4 clients.
  - `udp6_requests`: total number of requests received from IPv6 clients.
  - `udp4_responses`: total number of responses sent to IPv4 clients.
  - `udp6_responses`: total number of responses sent to IPv6 clients.
  - `udp_requests_aborted`: total number of requests aborted to make room in the active requests buffer.

  ### Notes

  - Responses sent might differ from requests received because of aborted requests.
  - When we [merge the IP ban service](#1124), we can add a new metric for the total number of IPs banned.
  - I want to add these new metrics to the [live demo Grafana dashboard](torrust/torrust-demo#20).

  ### Subtasks

  - [x] `udp4_requests`
  - [x] `udp6_requests`
  - [x] `udp4_responses`
  - [x] `udp6_responses`
  - [x] `udp_requests_aborted`
  - [x] Benchmarking to check how it affects performance before merging it.

ACKs for top commit:
  josecelano:
    ACK 6ca82e9

Tree-SHA512: 7fbf75b264b191f5c58fcecde8d5e783bbe54ee1c1799acdddc04a9ef64b7196d8b95d1bcad420b1df269bc7929e44417a1d164c6953b00804b0d1e5f0b36e7d
@josecelano
Copy link
Member Author

josecelano commented Dec 16, 2024

Hi @da2ce7, after running the demo tracker for 6 hours aprox with the new metrics, I'm getting this data:

With 880 req/sec, we have ~3 requests aborted in the ring buffer (active requests).

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
- Admin - Enjoyable to Install and Setup our Software - Developer - Torrust Improvement Experience Enhancement / Feature Request Something New
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant