Sentry failed to caught up when QPS is higher than 1000 #3471

ZLBillShaw · 2024-12-13T03:57:31Z

Self-Hosted Version

24.9.0

CPU Architecture

x86_64

Docker Version

kubernetes

Docker Compose Version

kubernetes

Machine Specification

My system meets the minimum system requirements of Sentry

Steps to Reproduce

When the number of errors captured by my SDK is high, data delays occur. I suspect this is due to Kafka consumption not keeping up.
I’ve increased the maxBatchSize for postProcessForwardErrors, workerEvents, ingestConsumerEvents, outcomesConsumer, and replacer to 10,000. However, I’m still facing delays. Should I adjust the number of partitions for the topic? Below are my configurations:

topics:
      - name: events
        # Number of partitions for this topic
        partitions: 30
        config:
          "message.timestamp.type": LogAppendTime
      - name: event-replacements
        partitions: 10
      - name: snuba-commit-log
        partitions: 10
        config:
          "cleanup.policy": "compact,delete"
          "min.compaction.lag.ms": "3600000"
      - name: cdc
      - name: transactions
        partitions: 20
        config:
          "message.timestamp.type": LogAppendTime
      - name: snuba-transactions-commit-log
        partitions: 10
        config:
          "cleanup.policy": "compact,delete"
          "min.compaction.lag.ms": "3600000"
      - name: snuba-metrics
        config:
          "message.timestamp.type": LogAppendTime
      - name: outcomes
        partitions: 20
      - name: outcomes-billing
        partitions: 20
      - name: ingest-sessions
      - name: snuba-sessions-commit-log
        config:
          "cleanup.policy": "compact,delete"
          "min.compaction.lag.ms": "3600000"
      - name: snuba-metrics-commit-log
        config:
          "cleanup.policy": "compact,delete"
          "min.compaction.lag.ms": "3600000"
      - name: scheduled-subscriptions-events
        partitions: 20
      - name: scheduled-subscriptions-transactions
        partitions: 10
      - name: scheduled-subscriptions-sessions
      - name: scheduled-subscriptions-metrics
      - name: scheduled-subscriptions-generic-metrics-sets
      - name: scheduled-subscriptions-generic-metrics-distributions
      - name: scheduled-subscriptions-generic-metrics-counters
      - name: events-subscription-results
        partitions: 20
      - name: transactions-subscription-results
        partitions: 10
      - name: sessions-subscription-results
      - name: metrics-subscription-results
      - name: generic-metrics-subscription-results
      - name: snuba-queries
        partitions: 20
        config:
          "message.timestamp.type": LogAppendTime
      - name: processed-profiles
        config:
          "message.timestamp.type": LogAppendTime
      - name: profiles-call-tree
      - name: ingest-replay-events
        config:
          "message.timestamp.type": LogAppendTime
          "max.message.bytes": "15000000"
      - name: snuba-generic-metrics
        config:
          "message.timestamp.type": LogAppendTime
      - name: snuba-generic-metrics-sets-commit-log
        config:
          "cleanup.policy": "compact,delete"
          "min.compaction.lag.ms": "3600000"
      - name: snuba-generic-metrics-distributions-commit-log
        config:
          "cleanup.policy": "compact,delete"
          "min.compaction.lag.ms": "3600000"
      - name: snuba-generic-metrics-counters-commit-log
        config:
          "cleanup.policy": "compact,delete"
          "min.compaction.lag.ms": "3600000"
      - name: generic-events
        partitions: 20
        config:
          "message.timestamp.type": LogAppendTime
      - name: snuba-generic-events-commit-log
        partitions: 20
        config:
          "cleanup.policy": "compact,delete"
          "min.compaction.lag.ms": "3600000"
      - name: group-attributes
        partitions: 20
        config:
          "message.timestamp.type": LogAppendTime
      - name: snuba-attribution
        partitions: 20
      - name: snuba-dead-letter-metrics
      - name: snuba-dead-letter-sessions
      - name: snuba-dead-letter-generic-metrics
      - name: snuba-dead-letter-replays
      - name: snuba-dead-letter-generic-events
        partitions: 10
      - name: snuba-dead-letter-querylog
        partitions: 10
      - name: snuba-dead-letter-group-attributes
        partitions: 10
      - name: ingest-attachments
        partitions: 20
      - name: ingest-transactions
        partitions: 20
      - name: ingest-events
        ## If the number of exceptions increases, it is recommended to increase the number of partitions for ingest-events
        partitions: 30
      - name: ingest-replay-recordings
      - name: ingest-metrics
      - name: ingest-performance-metrics
      - name: ingest-monitors
      - name: profiles
      - name: ingest-occurrences
        partitions: 25
      - name: snuba-spans
      - name: shared-resources-usage
      - name: snuba-metrics-summaries

Expected Result

The most recently processed event matches the one I captured in real-time.

Actual Result

The delay of the latest errors increases as the QPS grows higher.

Event ID

No response

The text was updated successfully, but these errors were encountered:

github-project-automation bot added this to Self-hosted Sentry Dec 13, 2024

getsantry bot added the Waiting for: Product Owner label Dec 13, 2024

getsantry bot added this to GitHub Issues with 👀 3 Dec 13, 2024

getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 3 Dec 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sentry failed to caught up when QPS is higher than 1000 #3471

Sentry failed to caught up when QPS is higher than 1000 #3471

ZLBillShaw commented Dec 13, 2024

Sentry failed to caught up when QPS is higher than 1000 #3471

Sentry failed to caught up when QPS is higher than 1000 #3471

Comments

ZLBillShaw commented Dec 13, 2024

Self-Hosted Version

CPU Architecture

Docker Version

Docker Compose Version

Machine Specification

Steps to Reproduce

Expected Result

Actual Result

Event ID