Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use RTC(Run-to-completion) model to speed up cache read #2837

Merged
merged 3 commits into from
Jul 31, 2024

Conversation

cheniujh
Copy link
Collaborator

@cheniujh cheniujh commented Jul 31, 2024

本PR的前置PR为:#2629 (authored by @chenbt-hz

本PR实现的功能:对于读请求,直接在网络线程查内嵌的Redis 分片Cache,避开ThreadPool的task_queue竞争并省略线程中转的开销。

基本结论:在Pika内嵌Redis Cache缓存命中率较高的情况下,高并发情况下,Get QPS能提高到原本的1.5-2倍,且p99仅为原来的二分之一

性能收益的来源:

  1. 直接在网络线程查询Cache,省略了切换到工作线程的线程切换成本
  2. Pika的Thread Pool本身是多个Worker争抢同一个TaskQueue, 且Worker在等待任务队列时,也没有任何轻量级等待策略,直接就是很重的cv.wait, 在没有Redis Cache前,这或许不构成瓶颈(请求都是打在RocksDB上,大概率还是Pika上层需要等待RocksDB),但有了Redis Cache,且缓存命中率高时,在高并发下这个线程池中转的过程就构成了性能瓶颈。

PS: 为了稳定性起见,目前只添加了对GET、HGET命令的支持,但理论上所有单key的读命令,且缓存开启了的数据结构,都可以走RTC路径(但是哪怕纯内存也比较慢的操作不能走rtc)。先等生产环境验证GET、HGET, 确认稳定后再后续让其他符合要求的读命令也走RTC路径。

详细测试报告请见下方附1

chenbt-hz and others added 2 commits July 31, 2024 12:06
…#2629)

* (Demo) Do read cmd before task queue. && add workflow_dispatch for manual action

* Check authed and write lock
,fix go test error in MacOS and cache mode judge

* fix some ut error by  commands filter  and return logic

* rollback some flag,but add kCmdReadBeforeQueuefor get mget hget hget hgetall,hmget

* fix mem error in macos

* move mget and hmget;add before_queue_time metrics

* fix cost to copy cmd_table by remove c_ptr
Copy link

coderabbitai bot commented Jul 31, 2024

Walkthrough

The recent changes enhance the functionality of the Pika system by introducing new methods for better command handling and timing metrics. Notably, timestamps are now tracked before queue processes, and caching capabilities for command processing have been improved. Overall, these modifications aim to optimize performance and provide deeper insights into command execution.

Changes

File Change Summary
include/pika_client_conn.h Added before_queue_ts_ to TimeStat for enhanced timing, and new methods in PikaClientConn for command management.
include/pika_command.h Introduced isCacheRead() and DoReadCommandInCache() methods in Cmd for improved cache interaction.
src/pika_client_conn.cc Enhanced PikaClientConn with command interception logic and refined logging for command processing.
src/pika_command.cc Improved formatting and added DoReadCommandInCache() for handling cache reads in command processing.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant PikaClientConn
    participant Cache

    Client->>PikaClientConn: Send Command (Get)
    PikaClientConn->>PikaClientConn: IsInterceptedByRTC()
    alt Intercepted
        PikaClientConn->>Cache: ReadCmdInCache()
        Cache-->>PikaClientConn: Return Cached Data
    else Not Intercepted
        PikaClientConn->>PikaClientConn: Process Command Normally
    end
    PikaClientConn-->>Client: Return Result
Loading

Poem

🐰 In the land of code where rabbits play,
New methods hop in to brighten the day!
With timestamps and caching, oh what a sight,
Performance improvements, making it right!
Let’s cheer for the changes, both swift and spry,
Hopping along, we’ll reach for the sky! 🌈


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added the ✏️ Feature New feature or request label Jul 31, 2024
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between f2d8e9c and d741698.

Files selected for processing (4)
  • include/pika_client_conn.h (3 hunks)
  • include/pika_command.h (4 hunks)
  • src/pika_client_conn.cc (5 hunks)
  • src/pika_command.cc (5 hunks)
Additional comments not posted (28)
include/pika_client_conn.h (4)

22-22: LGTM!

The addition of before_queue_ts_ enhances the granularity of time measurement.


41-43: Ensure the logic for before_queue_time() is correct.

The method calculates the time difference between before_queue_ts_ and enqueue_ts_, contingent on process_done_ts_ exceeding dequeue_ts_. This logic seems correct, but verify that it aligns with the intended behavior.


76-76: LGTM!

The addition of IsInterceptedByRTC improves command handling based on real-time control mechanisms.


80-80: LGTM!

The addition of ReadCmdInCache enhances the command processing capabilities by leveraging caching.

src/pika_client_conn.cc (6)

241-241: LGTM!

The inclusion of before_queue_time in the slow log provides deeper insights into command processing timings.


260-269: LGTM!

The method correctly checks if the command should be intercepted based on the cache settings.


277-277: Ensure the timestamp is correctly set.

The method sets both enqueue_ts_ and before_queue_ts_ to the current time. Verify that this is the intended behavior.


290-300: LGTM!

The logic correctly handles the interception of single commands and attempts to read from the cache.


337-380: LGTM!

The method performs necessary checks and updates command statistics upon successful cache reads.


290-300: LGTM!

The logic correctly handles the interception of single commands and attempts to read from the cache.

include/pika_command.h (2)

540-540: LGTM!

The method correctly checks if a command is intended to read from a cache.


584-584: LGTM!

The method enhances the command processing capabilities by leveraging caching.

src/pika_command.cc (16)

246-246: LGTM! Formatting change improves consistency.

The spacing around the bitwise OR operator enhances readability and maintains consistency.


395-395: LGTM! Formatting change improves consistency.

The spacing around the bitwise OR operator enhances readability and maintains consistency.


403-403: LGTM! Formatting change improves consistency.

The spacing around the bitwise OR operator enhances readability and maintains consistency.


739-739: LGTM! Formatting change improves consistency.

The spacing around the bitwise OR operator enhances readability and maintains consistency.


743-743: LGTM! Formatting change improves consistency.

The spacing around the bitwise OR operator enhances readability and maintains consistency.


747-747: LGTM! Formatting change improves consistency.

The spacing around the bitwise OR operator enhances readability and maintains consistency.


751-751: LGTM! Formatting change improves consistency.

The spacing around the bitwise OR operator enhances readability and maintains consistency.


755-755: LGTM! Formatting change improves consistency.

The spacing around the bitwise OR operator enhances readability and maintains consistency.


759-759: LGTM! Formatting change improves consistency.

The spacing around the bitwise OR operator enhances readability and maintains consistency.


763-763: LGTM! Formatting change improves consistency.

The spacing around the bitwise OR operator enhances readability and maintains consistency.


769-769: LGTM! Formatting change improves consistency.

The spacing around the bitwise OR operator enhances readability and maintains consistency.


773-773: LGTM! Formatting change improves consistency.

The spacing around the bitwise OR operator enhances readability and maintains consistency.


776-776: LGTM! Formatting change improves consistency.

The spacing around the bitwise OR operator enhances readability and maintains consistency.


779-779: LGTM! Formatting change improves consistency.

The spacing around the bitwise OR operator enhances readability and maintains consistency.


782-782: LGTM! Formatting change improves consistency.

The spacing around the bitwise OR operator enhances readability and maintains consistency.


914-934: LGTM! New method DoReadCommandInCache enhances command handling.

The method correctly handles shared locks, checks cache status, and returns a boolean indicating cache read success.

Ensure that this method is appropriately integrated and used within the codebase.

Verification successful

The method DoReadCommandInCache is defined in src/pika_command.cc, declared in include/pika_command.h, and used in src/pika_client_conn.cc. The usage seems appropriate, but let's verify the context in which it is used in src/pika_client_conn.cc to ensure it is correctly implemented.


The method DoReadCommandInCache is correctly integrated and used within the codebase.

The method is appropriately called in src/pika_client_conn.cc after verifying the cache status, and its return value is used to determine further processing.

  • src/pika_client_conn.cc: Correct usage of DoReadCommandInCache within the context of cache status check and read command processing.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the usage of the new method `DoReadCommandInCache`.

# Test: Search for the method usage. Expect: Occurrences of `DoReadCommandInCache`.
rg --type cpp -A 5 $'DoReadCommandInCache'

Length of output: 951


Script:

#!/bin/bash
# Description: Verify the context of the usage of `DoReadCommandInCache` in `src/pika_client_conn.cc`.

# Search for the surrounding lines of the usage of `DoReadCommandInCache` in `src/pika_client_conn.cc` to understand its context.
rg --type cpp -A 10 -B 10 $'DoReadCommandInCache' src/pika_client_conn.cc

Length of output: 862

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between d741698 and 4b16567.

Files selected for processing (1)
  • src/pika_client_conn.cc (5 hunks)
Files skipped from review as they are similar to previous changes (1)
  • src/pika_client_conn.cc

@AlexStocks

This comment was marked as resolved.

@AlexStocks AlexStocks merged commit 578fe41 into OpenAtomFoundation:unstable Jul 31, 2024
15 checks passed
@cheniujh
Copy link
Collaborator Author

cheniujh commented Aug 1, 2024

附1:详细测试结果

测试配置

机器配置:

  • 48 cores
  • 256GB DDR4 Memory

Pika配置:

  • thread-num = 50 (即网络线程数量)
  • thread-pool-size = 30
  • Redis-Cache大小上限:50GB (可以视作充分给足空间,只要Get命中的Key就能进入Cache)
  • Redis-Cache分片数量:32

测试分两阶段

阶段一:填充数据

统一用如下命令填充约1亿条key进入DB:
redis-benchmark -h * -p 19221 -t set -n 100000000 -r 100000000 -d 128 -c 300 --threads 20

阶段二:多次执行如下Get压测命令

每次Get数量为300W, 指定范围0-100W的key, Get行为会让key进入Pika内嵌的Redis Cache,随着命令执行次数的增加,缓存命中率也会逐步提升:
redis-benchmark -h * -p 19221 -t get -n 3000000 -r 1500000 -c 300 --threads 20

测试结果

RTC(本PR)

第1次执行Get压测命令:

Summary:

  • throughput summary: 399254.72 requests per second
  • avg: 0.730 min: 0.040 p50: 0.679 p95: 1.407 p99: 1.807 max: 5.167

Redis-Cache最大命中率(每秒): 小于48%
累计平均Cache命中率:33%

第2次执行Get压测命令:

Summary:

  • throughput summary: 629987.38 requests per second
  • avg: 0.434 min: 0.040 p50: 0.383 p95: 0.911 p99: 1.231 max: 4.903

Redis-Cache最大命中率(每秒): 小于72%
累计平均Cache命中率:49%

第3次执行Get压测命令:

Summary:

  • throughput summary: 747384.19 requests per second
  • avg: 0.333 min: 0.040 p50: 0.303 p95: 0.631 p99: 0.863 max: 4.135

Redis-Cache最大命中率(每秒): 小于77%
累计平均Cache命中率:57%

第4次执行Get压测命令:

Summary:

  • throughput summary: 749063.62 requests per second
  • avg: 0.322 min: 0.040 p50: 0.295 p95: 0.591 p99: 0.807 max: 3.167

Redis-Cache最大命中率(每秒): 小于78%
累计平均Cache命中率:62%

原版代码(无RTC)

第1次执行Get压测命令:

Summary:

  • throughput summary: 362932.50 requests per second
  • avg: 0.798 min: 0.064 p50: 0.719 p95: 1.527 p99: 1.951 max: 5.439

Redis-Cache最大命中率(每秒): 小于69%
累计平均Cache命中率:50%

第2次执行Get压测命令:

Summary:

  • throughput summary: 392511.72 requests per second
  • avg: 0.986 min: 0.056 p50: 0.943 p95: 1.743 p99: 2.167 max: 4.983

Redis-Cache最大命中率(每秒): 小于86%
累计平均Cache命中率:66%

第3次执行Get压测命令:

Summary:

  • throughput summary: 385496.75 requests per second
  • avg: 1.013 min: 0.064 p50: 0.967 p95: 1.783 p99: 2.223 max: 5.951

Redis-Cache最大命中率(每秒): 小于87%
累计平均Cache命中率:73%

第4次执行Get压测命令:

Summary:

  • throughput summary: 392169.84 requests per second
  • avg: 1.004 min: 0.064 p50: 0.959 p95: 1.775 p99: 2.207 max: 5.519

Redis-Cache最大命中率(每秒): 小于88%
累计平均Cache命中率:77%

chejinge pushed a commit that referenced this pull request Aug 6, 2024
@coderabbitai coderabbitai bot mentioned this pull request Sep 19, 2024
cheniujh added a commit to cheniujh/pika that referenced this pull request Sep 24, 2024
…omFoundation#2837)

* feat: Improve the RTC process of Read/Write model  (OpenAtomFoundation#2629)

* (Demo) Do read cmd before task queue. && add workflow_dispatch for manual action

* Check authed and write lock, fix go test error in MacOS and cache mode judge

* fix some ut error by  commands filter  and return logic

* rollback some flag,but add kCmdReadBeforeQueuefor get mget hget hget hgetall,hmget

* move mget and hmget;add before_queue_time metrics

* fix cost to copy cmd_table by remove c_ptr

---------

Co-authored-by: chenbt <[email protected]>
cheniujh added a commit to cheniujh/pika that referenced this pull request Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
✏️ Feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants