Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: dont store cmd(Get/Set/Hget/Hset) with too large key in cache #2849

Merged
merged 2 commits into from
Sep 24, 2024

Conversation

QlQlqiqi
Copy link
Contributor

@QlQlqiqi QlQlqiqi commented Aug 7, 2024

fix: #2557

Summary by CodeRabbit

  • New Features

    • Introduced a configuration parameter to set the maximum key size for caching.
    • Added new methods to manage and retrieve maximum key size within the caching system.
  • Bug Fixes

    • Implemented checks to prevent caching of commands with excessively large keys, enhancing memory management.
  • Tests

    • Added integration tests to verify functionality related to key size limits in caching operations, ensuring robustness against oversized keys.

Copy link

coderabbitai bot commented Aug 7, 2024

Walkthrough

The changes introduce a new configuration parameter to limit the maximum key size in the Redis cache, significantly enhancing cache management. Various command classes implement validations to prevent oversized keys from being stored, ensuring efficient memory usage. Integration tests have been added to verify correct behavior when encountering large keys.

Changes

Files Change Summary
conf/pika.conf Added max-key-size-in-cache parameter for key size control.
include/pika_command.h Introduced pure virtual function IsTooLargeKey in Cmd class.
include/pika_conf.h Added methods for setting and getting max_key_size_in_cache_, now using std::atomic_int.
include/pika_hash.h Implemented IsTooLargeKey in HGetCmd to evaluate key size.
include/pika_kv.h Added IsTooLargeKey methods in SetCmd and GetCmd for key size validation.
src/cache/include/config.h Defined DEFAULT_CACHE_MAX_KEY_SIZE constant.
src/pika_client_conn.cc Added checks in ReadCmdInCache to prevent caching oversized keys.
src/pika_conf.cc Enhanced Load method to validate max key size configuration.
src/pika_hash.cc Incorporated key size checks in DoUpdateCache of HSetCmd and HGetCmd.
src/pika_kv.cc Added key size checks in DoUpdateCache methods of SetCmd and GetCmd.
tests/integration/hash_test.go Introduced tests for handling oversized keys in hash commands.
tests/integration/set_test.go Added tests for handling oversized keys in set commands.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant PikaClientConn
    participant Cache
    participant PikaConf

    Client->>PikaClientConn: Send Set Command with Key
    PikaClientConn->>PikaConf: Get max_key_size_in_cache
    PikaClientConn->>PikaClientConn: Check if Key Size > max_key_size
    alt Key Size Valid
        PikaClientConn->>Cache: Store Key in Cache
    else Key Size Too Large
        PikaClientConn-->>Client: Reject Command
    end
Loading

Assessment against linked issues

Objective Addressed Explanation
Large keys are not stored in cache (2557)
Prevent oversized keys from being cached (2557)
Implement size checks for cache operations (2557)
Ensure configurable key size limits (2557)

Possibly related PRs

  • feat: Improve the RTC process of Read/Write model  #2629: This PR introduces a new function BatchReadCmdInCache in the PikaClientConn class, which involves reading commands from the cache and includes checks related to cache operations, aligning with the changes made in the main PR regarding cache key size management.
  • docs:4.0.1 changelog #2902: While primarily a changelog update, this PR mentions enhancements to cache operations, which may relate to the overall context of managing cache behavior introduced in the main PR.

Suggested labels

🧹 Updates, 📒 Documentation

Poem

🐰 In the cache where keys do play,
Large ones find no place to stay.
With size checks in our code so fine,
Memory's safe, and all's divine!
Hop along, let logic bloom,
For every key, there's just enough room! 🌟


Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 71d44f1 and fafdd42.

Files selected for processing (1)
  • src/pika_client_conn.cc (1 hunks)
Additional comments not posted (1)
src/pika_client_conn.cc (1)

355-360: LGTM!

The added check for key size before caching the command is a good optimization. It prevents storing commands with excessively large keys in the cache, which can lead to inefficient cache usage. The logic is implemented correctly:

  • It uses the IsTooLargeKey function to compare the key size against the configured max_key_size_in_cache limit.
  • If the key is too large, it properly decrements the response number to maintain the correct count.
  • Returning false allows the command to be processed through the normal execution path instead of from the cache.

Overall, this change enhances the caching mechanism by avoiding the caching of commands with large keys.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    -- I pushed a fix in commit <commit_id>, please review it.
    -- Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    -- @coderabbitai generate unit testing code for this file.
    -- @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    -- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    -- @coderabbitai read src/utils.ts and generate unit testing code.
    -- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    -- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added ☢️ Bug Something isn't working ✏️ Feature New feature or request labels Aug 7, 2024
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Outside diff range, codebase verification and nitpick comments (1)
include/pika_conf.h (1)

415-415: Ensure thread safety in SetCacheMaxKeySize.

The SetCacheMaxKeySize method sets the value of max_key_size_in_cache_. Unlike other setter methods in the same file, it does not use a locking mechanism to ensure thread safety. Consider adding a lock guard to maintain consistency and thread safety.

  • include/pika_conf.h: Add std::lock_guard l(rwlock_); to SetCacheMaxKeySize.
Analysis chain

Ensure thread safety in SetCacheMaxKeySize.

The SetCacheMaxKeySize method sets the value of max_key_size_in_cache_. Ensure that this method is consistent with other setter methods in terms of locking shared resources.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify consistency of locking mechanisms in setter methods.

# Test: Search for setter methods and check for locking mechanisms.
rg --type cpp -A 5 $'void Set'

Length of output: 61456

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 4588e26 and c1ebe48.

Files selected for processing (11)
  • conf/pika.conf (1 hunks)
  • include/pika_command.h (1 hunks)
  • include/pika_conf.h (3 hunks)
  • include/pika_hash.h (1 hunks)
  • include/pika_kv.h (2 hunks)
  • src/cache/include/config.h (1 hunks)
  • src/pika_client_conn.cc (1 hunks)
  • src/pika_conf.cc (1 hunks)
  • src/pika_hash.cc (2 hunks)
  • src/pika_kv.cc (2 hunks)
  • tests/integration/set_test.go (2 hunks)
Files skipped from review due to trivial changes (1)
  • src/cache/include/config.h
Additional comments not posted (15)
include/pika_hash.h (1)

57-57: LGTM! The method implementation is correct.

The method IsTooLargeKey correctly checks if the key size exceeds the given maximum size.

tests/integration/set_test.go (2)

408-427: LGTM! The test case correctly verifies that large keys are not stored in the cache.

The test case creates a large key, sets a value, retrieves it, and checks the cache info to ensure the key is not cached.


431-442: LGTM! The helper function implementation is correct.

The function extractValue correctly parses the data and retrieves the value associated with the specified key.

src/pika_client_conn.cc (1)

355-360: LGTM! The new check correctly handles commands with large keys.

The check uses the IsTooLargeKey method to determine if the key size exceeds the predefined limit. If the key is too large, the response number is decremented, and the method returns false.

include/pika_command.h (1)

532-532: LGTM! The pure virtual function enforces derived classes to implement key size validation.

This change ensures that each command can have its own logic for determining if a key is too large.

src/pika_hash.cc (2)

78-81: LGTM! The key size validation optimizes the cache update process.

The comment clarifies the purpose of the validation, and the check ensures that large keys are not stored in the cache.


130-132: LGTM! The key size validation enhances the robustness of the caching mechanism.

The check ensures that large keys are not stored in the cache, improving performance and resource management.

include/pika_kv.h (2)

32-32: LGTM! The key size validation is correctly implemented.

The method uses a static cast to convert max_sz to uint32_t for comparison, ensuring correct functionality.


68-68: LGTM! The key size validation is correctly implemented.

The method uses a static cast to convert max_sz to uint32_t for comparison, ensuring correct functionality.

conf/pika.conf (1)

571-574: New configuration parameter added: max-key-size-in-cache.

This parameter specifies the maximum size of keys that can be stored in the Redis cache. The comments provide clear guidance on its usage, including the default value and the behavior when set to -1.

src/pika_conf.cc (1)

639-645: New configuration parameter handling: max-key-size-in-cache.

The code correctly initializes the max_key_size_in_cache variable with the default value and updates it based on the configuration file. If the value is less than or equal to zero, it resets to the default value. This ensures that the parameter remains valid.

include/pika_conf.h (2)

886-886: Ensure thread safety in max_key_size_in_cache.

The max_key_size_in_cache method retrieves the value of max_key_size_in_cache_. Ensure that this method is consistent with other getter methods in terms of locking shared resources.


1036-1036: Ensure correct initialization of max_key_size_in_cache_.

The max_key_size_in_cache_ member variable is added as an std::atomic_int and initialized to 512. Ensure that this initialization is consistent with other member variables.

Verification successful

Initialization of max_key_size_in_cache_ is consistent.

The max_key_size_in_cache_ member variable is correctly initialized in a manner consistent with other std::atomic_int member variables in the same file.

  • include/pika_conf.h:
    • std::atomic_int cache_num_ = 5;
    • std::atomic_int cache_mode_ = 1;
    • std::atomic_int cache_string_ = 1;
    • std::atomic_int cache_set_ = 1;
    • std::atomic_int cache_zset_ = 1;
    • std::atomic_int cache_hash_ = 1;
    • std::atomic_int cache_list_ = 1;
    • std::atomic_int cache_bit_ = 1;
    • std::atomic_int zset_cache_start_direction_ = 0;
    • std::atomic_int zset_cache_field_num_per_key_ = 512;
    • std::atomic_int cache_maxmemory_policy_ = 1;
    • std::atomic_int cache_maxmemory_samples_ = 5;
    • std::atomic_int cache_lfu_decay_time_ = 1;
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify consistency of initialization of member variables.

# Test: Search for member variables and check for initialization.
rg --type cpp -A 5 $'std::atomic_int'

Length of output: 1793

src/pika_kv.cc (2)

109-111: LGTM! But verify the function and configuration usage.

The added condition to check if the key size exceeds a configured maximum value before updating the cache is correct and aligns with the PR objectives.

However, ensure that the function IsTooLargeKey and the configuration g_pika_conf->max_key_size_in_cache are correctly implemented and used.

Verification successful

Verified: The function IsTooLargeKey and the configuration g_pika_conf->max_key_size_in_cache are correctly implemented and used.

  • The function IsTooLargeKey checks if the key size exceeds the configured maximum value.
  • The configuration parameter g_pika_conf->max_key_size_in_cache is properly defined and used to store the maximum key size.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the implementation and usage of `IsTooLargeKey` and `g_pika_conf->max_key_size_in_cache`.

# Test: Search for the function definition and configuration usage. Expect: Correct implementation and usage.
rg --type cpp 'IsTooLargeKey'
rg --type cpp 'max_key_size_in_cache'

Length of output: 2086


188-190: LGTM! But verify the function and configuration usage.

The added condition to check if the key size exceeds a configured maximum value before updating the cache is correct and aligns with the PR objectives.

However, ensure that the function IsTooLargeKey and the configuration g_pika_conf->max_key_size_in_cache are correctly implemented and used.

Verification successful

LGTM! The function and configuration usage are correctly implemented.

The added condition to check if the key size exceeds a configured maximum value before updating the cache is correct and aligns with the PR objectives. The function IsTooLargeKey and the configuration g_pika_conf->max_key_size_in_cache are correctly implemented and used.

  • IsTooLargeKey correctly compares the key size with the maximum allowed size.
  • max_key_size_in_cache is properly defined and used in the configuration files.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the implementation and usage of `IsTooLargeKey` and `g_pika_conf->max_key_size_in_cache`.

# Test: Search for the function definition and configuration usage. Expect: Correct implementation and usage.
rg --type cpp 'IsTooLargeKey'
rg --type cpp 'max_key_size_in_cache'

Length of output: 2086


Script:

#!/bin/bash
# Extract the implementation of IsTooLargeKey from include/pika_kv.h
rg --type cpp --context 5 'bool IsTooLargeKey' include/pika_kv.h

# Extract the relevant code for max_key_size_in_cache from src/pika_conf.cc and include/pika_conf.h
rg --type cpp --context 5 'max_key_size_in_cache' src/pika_conf.cc include/pika_conf.h

Length of output: 4318

@QlQlqiqi QlQlqiqi force-pushed the big-key-not-in-cache branch from c1ebe48 to 427c9ef Compare August 7, 2024 10:36
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between c1ebe48 and 427c9ef.

Files selected for processing (11)
  • conf/pika.conf (1 hunks)
  • include/pika_command.h (1 hunks)
  • include/pika_conf.h (3 hunks)
  • include/pika_hash.h (1 hunks)
  • include/pika_kv.h (2 hunks)
  • src/cache/include/config.h (1 hunks)
  • src/pika_client_conn.cc (1 hunks)
  • src/pika_conf.cc (1 hunks)
  • src/pika_hash.cc (2 hunks)
  • src/pika_kv.cc (2 hunks)
  • tests/integration/set_test.go (2 hunks)
Files skipped from review as they are similar to previous changes (10)
  • conf/pika.conf
  • include/pika_command.h
  • include/pika_conf.h
  • include/pika_hash.h
  • include/pika_kv.h
  • src/cache/include/config.h
  • src/pika_client_conn.cc
  • src/pika_hash.cc
  • src/pika_kv.cc
  • tests/integration/set_test.go
Additional comments not posted (1)
src/pika_conf.cc (1)

639-645: LGTM! But verify the usage of the new configuration.

The code changes are approved.

However, ensure that the new configuration max-key-size-in-cache is properly used throughout the codebase.

@QlQlqiqi QlQlqiqi changed the title feat: dont store cmd(Get/Set/Hget/Hset) with too large key in cache(fix: #2557) feat: dont store cmd(Get/Set/Hget/Hset) with too large key in cache Aug 7, 2024
@QlQlqiqi QlQlqiqi force-pushed the big-key-not-in-cache branch from 427c9ef to ba5903b Compare August 7, 2024 10:45
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 427c9ef and ba5903b.

Files selected for processing (12)
  • conf/pika.conf (1 hunks)
  • include/pika_command.h (1 hunks)
  • include/pika_conf.h (3 hunks)
  • include/pika_hash.h (1 hunks)
  • include/pika_kv.h (2 hunks)
  • src/cache/include/config.h (1 hunks)
  • src/pika_client_conn.cc (1 hunks)
  • src/pika_conf.cc (1 hunks)
  • src/pika_hash.cc (2 hunks)
  • src/pika_kv.cc (2 hunks)
  • tests/integration/hash_test.go (2 hunks)
  • tests/integration/set_test.go (2 hunks)
Files skipped from review as they are similar to previous changes (9)
  • conf/pika.conf
  • include/pika_command.h
  • include/pika_conf.h
  • include/pika_kv.h
  • src/cache/include/config.h
  • src/pika_client_conn.cc
  • src/pika_conf.cc
  • src/pika_hash.cc
  • src/pika_kv.cc
Additional comments not posted (4)
tests/integration/hash_test.go (1)

369-388: LGTM!

The test case effectively verifies that large keys are not stored in the cache. The logic is sound, and the use of extractValue improves readability.

include/pika_hash.h (1)

57-57: LGTM!

The IsTooLargeKey method correctly checks if the key size exceeds the specified limit using a static cast for comparison.

tests/integration/set_test.go (2)

408-427: LGTM!

The test case effectively verifies that large keys are not stored in the cache. The logic is sound, and the use of extractValue improves readability.


431-442: LGTM!

The extractValue function is well-implemented and improves the readability of the test cases by encapsulating the logic for parsing the cache info string.

@QlQlqiqi QlQlqiqi force-pushed the big-key-not-in-cache branch from ba5903b to 71d44f1 Compare August 7, 2024 13:41
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between ba5903b and 71d44f1.

Files selected for processing (9)
  • include/pika_command.h (1 hunks)
  • include/pika_conf.h (3 hunks)
  • include/pika_hash.h (1 hunks)
  • include/pika_kv.h (2 hunks)
  • src/cache/include/config.h (1 hunks)
  • src/pika_client_conn.cc (1 hunks)
  • src/pika_conf.cc (1 hunks)
  • src/pika_hash.cc (2 hunks)
  • src/pika_kv.cc (2 hunks)
Files skipped from review as they are similar to previous changes (8)
  • include/pika_command.h
  • include/pika_conf.h
  • include/pika_hash.h
  • include/pika_kv.h
  • src/cache/include/config.h
  • src/pika_conf.cc
  • src/pika_hash.cc
  • src/pika_kv.cc
Additional comments not posted (1)
src/pika_client_conn.cc (1)

355-360: Ensure the IsTooLargeKey method is efficient.

The added code correctly checks if the key size exceeds the maximum allowed size before caching. However, ensure that the IsTooLargeKey method is efficient and does not introduce significant overhead.

Verification successful

Ensure the IsTooLargeKey method is efficient.

The IsTooLargeKey method implementations in pika_hash.h and pika_kv.h are efficient as they perform a simple size comparison. The implementation in pika_command.h returns false by default, which is likely intended for base class behavior.

  • include/pika_hash.h: bool IsTooLargeKey(const int &max_sz) override { return key_.size() > static_cast<uint32_t>(max_sz); }
  • include/pika_kv.h: bool IsTooLargeKey(const int& max_sz) override { return key_.size() > static_cast<uint32_t>(max_sz); }
  • include/pika_kv.h: bool IsTooLargeKey(const int &max_sz) override { return key_.size() > static_cast<uint32_t>(max_sz); }
  • include/pika_command.h: virtual bool IsTooLargeKey(const int &max_sz) { return false; }

The logic is correct and should not introduce significant overhead.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the implementation of `IsTooLargeKey` method.

# Test: Search for the implementation of `IsTooLargeKey`. Expect: Efficient implementation.
rg --type cpp --pattern 'bool IsTooLargeKey'

Length of output: 458


Script:

#!/bin/bash
# Description: Verify the implementation of `IsTooLargeKey` method.

# Test: Search for the implementation of `IsTooLargeKey`. Expect: Efficient implementation.
rg 'bool IsTooLargeKey'

Length of output: 485

@chejinge chejinge requested review from chejinge and Mixficsol August 16, 2024 13:25
@baixin01
Copy link
Collaborator

@chejinge @Mixficsol 本周review

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


@chejinge @Mixficsol review this week

@@ -352,6 +352,12 @@ bool PikaClientConn::ReadCmdInCache(const net::RedisCmdArgsType& argv, const std
}
// Initial
c_ptr->Initial(argv, current_db_);
// dont store cmd with too large key(only Get/HGet cmd can reach here)
// the cmd with large key should be non-exist in cache, except for pre-stored
if(c_ptr->IsTooLargeKey(g_pika_conf->max_key_size_in_cache())) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

代码格式,空格

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@chejinge chejinge merged commit 8f0cd18 into OpenAtomFoundation:unstable Sep 24, 2024
13 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4.0.1 ☢️ Bug Something isn't working ✏️ Feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Large keys are not stored in cache
6 participants