-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cognee config to telemetry #265
Conversation
WalkthroughThe changes in the pull request primarily focus on modifying the Changes
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
cognee/modules/pipelines/operations/run_tasks.py (3)
164-166
: Add config parameter validation and error handlingWhile the configuration logging is a good addition, consider adding validation and error handling:
- Validate that the config dictionary is not None
- Handle potential JSON serialization errors for non-serializable config values
async def run_tasks_with_telemetry(tasks: list[Task], data, pipeline_name: str, config: dict): + if config is None: + config = {} + try: logger.debug("\nRunning pipeline with configuration:\n%s\n", json.dumps(config, indent = 1)) + except TypeError as e: + logger.warning(f"Failed to serialize config for logging: {str(e)}") + logger.debug("\nRunning pipeline with non-serializable configuration\n")
173-174
: Consider including config in all telemetry eventsThe configuration is only included in the "Pipeline Run Started" event. For consistency and debugging purposes, consider including it in "Pipeline Run Completed" and "Pipeline Run Errored" events as well.
198-200
: Add error handling for settings retrievalThe settings retrieval could fail. Consider adding error handling to ensure the pipeline can still run with default configuration if settings retrieval fails.
async def run_tasks(tasks: list[Task], data = None, pipeline_name: str = "default_pipeline"): - config = get_current_settings() + try: + config = get_current_settings() + except Exception as e: + logger.warning(f"Failed to retrieve settings: {str(e)}. Using default configuration.") + config = {} async for result in run_tasks_with_telemetry(tasks, data, pipeline_name, config): yield result
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
cognee/modules/pipelines/operations/run_tasks.py
(3 hunks)
🔇 Additional comments (2)
cognee/modules/pipelines/operations/run_tasks.py (2)
2-2
: LGTM: Import changes are appropriate
The reorganized imports maintain clean structure and include all necessary dependencies for the new configuration logging functionality.
Also applies to: 8-8
Line range hint 164-200
: Verify telemetry data handling
Please ensure that:
- All direct calls to run_tasks_with_telemetry are updated with the config parameter
- Monitoring systems can handle the additional configuration data in telemetry events
✅ Verification successful
Let me gather more information about the telemetry data handling.
No changes needed for telemetry handling
The telemetry implementation is safe and properly handles the additional configuration data:
- The only direct call to
run_tasks_with_telemetry
is from within therun_tasks
function, which correctly provides all required parameters including the config. - The telemetry system, implemented in
shared/utils.py
, accepts arbitrary additional properties through itsadditional_properties
parameter, making it compatible with the configuration data being passed. - The pipeline telemetry events ("Pipeline Run Started", "Pipeline Run Completed", "Pipeline Run Errored") are consistently used across the codebase.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for direct calls to run_tasks_with_telemetry
rg "run_tasks_with_telemetry\(" --type py -B 2 -A 2
Length of output: 863
Script:
#!/bin/bash
# Search for telemetry event handling and monitoring
rg "send_telemetry\(" --type py -B 2 -A 2
# Search for any configuration related telemetry processing
rg "Pipeline Run Started" --type py -B 2 -A 2
Length of output: 15135
@@ -190,7 +195,6 @@ async def run_tasks_with_telemetry(tasks: list[Task], data, pipeline_name: str): | |||
|
|||
async def run_tasks(tasks: list[Task], data = None, pipeline_name: str = "default_pipeline"): | |||
config = get_current_settings() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move also this config inside run_tasks_with_telemetry
, then we don't have to pass it in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
}) | ||
|
||
} | config | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This formatting looks like there is some issue with the code. :) Maybe move the closing bracket )
up with config
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
….py and make the indentation a bit more readable
ff35912
to
462fcef
Compare
|
GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
---|---|---|---|---|---|
9573981 | Triggered | Generic Password | f30bf35 | cognee/tests/test_deduplication.py | View secret |
9573981 | Triggered | Generic Password | f30bf35 | .github/workflows/test_deduplication.yml | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secrets safely. Learn here the best practices.
- Revoke and rotate these secrets.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
Summary by CodeRabbit