-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Fix ubuntu 24.04 segmentation fault #422
Conversation
WalkthroughThe pull request encompasses a comprehensive update to GitHub Actions workflow configurations across multiple files. The primary change involves systematically replacing Changes
Poem
Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
️✅ There are no secrets present in this pull request anymore.If these secrets were true positive and are still valid, we highly recommend you to revoke them. 🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request. |
4c83d8a
to
ff6d8cb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (2)
.github/workflows/py_lint.yml (1)
Remove outdated linting commands as the project uses a different toolchain now
The commented linting commands are obsolete as the project has moved to a different set of tools:
- mypy is still used but should be uncommented as it's an active dependency
- flake8 and bandit have been replaced by ruff and pylint
- black and isort can be removed as ruff handles formatting
🔗 Analysis chain
Line range hint
63-75
: Consider cleaning up or restoring commented linting commands.There are several commented-out linting commands. If these are no longer needed, consider removing them. If they are still relevant, consider restoring them to maintain code quality checks.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check if these linting tools are still in use elsewhere # Note: This helps determine if we should restore these checks echo "Checking for mypy usage..." rg --type yaml 'mypy' .github/workflows/ echo "Checking for flake8 usage..." rg --type yaml 'flake8' .github/workflows/ echo "Checking for bandit usage..." rg --type yaml 'bandit' .github/workflows/Length of output: 580
Script:
#!/bin/bash # Check for linting tool configurations and dependencies echo "Checking for config files..." fd -t f "mypy.ini|.flake8|pyproject.toml|.bandit" echo -e "\nChecking content of pyproject.toml if exists..." if [ -f "pyproject.toml" ]; then echo "=== Dependencies section ===" rg -A 5 "\[tool.poetry.dependencies\]|\[build-system\]" pyproject.toml fi echo -e "\nChecking for requirements files..." fd "requirements.*\.txt"Length of output: 791
Script:
#!/bin/bash # Check dev dependencies and documentation echo "Checking dev dependencies in pyproject.toml..." rg -A 10 "\[tool.poetry.dev-dependencies\]|\[tool.poetry.group.dev.dependencies\]" pyproject.toml echo -e "\nChecking documentation for linting mentions..." fd -t f "README|CONTRIBUTING" -E "*.pyc"Length of output: 745
.github/workflows/cd_prd.yaml (1)
Line range hint
89-89
: Fix inconsistent step ID in outputs.The outputs section references
steps.export-promethai-docker-tag
but the step ID isexport-cognee-docker-tag
. This mismatch will cause the output to be undefined.- cognee_image_tag: ${{ steps.export-promethai-docker-tag.outputs.cognee_image_tag }} + cognee_image_tag: ${{ steps.export-cognee-docker-tag.outputs.cognee_image_tag }}
🧹 Nitpick comments (4)
.github/workflows/test_qdrant.yml (1)
20-20
: LGTM! Consider adding SQLite verification step.The update to
ubuntu-latest
is good. Since this PR addresses SQLite issues, consider adding a verification step to ensure that Qdrant integration tests don't trigger SQLite-related segmentation faults when SQLite is used as the metadata store..github/workflows/test_weaviate.yml (2)
20-20
: LGTM! Consider adding SQLite verification step.The update to
ubuntu-latest
is good. Since this PR addresses SQLite issues, consider adding a verification step to ensure that Weaviate integration tests don't trigger SQLite-related segmentation faults when SQLite is used as the metadata store.
Line range hint
1-1
: Consider adding dedicated SQLite integration tests.To ensure long-term stability with SQLite on Ubuntu 24.04:
- Add a dedicated SQLite integration test workflow
- Document the SQLite compatibility improvements and known limitations
- Consider adding SQLite-specific CI badges to indicate compatibility status
Would you like me to help create a dedicated SQLite integration test workflow or documentation?
.github/workflows/test_pgvector.yml (1)
20-20
: Consider adding parallel SQLite testing workflowSince this workflow successfully tests PostgreSQL (which works on Ubuntu 24.04), consider creating a parallel workflow for SQLite testing with similar structure to:
- Isolate and reproduce the reported segmentation faults
- Compare behavior between Ubuntu 22.04 and 24.04
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (23)
.github/workflows/cd.yaml
(1 hunks).github/workflows/cd_prd.yaml
(1 hunks).github/workflows/ci.yaml
(1 hunks).github/workflows/community_greetings.yml
(1 hunks).github/workflows/docker_compose.yml
(1 hunks).github/workflows/dockerhub.yml
(1 hunks).github/workflows/profiling.yaml
(1 hunks).github/workflows/py_lint.yml
(1 hunks).github/workflows/release_discord_action.yml
(1 hunks).github/workflows/reusable_notebook.yml
(1 hunks).github/workflows/reusable_python_example.yml
(1 hunks).github/workflows/ruff_format.yaml
(1 hunks).github/workflows/ruff_lint.yaml
(1 hunks).github/workflows/test_deduplication.yml
(1 hunks).github/workflows/test_milvus.yml
(1 hunks).github/workflows/test_neo4j.yml
(1 hunks).github/workflows/test_pgvector.yml
(1 hunks).github/workflows/test_python_3_10.yml
(1 hunks).github/workflows/test_python_3_11.yml
(1 hunks).github/workflows/test_python_3_12.yml
(1 hunks).github/workflows/test_qdrant.yml
(1 hunks).github/workflows/test_weaviate.yml
(1 hunks)cognee/modules/users/methods/get_default_user.py
(2 hunks)
✅ Files skipped from review due to trivial changes (10)
- .github/workflows/test_milvus.yml
- .github/workflows/dockerhub.yml
- .github/workflows/release_discord_action.yml
- .github/workflows/ruff_lint.yaml
- .github/workflows/ruff_format.yaml
- .github/workflows/test_deduplication.yml
- .github/workflows/ci.yaml
- .github/workflows/test_python_3_10.yml
- .github/workflows/community_greetings.yml
- .github/workflows/profiling.yaml
🔇 Additional comments (9)
.github/workflows/docker_compose.yml (1)
15-15
: Verify Docker Compose tests with SQLite on Ubuntu 24.04.While updating to
ubuntu-latest
will help test on Ubuntu 24.04, ensure that the Docker Compose tests specifically validate the SQLite functionality to catch any potential segmentation faults.Let's check if SQLite is being tested in the Docker setup:
.github/workflows/test_neo4j.yml (1)
18-18
: Verify Neo4j compatibility with Ubuntu 24.04While updating to
ubuntu-latest
will help test newer Ubuntu versions, ensure that the Neo4j dependencies and Python packages are compatible with Ubuntu 24.04.Consider adding explicit test coverage for both SQLite and PostgreSQL database backends in this workflow to help identify the reported segmentation faults.
.github/workflows/reusable_python_example.yml (1)
25-25
: Review Python 3.12.x compatibility with SQLiteThe workflow uses Python 3.12.x, which might affect SQLite compatibility differently than other Python versions. Consider:
- Testing with multiple Python versions
- Adding specific SQLite tests to verify the reported segmentation faults
.github/workflows/reusable_notebook.yml (1)
25-25
: Test database operations in notebooksSince notebooks might contain database operations, ensure that:
- Notebooks using SQLite are tested on both Ubuntu versions
- Any segmentation faults during notebook execution are properly caught and reported
.github/workflows/test_python_3_12.yml (1)
20-20
: 🛠️ Refactor suggestionEnhance SQLite testing in comprehensive test suite
This comprehensive test workflow is ideal for investigating the SQLite segmentation faults. Consider:
- Adding specific SQLite test cases in both unit and integration tests
- Using environment variables to toggle between SQLite and PostgreSQL backends
- Adding error handling to capture and report segmentation faults
Run the following script to check for existing SQLite tests:
.github/workflows/test_python_3_11.yml (1)
20-20
: 🛠️ Refactor suggestionConsider using explicit Ubuntu version for reproducing SQLite issues.
Since the PR aims to address SQLite segmentation faults specifically on Ubuntu 24.04, using
ubuntu-latest
might not consistently reproduce the issue as the latest version may change. Consider explicitly usingubuntu-24.04
to ensure reliable test results.- runs-on: ubuntu-latest + runs-on: ubuntu-24.04.github/workflows/cd.yaml (1)
20-20
: LGTM! Using latest Ubuntu for CD workflow is appropriate.For CD workflows that don't involve SQLite testing, using
ubuntu-latest
is acceptable as it ensures we get the latest security updates and features..github/workflows/py_lint.yml (1)
19-19
: LGTM! Using latest Ubuntu for linting is appropriate.For linting workflows, using
ubuntu-latest
is acceptable as it ensures we get the latest security updates and features..github/workflows/cd_prd.yaml (1)
20-20
: LGTM! Using latest Ubuntu for production CD workflow is appropriate.For CD workflows that don't involve SQLite testing, using
ubuntu-latest
is acceptable as it ensures we get the latest security updates and features.
@@ -1,4 +1,4 @@ | |||
from sqlalchemy.orm import joinedload | |||
from sqlalchemy.orm import selectinload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Consider updating other joinedload
instances for SQLite compatibility
Similar to the change in get_default_user.py
, other files using joinedload
might benefit from switching to selectinload
to prevent SQLite segmentation faults:
cognee/modules/users/methods/get_user.py
: Usesjoinedload(User.groups)
- identical case to the fixed onecognee/modules/users/permissions/methods/check_permission_on_documents.py
:joinedload(ACL.resources)
cognee/modules/data/methods/create_dataset.py
andSqlAlchemyAdapter.py
: Both usejoinedload(Dataset.data)
🔗 Analysis chain
Good choice using selectinload to address SQLite segmentation faults.
The change from joinedload
to selectinload
is a good solution for SQLite compatibility issues. While joinedload
uses LEFT OUTER JOINs that can trigger SQLite limitations, selectinload
uses separate SELECT statements which are more reliable with SQLite.
Let's verify if there are other instances of joinedload
that might need similar updates:
Also applies to: 14-14
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for other instances of joinedload that might need updating
rg "joinedload" --type py
Length of output: 926
For some reason unknown to me using the joinedload for getting the default user would cause a segmentation fault in Ubuntu 24.04, problem is solved by using the selectinload option for joining data |
Current testing seems to suggest issues with running Cognee on Ubuntu 24.04 when using SQLite. When running Cognee on Ubuntu 24.04 with PostgreSQL there are no segmentation fault issues.
Summary by CodeRabbit
Infrastructure
ubuntu-latest
across multiple workflow filestrigger_deployment
job in the continuous deployment workflowCode Optimization
joinedload
toselectinload