added boilerplate for user management #123

Vasilije1990 · 2024-07-21T18:21:00Z

Summary by CodeRabbit

New Features
- Introduced a new job for setting up Docker Buildx in CI/CD workflows for efficient multi-platform builds.
- Added a PostgreSQL service to the CI/CD workflow, improving the testing environment for database interactions.
- Implemented a new step to ensure PostgreSQL is fully initialized before running tests, enhancing reliability.
Bug Fixes
- Improved error handling and logging for user-related actions.
Documentation
- Updated user management API documentation to reflect new endpoints and functionalities.

coderabbitai · 2024-07-21T18:21:09Z

Warning

Rate limit exceeded

@borisarzentar has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 3 minutes and 48 seconds before requesting another review.

How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

Commits

Files that changed from the base of the PR and between 6d38bcd and e3c3f35.

Walkthrough

The recent enhancements to the FastAPI application significantly improve user management, database interactions, and overall application performance. New authentication routes, along with the SqlAlchemyAdapter, streamline user operations and database handling. Asynchronous processing and user-specific permission checks offer a robust framework for secure data interactions. These changes collectively elevate the functionalities, making the application more adaptable and efficient in managing user data and permissions.

Changes

Files	Change Summary
`cognee/api/client.py`	Introduced new routes for authentication and user management; added an asynchronous lifespan context manager for database initialization.
`cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py`	Added `SqlAlchemyAdapter` class to manage databases asynchronously, supporting DuckDB and PostgreSQL with flexible session handling.
`cognee/api/v1/search/search.py`	Enhanced `search` function to include a `user` parameter for user-specific operations, improving the relevance of search results.
`cognee/modules/users/permissions/methods/check_permissions_on_documents.py`, `.../get_document_ids_for_user.py`	Added asynchronous permission management functions for document access, enhancing security and user-specific operations.
`.github/workflows/test_common.yml`	New job added for Docker setup and PostgreSQL service configured for robust testing; includes readiness checks for PostgreSQL.
`.github/workflows/test_neo4j.yml`, `.github/workflows/test_qdrant.yml`, `.github/workflows/test_weaviate.yml`	Integrated PostgreSQL service configurations into GitHub Actions for a more robust testing environment, including readiness checks for services.
`pyproject.toml`	Updated dependency versions for stability and added new libraries for user management and asynchronous database interactions.

Poem

In the meadow where bunnies play,
New routes for users brighten the day!
With hops and jumps, we now can see,
Secure access, oh joy, oh glee!
Authentication’s a breeze, we cheer,
For every user, friends are near! 🐇✨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 86c7aa2 and 866270b.

Files selected for processing (4)

cognee/api/client.py (1 hunks)
cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (1 hunks)
cognee/infrastructure/databases/relational/user_authentication/schemas.py (1 hunks)
cognee/infrastructure/databases/relational/user_authentication/users.py (1 hunks)

Additional context used

Ruff

cognee/api/client.py

46-46: Module level import not at top of file

(E402)

48-48: Module level import not at top of file

(E402)

50-50: Module level import not at top of file

(E402)

51-51: Module level import not at top of file

(E402)

52-52: Module level import not at top of file

(E402)

114-114: Redefinition of unused authenticated_route from line 80

(F811)

Additional comments not posted (17)

cognee/infrastructure/databases/relational/user_authentication/schemas.py (3)

6-7: LGTM!

The class UserRead correctly inherits from schemas.BaseUser[uuid.UUID].

10-11: LGTM!

The class UserCreate correctly inherits from schemas.BaseUserCreate.

14-15: LGTM!

The class UserUpdate correctly inherits from schemas.BaseUserUpdate.

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (5)

11-12: LGTM!

The class Base correctly inherits from DeclarativeBase.

15-16: LGTM!

The class User correctly inherits from SQLAlchemyBaseUserTableUUID and Base.

25-27: LGTM!

The function create_db_and_tables correctly creates the database and tables using SQLAlchemy.

30-32: LGTM!

The function get_async_session correctly provides an async session for database operations.

35-36: LGTM!

The function get_user_db correctly provides a user database using SQLAlchemy.

cognee/infrastructure/databases/relational/user_authentication/users.py (6)

18-33: LGTM!

The class UserManager correctly inherits from UUIDIDMixin and BaseUserManager. The methods for user registration, password reset, and verification are correctly implemented.

36-37: LGTM!

The function get_user_manager correctly provides a user manager using UserManager.

43-44: LGTM!

The function get_jwt_strategy correctly provides a JWT strategy for authentication.

47-51: LGTM!

The function auth_backend correctly sets up an authentication backend using JWT strategy.

53-53: LGTM!

The function fastapi_users correctly sets up FastAPI users using UserManager and auth_backend.

55-55: LGTM!

The function current_active_user correctly sets up the current active user using fastapi_users.

cognee/api/client.py (3)

54-76: LGTM! Verify router functionality.

The routers are correctly set up and prefixed. Ensure that they function as expected.

79-81: LGTM! Verify function usage.

The function correctly uses dependency injection for current_active_user. Ensure that the function is used as intended.

83-87: LGTM! Verify database setup.

The context manager correctly sets up the database and tables. Ensure that the database setup is functioning as expected.

coderabbitai · 2024-07-21T18:25:04Z

cognee/api/client.py

+from contextlib import asynccontextmanager
+
+from fastapi import Depends, FastAPI
+
+from cognee.infrastructure.databases.relational.user_authentication.authentication_db import User, create_db_and_tables
+from cognee.infrastructure.databases.relational.user_authentication.schemas import UserCreate, UserRead, UserUpdate
+from cognee.infrastructure.databases.relational.user_authentication.users import auth_backend, current_active_user, fastapi_users


Reorganize import statements to the top of the file.

Imports should be at the top of the file to comply with PEP 8 standards.

- from contextlib import asynccontextmanager - from fastapi import Depends, FastAPI - from cognee.infrastructure.databases.relational.user_authentication.authentication_db import User, create_db_and_tables - from cognee.infrastructure.databases.relational.user_authentication.schemas import UserCreate, UserRead, UserUpdate - from cognee.infrastructure.databases.relational.user_authentication.users import auth_backend, current_active_user, fastapi_users + from contextlib import asynccontextmanager + from fastapi import Depends, FastAPI + from cognee.infrastructure.databases.relational.user_authentication.authentication_db import User, create_db_and_tables + from cognee.infrastructure.databases.relational.user_authentication.schemas import UserCreate, UserRead, UserUpdate + from cognee.infrastructure.databases.relational.user_authentication.users import auth_backend, current_active_user, fastapi_users

Committable suggestion was skipped due to low confidence.

Tools

Ruff

46-46: Module level import not at top of file

(E402)

48-48: Module level import not at top of file

(E402)

50-50: Module level import not at top of file

(E402)

51-51: Module level import not at top of file

(E402)

52-52: Module level import not at top of file

(E402)

cognee/api/client.py

gitguardian · 2024-07-21T18:59:11Z

⚠️ GitGuardian has uncovered 16 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secrets in your pull request

GitGuardian id	GitGuardian status	Secret	Commit	Filename
9573981	Triggered	Generic Password	`77e8c1b`	cognee/infrastructure/databases/relational/config.py	View secret
9573981	Triggered	Generic Password	`e785b30`	docker-compose.yml	View secret
9573981	Triggered	Generic Password	`e785b30`	cognee/infrastructure/databases/relational/config.py	View secret
9573981	Triggered	Generic Password	`401167b`	cognee/infrastructure/databases/relational/config.py	View secret
9573981	Triggered	Generic Password	`896a2ce`	.github/workflows/test_common.yml	View secret
9573981	Triggered	Generic Password	`896a2ce`	.github/workflows/test_qdrant.yml	View secret
9573981	Triggered	Generic Password	`896a2ce`	.github/workflows/test_weaviate.yml	View secret
9573981	Triggered	Generic Password	`e492a18`	.github/workflows/test_weaviate.yml	View secret
9573981	Triggered	Generic Password	`e492a18`	.github/workflows/test_qdrant.yml	View secret
9573981	Triggered	Generic Password	`e492a18`	.github/workflows/test_qdrant.yml	View secret
9573981	Triggered	Generic Password	`e492a18`	.github/workflows/test_neo4j.yml	View secret
9573981	Triggered	Generic Password	`e492a18`	.github/workflows/test_common.yml	View secret
9573981	Triggered	Generic Password	`e492a18`	.github/workflows/test_weaviate.yml	View secret
9573981	Triggered	Generic Password	`e492a18`	.github/workflows/test_common.yml	View secret
13323556	Triggered	Username Password	`73dd6c2`	.github/workflows/test_neo4j.yml	View secret
13323556	Triggered	Username Password	`4749995`	.github/workflows/test_neo4j.yml	View secret

🛠 Guidelines to remediate hardcoded secrets

Understand the implications of revoking this secret by investigating where it is used in your code.
Replace and store your secrets safely. Learn here the best practices.
Revoke and rotate these secrets.
If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider

following these best practices for managing and storing secrets including API keys and other credentials
install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.

^{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}

coderabbitai

Actionable comments posted: 2

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 866270b and 77e8c1b.

Files selected for processing (4)

cognee/infrastructure/databases/relational/config.py (2 hunks)
cognee/infrastructure/databases/relational/create_relational_engine.py (1 hunks)
cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py (1 hunks)
cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (1 hunks)

Files skipped from review as they are similar to previous changes (1)

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

Additional context used

Ruff

cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py

94-98: f-string without any placeholders

Remove extraneous f prefix

(F541)

102-102: f-string without any placeholders

Remove extraneous f prefix

(F541)

Additional comments not posted (13)

cognee/infrastructure/databases/relational/create_relational_engine.py (1)

8-10: Enum DBProvider looks good.

The enum correctly defines the database types DUCKDB and POSTGRES.

cognee/infrastructure/databases/relational/config.py (1)

14-15: Configuration changes look good.

The db_provider configuration option is correctly added and the create_relational_engine function is updated accordingly. The to_dict method is also updated to include the new db_provider attribute.

Also applies to: 34-34

cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py (11)

5-10: Initialization method looks good.

The __init__ method correctly initializes the SQLAlchemyAdapter with the provided db_type, db_path, and db_name.

11-20: Dataset retrieval method looks good.

The get_datasets method correctly retrieves and filters distinct schema names from the database.

22-25: File metadata retrieval method looks good.

The get_files_metadata method correctly retrieves file metadata from the specified dataset.

27-31: Table creation method looks good.

The create_table method correctly creates a table with the specified schema name, table name, and table configuration.

33-35: Table deletion method looks good.

The delete_table method correctly deletes the specified table.

37-42: Data insertion method looks good.

The insert_data method correctly inserts data into the specified table.

44-59: Data retrieval method looks good.

The get_data method correctly retrieves data from the specified table and applies filters if provided.

61-64: Query execution method looks good.

The execute_query method correctly executes a raw SQL query.

66-80: Data loading method looks good.

The load_cognify_data method correctly loads data into the cognify table.

106-119: Data deletion method looks good.

The delete_cognify_data method correctly deletes all data from the cognify table and drops the table.

121-125: Database deletion method looks good.

The delete_database method correctly deletes the database file and its write-ahead log file.

coderabbitai · 2024-07-21T19:02:19Z

cognee/infrastructure/databases/relational/create_relational_engine.py

+def create_relational_engine(db_path: str, db_name: str, db_provider:str):
    LocalStorage.ensure_directory_exists(db_path)

-    return DuckDBAdapter(
-        db_name = db_name,
-        db_path = db_path,
-    )
+    llm_config = get_relationaldb_config()
+
+    provider = DBProvider(llm_config.llm_provider)
+
+
+    if provider == DBProvider.DUCKDB:
+
+        return DuckDBAdapter(
+            db_name = db_name,
+            db_path = db_path,
+        )
+    elif provider == DBProvider.POSTGRES:
+        return SQLAlchemyAdapter(
+            db_name = db_name,
+            db_path = db_path,
+            db_type = db_provider,
+        )


Add error handling for invalid llm_config.llm_provider.

The function does not handle the case where llm_config.llm_provider is not a valid DBProvider. Consider adding error handling for this scenario.

+ try: + provider = DBProvider(llm_config.llm_provider) + except ValueError: + raise ValueError(f"Invalid database provider: {llm_config.llm_provider}")

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def create_relational_engine(db_path: str, db_name: str, db_provider:str):

LocalStorage.ensure_directory_exists(db_path)

return DuckDBAdapter(

db_name = db_name,

db_path = db_path,

)

llm_config = get_relationaldb_config()

provider = DBProvider(llm_config.llm_provider)

if provider == DBProvider.DUCKDB:

return DuckDBAdapter(

db_name = db_name,

db_path = db_path,

)

elif provider == DBProvider.POSTGRES:

return SQLAlchemyAdapter(

db_name = db_name,

db_path = db_path,

db_type = db_provider,

)

def create_relational_engine(db_path: str, db_name: str, db_provider:str):

LocalStorage.ensure_directory_exists(db_path)

llm_config = get_relationaldb_config()

try:

provider = DBProvider(llm_config.llm_provider)

except ValueError:

raise ValueError(f"Invalid database provider: {llm_config.llm_provider}")

if provider == DBProvider.DUCKDB:

return DuckDBAdapter(

db_name = db_name,

db_path = db_path,

)

elif provider == DBProvider.POSTGRES:

return SQLAlchemyAdapter(

db_name = db_name,

db_path = db_path,

db_type = db_provider,

)

cognee/api/client.py

borisarzentar · 2024-07-22T10:19:29Z

cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py

+
+    def delete_cognify_data(self):
+        with self.engine.connect() as connection:
+            connection.execute(text("""


We create a table here and delete it immediately in the next line. Should we check if table exists and delete if it does?

borisarzentar · 2024-07-22T10:20:51Z

cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py

+                    document_id_target STRING NULL
+                );
+            """))
+            query = text(f"""


If we just created the table this query won't return anything, so we can just set records to []. Otherwise we can execute the query.

borisarzentar · 2024-07-22T10:25:07Z

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

+
+
+
+llm_config = get_relationaldb_config()


llm_config -> relationaldb_config. In line 26 as well.

borisarzentar · 2024-07-22T10:30:50Z

cognee/infrastructure/databases/relational/user_authentication/users.py

+)
+from fastapi_users.db import SQLAlchemyUserDatabase
+
+from app.db import User, get_user_db


What is add.db?

coderabbitai

Actionable comments posted: 15

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 77e8c1b and e785b30.

Files ignored due to path filters (1)

poetry.lock is excluded by !**/*.lock

Files selected for processing (8)

cognee/api/client.py (3 hunks)
cognee/infrastructure/databases/relational/config.py (2 hunks)
cognee/infrastructure/databases/relational/create_relational_engine.py (1 hunks)
cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py (1 hunks)
cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (1 hunks)
cognee/infrastructure/databases/relational/user_authentication/users.py (1 hunks)
docker-compose.yml (1 hunks)
pyproject.toml (1 hunks)

Files skipped from review as they are similar to previous changes (3)

cognee/infrastructure/databases/relational/config.py
cognee/infrastructure/databases/relational/user_authentication/authentication_db.py
cognee/infrastructure/databases/relational/user_authentication/users.py

Additional context used

Ruff

cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py

117-121: f-string without any placeholders

Remove extraneous f prefix

(F541)

125-125: f-string without any placeholders

Remove extraneous f prefix

(F541)

cognee/api/client.py

29-29: Module level import not at top of file

(E402)

51-51: Module level import not at top of file

(E402)

53-53: Module level import not at top of file

(E402)

55-55: Module level import not at top of file

(E402)

56-56: Module level import not at top of file

(E402)

57-57: Module level import not at top of file

(E402)

119-119: Redefinition of unused authenticated_route from line 85

(F811)

Additional comments not posted (7)

cognee/infrastructure/databases/relational/create_relational_engine.py (4)

1-2: Import statement is correct.

The Enum import is necessary for defining the DBProvider enumeration.

3-7: Import statements are correct.

The imported modules and classes are necessary for the functionality of the create_relational_engine function.

8-11: Enumeration definition is correct.

The DBProvider enumeration is correctly defined and will facilitate the selection of the appropriate database adapter.

12-16: Function signature and directory existence check are correct.

The function signature has been expanded to include additional parameters for database configuration. The LocalStorage.ensure_directory_exists method ensures the directory exists.

docker-compose.yml (2)

63-75: Service configuration for PostgreSQL is correct.

The service configuration includes necessary environment variables, volume mapping, port exposure, and network connection.

81-83: Volume declaration for PostgreSQL data is correct.

The volume declaration ensures data persistence across container restarts.

pyproject.toml (1)

72-73: Dependencies are correctly added.

The fastapi-users and asyncpg dependencies are necessary for user management and asynchronous database operations.

coderabbitai · 2024-07-22T22:04:13Z

cognee/infrastructure/databases/relational/create_relational_engine.py

+    if provider == DBProvider.DUCKDB:
+        # return DuckDBAdapter(
+        #     db_name = db_name,
+        #     db_path = db_path,
+        # )
+        return SQLAlchemyAdapter(
+            db_name = db_name,
+            db_path = db_path,
+            db_type = db_provider,
+            db_host=db_host,
+            db_port=db_port,
+            db_user=db_user,
+            db_password=db_password
+        )
+    elif provider == DBProvider.POSTGRES:
+        return SQLAlchemyAdapter(
+            db_name = db_name,
+            db_path = db_path,
+            db_type = db_provider,
+            db_host= db_host,
+            db_port= db_port,
+            db_user= db_user,
+            db_password= db_password
+        )


Remove commented-out code and add error handling for invalid db_provider values.

The function correctly selects the database adapter based on the db_provider value. However, the commented-out code for DuckDBAdapter should be removed to maintain code cleanliness. Additionally, error handling for invalid db_provider values should be added.

- # return DuckDBAdapter( - # db_name = db_name, - # db_path = db_path, - # ) + try: + provider = DBProvider(db_provider) + except ValueError: + raise ValueError(f"Invalid database provider: {db_provider}")

Committable suggestion was skipped due to low confidence.

coderabbitai · 2024-07-22T22:04:13Z

cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py

+    async def get_async_session(self):
+        async_session_maker = self.Session
+        async with async_session_maker() as session:
+            yield session


Add type hints and docstrings.

Include type hints and docstrings for better readability and maintainability.

+ async def get_async_session(self) -> AsyncSession: + """ + Yields an asynchronous SQLAlchemy session. + """

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async def get_async_session(self):

async_session_maker = self.Session

async with async_session_maker() as session:

yield session

async def get_async_session(self) -> AsyncSession:

"""

Yields an asynchronous SQLAlchemy session.

"""

async_session_maker = self.Session

async with async_session_maker() as session:

yield session

coderabbitai · 2024-07-22T22:04:13Z

cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py

+    def get_session(self):
+        session_maker = self.Session
+        with session_maker() as session:
+            yield session


Add type hints and docstrings.

Include type hints and docstrings for better readability and maintainability.

+ def get_session(self) -> sessionmaker: + """ + Yields a synchronous SQLAlchemy session. + """

Committable suggestion was skipped due to low confidence.

coderabbitai · 2024-07-22T22:04:13Z

cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py

+    def delete_database(self):
+        from cognee.infrastructure.files.storage import LocalStorage
+        LocalStorage.remove(self.db_location)
+        if LocalStorage.file_exists(self.db_location + ".wal"):
+            LocalStorage.remove(self.db_location + ".wal")


Add type hints and docstrings.

Include type hints and docstrings for better readability and maintainability.

+ def delete_database(self) -> None: + """ + Deletes the database file and its write-ahead log file. + """

Committable suggestion was skipped due to low confidence.

coderabbitai · 2024-07-22T22:04:13Z

cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py

+    def delete_cognify_data(self):
+        with self.engine.connect() as connection:
+            connection.execute(text("""
+                CREATE TABLE IF NOT EXISTS cognify (
+                    document_id STRING,
+                    layer_id STRING,
+                    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
+                    updated_at TIMESTAMP DEFAULT NULL,
+                    processed BOOLEAN DEFAULT FALSE,
+                    document_id_target STRING NULL
+                );
+            """))
+            connection.execute(text("DELETE FROM cognify;"))
+            connection.execute(text("DROP TABLE cognify;"))


Add type hints, docstrings, and remove redundant table creation logic.

Include type hints and docstrings for better readability and maintainability. Remove redundant table creation logic before deletion.

+ def delete_cognify_data(self) -> None: + """ + Deletes data from the `cognify` table. + """ - connection.execute(text(""" - CREATE TABLE IF NOT EXISTS cognify ( - document_id STRING, - layer_id STRING, - created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, - updated_at TIMESTAMP DEFAULT NULL, - processed BOOLEAN DEFAULT FALSE, - document_id_target STRING NULL - ); - """))

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def delete_cognify_data(self):

with self.engine.connect() as connection:

connection.execute(text("""

CREATE TABLE IF NOT EXISTS cognify (

document_id STRING,

layer_id STRING,

created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,

updated_at TIMESTAMP DEFAULT NULL,

processed BOOLEAN DEFAULT FALSE,

document_id_target STRING NULL

);

"""))

connection.execute(text("DELETE FROM cognify;"))

connection.execute(text("DROP TABLE cognify;"))

def delete_cognify_data(self) -> None:

"""

Deletes data from the `cognify` table.

"""

with self.engine.connect() as connection:

connection.execute(text("DELETE FROM cognify;"))

connection.execute(text("DROP TABLE cognify;"))

coderabbitai · 2024-07-22T22:04:13Z

cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py

+    def get_datasets(self):
+        with self.engine.connect() as connection:
+            result = connection.execute(text("SELECT DISTINCT schema_name FROM information_schema.tables;"))
+            tables = [row['schema_name'] for row in result]
+        return list(
+            filter(
+                lambda schema_name: not schema_name.endswith("staging") and schema_name != "cognee",
+                tables
+            )
+        )


Add type hints, docstrings, and improve filtering logic.

Include type hints and docstrings for better readability and maintainability. Improve filtering logic for readability.

+ def get_datasets(self) -> List[str]: + """ + Retrieves distinct schema names from the database, excluding staging schemas and 'cognee'. + """ - return list( - filter( - lambda schema_name: not schema_name.endswith("staging") and schema_name != "cognee", - tables - ) - ) + return [schema_name for schema_name in tables if not schema_name.endswith("staging") and schema_name != "cognee"]

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def get_datasets(self):

with self.engine.connect() as connection:

result = connection.execute(text("SELECT DISTINCT schema_name FROM information_schema.tables;"))

tables = [row['schema_name'] for row in result]

return list(

filter(

lambda schema_name: not schema_name.endswith("staging") and schema_name != "cognee",

tables

)

)

def get_datasets(self) -> List[str]:

"""

Retrieves distinct schema names from the database, excluding staging schemas and 'cognee'.

"""

with self.engine.connect() as connection:

result = connection.execute(text("SELECT DISTINCT schema_name FROM information_schema.tables;"))

tables = [row['schema_name'] for row in result]

return [schema_name for schema_name in tables if not schema_name.endswith("staging") and schema_name != "cognee"]

coderabbitai · 2024-07-22T22:04:13Z

cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py

+    def load_cognify_data(self, data):
+        metadata = MetaData()
+        cognify_table = Table(
+            'cognify', metadata,
+            Column('document_id', String),
+            Column('layer_id', String),
+            Column('created_at', TIMESTAMP, server_default=text('CURRENT_TIMESTAMP')),
+            Column('updated_at', TIMESTAMP, nullable=True, default=None),
+            Column('processed', Boolean, default=False),
+            Column('document_id_target', String, nullable=True)
+        )
+        metadata.create_all(self.engine)
+        insert_query = cognify_table.insert().values(document_id=text(':document_id'), layer_id=text(':layer_id'))
+        with self.engine.connect() as connection:
+            connection.execute(insert_query, data)


Add type hints and docstrings.

Include type hints and docstrings for better readability and maintainability.

+ def load_cognify_data(self, data: List[Dict[str, Any]]) -> None: + """ + Loads data into the `cognify` table. + """

Committable suggestion was skipped due to low confidence.

coderabbitai · 2024-07-22T22:04:13Z

cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py

+    def __init__(self, db_type: str, db_path: str, db_name: str, db_user:str, db_password:str, db_host:str, db_port:str):
+        self.db_location = os.path.abspath(os.path.join(db_path, db_name))
+        # self.engine = create_engine(f"{db_type}:///{self.db_location}")
+        if db_type == "duckdb":
+            self.engine = create_engine(f"duckdb:///{self.db_location}")
+            self.Session = sessionmaker(bind=self.engine)
+
+        else:
+            print("Name: ", db_name)
+            print("User: ", db_user)
+            print("Password: ", db_password)
+            print("Host: ", db_host)
+            print("Port: ", db_port)
+            self.engine = create_async_engine(f"postgresql+asyncpg://{db_user}:{db_password}@{db_host}:{db_port}/{db_name}")
+            self.Session = sessionmaker(bind=self.engine, class_=AsyncSession, expire_on_commit=False)


Use logging instead of print statements and conditionally construct self.db_location.

Replace print statements with logging for better maintainability. Construct self.db_location only if the database type is DuckDB.

- self.db_location = os.path.abspath(os.path.join(db_path, db_name)) + if db_type == "duckdb": + self.db_location = os.path.abspath(os.path.join(db_path, db_name)) - print("Name: ", db_name) - print("User: ", db_user) - print("Password: ", db_password) - print("Host: ", db_host) - print("Port: ", db_port) + logger.info(f"Name: {db_name}") + logger.info(f"User: {db_user}") + logger.info(f"Password: {db_password}") + logger.info(f"Host: {db_host}") + logger.info(f"Port: {db_port}")

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def __init__(self, db_type: str, db_path: str, db_name: str, db_user:str, db_password:str, db_host:str, db_port:str):

self.db_location = os.path.abspath(os.path.join(db_path, db_name))

# self.engine = create_engine(f"{db_type}:///{self.db_location}")

if db_type == "duckdb":

self.engine = create_engine(f"duckdb:///{self.db_location}")

self.Session = sessionmaker(bind=self.engine)

else:

print("Name: ", db_name)

print("User: ", db_user)

print("Password: ", db_password)

print("Host: ", db_host)

print("Port: ", db_port)

self.engine = create_async_engine(f"postgresql+asyncpg://{db_user}:{db_password}@{db_host}:{db_port}/{db_name}")

self.Session = sessionmaker(bind=self.engine, class_=AsyncSession, expire_on_commit=False)

def __init__(self, db_type: str, db_path: str, db_name: str, db_user:str, db_password:str, db_host:str, db_port:str):

if db_type == "duckdb":

self.db_location = os.path.abspath(os.path.join(db_path, db_name))

self.engine = create_engine(f"duckdb:///{self.db_location}")

self.Session = sessionmaker(bind=self.engine)

else:

logger.info(f"Name: {db_name}")

logger.info(f"User: {db_user}")

logger.info(f"Password: {db_password}")

logger.info(f"Host: {db_host}")

logger.info(f"Port: {db_port}")

self.engine = create_async_engine(f"postgresql+asyncpg://{db_user}:{db_password}@{db_host}:{db_port}/{db_name}")

self.Session = sessionmaker(bind=self.engine, class_=AsyncSession, expire_on_commit=False)

coderabbitai · 2024-07-22T22:04:14Z

cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py

+    def execute_query(self, query):
+        with self.engine.connect() as connection:
+            result = connection.execute(text(query))
+            return [dict(row) for row in result]


Add type hints, docstrings, and use parameterized queries.

Include type hints and docstrings for better readability and maintainability. Use parameterized queries to prevent SQL injection.

+ def execute_query(self, query: str) -> List[Dict[str, Any]]: + """ + Executes the provided SQL query. + """ - result = connection.execute(text(query)) + result = connection.execute(text(":query"), {'query': query})

Committable suggestion was skipped due to low confidence.

coderabbitai · 2024-07-22T22:04:14Z

cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py

+    def fetch_cognify_data(self, excluded_document_id: str):
+        with self.engine.connect() as connection:
+            connection.execute(text("""
+                CREATE TABLE IF NOT EXISTS cognify (
+                    document_id STRING,
+                    layer_id STRING,
+                    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
+                    updated_at TIMESTAMP DEFAULT NULL,
+                    processed BOOLEAN DEFAULT FALSE,
+                    document_id_target STRING NULL
+                );
+            """))
+            query = text(f"""
+                SELECT document_id, layer_id, created_at, updated_at, processed
+                FROM cognify
+                WHERE document_id != :excluded_document_id AND processed = FALSE;
+            """)
+            records = connection.execute(query, {'excluded_document_id': excluded_document_id}).fetchall()
+            if records:
+                document_ids = tuple(record['document_id'] for record in records)
+                update_query = text(f"UPDATE cognify SET processed = TRUE WHERE document_id IN :document_ids;")
+                connection.execute(update_query, {'document_ids': document_ids})
+            return [dict(record) for record in records]


Add type hints, docstrings, and remove extraneous f prefixes.

Include type hints and docstrings for better readability and maintainability. Remove extraneous f prefixes from f-strings without placeholders.

+ def fetch_cognify_data(self, excluded_document_id: str) -> List[Dict[str, Any]]: + """ + Fetches data from the `cognify` table based on the provided `excluded_document_id`. + """ - query = text(f""" - SELECT document_id, layer_id, created_at, updated_at, processed - FROM cognify - WHERE document_id != :excluded_document_id AND processed = FALSE; - """) + query = text(""" + SELECT document_id, layer_id, created_at, updated_at, processed + FROM cognify + WHERE document_id != :excluded_document_id AND processed = FALSE; + """) - update_query = text(f"UPDATE cognify SET processed = TRUE WHERE document_id IN :document_ids;") + update_query = text("UPDATE cognify SET processed = TRUE WHERE document_id IN :document_ids;")

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def fetch_cognify_data(self, excluded_document_id: str):

with self.engine.connect() as connection:

connection.execute(text("""

CREATE TABLE IF NOT EXISTS cognify (

document_id STRING,

layer_id STRING,

created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,

updated_at TIMESTAMP DEFAULT NULL,

processed BOOLEAN DEFAULT FALSE,

document_id_target STRING NULL

);

"""))

query = text(f"""

SELECT document_id, layer_id, created_at, updated_at, processed

FROM cognify

WHERE document_id != :excluded_document_id AND processed = FALSE;

""")

records = connection.execute(query, {'excluded_document_id': excluded_document_id}).fetchall()

if records:

document_ids = tuple(record['document_id'] for record in records)

update_query = text(f"UPDATE cognify SET processed = TRUE WHERE document_id IN :document_ids;")

connection.execute(update_query, {'document_ids': document_ids})

return [dict(record) for record in records]

def fetch_cognify_data(self, excluded_document_id: str) -> List[Dict[str, Any]]:

"""

Fetches data from the `cognify` table based on the provided `excluded_document_id`.

"""

with self.engine.connect() as connection:

connection.execute(text("""

CREATE TABLE IF NOT EXISTS cognify (

document_id STRING,

layer_id STRING,

created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,

updated_at TIMESTAMP DEFAULT NULL,

processed BOOLEAN DEFAULT FALSE,

document_id_target STRING NULL

);

"""))

query = text("""

SELECT document_id, layer_id, created_at, updated_at, processed

FROM cognify

WHERE document_id != :excluded_document_id AND processed = FALSE;

""")

records = connection.execute(query, {'excluded_document_id': excluded_document_id}).fetchall()

if records:

document_ids = tuple(record['document_id'] for record in records)

update_query = text("UPDATE cognify SET processed = TRUE WHERE document_id IN :document_ids;")

connection.execute(update_query, {'document_ids': document_ids})

return [dict(record) for record in records]

Tools

Ruff

117-121: f-string without any placeholders

Remove extraneous f prefix

(F541)

125-125: f-string without any placeholders

Remove extraneous f prefix

(F541)

coderabbitai

Actionable comments posted: 6

Outside diff range, codebase verification and nitpick comments (1)

cognee/api/client.py (1)

Line range hint 31-59:
Reorganize import statements to the top of the file.

Imports should be at the top of the file to comply with PEP 8 standards.

- from contextlib import asynccontextmanager
- from fastapi import Depends, FastAPI
- from cognee.infrastructure.databases.relational.user_authentication.authentication_db import User, create_db_and_tables
- from cognee.infrastructure.databases.relational.user_authentication.schemas import UserCreate, UserRead, UserUpdate
- from cognee.infrastructure.databases.relational.user_authentication.users import auth_backend, current_active_user, fastapi_users
+ from contextlib import asynccontextmanager
+ from fastapi import Depends, FastAPI
+ from cognee.infrastructure.databases.relational.user_authentication.authentication_db import User, create_db_and_tables
+ from cognee.infrastructure.databases.relational.user_authentication.schemas import UserCreate, UserRead, UserUpdate
+ from cognee.infrastructure.databases.relational.user_authentication.users import auth_backend, current_active_user, fastapi_users

Also applies to: 91-123

Tools

Ruff

31-31: Module level import not at top of file

(E402)

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between e785b30 and dab5c87.

Files selected for processing (4)

cognee/api/client.py (4 hunks)
cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (1 hunks)
cognee/infrastructure/databases/relational/user_authentication/routers.py (1 hunks)
cognee/infrastructure/databases/relational/user_authentication/users.py (1 hunks)

Files skipped from review as they are similar to previous changes (1)

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

Additional context used

Ruff

cognee/infrastructure/databases/relational/user_authentication/users.py

12-12: Redefinition of unused Depends from line 4

Remove definition: Depends

(F811)

cognee/api/client.py

31-31: Module level import not at top of file

(E402)

53-53: Module level import not at top of file

(E402)

55-55: Module level import not at top of file

(E402)

57-57: Module level import not at top of file

(E402)

58-58: Module level import not at top of file

(E402)

59-59: Module level import not at top of file

(E402)

122-122: Redefinition of unused authenticated_route from line 88

(F811)

Additional comments not posted (3)

cognee/infrastructure/databases/relational/user_authentication/users.py (3)

37-38: LGTM!

The function get_user_manager is correctly implemented.

44-45: LGTM!

The function get_jwt_strategy is correctly implemented.

64-70: LGTM!

The function has_permission is correctly implemented.

coderabbitai · 2024-07-22T22:32:05Z

cognee/infrastructure/databases/relational/user_authentication/routers.py

+@permission_router.post("/groups/{group_id}/permissions")
+async def assign_permission_to_group(group_id: int, permission: str, db: Session = Depends(get_user_db)):
+    group = db.query(Group).filter(Group.id == group_id).first()
+    if not group:
+        raise HTTPException(status_code=404, detail="Group not found")
+    perm = db.query(Permission).filter(Permission.name == permission).first()
+    if not perm:
+        perm = Permission(name=permission)
+        db.add(perm)
+    group.permissions.append(perm)
+    db.commit()
+    return {"msg": "Permission added to group"}


Add error handling and logging.

The function should handle database operation errors and log the actions for better traceability.

import logging logger = logging.getLogger(__name__) @permission_router.post("/groups/{group_id}/permissions") async def assign_permission_to_group(group_id: int, permission: str, db: Session = Depends(get_user_db)): try: group = db.query(Group).filter(Group.id == group_id).first() if not group: raise HTTPException(status_code=404, detail="Group not found") perm = db.query(Permission).filter(Permission.name == permission).first() if not perm: perm = Permission(name=permission) db.add(perm) group.permissions.append(perm) db.commit() logger.info(f"Permission '{permission}' added to group '{group_id}'") return {"msg": "Permission added to group"} except Exception as e: db.rollback() logger.error(f"Error assigning permission to group: {e}") raise HTTPException(status_code=500, detail="Internal Server Error")

coderabbitai · 2024-07-22T22:32:05Z

cognee/infrastructure/databases/relational/user_authentication/routers.py

+@permission_router.post("/users/{user_id}/groups")
+async def add_user_to_group(user_id: int, group_id: int, db: Session = Depends(get_user_db)):
+    user = db.query(User).filter(User.id == user_id).first()
+    group = db.query(Group).filter(Group.id == group_id).first()
+    if not user or not group:
+        raise HTTPException(status_code=404, detail="User or group not found")
+    user.groups.append(group)
+    db.commit()
+    return {"msg": "User added to group"}


Add error handling and logging.

The function should handle database operation errors and log the actions for better traceability.

import logging logger = logging.getLogger(__name__) @permission_router.post("/users/{user_id}/groups") async def add_user_to_group(user_id: int, group_id: int, db: Session = Depends(get_user_db)): try: user = db.query(User).filter(User.id == user_id).first() group = db.query(Group).filter(Group.id == group_id).first() if not user or not group: raise HTTPException(status_code=404, detail="User or group not found") user.groups.append(group) db.commit() logger.info(f"User '{user_id}' added to group '{group_id}'") return {"msg": "User added to group"} except Exception as e: db.rollback() logger.error(f"Error adding user to group: {e}") raise HTTPException(status_code=500, detail="Internal Server Error")

coderabbitai

Actionable comments posted: 8

Outside diff range, codebase verification and nitpick comments (1)

cognee/infrastructure/data/models/Dataset.py (1)
5-5: Verify the implications of changing the base class.

There are still several references to ModelBase in the codebase, indicating that the transition to Base is incomplete. Ensure all references to ModelBase are updated to Base and that all necessary migrations are handled.

cognee/infrastructure/pipeline/models/Operation.py

cognee/modules/pipelines/models/Task.py

cognee/modules/pipelines/models/Pipeline.py

cognee/modules/pipelines/models/PipelineTask.py

cognee/infrastructure/databases/relational/sqlite/SqliteEngine.py

Analysis chain

Verify the implications of changing the base class.

The change from ModelBase to Base might have significant implications on how the Dataset class interacts with the database. Ensure that this change is compatible with the rest of the codebase and that all necessary migrations are handled.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the usage and compatibility of the new base class `Base` in the codebase.

# Test: Search for the `Dataset` class usage. Expect: No issues with the new base class.
rg --type python -A 5 $'class Dataset(Base)'
Length of output: 76

Script:
#!/bin/bash
# Description: Verify the usage and compatibility of the new base class `Base` in the codebase.

# Test: Search for the `Dataset` class usage. Expect: No issues with the new base class.
rg --type py -A 5 $'class Dataset(Base)'
Length of output: 41

Script:
#!/bin/bash
# Description: Verify the current base class used by the `Dataset` class in the codebase.

# Test: Search for the `Dataset` class definition. Expect: Identify the current base class used.
rg --type py -A 5 $'class Dataset'
Length of output: 963

Script:
#!/bin/bash
# Description: Verify that there are no remaining references to `ModelBase` in the codebase.

# Test: Search for any references to `ModelBase`. Expect: No references found.
rg --type py $'ModelBase'
Length of output: 937

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between dab5c87 and 14e1eba.

Files ignored due to path filters (1)

poetry.lock is excluded by !**/*.lock

Files selected for processing (11)

cognee/api/client.py (4 hunks)
cognee/infrastructure/data/models/Data.py (1 hunks)
cognee/infrastructure/data/models/Dataset.py (1 hunks)
cognee/infrastructure/data/models/DatasetData.py (1 hunks)
cognee/infrastructure/databases/relational/ModelBase.py (1 hunks)
cognee/infrastructure/databases/relational/init.py (1 hunks)
cognee/infrastructure/databases/relational/sqlite/SqliteEngine.py (2 hunks)
cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (1 hunks)
cognee/infrastructure/databases/relational/user_authentication/routers.py (1 hunks)
cognee/infrastructure/databases/relational/user_authentication/users.py (1 hunks)
pyproject.toml (2 hunks)

Files skipped from review due to trivial changes (1)

cognee/infrastructure/databases/relational/sqlite/SqliteEngine.py

Files skipped from review as they are similar to previous changes (2)

cognee/infrastructure/databases/relational/user_authentication/routers.py
pyproject.toml

Additional context used

Ruff

cognee/infrastructure/databases/relational/user_authentication/users.py

12-12: Redefinition of unused Depends from line 4

Remove definition: Depends

(F811)

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

12-12: Redefinition of unused DeclarativeBase from line 7

Remove definition: DeclarativeBase

(F811)

12-12: Redefinition of unused sessionmaker from line 7

Remove definition: sessionmaker

(F811)

12-12: Redefinition of unused Session from line 7

Remove definition: Session

(F811)

cognee/api/client.py

31-31: Module level import not at top of file

(E402)

53-53: Module level import not at top of file

(E402)

55-55: Module level import not at top of file

(E402)

57-57: Module level import not at top of file

(E402)

58-58: Module level import not at top of file

(E402)

59-59: Module level import not at top of file

(E402)

121-121: Redefinition of unused authenticated_route from line 87

(F811)

Additional comments not posted (14)

cognee/infrastructure/databases/relational/ModelBase.py (1)

2-6: LGTM! The new base class definition aligns with modern SQLAlchemy practices.

The use of DeclarativeBase is a more contemporary approach and improves maintainability.

cognee/infrastructure/databases/relational/__init__.py (1)

1-1: LGTM! The import statement change is consistent with the new base class definition.

The update ensures that the module imports the correct base class.

cognee/infrastructure/data/models/DatasetData.py (1)

4-6: LGTM! The class inheritance change is consistent with the new base class definition.

The update ensures that the DatasetData class inherits from the correct base class.

cognee/infrastructure/data/models/Data.py (1)

5-5: Verify the implications of changing the base class.

The change from ModelBase to Base might have significant implications on how the Data class interacts with the database. Ensure that this change is compatible with the rest of the codebase and that all necessary migrations are handled.

cognee/infrastructure/databases/relational/user_authentication/users.py (4)

37-38: LGTM!

The function get_user_manager correctly yields an instance of UserManager.

44-45: LGTM!

The function get_jwt_strategy correctly returns a JWTStrategy with the specified secret and lifetime.

59-63: LGTM!

The function get_user_permissions correctly retrieves and returns the permissions for a user.

65-71: LGTM!

The function has_permission correctly checks if a user has a specific permission.

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (2)

44-57: LGTM!

The database configuration and session management are correctly implemented.

51-61: LGTM!

The utility functions for creating the database and tables, and for getting the async session and user database are correctly implemented.

cognee/api/client.py (4)

85-85: LGTM!

The permission router is correctly implemented.

Line range hint 124-139:
LGTM!

The root and health check endpoints are correctly implemented.

Tools

Ruff

53-53: Module level import not at top of file

(E402)

55-55: Module level import not at top of file

(E402)

57-57: Module level import not at top of file

(E402)

58-58: Module level import not at top of file

(E402)

59-59: Module level import not at top of file

(E402)

121-121: Redefinition of unused authenticated_route from line 87

(F811)

Line range hint 140-261:
LGTM!

The dataset-related endpoints are correctly implemented.

Line range hint 262-349:
LGTM!

The endpoints for settings, adding data, cognitive processing, and searching are correctly implemented.

coderabbitai · 2024-07-23T11:49:31Z

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

+# Association table for many-to-many relationship between users and groups
+user_group = Table('user_group', Base.metadata,
+                   Column('user_id', UUID, ForeignKey('users.id')),
+                   Column('group_id', UUID, ForeignKey('groups.id')))
+
+# Association table for many-to-many relationship between groups and permissions
+group_permission = Table('group_permission', Base.metadata,
+                         Column('group_id', UUID, ForeignKey('groups.id')),
+                         Column('permission_id', UUID, ForeignKey('permissions.id')))


Consider adding explicit primary keys for association tables.

While the association tables are correctly defined, adding explicit primary keys can help with performance and data integrity.

- user_group = Table('user_group', Base.metadata, - Column('user_id', UUID, ForeignKey('users.id')), - Column('group_id', UUID, ForeignKey('groups.id'))) + user_group = Table('user_group', Base.metadata, + Column('id', UUID, primary_key=True, index=True), + Column('user_id', UUID, ForeignKey('users.id')), + Column('group_id', UUID, ForeignKey('groups.id'))) - group_permission = Table('group_permission', Base.metadata, - Column('group_id', UUID, ForeignKey('groups.id')), - Column('permission_id', UUID, ForeignKey('permissions.id'))) + group_permission = Table('group_permission', Base.metadata, + Column('id', UUID, primary_key=True, index=True), + Column('group_id', UUID, ForeignKey('groups.id')), + Column('permission_id', UUID, ForeignKey('permissions.id')))

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Association table for many-to-many relationship between users and groups

user_group = Table('user_group', Base.metadata,

Column('user_id', UUID, ForeignKey('users.id')),

Column('group_id', UUID, ForeignKey('groups.id')))

# Association table for many-to-many relationship between groups and permissions

group_permission = Table('group_permission', Base.metadata,

Column('group_id', UUID, ForeignKey('groups.id')),

Column('permission_id', UUID, ForeignKey('permissions.id')))

# Association table for many-to-many relationship between users and groups

user_group = Table('user_group', Base.metadata,

Column('id', UUID, primary_key=True, index=True),

Column('user_id', UUID, ForeignKey('users.id')),

Column('group_id', UUID, ForeignKey('groups.id')))

# Association table for many-to-many relationship between groups and permissions

group_permission = Table('group_permission', Base.metadata,

Column('id', UUID, primary_key=True, index=True),

Column('group_id', UUID, ForeignKey('groups.id')),

Column('permission_id', UUID, ForeignKey('permissions.id')))

coderabbitai · 2024-07-23T11:49:31Z

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

+class User(SQLAlchemyBaseUserTableUUID, Base):
+    __tablename__ = 'users'
+    groups = relationship('Group', secondary=user_group, back_populates='users')
+
+class Group(Base):
+    __tablename__ = 'groups'
+    id = Column(UUID, primary_key=True, index=True)
+    name = Column(String, unique=True, index=True)
+    users = relationship('User', secondary=user_group, back_populates='groups')
+    permissions = relationship('Permission', secondary=group_permission, back_populates='groups')
+
+class Permission(Base):
+    __tablename__ = 'permissions'
+    id = Column(UUID, primary_key=True, index=True)
+    name = Column(String, unique=True, index=True)
+    groups = relationship('Group', secondary=group_permission, back_populates='permissions')


Add id column to the User model.

The User model is missing the id column, which is necessary for primary key identification.

- class User(SQLAlchemyBaseUserTableUUID, Base): - __tablename__ = 'users' - groups = relationship('Group', secondary=user_group, back_populates='users') + class User(SQLAlchemyBaseUserTableUUID, Base): + __tablename__ = 'users' + id = Column(UUID, primary_key=True, index=True) + groups = relationship('Group', secondary=user_group, back_populates='users')

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class User(SQLAlchemyBaseUserTableUUID, Base):

__tablename__ = 'users'

groups = relationship('Group', secondary=user_group, back_populates='users')

class Group(Base):

__tablename__ = 'groups'

id = Column(UUID, primary_key=True, index=True)

name = Column(String, unique=True, index=True)

users = relationship('User', secondary=user_group, back_populates='groups')

permissions = relationship('Permission', secondary=group_permission, back_populates='groups')

class Permission(Base):

__tablename__ = 'permissions'

id = Column(UUID, primary_key=True, index=True)

name = Column(String, unique=True, index=True)

groups = relationship('Group', secondary=group_permission, back_populates='permissions')

class User(SQLAlchemyBaseUserTableUUID, Base):

__tablename__ = 'users'

id = Column(UUID, primary_key=True, index=True)

groups = relationship('Group', secondary=user_group, back_populates='users')

class Group(Base):

__tablename__ = 'groups'

id = Column(UUID, primary_key=True, index=True)

name = Column(String, unique=True, index=True)

users = relationship('User', secondary=user_group, back_populates='groups')

permissions = relationship('Permission', secondary=group_permission, back_populates='groups')

class Permission(Base):

__tablename__ = 'permissions'

id = Column(UUID, primary_key=True, index=True)

name = Column(String, unique=True, index=True)

groups = relationship('Group', secondary=group_permission, back_populates='permissions')

coderabbitai · 2024-07-23T11:49:32Z

cognee/api/client.py

@@ -13,6 +13,8 @@
 from fastapi.middleware.cors import CORSMiddleware
 from pydantic import BaseModel

+from cognee.infrastructure.databases.relational.user_authentication.routers import permission_router


Reorganize import statements to the top of the file.

Imports should be at the top of the file to comply with PEP 8 standards.

- from contextlib import asynccontextmanager - from fastapi import Depends, FastAPI - from cognee.infrastructure.databases.relational.user_authentication.authentication_db import User, create_db_and_tables - from cognee.infrastructure.databases.relational.user_authentication.schemas import UserCreate, UserRead, UserUpdate - from cognee.infrastructure.databases.relational.user_authentication.users import auth_backend, current_active_user, fastapi_users + from contextlib import asynccontextmanager + from fastapi import Depends, FastAPI + from cognee.infrastructure.databases.relational.user_authentication.routers import permission_router + from cognee.infrastructure.databases.relational.user_authentication.authentication_db import User, create_db_and_tables + from cognee.infrastructure.databases.relational.user_authentication.schemas import UserCreate, UserRead, UserUpdate + from cognee.infrastructure.databases.relational.user_authentication.users import auth_backend, current_active_user, fastapi_users

Also applies to: 31-31, 53-59

coderabbitai · 2024-07-23T11:49:32Z

cognee/api/client.py

+@asynccontextmanager
+async def lifespan(app: FastAPI):
+    # Not needed if you setup a migration system like Alembic
+    await create_db_and_tables()
+    yield
+app = FastAPI(debug = os.getenv("ENV") != "prod", lifespan=lifespan)


Remove duplicate lifespan context manager.

The lifespan context manager is defined twice, which can cause conflicts.

- @asynccontextmanager - async def lifespan(app: FastAPI): - # Not needed if you setup a migration system like Alembic - await create_db_and_tables() - yield

Also applies to: 90-95

@Vasilije1990 This one

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

coderabbitai · 2024-07-23T11:49:32Z

cognee/api/client.py

+@app.get("/authenticated-route")
+async def authenticated_route(user: User = Depends(current_active_user)):
+    return {"message": f"Hello {user.email}!"}
+


Remove duplicate function authenticated_route.

The function authenticated_route is defined twice, which can cause conflicts.

- @app.get("/authenticated-route") - async def authenticated_route(user: User = Depends(current_active_user)): - return {"message": f"Hello {user.email}!"}

Also applies to: 120-123

@Vasilije1990 This one

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

coderabbitai

Actionable comments posted: 4

Outside diff range, codebase verification and nitpick comments (2)

cognee/modules/data/processing/process_documents.py (1)

4-4: Document the new parameters.

The new parameters user and user_permissions should be documented to explain their purpose and usage.

+    """
+    Process documents and add user-specific data.
+
+    Args:
+        documents (list[Document]): List of documents to process.
+        parent_node_id (str, optional): ID of the parent node. Defaults to None.
+        user (str, optional): User information. Defaults to None.
+        user_permissions (str, optional): User permissions. Defaults to None.
+    """

cognee/api/v1/cognify/cognify_v2.py (1)

39-39: Document the new user_id parameter.

The new parameter user_id should be documented to explain its purpose and usage.

+    """
+    Cognify datasets and add user-specific data.
+
+    Args:
+        datasets (Union[str, list[str]], optional): List of datasets to process. Defaults to None.
+        root_node_id (str, optional): ID of the root node. Defaults to None.
+        user_id (str, optional): User ID. Defaults to "default_user".
+    """

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 14e1eba and 36e156e.

Files selected for processing (3)

cognee/api/v1/cognify/cognify_v2.py (3 hunks)
cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (1 hunks)
cognee/modules/data/processing/process_documents.py (2 hunks)

Additional context used

Ruff

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

12-12: Redefinition of unused DeclarativeBase from line 7

Remove definition: DeclarativeBase

(F811)

12-12: Redefinition of unused sessionmaker from line 7

Remove definition: sessionmaker

(F811)

12-12: Redefinition of unused Session from line 7

Remove definition: Session

(F811)

Additional comments not posted (4)

cognee/api/v1/cognify/cognify_v2.py (4)

1-11: Imports look good.

The imports include necessary modules for user authentication and permission management.

34-38: PermissionDeniedException class looks good.

The class is correctly defined and necessary for handling permission errors.

40-47: Permission check logic looks good.

The permission check logic ensures that only authorized users can proceed with the operations.

83-83: Pass user information to tasks.

The tasks now include user information, which is necessary for user-specific processing.

coderabbitai · 2024-07-23T13:16:44Z

cognee/modules/data/processing/process_documents.py

+            document_dict = document.to_dict()
+            document_dict["user"] = user
+            document_dict["user_permissions"] = user_permissions


Fix the unused document_dict.

The new dictionary document_dict is created but not used in the nodes list. Use document_dict instead of calling document.to_dict() again.

- nodes.append((str(document.id), document.to_dict())) + nodes.append((str(document.id), document_dict))

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

document_dict = document.to_dict()

document_dict["user"] = user

document_dict["user_permissions"] = user_permissions

document_dict = document.to_dict()

document_dict["user"] = user

document_dict["user_permissions"] = user_permissions

nodes.append((str(document.id), document_dict))

coderabbitai · 2024-07-23T13:16:44Z

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

+class User(SQLAlchemyBaseUserTableUUID, Base):
+    __tablename__ = 'users'
+    groups = relationship('Group', secondary=user_group, back_populates='users')


Add id column to the User model.

The User model is missing the id column, which is necessary for primary key identification.

- class User(SQLAlchemyBaseUserTableUUID, Base): - __tablename__ = 'users' - groups = relationship('Group', secondary=user_group, back_populates='users') + class User(SQLAlchemyBaseUserTableUUID, Base): + __tablename__ = 'users' + id = Column(UUID, primary_key=True, index=True) + groups = relationship('Group', secondary=user_group, back_populates='users')

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class User(SQLAlchemyBaseUserTableUUID, Base):

__tablename__ = 'users'

groups = relationship('Group', secondary=user_group, back_populates='users')

class User(SQLAlchemyBaseUserTableUUID, Base):

__tablename__ = 'users'

id = Column(UUID, primary_key=True, index=True)

groups = relationship('Group', secondary=user_group, back_populates='users')

@Vasilije1990 This one

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

it already does this when it inherits from SQLAlchemyBaseUserTableUUID

coderabbitai · 2024-07-23T13:16:44Z

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

+from typing import AsyncGenerator, Generator
+
+from fastapi import Depends
+from fastapi_users.db import SQLAlchemyBaseUserTableUUID, SQLAlchemyUserDatabase
+from sqlalchemy import create_engine, UUID
+from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine
+from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session
+from cognee.infrastructure.databases.relational import get_relationaldb_config, Base
+from sqlalchemy import Column, String, ForeignKey, Table, Integer
+
+
+from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session, relationship


Remove redefined imports and organize them.

The imports DeclarativeBase, sessionmaker, and Session are redefined on line 12. Additionally, imports should be organized according to PEP 8 guidelines.

- from typing import AsyncGenerator, Generator - from fastapi import Depends - from fastapi_users.db import SQLAlchemyBaseUserTableUUID, SQLAlchemyUserDatabase - from sqlalchemy import create_engine, UUID - from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine - from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session - from cognee.infrastructure.databases.relational import get_relationaldb_config, Base - from sqlalchemy import Column, String, ForeignKey, Table, Integer - from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session, relationship + from typing import AsyncGenerator, Generator + from fastapi import Depends + from fastapi_users.db import SQLAlchemyBaseUserTableUUID, SQLAlchemyUserDatabase + from sqlalchemy import create_engine, UUID, Column, String, ForeignKey, Table, Integer + from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine + from sqlalchemy.orm import sessionmaker, Session, relationship + from cognee.infrastructure.databases.relational import get_relationaldb_config, Base

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from typing import AsyncGenerator, Generator

from fastapi import Depends

from fastapi_users.db import SQLAlchemyBaseUserTableUUID, SQLAlchemyUserDatabase

from sqlalchemy import create_engine, UUID

from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine

from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session

from cognee.infrastructure.databases.relational import get_relationaldb_config, Base

from sqlalchemy import Column, String, ForeignKey, Table, Integer

from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session, relationship

from typing import AsyncGenerator, Generator

from fastapi import Depends

from fastapi_users.db import SQLAlchemyBaseUserTableUUID, SQLAlchemyUserDatabase

from sqlalchemy import create_engine, UUID, Column, String, ForeignKey, Table, Integer

from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine

from sqlalchemy.orm import sessionmaker, Session, relationship

from cognee.infrastructure.databases.relational import get_relationaldb_config, Base

Tools

Ruff

12-12: Redefinition of unused DeclarativeBase from line 7

Remove definition: DeclarativeBase

(F811)

12-12: Redefinition of unused sessionmaker from line 7

Remove definition: sessionmaker

(F811)

12-12: Redefinition of unused Session from line 7

Remove definition: Session

(F811)

borisarzentar · 2024-07-23T21:19:16Z

cognee/api/client.py

+@asynccontextmanager
+async def lifespan(app: FastAPI):
+    # Not needed if you setup a migration system like Alembic
+    await create_db_and_tables()
+    yield
+app = FastAPI(debug = os.getenv("ENV") != "prod", lifespan=lifespan)


@Vasilije1990 This one

borisarzentar · 2024-07-23T21:19:27Z

cognee/api/client.py

+@app.get("/authenticated-route")
+async def authenticated_route(user: User = Depends(current_active_user)):
+    return {"message": f"Hello {user.email}!"}
+


@Vasilije1990 This one

borisarzentar · 2024-07-23T21:21:22Z

cognee/api/client.py

@@ -267,8 +345,8 @@ def start_api_server(host: str = "0.0.0.0", port: int = 8000):
        relational_config.create_engine()

        from cognee.modules.data.deletion import prune_system, prune_data
-        asyncio.run(prune_data())
-        asyncio.run(prune_system(metadata = True))
+        # asyncio.run(prune_data())


Please uncomment this after you are done.

borisarzentar · 2024-07-23T21:27:10Z

cognee/api/v1/cognify/cognify_v2.py

+        super().__init__(self.message)
+
+async def cognify(datasets: Union[str, list[str]] = None, root_node_id: str = None, user_id:str="default_user"):
+    session: AsyncSession = async_session_maker()


Let's put this login in a function in some module and just use it here.
Something like def check_permissions(user) for example.

Let's discuss this live

borisarzentar · 2024-07-23T21:32:19Z

cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py

+            self.Session = sessionmaker(bind=self.engine)
+
+        else:
+            print("Name: ", db_name)


Let's remove print statements

borisarzentar · 2024-07-23T21:37:12Z

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

+from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session, relationship
+
+# Association table for many-to-many relationship between users and groups
+user_group = Table('user_group', Base.metadata,


Can we define this like we did for Data and Dataset?

Can you specify exactly what needs to be done?

borisarzentar · 2024-07-23T21:37:53Z

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

+class User(SQLAlchemyBaseUserTableUUID, Base):
+    __tablename__ = 'users'
+    groups = relationship('Group', secondary=user_group, back_populates='users')


@Vasilije1990 This one

borisarzentar · 2024-07-23T21:41:12Z

cognee/infrastructure/databases/relational/user_authentication/routers.py

+    if not group:
+        raise HTTPException(status_code=404, detail="Group not found")
+    perm = db.query(Permission).filter(Permission.name == permission).first()
+    if not perm:


perm -> permission

Do we want to create a new permission freely here?

Don't get it, what is the exact problem?

borisarzentar · 2024-07-23T21:44:35Z

cognee/infrastructure/databases/relational/user_authentication/users.py

+        permissions.update(permission.name for permission in group.permissions)
+    return permissions
+
+def has_permission(permission: str):


Use this function where we need to check for permissions.

Where exactly?

borisarzentar · 2024-07-23T21:45:01Z

docker-compose.yml

@@ -60,8 +60,24 @@ services:
      - 3001:3000
    networks:
      - cognee-network
+  postgres:


Space between.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai

Actionable comments posted: 3

Outside diff range, codebase verification and nitpick comments (1)

cognee/api/client.py (1)
345-346: Uncomment or remove the commented-out code.

Please uncomment this code if it is needed, or remove it if it is no longer necessary.
- # asyncio.run(prune_data())
- # asyncio.run(prune_system(metadata = True))
+ asyncio.run(prune_data())
+ asyncio.run(prune_system(metadata = True))

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 36e156e and 3f5e665.

Files selected for processing (1)

cognee/api/client.py (4 hunks)

Additional context used

Ruff

cognee/api/client.py

31-31: Module level import not at top of file

(E402)

53-53: Module level import not at top of file

(E402)

55-55: Module level import not at top of file

(E402)

57-57: Module level import not at top of file

(E402)

58-58: Module level import not at top of file

(E402)

59-59: Module level import not at top of file

(E402)

Additional comments not posted (3)

cognee/api/client.py (3)

85-85: LGTM!

The inclusion of permission_router appears correct and follows the pattern used for other routers.

61-83: LGTM!

The inclusion of authentication and user management routers appears correct and follows the pattern used for other routers.

Also applies to: 96-117

Line range hint 1-345:
LGTM!

The overall structure and completeness of the file appear to be well-organized and functional.

Tools

Ruff

53-53: Module level import not at top of file

(E402)

55-55: Module level import not at top of file

(E402)

57-57: Module level import not at top of file

(E402)

58-58: Module level import not at top of file

(E402)

59-59: Module level import not at top of file

(E402)

coderabbitai · 2024-07-24T14:49:20Z

cognee/api/client.py

+from cognee.infrastructure.databases.relational.user_authentication.routers import permission_router
+


Reorganize import statements to the top of the file.

Imports should be at the top of the file to comply with PEP 8 standards.

- from contextlib import asynccontextmanager - from fastapi import Depends, FastAPI - from cognee.infrastructure.databases.relational.user_authentication.authentication_db import User, create_db_and_tables - from cognee.infrastructure.databases.relational.user_authentication.schemas import UserCreate, UserRead, UserUpdate - from cognee.infrastructure.databases.relational.user_authentication.users import auth_backend, current_active_user, fastapi_users + from contextlib import asynccontextmanager + from fastapi import Depends, FastAPI + from cognee.infrastructure.databases.relational.user_authentication.routers import permission_router + from cognee.infrastructure.databases.relational.user_authentication.authentication_db import User, create_db_and_tables + from cognee.infrastructure.databases.relational.user_authentication.schemas import UserCreate, UserRead, UserUpdate + from cognee.infrastructure.databases.relational.user_authentication.users import auth_backend, current_active_user, fastapi_users

Also applies to: 31-32, 53-60

coderabbitai · 2024-07-24T14:49:20Z

cognee/api/client.py

+@asynccontextmanager
+async def lifespan(app: FastAPI):
+    # Not needed if you setup a migration system like Alembic
+    await create_db_and_tables()
+    yield
+app = FastAPI(debug = os.getenv("ENV") != "prod", lifespan=lifespan)


Remove duplicate lifespan context manager.

The lifespan context manager is defined twice, which can cause conflicts.

- @asynccontextmanager - async def lifespan(app: FastAPI): - # Not needed if you setup a migration system like Alembic - await create_db_and_tables() - yield

Also applies to: 90-95

coderabbitai · 2024-07-24T14:49:20Z

cognee/api/client.py

+@app.get("/authenticated-route")
+async def authenticated_route(user: User = Depends(current_active_user)):
+    return {"message": f"Hello {user.email}!"}
+


Remove duplicate function authenticated_route.

The function authenticated_route is defined twice, which can cause conflicts.

- @app.get("/authenticated-route") - async def authenticated_route(user: User = Depends(current_active_user)): - return {"message": f"Hello {user.email}!"}

Also applies to: 120-123

coderabbitai

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 3f5e665 and a93bd53.

Files selected for processing (2)

cognee/api/client.py (4 hunks)
cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py (1 hunks)

Additional context used

Ruff

cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py

112-116: f-string without any placeholders

Remove extraneous f prefix

(F541)

120-120: f-string without any placeholders

Remove extraneous f prefix

(F541)

cognee/api/client.py

31-31: Module level import not at top of file

(E402)

53-53: Module level import not at top of file

(E402)

55-55: Module level import not at top of file

(E402)

57-57: Module level import not at top of file

(E402)

58-58: Module level import not at top of file

(E402)

59-59: Module level import not at top of file

(E402)

Additional comments not posted (14)

cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py (14)

7-16: Add type hints, docstrings, and conditionally construct db_location.

Include type hints and docstrings for better readability and maintainability. Construct db_location only if the database type is DuckDB.

-    def __init__(self, db_type: str, db_path: str, db_name: str, db_user:str, db_password:str, db_host:str, db_port:str):
+    def __init__(self, db_type: str, db_path: str, db_name: str, db_user: str, db_password: str, db_host: str, db_port: str):
+        """
+        Initializes the SQLAlchemyAdapter with the specified database configuration.
+        """
-        self.db_location = os.path.abspath(os.path.join(db_path, db_name))
+        if db_type == "duckdb":
+            self.db_location = os.path.abspath(os.path.join(db_path, db_name))

19-22: Add type hints and docstrings.

Include type hints and docstrings for better readability and maintainability.

+    async def get_async_session(self) -> AsyncSession:
+        """
+        Yields an asynchronous SQLAlchemy session.
+        """

24-27: Add type hints and docstrings.

Include type hints and docstrings for better readability and maintainability.

+    def get_session(self) -> sessionmaker:
+        """
+        Yields a synchronous SQLAlchemy session.
+        """

29-38: Add type hints, docstrings, and improve filtering logic.

Include type hints and docstrings for better readability and maintainability. Improve filtering logic for readability.

+    def get_datasets(self) -> List[str]:
+        """
+        Retrieves distinct schema names from the database, excluding staging schemas and 'cognee'.
+        """
-        return list(
-            filter(
-                lambda schema_name: not schema_name.endswith("staging") and schema_name != "cognee",
-                tables
-            )
-        )
+        return [schema_name for schema_name in tables if not schema_name.endswith("staging") and schema_name != "cognee"]

40-43: Add type hints, docstrings, and use parameterized queries.

Include type hints and docstrings for better readability and maintainability. Use parameterized queries to prevent SQL injection.

+    def get_files_metadata(self, dataset_name: str) -> List[Dict[str, Any]]:
+        """
+        Retrieves file metadata from the specified dataset.
+        """
-            result = connection.execute(text(f"SELECT id, name, file_path, extension, mime_type FROM {dataset_name}.file_metadata;"))
+            result = connection.execute(text("SELECT id, name, file_path, extension, mime_type FROM :dataset_name.file_metadata;"), {'dataset_name': dataset_name})

45-49: Add type hints, docstrings, and use parameterized queries.

Include type hints and docstrings for better readability and maintainability. Use parameterized queries to prevent SQL injection.

+    def create_table(self, schema_name: str, table_name: str, table_config: List[Dict[str, str]]) -> None:
+        """
+        Creates a table with the specified schema and configuration.
+        """
-            connection.execute(text(f"CREATE SCHEMA IF NOT EXISTS {schema_name};"))
-            connection.execute(text(f"CREATE TABLE IF NOT EXISTS {schema_name}.{table_name} ({', '.join(fields_query_parts)});"))
+            connection.execute(text("CREATE SCHEMA IF NOT EXISTS :schema_name;"), {'schema_name': schema_name})
+            connection.execute(text("CREATE TABLE IF NOT EXISTS :schema_name.:table_name (:fields_query_parts);"), {'schema_name': schema_name, 'table_name': table_name, 'fields_query_parts': ', '.join(fields_query_parts)})

51-53: Add type hints, docstrings, and use parameterized queries.

Include type hints and docstrings for better readability and maintainability. Use parameterized queries to prevent SQL injection.

+    def delete_table(self, table_name: str) -> None:
+        """
+        Deletes the specified table.
+        """
-            connection.execute(text(f"DROP TABLE IF EXISTS {table_name};"))
+            connection.execute(text("DROP TABLE IF EXISTS :table_name;"), {'table_name': table_name})

55-60: Add type hints, docstrings, and use parameterized queries.

Include type hints and docstrings for better readability and maintainability. Use parameterized queries to prevent SQL injection.

+    def insert_data(self, schema_name: str, table_name: str, data: List[Dict[str, Any]]) -> None:
+        """
+        Inserts data into the specified table.
+        """
-        insert_query = text(f"INSERT INTO {schema_name}.{table_name} ({columns}) VALUES {values};")
+        insert_query = text("INSERT INTO :schema_name.:table_name (:columns) VALUES :values;")
+        with self.engine.connect() as connection:
+            connection.execute(insert_query, {'schema_name': schema_name, 'table_name': table_name, 'columns': columns, 'values': values, **data})

62-77: Add type hints, docstrings, and use parameterized queries.

Include type hints and docstrings for better readability and maintainability. Use parameterized queries to prevent SQL injection.

+    def get_data(self, table_name: str, filters: Optional[Dict[str, Union[str, List[str]]]] = None) -> Dict[str, Any]:
+        """
+        Retrieves data from the specified table based on the provided filters.
+        """
-            query = f"SELECT * FROM {table_name}"
+            query = "SELECT * FROM :table_name"
-                filter_conditions = " AND ".join([
-                    f"{key} IN ({', '.join([f':{key}{i}' for i in range(len(value))])})" if isinstance(value, list)
-                    else f"{key} = :{key}" for key, value in filters.items()
-                ])
-                query += f" WHERE {filter_conditions};"
+                filter_conditions = " AND ".join([
+                    f"{key} IN (:values)" if isinstance(value, list)
+                    else f"{key} = :{key}" for key, value in filters.items()
+                ])
+                query += f" WHERE {filter_conditions};"
+                query = text(query)
+                results = connection.execute(query, {'table_name': table_name, 'values': ', '.join([f':{key}{i}' for i in range(len(value))]), **filters})

79-82: Add type hints, docstrings, and use parameterized queries.

Include type hints and docstrings for better readability and maintainability. Use parameterized queries to prevent SQL injection.

+    def execute_query(self, query: str) -> List[Dict[str, Any]]:
+        """
+        Executes the provided SQL query.
+        """
-            result = connection.execute(text(query))
+            result = connection.execute(text(":query"), {'query': query})

84-98: Add type hints, docstrings, and use parameterized queries.

Include type hints and docstrings for better readability and maintainability. Use parameterized queries to prevent SQL injection.

+    def load_cognify_data(self, data: List[Dict[str, Any]]) -> None:
+        """
+        Loads data into the `cognify` table.
+        """
-        insert_query = cognify_table.insert().values(document_id=text(':document_id'), layer_id=text(':layer_id'))
+        insert_query = cognify_table.insert().values(document_id=:document_id, layer_id=:layer_id)

100-122: Add type hints, docstrings, and remove redundant table creation logic.

Include type hints and docstrings for better readability and maintainability. Remove redundant table creation logic before fetching data. Remove extraneous f prefixes from f-strings without placeholders.

+    def fetch_cognify_data(self, excluded_document_id: str) -> List[Dict[str, Any]]:
+        """
+        Fetches data from the `cognify` table based on the provided `excluded_document_id`.
+        """
-            connection.execute(text("""
-                CREATE TABLE IF NOT EXISTS cognify (
-                    document_id STRING,
-                    layer_id STRING,
-                    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
-                    updated_at TIMESTAMP DEFAULT NULL,
-                    processed BOOLEAN DEFAULT FALSE,
-                    document_id_target STRING NULL
-                );
-            """))
-            query = text(f"""
+            query = text("""
+                SELECT document_id, layer_id, created_at, updated_at, processed
+                FROM cognify
+                WHERE document_id != :excluded_document_id AND processed = FALSE;
+            """)
-                update_query = text(f"UPDATE cognify SET processed = TRUE WHERE document_id IN :document_ids;")
+                update_query = text("UPDATE cognify SET processed = TRUE WHERE document_id IN :document_ids;")

Tools

Ruff

112-116: f-string without any placeholders

Remove extraneous f prefix

(F541)

120-120: f-string without any placeholders

Remove extraneous f prefix

(F541)

124-137: Add type hints, docstrings, and remove redundant table creation logic.

Include type hints and docstrings for better readability and maintainability. Remove redundant table creation logic before deletion.

+    def delete_cognify_data(self) -> None:
+        """
+        Deletes data from the `cognify` table.
+        """
-            connection.execute(text("""
-                CREATE TABLE IF NOT EXISTS cognify (
-                    document_id STRING,
-                    layer_id STRING,
-                    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
-                    updated_at TIMESTAMP DEFAULT NULL,
-                    processed BOOLEAN DEFAULT FALSE,
-                    document_id_target STRING NULL
-                );
-            """))

139-143: Add type hints and docstrings.

Include type hints and docstrings for better readability and maintainability.

+    def delete_database(self) -> None:
+        """
+        Deletes the database file and its write-ahead log file.
+        """

coderabbitai

Actionable comments posted: 9

Outside diff range, codebase verification and nitpick comments (2)

cognee/api/v1/authenticate_user/authenticate_user.py (1)
1-1: The authenticate_user_method function does not exist in the authentication_db module.

The file cognee/infrastructure/databases/relational/user_authentication/authentication_db.py does not contain a function named authenticate_user_method.

Please review the import statement in cognee/api/v1/authenticate_user/authenticate_user.py and ensure that the correct module and function are being imported.

Analysis chain

Verify the import statement.

Ensure that the authenticate_user_method function exists in the authentication_db module.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the existence of the `authenticate_user_method` function in the `authentication_db` module.

# Test: Search for the function definition. Expect: The function definition exists.
ast-grep --lang python --pattern $'def authenticate_user_method($_):'
Length of output: 70

Script:
#!/bin/bash
# Description: Locate the `authentication_db` module and verify the existence of the `authenticate_user_method` function.

# Locate the `authentication_db` module
fd 'authentication_db.py'

# Search for the function definition within the located file(s)
fd 'authentication_db.py' --exec ast-grep --lang python --pattern $'def authenticate_user_method($_):' {}
Length of output: 216
cognee/api/v1/create_user/create_user.py (1)
1-1: The create_user_method function does not exist in the authentication_db module.

The module cognee/infrastructure/databases/relational/user_authentication/authentication_db.py exists.

However, the function create_user_method was not found within this module.

Please review and correct the import statement in cognee/api/v1/create_user/create_user.py.

Analysis chain

Verify the import statement.

Ensure that the create_user_method function exists in the authentication_db module.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the existence of the `create_user_method` function in the `authentication_db` module.

# Test: Search for the function definition. Expect: The function definition exists.
ast-grep --lang python --pattern $'def create_user_method($_, $_, $_):'
Length of output: 72

Script:
#!/bin/bash
# Description: Re-verify the existence of the `create_user_method` function in the `authentication_db` module with a broader search pattern.

# Test: Search for any function named `create_user_method`. Expect: The function definition exists.
ast-grep --lang python --pattern 'def create_user_method($$):'

# Additionally, verify the module path to ensure it exists.
fd 'authentication_db.py' | xargs -I {} echo "Found module: {}"
Length of output: 228

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between a93bd53 and 9616545.

Files selected for processing (8)

cognee/api/v1/authenticate_user/init.py (1 hunks)
cognee/api/v1/authenticate_user/authenticate_user.py (1 hunks)
cognee/api/v1/create_user/init.py (1 hunks)
cognee/api/v1/create_user/create_user.py (1 hunks)
cognee/api/v1/reset_user_password/init.py (1 hunks)
cognee/api/v1/verify_user_token/init.py (1 hunks)
cognee/api/v1/verify_user_token/verify_user_token.py (1 hunks)
cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (1 hunks)

Files skipped from review due to trivial changes (4)

cognee/api/v1/authenticate_user/init.py
cognee/api/v1/create_user/init.py
cognee/api/v1/reset_user_password/init.py
cognee/api/v1/verify_user_token/init.py

Additional context used

Ruff

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

15-15: Redefinition of unused DeclarativeBase from line 9

Remove definition: DeclarativeBase

(F811)

15-15: Redefinition of unused sessionmaker from line 9

Remove definition: sessionmaker

(F811)

15-15: Redefinition of unused Session from line 9

Remove definition: Session

(F811)

149-149: Local variable user_manager is assigned to but never used

Remove assignment to unused variable user_manager

(F841)

158-158: Do not use bare except

(E722)

174-174: Do not use bare except

(E722)

Additional comments not posted (10)

cognee/api/v1/verify_user_token/verify_user_token.py (1)
1-1: Verify the import statement.

Ensure that the user_check_token function exists in the authentication_db module.

Verification successful

The user_check_token function exists in the authentication_db module. The import statement is correct.

The user_check_token function is defined towards the end of the authentication_db module.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the existence of the `user_check_token` function in the `authentication_db` module.

# Test: Search for the function definition. Expect: The function definition exists.
ast-grep --lang python --pattern $'def user_check_token($_):'
Length of output: 62

Script:
#!/bin/bash
# Description: List all function definitions in the `authentication_db` module to verify the presence of `user_check_token`.

# Locate the `authentication_db` module
auth_db_path=$(fd authentication_db.py)

# List all function definitions in the `authentication_db` module
ast-grep --lang python --pattern $'def $_($_):' $auth_db_path
Length of output: 258

Script:
#!/bin/bash
# Description: Display the contents of the `authentication_db` module to verify the presence of `user_check_token`.

# Locate the `authentication_db` module
auth_db_path=$(fd authentication_db.py)

# Display the contents of the `authentication_db` module
cat $auth_db_path
Length of output: 7948
cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (9)
20-27: Consider adding explicit primary keys for association tables.

While the association tables are correctly defined, adding explicit primary keys can help with performance and data integrity.
- user_group = Table('user_group', Base.metadata,
-                   Column('user_id', UUID, ForeignKey('users.id')),
-                   Column('group_id', UUID, ForeignKey('groups.id')))
+ user_group = Table('user_group', Base.metadata,
+                   Column('id', UUID, primary_key=True, index=True),
+                   Column('user_id', UUID, ForeignKey('users.id')),
+                   Column('group_id', UUID, ForeignKey('groups.id')))

- group_permission = Table('group_permission', Base.metadata,
-                         Column('group_id', UUID, ForeignKey('groups.id')),
-                         Column('permission_id', UUID, ForeignKey('permissions.id')))
+ group_permission = Table('group_permission', Base.metadata,
+                         Column('id', UUID, primary_key=True, index=True),
+                         Column('group_id', UUID, ForeignKey('groups.id')),
+                         Column('permission_id', UUID, ForeignKey('permissions.id')))
35-46: LGTM!

The model classes Group and Permission are correctly defined with primary keys and relationships.

60-65: LGTM!

The configuration section is correctly defined.

67-69: LGTM!

The function create_db_and_tables is correctly defined.

71-73: LGTM!

The function get_async_session is correctly defined.

76-77: LGTM!

The function get_user_db is correctly defined.

80-81: LGTM!

The function hash_password is correctly defined.

114-122: LGTM!

The function reset_user_password_method is correctly defined.

48-57: Verify the relationship to dataset_data.

The ACL class includes a relationship to dataset_data, which should be reviewed for correctness.

Verification successful

Verify the relationship to dataset_data.

The ACL class includes a relationship to dataset_data, which should be reviewed for correctness.

Verified: The relationship to dataset_data is correctly defined.

The ACL class includes a correctly defined relationship to dataset_data based on the table name specified in the DatasetData model.

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py: Relationship to dataset_data is correctly defined.

cognee/infrastructure/data/models/DatasetData.py: __tablename__ is set to "dataset_data".
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the relationship to `dataset_data` in the `ACL` class.

# Test: Search for the `dataset_data` definition. Expect: Correct relationship definition.
rg --type python -A 5 $'dataset_data'
Length of output: 67

Script:
#!/bin/bash
# Description: Verify the relationship to `dataset_data` in the `ACL` class.

# Test: Search for the `dataset_data` definition. Expect: Correct relationship definition.
rg --type py -A 5 $'dataset_data'
Length of output: 4769

coderabbitai · 2024-07-26T14:21:31Z

cognee/api/v1/verify_user_token/verify_user_token.py

+async def verify_user_token(token: str):
+
+    output = await user_check_token(token=token)
+    return output


Add error handling and a docstring.

The function lacks error handling and documentation. Consider adding a try-except block and a docstring.

+ """ + Verify the user token. + + Args: + token (str): The user token to verify. + + Returns: + output: The result of the token verification. + """ async def verify_user_token(token: str): + try: output = await user_check_token(token=token) + except Exception as e: + # Handle the exception (e.g., log the error, raise a custom exception, etc.) + raise e return output

Committable suggestion was skipped due to low confidence.

coderabbitai · 2024-07-26T14:21:31Z

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

+# async def generate_verification_token(email: str, tokens_db: dict) -> str:
+#     async with get_async_session_context() as session:
+#         async with get_user_db_context(session) as user_db:
+#             if not await user_db.get_by_email(email):
+#                 raise ValueError("User does not exist")
+#             token = str(uuid.uuid4())
+#             tokens_db[token] = email
+#             return token
+
+# async def verify_user_method(token: str, tokens_db: dict) -> bool:
+#     async with get_async_session_context() as session:
+#         async with get_user_db_context(session) as user_db:
+#             email = tokens_db.get(token)
+#             if not email or not await user_db.get_by_email(email):
+#                 return False
+#             user = await user_db.get_by_email(email)
+#             user.is_verified = True
+#             await user_db.update(user)
+#             return True


Address commented-out functions.

The commented-out functions for generating and verifying tokens should be reviewed for relevance and either implemented or removed.

Do you want me to help implement these functions or open a GitHub issue to track this task?

coderabbitai · 2024-07-26T14:21:31Z

cognee/api/v1/authenticate_user/authenticate_user.py

+async def authenticate_user():
+    """
+    This function is used to authenticate a user.
+    """
+    output = await authenticate_user_method()
+    return output


Add error handling and improve the docstring.

The function lacks error handling and the docstring could be more descriptive. Consider adding a try-except block and enhancing the docstring.

async def authenticate_user(): - """ - This function is used to authenticate a user. - """ + """ + Authenticate a user. + + Returns: + output: The result of the user authentication. + """ + try: output = await authenticate_user_method() + except Exception as e: + # Handle the exception (e.g., log the error, raise a custom exception, etc.) + raise e return output

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async def authenticate_user():

"""

This function is used to authenticate a user.

"""

output = await authenticate_user_method()

return output

async def authenticate_user():

"""

Authenticate a user.

Returns:

output: The result of the user authentication.

"""

try:

output = await authenticate_user_method()

except Exception as e:

# Handle the exception (e.g., log the error, raise a custom exception, etc.)

raise e

return output

coderabbitai · 2024-07-26T14:21:31Z

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

+async def authenticate_user_method(email: str, password: str) -> Optional[User]:
+    try:
+        async with get_async_session_context() as session:
+            async with get_user_db_context(session) as user_db:
+                async with get_user_manager_context(user_db) as user_manager:
+                    credentials = OAuth2PasswordRequestForm(username=email, password=password)
+                    user = await user_manager.authenticate(credentials)
+                    if user is None or not user.is_active:
+                        return None
+                    return user
+    except Exception as e:
+        print(f"Error during authentication: {e}")
+        return None


Improve exception handling by logging the exception.

The function authenticate_user_method is correctly defined, but the exception handling can be improved by logging the exception.

- except Exception as e: - print(f"Error during authentication: {e}") + except Exception as e: + print(f"Error during authentication: {e}") + raise e

Committable suggestion was skipped due to low confidence.

coderabbitai · 2024-07-26T14:21:31Z

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

+async def create_user_method(email: str, password: str, is_superuser: bool = False):
+    try:
+        async with get_async_session_context() as session:
+            async with get_user_db_context(session) as user_db:
+                async with get_user_manager_context(user_db) as user_manager:
+                    user = await user_manager.create(
+                        UserCreate(email=email, password=password, is_superuser=is_superuser)
+                    )
+                    print(f"User created: {user.email}")
+    except UserAlreadyExists:
+        print(f"User {email} already exists")


Improve exception handling by logging the exception.

The function create_user_method is correctly defined, but the exception handling can be improved by logging the exception.

- except UserAlreadyExists: - print(f"User {email} already exists") + except UserAlreadyExists as e: + print(f"User {email} already exists: {e}")

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async def create_user_method(email: str, password: str, is_superuser: bool = False):

try:

async with get_async_session_context() as session:

async with get_user_db_context(session) as user_db:

async with get_user_manager_context(user_db) as user_manager:

user = await user_manager.create(

UserCreate(email=email, password=password, is_superuser=is_superuser)

)

print(f"User created: {user.email}")

except UserAlreadyExists:

print(f"User {email} already exists")

async def create_user_method(email: str, password: str, is_superuser: bool = False):

try:

async with get_async_session_context() as session:

async with get_user_db_context(session) as user_db:

async with get_user_manager_context(user_db) as user_manager:

user = await user_manager.create(

UserCreate(email=email, password=password, is_superuser=is_superuser)

)

print(f"User created: {user.email}")

except UserAlreadyExists as e:

print(f"User {email} already exists: {e}")

coderabbitai · 2024-07-26T14:21:31Z

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

+import hashlib
+import uuid
+from typing import AsyncGenerator, Generator, Optional
+
+from fastapi import Depends
+from fastapi_users.db import SQLAlchemyBaseUserTableUUID, SQLAlchemyUserDatabase
+from sqlalchemy import create_engine, UUID
+from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine
+from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session
+from cognee.infrastructure.databases.relational import get_relationaldb_config, Base
+from sqlalchemy import Column, String, ForeignKey, Table, Integer
+from contextlib import asynccontextmanager
+from fastapi_users.exceptions import UserAlreadyExists
+from fastapi_users.authentication import JWTStrategy
+from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session, relationship


Clean up redundant and redefined imports.

The imports DeclarativeBase, sessionmaker, and Session are redefined on line 15. Additionally, there are redundant imports that can be removed to improve readability and maintainability.

- from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session - from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session, relationship + from sqlalchemy.orm import relationship

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

import hashlib

import uuid

from typing import AsyncGenerator, Generator, Optional

from fastapi import Depends

from fastapi_users.db import SQLAlchemyBaseUserTableUUID, SQLAlchemyUserDatabase

from sqlalchemy import create_engine, UUID

from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine

from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session

from cognee.infrastructure.databases.relational import get_relationaldb_config, Base

from sqlalchemy import Column, String, ForeignKey, Table, Integer

from contextlib import asynccontextmanager

from fastapi_users.exceptions import UserAlreadyExists

from fastapi_users.authentication import JWTStrategy

from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session, relationship

import hashlib

import uuid

from typing import AsyncGenerator, Generator, Optional

from fastapi import Depends

from fastapi_users.db import SQLAlchemyBaseUserTableUUID, SQLAlchemyUserDatabase

from sqlalchemy import create_engine, UUID

from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine

from cognee.infrastructure.databases.relational import get_relationaldb_config, Base

from sqlalchemy import Column, String, ForeignKey, Table, Integer

from contextlib import asynccontextmanager

from fastapi_users.exceptions import UserAlreadyExists

from fastapi_users.authentication import JWTStrategy

from sqlalchemy.orm import relationship

Tools

Ruff

15-15: Redefinition of unused DeclarativeBase from line 9

Remove definition: DeclarativeBase

(F811)

15-15: Redefinition of unused sessionmaker from line 9

Remove definition: sessionmaker

(F811)

15-15: Redefinition of unused Session from line 9

Remove definition: Session

(F811)

coderabbitai · 2024-07-26T14:21:32Z

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

+async def user_create_token(user: User) -> Optional[str]:
+    try:
+        async with get_async_session_context() as session:
+            async with get_user_db_context(session) as user_db:
+                async with get_user_manager_context(user_db) as user_manager:
+                    if user is None:
+                        return None
+                    strategy = get_jwt_strategy()
+                    token = await strategy.write_token(user)
+                    if token is not None:
+                        return token
+                    else:
+                        return None
+    except:
+        return None


Remove unused variable and improve exception handling.

The function user_create_token is correctly defined, but the unused user_manager variable should be removed. Additionally, the bare except should be replaced with a specific exception.

- async with get_user_manager_context(user_db) as user_manager: + async with get_user_manager_context(user_db): ... - except: + except Exception as e: + print(f"Error during token creation: {e}")

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async def user_create_token(user: User) -> Optional[str]:

try:

async with get_async_session_context() as session:

async with get_user_db_context(session) as user_db:

async with get_user_manager_context(user_db) as user_manager:

if user is None:

return None

strategy = get_jwt_strategy()

token = await strategy.write_token(user)

if token is not None:

return token

else:

return None

except:

return None

async def user_create_token(user: User) -> Optional[str]:

try:

async with get_async_session_context() as session:

async with get_user_db_context(session) as user_db:

async with get_user_manager_context(user_db):

if user is None:

return None

strategy = get_jwt_strategy()

token = await strategy.write_token(user)

if token is not None:

return token

else:

return None

except Exception as e:

print(f"Error during token creation: {e}")

return None

Tools

Ruff

149-149: Local variable user_manager is assigned to but never used

Remove assignment to unused variable user_manager

(F841)

158-158: Do not use bare except

(E722)

coderabbitai · 2024-07-26T14:21:32Z

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

+async def user_check_token(token: str) -> bool:
+    try:
+        async with get_async_session_context() as session:
+            async with get_user_db_context(session) as user_db:
+                async with get_user_manager_context(user_db) as user_manager:
+                    if token is None:
+                        return False
+                    strategy = get_jwt_strategy()
+                    user = await strategy.read_token(token, user_manager)
+                    if user is None or not user.is_active:
+                        return False
+                    else:
+                        return True
+    except:
+        return False


Improve exception handling.

The function user_check_token is correctly defined, but the bare except should be replaced with a specific exception.

- except: - return False + except Exception as e: + print(f"Error during token validation: {e}") + return False

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async def user_check_token(token: str) -> bool:

try:

async with get_async_session_context() as session:

async with get_user_db_context(session) as user_db:

async with get_user_manager_context(user_db) as user_manager:

if token is None:

return False

strategy = get_jwt_strategy()

user = await strategy.read_token(token, user_manager)

if user is None or not user.is_active:

return False

else:

return True

except:

return False

async def user_check_token(token: str) -> bool:

try:

async with get_async_session_context() as session:

async with get_user_db_context(session) as user_db:

async with get_user_manager_context(user_db) as user_manager:

if token is None:

return False

strategy = get_jwt_strategy()

user = await strategy.read_token(token, user_manager)

if user is None or not user.is_active:

return False

else:

return True

except Exception as e:

print(f"Error during token validation: {e}")

return False

Tools

Ruff

174-174: Do not use bare except

(E722)

coderabbitai · 2024-07-26T14:21:32Z

cognee/api/v1/create_user/create_user.py

+async def create_user(email: str, password: str, is_superuser: bool = False):
+    output = await create_user_method(email=email, password=password, is_superuser=is_superuser)
+    return output


Add error handling and a docstring.

The function lacks error handling and documentation. Consider adding a try-except block and a docstring.

+ """ + Create a new user. + + Args: + email (str): The email of the user. + password (str): The password of the user. + is_superuser (bool): Whether the user is a superuser. Defaults to False. + + Returns: + output: The result of the user creation. + """ async def create_user(email: str, password: str, is_superuser: bool = False): + try: output = await create_user_method(email=email, password=password, is_superuser=is_superuser) + except Exception as e: + # Handle the exception (e.g., log the error, raise a custom exception, etc.) + raise e return output

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async def create_user(email: str, password: str, is_superuser: bool = False):

output = await create_user_method(email=email, password=password, is_superuser=is_superuser)

return output

"""

Create a new user.

Args:

email (str): The email of the user.

password (str): The password of the user.

is_superuser (bool): Whether the user is a superuser. Defaults to False.

Returns:

output: The result of the user creation.

"""

async def create_user(email: str, password: str, is_superuser: bool = False):

try:

output = await create_user_method(email=email, password=password, is_superuser=is_superuser)

except Exception as e:

# Handle the exception (e.g., log the error, raise a custom exception, etc.)

raise e

return output

coderabbitai

Actionable comments posted: 9

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 9616545 and 218d322.

Files selected for processing (7)

cognee/api/v1/authenticate_user/authenticate_user.py (1 hunks)
cognee/api/v1/cognify/cognify_v2.py (3 hunks)
cognee/api/v1/create_user/create_user.py (1 hunks)
cognee/api/v1/verify_user_token/verify_user_token.py (1 hunks)
cognee/infrastructure/data/models/DatasetData.py (1 hunks)
cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (1 hunks)
cognee/infrastructure/databases/relational/user_authentication/users.py (1 hunks)

Files skipped from review as they are similar to previous changes (3)

cognee/api/v1/authenticate_user/authenticate_user.py
cognee/api/v1/create_user/create_user.py
cognee/api/v1/verify_user_token/verify_user_token.py

Additional context used

Ruff

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

15-15: Redefinition of unused DeclarativeBase from line 9

Remove definition: DeclarativeBase

(F811)

15-15: Redefinition of unused sessionmaker from line 9

Remove definition: sessionmaker

(F811)

15-15: Redefinition of unused Session from line 9

Remove definition: Session

(F811)

cognee/api/v1/cognify/cognify_v2.py

83-83: Undefined name hashed_user_id

(F821)

83-83: Undefined name user_permissions

(F821)

cognee/infrastructure/databases/relational/user_authentication/users.py

14-14: Redefinition of unused Depends from line 5

Remove definition: Depends

(F811)

19-19: Redefinition of unused JWTStrategy from line 10

Remove definition: JWTStrategy

(F811)

150-150: Local variable user_manager is assigned to but never used

Remove assignment to unused variable user_manager

(F841)

159-159: Do not use bare except

(E722)

175-175: Do not use bare except

(E722)

Additional comments not posted (17)

cognee/infrastructure/data/models/DatasetData.py (4)

8-8: Ensure consistency with ORM framework.

The class now inherits from Base instead of ModelBase. Verify that this change aligns with the ORM framework being used.

13-14: Primary key adjustment.

The primary key definitions for dataset_id and data_id have been removed. Ensure that the new primary key setup with id is consistent with the overall data model.

15-17: Unique constraint added.

The unique constraint on dataset_id and data_id ensures data integrity. This is a good practice to prevent duplicate entries.

19-19: Relationship with ACL model.

The relationship with the ACL model establishes a back-reference. Ensure that the ACL model has the corresponding relationship defined.
cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (4)
36-42: Ensure relationships are correctly defined.

The Group model defines several relationships. Ensure that the corresponding relationships are defined in the related models.

43-47: Ensure relationships are correctly defined.

The Permission model defines a relationship with the Group model. Ensure that the corresponding relationship is defined in the Group model.

49-58: Ensure relationships are correctly defined.

The ACL model defines relationships with DatasetData, User, and Group. Ensure that the corresponding relationships are defined in the related models.

31-34: Add id column to the User model.

The User model is missing the id column, which is necessary for primary key identification.
+    id = Column(UUID, primary_key=True, index=True)
Likely invalid or redundant comment.
cognee/api/v1/cognify/cognify_v2.py (2)

34-37: LGTM!

The PermissionDeniedException class is correctly implemented.

39-47: Verify the integration of user permissions.

The function now retrieves user permissions and checks for the required "write" permission. Ensure that the get_user_permissions function and the fastapi_users manager are correctly implemented and integrated.
cognee/infrastructure/databases/relational/user_authentication/users.py (7)
26-41: Replace print statements with proper logging.

Using print statements for logging is not ideal for production code. Consider using the logging module.
import logging

logger = logging.getLogger(__name__)

class UserManager(UUIDIDMixin, BaseUserManager[User, uuid.UUID]):
    reset_password_token_secret = SECRET
    verification_token_secret = SECRET

    async def on_after_register(self, user: User, request: Optional[Request] = None):
        logger.info(f"User {user.id} has registered.")

    async def on_after_forgot_password(self, user: User, token: str, request: Optional[Request] = None):
        logger.info(f"User {user.id} has forgot their password. Reset token: {token}")

    async def on_after_request_verify(self, user: User, token: str, request: Optional[Request] = None):
        logger.info(f"Verification requested for user {user.id}. Verification token: {token}")
44-45: LGTM!

The get_user_manager function correctly yields the UserManager instance.

51-52: LGTM!

The get_jwt_strategy function correctly returns the JWT strategy.

66-70: LGTM!

The get_user_permissions function correctly retrieves and returns the user's permissions.

72-78: LGTM!

The has_permission function correctly checks the user's permissions and raises an HTTP exception if the user lacks the required permission.

81-82: LGTM!

The hash_password function correctly hashes the password using SHA-256.

115-123: LGTM!

The reset_user_password_method function correctly resets the user's password and updates the database.

coderabbitai · 2024-07-27T08:12:40Z

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

+from fastapi_users.db import SQLAlchemyBaseUserTableUUID, SQLAlchemyUserDatabase
+from sqlalchemy import create_engine, UUID
+from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine
+from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session


Clean up redundant and redefined imports.

The imports DeclarativeBase, sessionmaker, and Session are redefined. Additionally, there are redundant imports that can be removed to improve readability and maintainability.

- from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session - from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session, relationship + from sqlalchemy.orm import relationship

Also applies to: 15-15

coderabbitai · 2024-07-27T08:12:40Z

cognee/infrastructure/databases/relational/user_authentication/authentication_db.py

+user_group = Table('user_group', Base.metadata,
+                   Column('user_id', UUID, ForeignKey('users.id')),
+                   Column('group_id', UUID, ForeignKey('groups.id')))
+
+# Association table for many-to-many relationship between groups and permissions
+group_permission = Table('group_permission', Base.metadata,
+                         Column('group_id', UUID, ForeignKey('groups.id')),
+                         Column('permission_id', UUID, ForeignKey('permissions.id')))


Consider adding explicit primary keys for association tables.

While the association tables are correctly defined, adding explicit primary keys can help with performance and data integrity.

- user_group = Table('user_group', Base.metadata, - Column('user_id', UUID, ForeignKey('users.id')), - Column('group_id', UUID, ForeignKey('groups.id'))) + user_group = Table('user_group', Base.metadata, + Column('id', UUID, primary_key=True, index=True), + Column('user_id', UUID, ForeignKey('users.id')), + Column('group_id', UUID, ForeignKey('groups.id'))) - group_permission = Table('group_permission', Base.metadata, - Column('group_id', UUID, ForeignKey('groups.id')), - Column('permission_id', UUID, ForeignKey('permissions.id'))) + group_permission = Table('group_permission', Base.metadata, + Column('id', UUID, primary_key=True, index=True), + Column('group_id', UUID, ForeignKey('groups.id')), + Column('permission_id', UUID, ForeignKey('permissions.id')))

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

user_group = Table('user_group', Base.metadata,

Column('user_id', UUID, ForeignKey('users.id')),

Column('group_id', UUID, ForeignKey('groups.id')))

# Association table for many-to-many relationship between groups and permissions

group_permission = Table('group_permission', Base.metadata,

Column('group_id', UUID, ForeignKey('groups.id')),

Column('permission_id', UUID, ForeignKey('permissions.id')))

user_group = Table('user_group', Base.metadata,

Column('id', UUID, primary_key=True, index=True),

Column('user_id', UUID, ForeignKey('users.id')),

Column('group_id', UUID, ForeignKey('groups.id')))

# Association table for many-to-many relationship between groups and permissions

group_permission = Table('group_permission', Base.metadata,

Column('id', UUID, primary_key=True, index=True),

Column('group_id', UUID, ForeignKey('groups.id')),

Column('permission_id', UUID, ForeignKey('permissions.id')))

coderabbitai · 2024-07-27T08:12:40Z

cognee/infrastructure/databases/relational/user_authentication/users.py

+async def create_user_method(email: str, password: str, is_superuser: bool = False):
+    try:
+        async with get_async_session_context() as session:
+            async with get_user_db_context(session) as user_db:
+                async with get_user_manager_context(user_db) as user_manager:
+                    user = await user_manager.create(
+                        UserCreate(email=email, password=password, is_superuser=is_superuser)
+                    )
+                    print(f"User created: {user.email}")
+    except UserAlreadyExists:
+        print(f"User {email} already exists")


Replace print statements with proper logging.

Using print statements for logging is not ideal for production code. Consider using the logging module.

import logging logger = logging.getLogger(__name__) async def create_user_method(email: str, password: str, is_superuser: bool = False): try: async with get_async_session_context() as session: async with get_user_db_context(session) as user_db: async with get_user_manager_context(user_db) as user_manager: user = await user_manager.create( UserCreate(email=email, password=password, is_superuser=is_superuser) ) logger.info(f"User created: {user.email}") except UserAlreadyExists: logger.warning(f"User {email} already exists")

coderabbitai · 2024-07-27T08:12:41Z

cognee/api/v1/cognify/cognify_v2.py

@@ -60,7 +80,7 @@ async def run_cognify_pipeline(dataset_name: str, files: list[dict]):
                root_node_id = "ROOT"

            tasks = [
-                Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type
+                Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }, user_id = hashed_user_id, user_permissions=user_permissions), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type


Undefined variables hashed_user_id and user_permissions.

The variables hashed_user_id and user_permissions are used but not defined in the provided code. Ensure that these variables are correctly defined and passed to the Task instances.

- Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }, user_id = hashed_user_id, user_permissions=user_permissions), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type + Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }, user_id = hashed_user_id, user_permissions=user_permissions), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type

Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type

Tools

Ruff

83-83: Undefined name hashed_user_id

(F821)

83-83: Undefined name user_permissions

(F821)

coderabbitai

Actionable comments posted: 8

Outside diff range, codebase verification and nitpick comments (3)

cognee/api/v1/add/add.py (2)
89-89: The function call to data_resources does not match the new signature.

File: cognee/api/v1/add/add.py

Lines: 89-89

Issue: The function call to data_resources does not include the user_id parameter, which is now required by the updated function signature.

Analysis chain

LGTM! But verify the function usage in the codebase.

The code changes are approved.

However, ensure that all function calls to data_resources match the new signature.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify all function calls to `data_resources` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type python -A 5 $'data_resources'
Length of output: 69

Script:
#!/bin/bash
# Description: Verify all function calls to `data_resources` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type py -A 5 $'data_resources'
Length of output: 920
55-55: Update the function call to match the new signature.

The function call to add_files in cognee/api/v1/add/add.py does not include the user_id parameter. Update this call to match the new function signature.

cognee/api/v1/add/add.py: Line with return await add_files(file_paths, dataset_name)

Analysis chain

LGTM! But verify the function usage in the codebase.

The code changes are approved.

However, ensure that all function calls to add_files match the new signature.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify all function calls to `add_files` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type python -A 5 $'add_files'
Length of output: 64

Script:
#!/bin/bash
# Description: Verify all function calls to `add_files` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type py -A 5 $'add_files'
Length of output: 573
cognee/api/v1/cognify/cognify_v2.py (1)
43-43: Function calls to cognify need updating to match the new signature.

The cognify function has been updated to include a user_id parameter, but several calls to this function in the codebase do not include this new parameter. Please update these calls to match the new function signature.

cognee/tests/test_weaviate.py

cognee/tests/test_qdrant.py

cognee/tests/test_neo4j.py

cognee/tests/test_library.py

Analysis chain

LGTM! But verify the function usage in the codebase.

The code changes are approved.

However, ensure that all function calls to cognify match the new signature.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify all function calls to `cognify` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type python -A 5 $'cognify'
Length of output: 62

Script:
#!/bin/bash
# Description: Verify all function calls to `cognify` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type py -A 5 $'cognify'
Length of output: 34467

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 218d322 and b4d1a73.

Files selected for processing (3)

cognee/api/v1/add/add.py (3 hunks)
cognee/api/v1/cognify/cognify_v2.py (3 hunks)
cognee/infrastructure/databases/relational/user_authentication/users.py (1 hunks)

Additional context used

Ruff

cognee/api/v1/cognify/cognify_v2.py

61-61: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)

89-89: Undefined name hashed_user_id

(F821)

89-89: Undefined name user_permissions

(F821)

cognee/infrastructure/databases/relational/user_authentication/users.py

14-14: Redefinition of unused Depends from line 5

Remove definition: Depends

(F811)

20-20: Redefinition of unused JWTStrategy from line 10

Remove definition: JWTStrategy

(F811)

151-151: Local variable user_manager is assigned to but never used

Remove assignment to unused variable user_manager

(F841)

160-160: Do not use bare except

(E722)

176-176: Do not use bare except

(E722)

coderabbitai · 2024-07-27T10:48:44Z

cognee/api/v1/cognify/cognify_v2.py

+
+            async with get_async_session_context() as session:
+
+                out = await has_permission_document(user_id, file["id"], "write", session)


Remove assignment to unused variable out.

The variable out is assigned but never used.

- out = await has_permission_document(user_id, file["id"], "write", session)

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

out = await has_permission_document(user_id, file["id"], "write", session)

await has_permission_document(user_id, file["id"], "write", session)

Tools

Ruff

61-61: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)

coderabbitai · 2024-07-27T10:48:45Z

cognee/infrastructure/databases/relational/user_authentication/users.py

+async def give_permission_document(user: Optional[User], document_id: str, permission: str,
+                                   session: AsyncSession):
+
+    acl_entry = ACL(
+        document_id=document_id,
+        user_id=user.id,
+        permission=permission
+    )
+    session.add(acl_entry)
+    await session.commit()
+
+
+    if user.is_superuser:
+        permission = 'all_permissions'  # Example permission, change as needed
+    acl_entry = ACL(
+        document_id=document_id,
+        user_id=user.id,
+        permission=permission
+    )
+    session.add(acl_entry)
+    await session.commit()


Simplify code for superuser permissions.

The code for superuser permissions is redundant and can be simplified.

async def give_permission_document(user: Optional[User], document_id: str, permission: str, session: AsyncSession): acl_entry = ACL( document_id=document_id, user_id=user.id, permission=permission ) session.add(acl_entry) await session.commit() if user.is_superuser: acl_entry = ACL( document_id=document_id, user_id=user.id, permission='all_permissions' # Example permission, change as needed ) session.add(acl_entry) await session.commit()

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async def give_permission_document(user: Optional[User], document_id: str, permission: str,

session: AsyncSession):

acl_entry = ACL(

document_id=document_id,

user_id=user.id,

permission=permission

)

session.add(acl_entry)

await session.commit()

if user.is_superuser:

permission = 'all_permissions' # Example permission, change as needed

acl_entry = ACL(

document_id=document_id,

user_id=user.id,

permission=permission

)

session.add(acl_entry)

await session.commit()

async def give_permission_document(user: Optional[User], document_id: str, permission: str,

session: AsyncSession):

acl_entry = ACL(

document_id=document_id,

user_id=user.id,

permission=permission

)

session.add(acl_entry)

await session.commit()

if user.is_superuser:

acl_entry = ACL(

document_id=document_id,

user_id=user.id,

permission='all_permissions' # Example permission, change as needed

)

session.add(acl_entry)

await session.commit()

coderabbitai

Actionable comments posted: 13

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between b4d1a73 and 7930586.

Files selected for processing (2)

cognee/api/v1/cognify/cognify_v2.py (3 hunks)
cognee/infrastructure/databases/relational/user_authentication/users.py (1 hunks)

Additional context used

Ruff

cognee/api/v1/cognify/cognify_v2.py

62-62: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)

90-90: Undefined name hashed_user_id

(F821)

90-90: Undefined name user_permissions

(F821)

cognee/infrastructure/databases/relational/user_authentication/users.py

14-14: Redefinition of unused Depends from line 5

Remove definition: Depends

(F811)

20-20: Redefinition of unused JWTStrategy from line 10

Remove definition: JWTStrategy

(F811)

154-154: Local variable user_manager is assigned to but never used

Remove assignment to unused variable user_manager

(F841)

163-163: Do not use bare except

(E722)

179-179: Do not use bare except

(E722)

Additional comments not posted (11)

cognee/api/v1/cognify/cognify_v2.py (7)

4-13: Imports look good.

The new imports are necessary for the added user management functionalities.

38-41: Exception class implementation looks good.

The PermissionDeniedException class is correctly implemented.

42-43: Function signature update looks good.

The cognify function now includes a user_id parameter, which is necessary for user permission checks.

Line range hint 44-48:
Initialization segment looks good.

The segment initializes configurations and creates a task status table, which is necessary for the function.

Line range hint 49-53:
Dataset handling segment looks good.

The segment handles the datasets parameter and calls cognify recursively if datasets are not provided.

Tools

Ruff

62-62: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)

90-90: Undefined name hashed_user_id

(F821)

90-90: Undefined name user_permissions

(F821)

65-73: Task status update segment looks good.

The segment handles task status updates using a lock, which is necessary for synchronization.

Line range hint 127-137:
Existing datasets handling segment looks good.

The segment handles existing datasets and calls run_cognify_pipeline.

Tools

Ruff

62-62: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)

90-90: Undefined name hashed_user_id

(F821)

90-90: Undefined name user_permissions

(F821)

cognee/infrastructure/databases/relational/user_authentication/users.py (4)

45-46: Function implementation looks good.

The get_user_manager function is correctly implemented.

52-53: Function implementation looks good.

The get_jwt_strategy function is correctly implemented.

70-74: Function implementation looks good.

The get_user_permissions function is correctly implemented.

76-82: Function implementation looks good.

The has_permission function is correctly implemented.

coderabbitai · 2024-07-27T10:52:26Z

cognee/api/v1/cognify/cognify_v2.py

+
+        for file in files:
+            file["id"] = str(uuid.uuid4())
+            file["name"] = file["name"].replace(" ", "_")
+
+            async with get_async_session_context() as session:
+                active_user = await fast_api_users_init()
+
+                out = await has_permission_document(active_user.current_user(active=True), file["id"], "write", session)


Remove assignment to unused variable out.

The variable out is assigned but never used.

- out = await has_permission_document(active_user.current_user(active=True), file["id"], "write", session) + await has_permission_document(active_user.current_user(active=True), file["id"], "write", session)

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

for file in files:

file["id"] = str(uuid.uuid4())

file["name"] = file["name"].replace(" ", "_")

async with get_async_session_context() as session:

active_user = await fast_api_users_init()

out = await has_permission_document(active_user.current_user(active=True), file["id"], "write", session)

for file in files:

file["id"] = str(uuid.uuid4())

file["name"] = file["name"].replace(" ", "_")

async with get_async_session_context() as session:

active_user = await fast_api_users_init()

await has_permission_document(active_user.current_user(active=True), file["id"], "write", session)

Tools

Ruff

62-62: Local variable out is assigned to but never used

Remove assignment to unused variable out

(F841)

coderabbitai · 2024-07-27T10:52:26Z

cognee/api/v1/cognify/cognify_v2.py

+                    cognee_config = get_cognify_config()
+                    graph_config = get_graph_config()
+                    root_node_id = None
+
+                    if graph_config.infer_graph_topology and graph_config.graph_topology_task:
+                        from cognee.modules.topology.topology import TopologyEngine
+                        topology_engine = TopologyEngine(infer=graph_config.infer_graph_topology)
+                        root_node_id = await topology_engine.add_graph_topology(files = files)
+                    elif graph_config.infer_graph_topology and not graph_config.infer_graph_topology:
+                        from cognee.modules.topology.topology import TopologyEngine
+                        topology_engine = TopologyEngine(infer=graph_config.infer_graph_topology)
+                        await topology_engine.add_graph_topology(graph_config.topology_file_path)
+                    elif not graph_config.graph_topology_task:
+                        root_node_id = "ROOT"
+
+                    tasks = [
+                        Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }, user_id = hashed_user_id, user_permissions=user_permissions), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type
+                        Task(establish_graph_topology, topology_model = KnowledgeGraph), # Set the graph topology for the document chunk data
+                        Task(expand_knowledge_graph, graph_model = KnowledgeGraph), # Generate knowledge graphs from the document chunks and attach it to chunk nodes
+                        Task(filter_affected_chunks, collection_name = "chunks"), # Find all affected chunks, so we don't process unchanged chunks
+                        Task(
+                            save_data_chunks,
+                            collection_name = "chunks",
+                        ), # Save the document chunks in vector db and as nodes in graph db (connected to the document node and between each other)
+                        run_tasks_parallel([
+                            Task(
+                                summarize_text_chunks,
+                                summarization_model = cognee_config.summarization_model,
+                                collection_name = "chunk_summaries",
+                            ), # Summarize the document chunks
+                            Task(
+                                classify_text_chunks,
+                                classification_model = cognee_config.classification_model,
+                            ),
+                        ]),
+                        Task(remove_obsolete_chunks), # Remove the obsolete document chunks.
+                    ]
+
+                    pipeline = run_tasks(tasks, [
+                        PdfDocument(title=f"{file['name']}.{file['extension']}", file_path=file["file_path"]) if file["extension"] == "pdf" else
+                        AudioDocument(title=f"{file['name']}.{file['extension']}", file_path=file["file_path"]) if file["extension"] == "audio" else
+                        ImageDocument(title=f"{file['name']}.{file['extension']}", file_path=file["file_path"]) if file["extension"] == "image" else
+                        TextDocument(title=f"{file['name']}.{file['extension']}", file_path=file["file_path"])
+                        for file in files
+                    ])
+
+                    async for result in pipeline:
+                        print(result)
+
+                    update_task_status(dataset_name, "DATASET_PROCESSING_FINISHED")
+                except Exception as error:
+                    update_task_status(dataset_name, "DATASET_PROCESSING_ERROR")
+                    raise error


Improve error handling in the task execution segment.

Avoid using bare except statements to catch specific exceptions.

- except Exception as error: + except Exception as e: + logger.error(f"Error during task execution: {e}")

Committable suggestion was skipped due to low confidence.

Tools

Ruff

90-90: Undefined name hashed_user_id

(F821)

90-90: Undefined name user_permissions

(F821)

coderabbitai · 2024-07-27T10:52:26Z

cognee/api/v1/cognify/cognify_v2.py

+                        root_node_id = "ROOT"
+
+                    tasks = [
+                        Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }, user_id = hashed_user_id, user_permissions=user_permissions), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type


Undefined variables hashed_user_id and user_permissions.

The variables hashed_user_id and user_permissions are used but not defined in the provided code. Ensure that these variables are correctly defined and passed to the Task instances.

- Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }, user_id = hashed_user_id, user_permissions=user_permissions), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type + Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }, user_id = hashed_user_id, user_permissions=user_permissions), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type

Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type

Tools

Ruff

90-90: Undefined name hashed_user_id

(F821)

90-90: Undefined name user_permissions

(F821)

coderabbitai · 2024-07-27T10:52:26Z

cognee/infrastructure/databases/relational/user_authentication/users.py

+from cognee.infrastructure.databases.relational.user_authentication.authentication_db import User, get_user_db, \
+    get_async_session, ACL
+from fastapi.security import OAuth2PasswordRequestForm
+from fastapi_users.authentication import JWTStrategy


Remove redundant import.

The JWTStrategy import on line 20 is redundant and should be removed.

- from fastapi_users.authentication import JWTStrategy

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from fastapi_users.authentication import JWTStrategy

Tools

Ruff

20-20: Redefinition of unused JWTStrategy from line 10

Remove definition: JWTStrategy

(F811)

coderabbitai · 2024-07-27T10:52:26Z

cognee/infrastructure/databases/relational/user_authentication/users.py

+                    print(f"User created: {user.email}")
+    except UserAlreadyExists:
+        print(f"User {email} already exists")


Replace print statements with proper logging.

Using print statements for logging is not ideal for production code. Consider using the logging module.

- print(f"User {email} already exists") + logger.warning(f"User {email} already exists")

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

print(f"User created: {user.email}")

except UserAlreadyExists:

print(f"User {email} already exists")

print(f"User created: {user.email}")

except UserAlreadyExists:

logger.warning(f"User {email} already exists")

coderabbitai · 2024-07-27T10:52:26Z

cognee/infrastructure/databases/relational/user_authentication/users.py

+    async def on_after_request_verify(
+        self, user: User, token: str, request: Optional[Request] = None
+    ):
+        print(f"Verification requested for user {user.id}. Verification token: {token}")


Replace print statements with proper logging.

Using print statements for logging is not ideal for production code. Consider using the logging module.

- print(f"Verification requested for user {user.id}. Verification token: {token}") + logger.info(f"Verification requested for user {user.id}. Verification token: {token}")

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

print(f"Verification requested for user {user.id}. Verification token: {token}")

logger.info(f"Verification requested for user {user.id}. Verification token: {token}")

coderabbitai · 2024-07-27T10:52:27Z

cognee/infrastructure/databases/relational/user_authentication/users.py

+                        return False
+                    else:
+                        return True
+    except:


Avoid bare except statements.

Avoid using bare except statements to catch specific exceptions.

- except: + except Exception as e: + logger.error(f"Error checking token: {e}")

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

except:

except Exception as e:

logger.error(f"Error checking token: {e}")

Tools

Ruff

179-179: Do not use bare except

(E722)

coderabbitai · 2024-07-27T10:52:27Z

cognee/infrastructure/databases/relational/user_authentication/users.py

+                        return token
+                    else:
+                        return None
+    except:


Avoid bare except statements.

Avoid using bare except statements to catch specific exceptions.

- except: + except Exception as e: + logger.error(f"Error creating token: {e}")

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

except:

except Exception as e:

logger.error(f"Error creating token: {e}")

Tools

Ruff

163-163: Do not use bare except

(E722)

coderabbitai · 2024-07-27T10:52:27Z

cognee/infrastructure/databases/relational/user_authentication/users.py

+)
+from fastapi_users.exceptions import UserAlreadyExists
+from fastapi_users.db import SQLAlchemyUserDatabase
+from fastapi import Depends, HTTPException, status


Remove redundant import.

The Depends import on line 14 is redundant and should be removed.

- from fastapi import Depends, HTTPException, status + from fastapi import HTTPException, status

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from fastapi import Depends, HTTPException, status

from fastapi import HTTPException, status

Tools

Ruff

14-14: Redefinition of unused Depends from line 5

Remove definition: Depends

(F811)

coderabbitai · 2024-07-27T10:52:27Z

cognee/infrastructure/databases/relational/user_authentication/users.py

+                        return None
+                    return user
+    except Exception as e:
+        print(f"Error during authentication: {e}")


Replace print statements with proper logging and avoid bare except statements.

Using print statements for logging is not ideal for production code. Consider using the logging module. Additionally, avoid using bare except statements to catch specific exceptions.

- print(f"Error during authentication: {e}") + logger.error(f"Error during authentication: {e}")

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

print(f"Error during authentication: {e}")

logger.error(f"Error during authentication: {e}")

coderabbitai

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 709a10c and 0519986.

Files selected for processing (1)

.github/workflows/test_neo4j.yml (3 hunks)

Additional comments not posted (3)

.github/workflows/test_neo4j.yml (3)

31-42: LGTM! PostgreSQL service configuration is correctly added.

The services section is correctly configured to add a PostgreSQL service for the CI workflow.

59-65: LGTM! Waiting for PostgreSQL to be ready is a good practice.

The step ensures that PostgreSQL is ready before running tests, which is crucial for avoiding connection issues.

82-86: LGTM! PostgreSQL environment variables are correctly added.

The environment variables are necessary for the tests to connect to the PostgreSQL instance.

coderabbitai

Actionable comments posted: 15

Outside diff range, codebase verification and nitpick comments (3)

cognee/modules/search/graph/search_similarity.py (1)
Line range hint 3-20:
Correct the return type hint and update the docstring.

The return type hint should be list[dict] instead of list[str, str]. Additionally, update the docstring to reflect the changes.
- async def search_similarity(query: str) -> list[str, str]:
+ async def search_similarity(query: str) -> list[dict]:
    """
    Parameters:
    - query (str): The query string to filter nodes by.

    Returns:
-   - list(chunk): A list of objects providing information about the chunks related to query.
+   - list[dict]: A list of dictionaries providing information about the chunks related to query.
    """
cognee/api/v1/cognify/cognify_v2.py (1)
40-50: Update docstring to reflect the new function signature.

The function signature now includes a user parameter. Update the docstring to reflect this change.
"""
Process datasets and manage user permissions.

Args:
    datasets (Union[str, list[str]], optional): The datasets to process.
    user (User, optional): The user performing the operation.
"""
Tools

Ruff

46-46: Use is and is not for type comparisons, or isinstance() for isinstance checks

(E721)
cognee/api/v1/cognify/cognify.py (1)
Line range hint 61-85:
Improve error handling in the task execution segment.

Avoid using bare except statements to catch specific exceptions.
-        except Exception as error:
+        except (SQLAlchemyError, ValueError) as error:

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 0519986 and 4749995.

Files selected for processing (27)

cognee-frontend/src/app/page.tsx (1 hunks)
cognee-frontend/src/modules/datasets/cognifyDataset.ts (1 hunks)
cognee-frontend/src/modules/ingestion/DataView/DataView.module.css (1 hunks)
cognee-frontend/src/modules/ingestion/DataView/DataView.tsx (4 hunks)
cognee-frontend/src/modules/ingestion/DatasetsView/DatasetsView.tsx (2 hunks)
cognee-frontend/src/modules/ingestion/useDatasets.ts (2 hunks)
cognee-frontend/src/ui/Partials/SearchView/SearchView.tsx (2 hunks)
cognee/api/client.py (8 hunks)
cognee/api/v1/add/add.py (5 hunks)
cognee/api/v1/cognify/cognify.py (5 hunks)
cognee/api/v1/cognify/cognify_v2.py (4 hunks)
cognee/api/v1/datasets/datasets.py (1 hunks)
cognee/api/v1/search/search.py (4 hunks)
cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py (1 hunks)
cognee/modules/data/extraction/data_summary/summarize_text_chunks.py (1 hunks)
cognee/modules/data/models/Data.py (1 hunks)
cognee/modules/data/models/Dataset.py (1 hunks)
cognee/modules/data/operations/get_dataset_data.py (1 hunks)
cognee/modules/data/operations/retrieve_datasets.py (1 hunks)
cognee/modules/pipelines/models/PipelineRun.py (1 hunks)
cognee/modules/pipelines/operations/get_pipeline_status.py (1 hunks)
cognee/modules/pipelines/operations/log_pipeline_status.py (1 hunks)
cognee/modules/search/graph/search_similarity.py (1 hunks)
cognee/modules/topology/topology.py (2 hunks)
cognee/modules/users/models/Permission.py (1 hunks)
cognee/modules/users/permissions/methods/check_permissions_on_documents.py (1 hunks)
cognee/modules/users/permissions/methods/get_document_ids_for_user.py (1 hunks)

Files skipped from review due to trivial changes (2)

cognee-frontend/src/modules/ingestion/DataView/DataView.module.css
cognee-frontend/src/modules/ingestion/useDatasets.ts

Files skipped from review as they are similar to previous changes (10)

cognee-frontend/src/modules/datasets/cognifyDataset.ts
cognee-frontend/src/modules/ingestion/DataView/DataView.tsx
cognee/api/v1/datasets/datasets.py
cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py
cognee/modules/pipelines/models/PipelineRun.py
cognee/modules/pipelines/operations/get_pipeline_status.py
cognee/modules/pipelines/operations/log_pipeline_status.py
cognee/modules/topology/topology.py
cognee/modules/users/permissions/methods/check_permissions_on_documents.py
cognee/modules/users/permissions/methods/get_document_ids_for_user.py

Additional context used

Ruff

cognee/modules/data/models/Dataset.py

19-19: Undefined name Data

(F821)

cognee/modules/data/models/Data.py

22-22: Undefined name Dataset

(F821)

cognee/api/v1/cognify/cognify_v2.py

46-46: Use is and is not for type comparisons, or isinstance() for isinstance checks

(E721)

cognee/api/client.py

31-31: Module level import not at top of file

(E402)

56-57: Module level import not at top of file

(E402)

59-59: Module level import not at top of file

(E402)

Additional comments not posted (51)

cognee/modules/data/operations/retrieve_datasets.py (1)

1-3: Imports look good.

The necessary modules for database operations are correctly imported.

cognee/modules/users/models/Permission.py (1)

1-5: Imports look good.

The necessary modules for defining the SQLAlchemy model are correctly imported.

cognee/modules/search/graph/search_similarity.py (1)

22-27: Function looks good.

The parse_payload function is well-structured and correctly extracts the required fields from the payload.

cognee/modules/data/operations/get_dataset_data.py (3)

1-5: Import statements look good.

The necessary modules are imported, including UUID, SQLAlchemy components, and project-specific modules.

6-8: Good error handling for missing parameters.

The function correctly raises a ValueError if neither dataset_id nor dataset_name is provided.

10-11: Efficient database engine retrieval.

The function retrieves the database engine efficiently using get_relational_engine().
cognee/modules/data/models/Dataset.py (5)
9-11: Correct table name definition.

The __tablename__ attribute correctly defines the table name as "datasets".

12-17: Proper column definitions.

The columns are defined correctly, including UUID primary key and datetime fields with default values.

19-22: Fix undefined name Data.

The type hint for data uses Data, which is not defined in this file. This will cause a runtime error.
- data: Mapped[List["Data"]] = relationship(
+ from .Data import Data
+ data: Mapped[List[Data]] = relationship(
Tools

Ruff

19-19: Undefined name Data

(F821)

24-31: Good JSON serialization method.

The to_json method correctly serializes the Dataset object to a JSON-compatible dictionary.

1-8: Import statements look good, but missing import for Data.

The necessary modules are imported, including UUID, typing, datetime, and SQLAlchemy components. However, the Data model is referenced but not imported.
+ from .Data import Data
Likely invalid or redundant comment.
cognee/modules/data/extraction/data_summary/summarize_text_chunks.py (5)

Line range hint 1-10:
Import statements look good.

The necessary modules are imported, including asyncio, typing, pydantic, and project-specific modules.

Line range hint 12-13:
Good initial check for empty data chunks.

The function correctly returns early if the data_chunks list is empty.

Line range hint 15-17:
Efficient use of asyncio.gather for concurrent processing.

The function uses asyncio.gather to process text chunks concurrently, which is efficient.

Line range hint 19-21:
Efficient vector engine retrieval and collection creation.

The function retrieves the vector engine and creates a collection efficiently.

Line range hint 23-32:
Good addition of document_id to payload.

The addition of document_id to the payload enhances the data structure by including the document identifier, which is crucial for tracking or referencing the source document.
cognee/modules/data/models/Data.py (4)
1-7: Imports look good.

The imports are appropriate and necessary for the functionality of the file.

9-10: Class declaration looks good.

The class declaration and table name definition are correct.

12-20: Column definitions look good.

The columns are appropriately defined with correct data types and default values.

27-37: Address the commented-out datasets field.

The to_json method is well-defined, but the datasets field is commented out. Ensure it is either included or removed based on the requirements.
-            # "datasets": [dataset.to_json() for dataset in self.datasets]
+            "datasets": [dataset.to_json() for dataset in self.datasets]
cognee-frontend/src/ui/Partials/SearchView/SearchView.tsx (2)

10-10: Review the change of text property to any.

Changing the text property to any can introduce potential type safety issues. Ensure that this change is necessary and justified.

101-103: Rendering logic looks good.

The conditional rendering logic ensures that message.text is displayed correctly based on its type, enhancing the component's robustness.

cognee/api/v1/search/search.py (4)

Line range hint 1-15:
Imports look good.

The imports are appropriate and necessary for the functionality of the file.

Line range hint 17-39:
Class declarations look good.

The class declarations are well-defined and appropriate for the file's functionality.

45-63: Function changes look good.

The enhancements to the search function improve functionality by integrating user context into the search process.

Line range hint 65-89:
Function looks good.

The specific_search function is well-defined and leverages asyncio for concurrent execution of search tasks.

cognee-frontend/src/modules/ingestion/DatasetsView/DatasetsView.tsx (2)

43-43: Update state variable to include name property.

The dataset state variable has been expanded to include both id and name properties. This change is necessary to support the updated rendering logic.

100-100: Render dataset name instead of id.

The rendering logic within the modal has been updated to display the name of the dataset instead of its id. This improves the user interface by providing more meaningful information.

cognee-frontend/src/app/page.tsx (4)

51-51: Update function parameter to include name property.

The onDatasetCognify function's parameter type has been updated to include both id and name properties. This change is necessary to support the updated notification messages.

52-52: Update notification message to use dataset name.

The notification message has been updated to use the name property of the dataset instead of the id. This improves the clarity of the notifications.

56-56: Update success notification message to use dataset name.

The success notification message has been updated to use the name property of the dataset instead of the id. This improves the clarity of the notifications.

59-59: Update failure notification message to use dataset name.

The failure notification message has been updated to use the name property of the dataset instead of the id. This improves the clarity of the notifications.

cognee/api/v1/add/add.py (6)

18-18: Introduce user parameter in add function.

The user parameter allows for user-specific operations during data processing. This change enhances the functionality by enabling permission checks and assignments.

54-54: Pass user parameter to add_files function.

The user parameter is passed to the add_files function to ensure user-specific operations are handled during file processing.

58-58: Introduce user parameter in add_files function.

The user parameter allows for user-specific operations during file processing. This change enhances the functionality by enabling permission checks and assignments.

107-107: Introduce user parameter in data_resources function.

The user parameter allows for user-specific operations during data processing. This change enhances the functionality by enabling permission checks and assignments.

153-154: Assign read and write permissions to the user.

The give_permission_on_document function is used to assign read and write permissions to the user for the processed data.

157-157: Get or create default user if none is provided.

If no user is provided, the function retrieves or creates a default user. This ensures that user-specific operations can still be performed.
cognee/api/v1/cognify/cognify_v2.py (6)
25-29: New imports for user management.

The imports for user management are correctly added. Ensure that these modules are available and correctly implemented.

35-39: LGTM! The PermissionDeniedException class is correctly defined.

The exception class is well-defined and follows standard practices.

49-50: Ensure default user retrieval is correct.

The default user is retrieved if no user is provided. Ensure that the get_default_user function is correctly implemented and handles any potential errors.

66-70: Ensure permissions are correctly checked.

The check_permissions_on_documents function is used to check user permissions. Ensure that this function is correctly implemented and handles any potential errors.

154-155: LGTM! The generate_dataset_name function is correctly implemented.

The function follows best practices and is correctly implemented.

7-7: Update import statement for get_relational_engine.

The import statement has been updated to import get_relational_engine. Ensure that this change aligns with the new database interaction approach.

Verification successful

Update import statement for get_relational_engine.

The import statement has been updated to import get_relational_engine. This change aligns with the new database interaction approach and is correctly used across the codebase.

cognee/api/v1/cognify/cognify_v2.py

cognee/modules/users/methods/get_default_user.py

cognee/modules/users/permissions/methods/give_permission_on_document.py

cognee/modules/pipelines/operations/get_pipeline_status.py

cognee/modules/users/permissions/methods/get_document_ids_for_user.py

cognee/modules/users/permissions/methods/check_permissions_on_documents.py

cognee/modules/pipelines/operations/log_pipeline_status.py

cognee/modules/data/deletion/prune_system.py

cognee/modules/data/operations/ensure_dataset_exists.py

cognee/modules/data/operations/retrieve_datasets.py

cognee/modules/users/methods/create_user.py

cognee/modules/data/operations/get_dataset_data.py

cognee/infrastructure/databases/relational/get_relational_engine.py

cognee/infrastructure/databases/relational/create_db_and_tables.py

cognee/infrastructure/databases/relational/__init__.py

cognee/api/v1/add/add.py

cognee/api/v1/datasets/datasets.py

cognee/api/v1/cognify/cognify.py
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the usage of `get_relational_engine` in the codebase.

# Test: Search for the function usage. Expect: Only occurrences of the new function.
rg --type python -A 5 $'get_relational_engine'
Length of output: 76

Script:
#!/bin/bash
# Description: Verify the usage of `get_relational_engine` in the codebase.

# Test: Search for the function usage. Expect: Only occurrences of the new function.
rg -t py -A 5 'get_relational_engine'
Length of output: 18774
cognee/api/client.py (1)

291-292: LGTM! The get_settings function is correctly implemented.

The function follows best practices and is correctly implemented.
cognee/api/v1/cognify/cognify.py (6)
49-49: Ensure create_task_status_table is correctly awaited.

The create_task_status_table function is now awaited. Ensure that this function is correctly implemented and handles any potential errors.

56-56: Ensure get_datasets is correctly awaited.

The get_datasets function is now awaited. Ensure that this function is correctly implemented and handles any potential errors.

86-94: Ensure get_dataset_data is correctly awaited.

The get_dataset_data function is now awaited. Ensure that this function is correctly implemented and handles any potential errors.

Line range hint 96-138:
Ensure file processing logic is correct.

The function process_batch processes batches of files. Ensure that the file processing logic is correctly implemented and handles any potential errors.

Line range hint 170-229:
Ensure text processing logic is correct.

The function process_text processes text chunks. Ensure that the text processing logic is correctly implemented and handles any potential errors.

33-36: Update import statement for get_relational_engine.

The import statement has been updated to import get_relational_engine. Ensure that this change aligns with the new database interaction approach.

Verification successful

Update import statement for get_relational_engine.

The import statement for get_relational_engine is correctly updated and aligns with the new database interaction approach. The function is widely used across the codebase, indicating its necessity and proper integration.

cognee/api/v1/cognify/cognify.py: The import and usage of get_relational_engine are consistent with the rest of the codebase.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the usage of `get_relational_engine` in the codebase.

# Test: Search for the function usage. Expect: Only occurrences of the new function.
rg --type python -A 5 $'get_relational_engine'
Length of output: 76

Script:
#!/bin/bash
# Description: Verify the usage of `get_relational_engine` in the codebase.

# Test: Search for the function usage. Expect: Only occurrences of the new function.
rg -A 5 'get_relational_engine'
Length of output: 18768

coderabbitai · 2024-08-06T13:35:06Z

cognee/modules/data/models/Data.py

+    datasets: Mapped[List["Dataset"]] = relationship(
+        secondary = DatasetData.__tablename__,
+        back_populates = "data"
+    )


Fix undefined name Dataset.

The datasets relationship references Dataset, which is not defined or imported. Change it to DatasetData to align with the import.

- datasets: Mapped[List["Dataset"]] = relationship( + datasets: Mapped[List["DatasetData"]] = relationship(

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

datasets: Mapped[List["Dataset"]] = relationship(

secondary = DatasetData.__tablename__,

back_populates = "data"

)

datasets: Mapped[List["DatasetData"]] = relationship(

secondary = DatasetData.__tablename__,

back_populates = "data"

)

Tools

Ruff

22-22: Undefined name Dataset

(F821)

coderabbitai · 2024-08-06T13:35:07Z

cognee/modules/data/operations/retrieve_datasets.py

+async def retrieve_datasets(dataset_names: list[str]) -> list[Dataset]:
+    db_engine = get_relational_engine()
+
+    async with db_engine.get_async_session() as session:
+        datasets = (await session.scalars(
+            select(Dataset).filter(Dataset.name.in_(dataset_names))
+        )).all()
+
+        return datasets


Consider adding error handling.

The function lacks error handling, which is crucial for robust database operations. Consider adding try-except blocks to handle potential exceptions.

from sqlalchemy.exc import SQLAlchemyError async def retrieve_datasets(dataset_names: List[str]) -> List[Dataset]: db_engine = get_relational_engine() try: async with db_engine.get_async_session() as session: datasets = (await session.scalars( select(Dataset).filter(Dataset.name.in_(dataset_names)) )).all() return datasets except SQLAlchemyError as e: # Handle the exception (e.g., log it and return an empty list or re-raise) print(f"Database error: {e}") return []

coderabbitai · 2024-08-06T13:35:07Z

cognee/modules/data/operations/retrieve_datasets.py

+from cognee.infrastructure.databases.relational import get_relational_engine
+from ..models import Dataset
+
+async def retrieve_datasets(dataset_names: list[str]) -> list[Dataset]:


Correct the type hint for the parameter.

Use List[str] instead of list[str] for consistency with Python type hinting conventions.

- async def retrieve_datasets(dataset_names: list[str]) -> list[Dataset]: + async def retrieve_datasets(dataset_names: List[str]) -> List[Dataset]:

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async def retrieve_datasets(dataset_names: list[str]) -> list[Dataset]:

from typing import List

async def retrieve_datasets(dataset_names: List[str]) -> List[Dataset]:

coderabbitai · 2024-08-06T13:35:07Z

cognee/modules/users/models/Permission.py

+class Permission(Base):
+    __tablename__ = "permissions"
+
+    id = Column(UUID, primary_key = True, index = True, default = uuid4)


Correct the default value for id.

The default value for id should be a callable (i.e., uuid4 without parentheses).

- id = Column(UUID, primary_key = True, index = True, default = uuid4) + id = Column(UUID, primary_key = True, index = True, default = uuid4)

Committable suggestion was skipped due to low confidence.

coderabbitai · 2024-08-06T13:35:07Z

cognee/modules/users/models/Permission.py

+    created_at = Column(DateTime(timezone = True), default = lambda: datetime.now(timezone.utc))
+    updated_at = Column(DateTime(timezone = True), onupdate = lambda: datetime.now(timezone.utc))


Use func.now() and func.current_timestamp() for timestamps.

Replace the lambda functions for created_at and updated_at with func.now() and func.current_timestamp() from SQLAlchemy for better readability and consistency.

- created_at = Column(DateTime(timezone = True), default = lambda: datetime.now(timezone.utc)) - updated_at = Column(DateTime(timezone = True), onupdate = lambda: datetime.now(timezone.utc)) + from sqlalchemy.sql import func + created_at = Column(DateTime(timezone = True), default = func.now()) + updated_at = Column(DateTime(timezone = True), onupdate = func.current_timestamp())

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

created_at = Column(DateTime(timezone = True), default = lambda: datetime.now(timezone.utc))

updated_at = Column(DateTime(timezone = True), onupdate = lambda: datetime.now(timezone.utc))

from sqlalchemy.sql import func

created_at = Column(DateTime(timezone = True), default = func.now())

updated_at = Column(DateTime(timezone = True), onupdate = func.current_timestamp())

coderabbitai · 2024-08-06T13:35:08Z

cognee/api/client.py

+@asynccontextmanager
+async def lifespan(app: FastAPI):
+    # Not needed if you setup a migration system like Alembic
+    await create_db_and_tables()
+    yield
+


Remove duplicate lifespan context manager.

The lifespan context manager is defined twice, which can cause conflicts.

- @asynccontextmanager - async def lifespan(app: FastAPI): - # Not needed if you setup a migration system like Alembic - await create_db_and_tables() - yield

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@asynccontextmanager

async def lifespan(app: FastAPI):

# Not needed if you setup a migration system like Alembic

await create_db_and_tables()

yield

coderabbitai · 2024-08-06T13:35:08Z

cognee/api/client.py

+from contextlib import asynccontextmanager
+


Reorganize import statements to the top of the file.

Imports should be at the top of the file to comply with PEP 8 standards.

- from contextlib import asynccontextmanager

Committable suggestion was skipped due to low confidence.

Tools

Ruff

31-31: Module level import not at top of file

(E402)

coderabbitai · 2024-08-06T13:35:08Z

cognee/api/client.py

+from cognee.api.v1.users.routers import get_auth_router, get_register_router,\
+    get_reset_password_router, get_verify_router, get_users_router
+
+from cognee.api.v1.permissions.get_permissions_router import get_permissions_router


Reorganize import statements to the top of the file.

Imports should be at the top of the file to comply with PEP 8 standards.

- from cognee.api.v1.permissions.get_permissions_router import get_permissions_router

Committable suggestion was skipped due to low confidence.

Tools

Ruff

59-59: Module level import not at top of file

(E402)

coderabbitai · 2024-08-06T13:35:09Z

cognee/api/client.py

+from cognee.api.v1.users.routers import get_auth_router, get_register_router,\
+    get_reset_password_router, get_verify_router, get_users_router


Reorganize import statements to the top of the file.

Imports should be at the top of the file to comply with PEP 8 standards.

- from cognee.api.v1.users.routers import get_auth_router, get_register_router,\ - get_reset_password_router, get_verify_router, get_users_router

Committable suggestion was skipped due to low confidence.

Tools

Ruff

56-57: Module level import not at top of file

(E402)

coderabbitai · 2024-08-06T13:35:09Z

cognee/api/client.py

+@app.get("/datasets", response_model = list)
 async def get_datasets():
-    from cognee.api.v1.datasets.datasets import datasets
-    return datasets.list_datasets()
+    try:
+        from cognee.api.v1.datasets.datasets import datasets
+        datasets = await datasets.list_datasets()
+
+        return JSONResponse(
+            status_code = 200,
+            content = [dataset.to_json() for dataset in datasets],
+        )
+    except Exception as error:
+        raise HTTPException(status_code = 500, detail=f"Error retrieving datasets: {str(error)}") from error


Improve error handling in the get_datasets function.

Avoid using bare except statements to catch specific exceptions.

- except Exception as error: + except (SQLAlchemyError, ValueError) as error:

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@app.get("/datasets", response_model = list)

async def get_datasets():

from cognee.api.v1.datasets.datasets import datasets

return datasets.list_datasets()

try:

from cognee.api.v1.datasets.datasets import datasets

datasets = await datasets.list_datasets()

return JSONResponse(

status_code = 200,

content = [dataset.to_json() for dataset in datasets],

)

except Exception as error:

raise HTTPException(status_code = 500, detail=f"Error retrieving datasets: {str(error)}") from error

@app.get("/datasets", response_model = list)

async def get_datasets():

try:

from cognee.api.v1.datasets.datasets import datasets

datasets = await datasets.list_datasets()

return JSONResponse(

status_code = 200,

content = [dataset.to_json() for dataset in datasets],

)

except (SQLAlchemyError, ValueError) as error:

raise HTTPException(status_code = 500, detail=f"Error retrieving datasets: {str(error)}") from error

coderabbitai

Actionable comments posted: 0

Outside diff range, codebase verification and nitpick comments (1)

.github/workflows/docker_compose.yml (1)
38-38: Ensure consistent usage of docker compose or docker-compose commands.

There are mixed references to docker compose and docker-compose in the codebase:

docker compose is used in docs/blog/posts/from-demo-to-production-3.md.

docker-compose is used in README.md.

To avoid confusion or potential errors, ensure consistent usage of either docker compose or docker-compose throughout the codebase.

docs/blog/posts/from-demo-to-production-3.md

README.md

Analysis chain

Verify compatibility with docker compose command for down.

Ensure that the new command docker compose -f docker-compose.yml down is compatible with your workflow and does not introduce any issues.

Verify consistency of docker compose command usage.

Ensure that the new command docker compose -f docker-compose.yml down is consistently used and correctly referenced in the codebase.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify compatibility with `docker compose` command for down.

# Test: Check if `docker compose` command is available. Expect: Command found.
docker compose version
Length of output: 98

Script:
#!/bin/bash
# Search for references to `docker compose` and `docker-compose` in the codebase to verify consistency and correctness.
rg "docker compose" -A 2
rg "docker-compose" -A 2
Length of output: 536

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 4749995 and 896a2ce.

Files ignored due to path filters (1)

poetry.lock is excluded by !**/*.lock

Files selected for processing (6)

.github/workflows/docker_compose.yml (1 hunks)
.github/workflows/test_common.yml (2 hunks)
.github/workflows/test_neo4j.yml (3 hunks)
.github/workflows/test_qdrant.yml (1 hunks)
.github/workflows/test_weaviate.yml (1 hunks)
pyproject.toml (3 hunks)

Files skipped from review as they are similar to previous changes (1)

.github/workflows/test_neo4j.yml

Additional comments not posted (18)

.github/workflows/docker_compose.yml (3)

24-24: Verify compatibility with docker compose command for build.

Ensure that the new command docker compose -f docker-compose.yml build is compatible with your workflow and does not introduce any issues.

28-28: Verify compatibility with docker compose command for up.

Ensure that the new command docker compose -f docker-compose.yml up -d is compatible with your workflow and does not introduce any issues.

20-20: Verify compatibility with Docker Buildx v3.

Ensure that the new version docker/setup-buildx-action@v3 is compatible with your workflow and does not introduce any breaking changes.
.github/workflows/test_qdrant.yml (4)
Line range hint 50-72:
Verify the run_qdrant_integration_test job steps configuration.

Ensure that the job steps are correctly configured and align with best practices.

Verification successful

The run_qdrant_integration_test job steps are correctly configured and align with best practices.

The workflow sets up Python, installs dependencies using Poetry, and runs the integration tests with the necessary environment variables.

The recent failures are likely due to issues in the test script or environment, not the workflow configuration.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the `run_qdrant_integration_test` job steps configuration.

# Test: Check the job steps configuration for errors. Expect: No errors.
gh workflow view test_qdrant.yml
Length of output: 804

Script:
#!/bin/bash
# Description: View the YAML configuration of the `test_qdrant.yml` workflow.

# Command to view the YAML configuration of the workflow.
gh workflow view test_qdrant.yml --yaml
Length of output: 1558
38-49: Verify the new PostgreSQL service configuration.

Ensure that the new PostgreSQL service is correctly configured and aligns with best practices.

21-27: Verify the new setup_docker job configuration.

Ensure that the new job setup_docker is correctly configured and aligns with best practices.

32-32: Verify compatibility with ubuntu-latest environment.

Ensure that the new environment ubuntu-latest is compatible with your workflow and does not introduce any issues.

Verification successful

Environment change to ubuntu-latest is not the cause of workflow failures.

The recent failures in the workflow are due to missing configuration fields, specifically the 'username' field in the PostgresCredentials configuration. The change from macos-latest to ubuntu-latest is not the root cause of these failures.

Ensure that all necessary configuration values are provided for the workflow to run successfully.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify compatibility with `ubuntu-latest` environment.

# Test: Check the environment configuration for errors. Expect: No errors.
gh workflow view test_qdrant.yml
Length of output: 804

Script:
#!/bin/bash
# Description: Fetch and analyze logs of recent failed runs for the `test_qdrant.yml` workflow.

# Get the list of recent runs for the workflow
gh run list --workflow test_qdrant.yml --limit 5 --json conclusion,workflowDatabaseId,name,headSha,headBranch,url,status

# Fetch logs for the most recent failed run
gh run view 10267916576 --log
Length of output: 218035
.github/workflows/test_weaviate.yml (4)
Line range hint 50-72:
Verify the run_weaviate_integration_test job steps configuration.

Ensure that the job steps are correctly configured and align with best practices.

Verification successful

The run_weaviate_integration_test job steps are correctly configured and align with best practices.

The job includes steps for checking out the repository, setting up Python, installing Poetry and dependencies, creating a directory, and running the integration tests with the necessary environment variables.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the `run_weaviate_integration_test` job steps configuration.

# Test: Check the job steps configuration for errors. Expect: No errors.
gh workflow view test_weaviate.yml
Length of output: 824

Script:
#!/bin/bash
# Description: Retrieve the YAML content of the `test_weaviate.yml` workflow file.

# Fetch the YAML content of the workflow file to inspect the job steps configuration.
gh workflow view test_weaviate.yml --yaml
Length of output: 1572
32-32: Verify compatibility with ubuntu-latest environment.

Ensure that the new environment ubuntu-latest is compatible with your workflow and does not introduce any issues.

38-49: Verify the new PostgreSQL service configuration.

Ensure that the new PostgreSQL service is correctly configured and aligns with best practices.

21-27: Verify the new setup_docker job configuration.

Ensure that the new job setup_docker is correctly configured and aligns with best practices.
pyproject.toml (5)

29-29: LGTM! Fixed version for sqlalchemy ensures stability.

Changing the version constraint from ^2.0.21 to 2.0.21 helps avoid potential breaking changes from minor updates.

32-32: LGTM! Updated debugpy version ensures compatibility.

Updating from ^1.8.0 to 1.8.2 likely addresses specific bug fixes or required features.

41-41: LGTM! Updated dlt version with PostgreSQL support enhances functionality.

Updating from 0.5.1 to {extras = ["postgres"], version = "^0.5.2"} adds PostgreSQL support, improving the library's capabilities.

72-72: LGTM! Added fastapi-users for user management.

Introducing fastapi-users with sqlalchemy extras adds essential user management capabilities to the project.

73-73: LGTM! Added asyncpg for asynchronous PostgreSQL interactions.

Introducing asyncpg with version ^0.29.0 enhances the project's capability to handle asynchronous PostgreSQL database operations.

.github/workflows/test_common.yml (2)

21-26: LGTM! Added setup_docker job for multi-platform builds.

The new job sets up Docker Buildx using docker/setup-buildx-action@v3, enhancing the workflow's capability to handle multi-platform builds efficiently.

50-60: LGTM! Added postgres service for PostgreSQL database setup.

The new service uses the latest PostgreSQL image, sets up environment variables, and defines a volume for persistent storage, improving the testing environment's fidelity and functionality.

coderabbitai

Actionable comments posted: 2

Outside diff range, codebase verification and nitpick comments (1)

cognee/api/v1/search/search.py (1)
45-47: Verify the function usage in the codebase.

The following instances of the search function calls do not include the user parameter, which is required by the new function signature. Please update these calls to match the new signature:

cognee/tests/test_weaviate.py

cognee/tests/test_qdrant.py

cognee/tests/test_neo4j.py

cognee/tests/test_library.py

cognee/modules/search/vector/search_traverse.py

cognee/modules/search/graph/search_adjacent.py

cognee/modules/search/graph/search_summary.py

cognee/modules/search/graph/search_similarity.py

cognee/modules/cognify/llm/resolve_cross_graph_references.py

cognee/infrastructure/databases/vector/vector_db_interface.py

cognee/infrastructure/databases/vector/falkordb/FalkorDBAdapter.py

cognee/infrastructure/databases/vector/lancedb/LanceDBAdapter.py

cognee/infrastructure/databases/vector/qdrant/QDrantAdapter.py

cognee/infrastructure/databases/vector/weaviate_db/WeaviateAdapter.py

cognee/api/v1/search/search.py

cognee/api/v1/cognify/cognify.py

cognee/api/client.py

Analysis chain

Verify the function usage in the codebase.

Ensure that all function calls to search match the new signature.

Let's correct the regex pattern and re-run the search to verify all function calls to search match the new signature.

Let's correct the pattern format and use double quotes to avoid the regex parse error.

Let's try escaping the parentheses in the pattern to resolve the regex parse error.

Let's use fd to find Python files and then search for the function calls using rg.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify all function calls to `search` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type python -A 5 $'search('
Length of output: 98

Script:
#!/bin/bash
# Description: Verify all function calls to `search` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type python -A 5 'search('
Length of output: 98

Script:
#!/bin/bash
# Description: Verify all function calls to `search` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type python -A 5 "search("
Length of output: 98

Script:
#!/bin/bash
# Description: Verify all function calls to `search` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type python -A 5 "search\\("
Length of output: 65

Script:
#!/bin/bash
# Description: Verify all function calls to `search` match the new signature.

# Find all Python files and search for the function usage.
fd --extension py | xargs rg -A 5 "search\\("
Length of output: 21025

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 896a2ce and 1c89119.

Files ignored due to path filters (1)

poetry.lock is excluded by !**/*.lock

Files selected for processing (8)

.github/workflows/test_common.yml (4 hunks)
.github/workflows/test_neo4j.yml (3 hunks)
.github/workflows/test_qdrant.yml (2 hunks)
.github/workflows/test_weaviate.yml (2 hunks)
cognee/api/client.py (8 hunks)
cognee/api/v1/search/search.py (4 hunks)
cognee/infrastructure/databases/graph/neo4j_driver/adapter.py (1 hunks)
pyproject.toml (4 hunks)

Files skipped from review due to trivial changes (1)

pyproject.toml

Files skipped from review as they are similar to previous changes (4)

.github/workflows/test_common.yml
.github/workflows/test_neo4j.yml
.github/workflows/test_qdrant.yml
.github/workflows/test_weaviate.yml

Additional context used

Ruff

cognee/api/v1/search/search.py

59-59: Use is and is not for type comparisons, or isinstance() for isinstance checks

(E721)

cognee/api/client.py

31-31: Module level import not at top of file

(E402)

56-57: Module level import not at top of file

(E402)

59-59: Module level import not at top of file

(E402)

Additional comments not posted (10)

cognee/api/v1/search/search.py (1)

Line range hint 64-90:
LGTM!

The specific_search function is well-implemented and efficient.

cognee/api/client.py (5)

33-38: LGTM!

The lifespan context manager is well-implemented and efficient.

Line range hint 39-54:
LGTM!

The FastAPI application setup and middleware configuration are well-implemented and efficient.

Tools

Ruff

31-31: Module level import not at top of file

(E402)

61-95: LGTM!

The router inclusion for authentication and user management is well-implemented and efficient.

111-122: LGTM!

The get_datasets function is well-implemented and efficient.

291-292: LGTM!

The get_settings function and other endpoints are well-implemented and efficient.

cognee/infrastructure/databases/graph/neo4j_driver/adapter.py (4)

24-25: LGTM!

The addition of max_connection_lifetime to the AsyncGraphDatabase.driver initialization is well-implemented and efficient.

Line range hint 47-57:
LGTM!

The query method is well-implemented and efficient.

Line range hint 81-111:
LGTM!

The add_node and add_nodes methods are well-implemented and efficient.

Line range hint 211-243:
LGTM!

The add_edge and add_edges methods are well-implemented and efficient.

coderabbitai · 2024-08-06T14:42:09Z

cognee/api/v1/search/search.py

+
+    for search_result in search_results:
+        document_id = search_result["document_id"] if "document_id" in search_result else None
+        document_id = UUID(document_id) if type(document_id) == str else document_id


Use isinstance for type comparison.

Replace type(document_id) == str with isinstance(document_id, str) for better readability and performance.

- document_id = UUID(document_id) if type(document_id) == str else document_id + document_id = UUID(document_id) if isinstance(document_id, str) else document_id

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

document_id = UUID(document_id) if type(document_id) == str else document_id

document_id = UUID(document_id) if isinstance(document_id, str) else document_id

Tools

Ruff

59-59: Use is and is not for type comparisons, or isinstance() for isinstance checks

(E721)

coderabbitai · 2024-08-06T14:42:09Z

cognee/api/client.py

+from cognee.infrastructure.databases.relational import create_db_and_tables
+


Reorganize import statements to the top of the file.

Imports should be at the top of the file to comply with PEP 8 standards.

- from contextlib import asynccontextmanager - from cognee.api.v1.users.routers import get_auth_router, get_register_router,\ - get_reset_password_router, get_verify_router, get_users_router - from cognee.api.v1.permissions.get_permissions_router import get_permissions_router + from contextlib import asynccontextmanager + from cognee.api.v1.users.routers import get_auth_router, get_register_router, get_reset_password_router, get_verify_router, get_users_router + from cognee.api.v1.permissions.get_permissions_router import get_permissions_router

Also applies to: 31-32, 56-60

coderabbitai

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 1c89119 and 6d38bcd.

Files selected for processing (1)

.github/workflows/test_common.yml (4 hunks)

Additional comments not posted (3)

.github/workflows/test_common.yml (3)

21-34: LGTM! The setup_docker job is correctly defined.

The job sets up Docker Buildx for multi-platform builds using appropriate actions.

57-67: LGTM! The PostgreSQL service is correctly defined.

The service uses the latest PostgreSQL image, sets up environment variables, defines a volume for persistent storage, and is exposed on port 5432.

108-113: LGTM! The step to wait for PostgreSQL to be ready is correctly implemented.

The step uses the pg_isready command in a loop until the database is ready, ensuring that PostgreSQL is fully initialized before running tests.

added boilerplate for user management

866270b

coderabbitai bot reviewed Jul 21, 2024

View reviewed changes

add sqlalchemy engine

77e8c1b

coderabbitai bot reviewed Jul 21, 2024

View reviewed changes

borisarzentar reviewed Jul 22, 2024

View reviewed changes

Initial functional user auth

e785b30

coderabbitai bot reviewed Jul 22, 2024

View reviewed changes

Fixes for user flow with group management

14e1eba

Vasilije1990 force-pushed the COG-206 branch from dab5c87 to 14e1eba Compare July 23, 2024 11:43

coderabbitai bot reviewed Jul 23, 2024

View reviewed changes

Fixes to the model and adding the read info to the graph

36e156e

coderabbitai bot reviewed Jul 23, 2024

View reviewed changes

borisarzentar requested changes Jul 23, 2024

View reviewed changes

Update cognee/api/client.py

3f5e665

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai bot reviewed Jul 24, 2024

View reviewed changes

Vasilije1990 added 2 commits July 24, 2024 16:59

Fixes to translation services

e7b0e71

Merge remote-tracking branch 'origin/COG-206' into COG-206

a93bd53

coderabbitai bot reviewed Jul 24, 2024

View reviewed changes

Exposed user management

9616545

coderabbitai bot reviewed Jul 26, 2024

View reviewed changes

Fixes to the ACL model

218d322

coderabbitai bot reviewed Jul 27, 2024

View reviewed changes

Vasilije1990 added 2 commits July 27, 2024 12:45

Fixes to the ACL model

b4d1a73

Fixes to the ACL model

7930586

coderabbitai bot reviewed Jul 27, 2024

View reviewed changes

Updates to searches

797e7ba

coderabbitai bot reviewed Aug 6, 2024

View reviewed changes

borisarzentar added 3 commits August 6, 2024 15:23

fix: search results preview

cb9bfa2

Merge remote-tracking branch 'origin/COG-206' into COG-206

4749995

fix: add buildx action

7846291

coderabbitai bot reviewed Aug 6, 2024

View reviewed changes

borisarzentar added 7 commits August 6, 2024 15:36

fix: upgrade setup-buildx-action

acf41ff

fix: add setup_docker job

3f35a45

fix: fix debugpy version

3e3134b

fix: upgrade buildx action

0dad12c

fix: run tests on ubuntu

a590c6a

fix: add postgres service to tests

896a2ce

fix: add env variables to test workflows

e492a18

coderabbitai bot reviewed Aug 6, 2024

View reviewed changes

borisarzentar added 4 commits August 6, 2024 16:11

fix: wait for postgres to be ready before tests

ed6e8eb

fix: update neo4j lib

a34acbc

fix: uuid parsing of search results

1c89119

fix: run buildx on all platforms

8a63aa3

coderabbitai bot reviewed Aug 6, 2024

View reviewed changes

borisarzentar added 2 commits August 6, 2024 16:46

fix: install buildx on all platforms

4f72eb5

fix: add QEMU

6d38bcd

coderabbitai bot reviewed Aug 6, 2024

View reviewed changes

borisarzentar added 7 commits August 6, 2024 17:07

fix: add install_docker step in github action workflow

887d4e1

fix: add steps

fd20fac

fix: add runs-on

8fa572a

fix: move if

9927855

fix: add needs

841e7b5

fix: move install docker job

f808cf1

fix: remove macos test

e3c3f35

Vasilije1990 merged commit 80e4f29 into main Aug 6, 2024
21 of 24 checks passed

borisarzentar deleted the COG-206 branch October 3, 2024 16:47

		from cognee.infrastructure.databases.relational.user_authentication.routers import permission_router

	Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }, user_id = hashed_user_id, user_permissions=user_permissions), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type
	Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type


		async with get_async_session_context() as session:

		out = await has_permission_document(user_id, file["id"], "write", session)

	out = await has_permission_document(user_id, file["id"], "write", session)
	await has_permission_document(user_id, file["id"], "write", session)

	print(f"Verification requested for user {user.id}. Verification token: {token}")
	logger.info(f"Verification requested for user {user.id}. Verification token: {token}")

	except:
	except Exception as e:
	logger.error(f"Error checking token: {e}")

	except:
	except Exception as e:
	logger.error(f"Error creating token: {e}")

	from fastapi import Depends, HTTPException, status
	from fastapi import HTTPException, status

	print(f"Error during authentication: {e}")
	logger.error(f"Error during authentication: {e}")

-async def retrieve_datasets(dataset_names: list[str]) -> list[Dataset]:
+from typing import List
+async def retrieve_datasets(dataset_names: List[str]) -> List[Dataset]:

		created_at = Column(DateTime(timezone = True), default = lambda: datetime.now(timezone.utc))
		updated_at = Column(DateTime(timezone = True), onupdate = lambda: datetime.now(timezone.utc))

-    created_at = Column(DateTime(timezone = True), default = lambda: datetime.now(timezone.utc))
-    updated_at = Column(DateTime(timezone = True), onupdate = lambda: datetime.now(timezone.utc))
+    from sqlalchemy.sql import func
+    created_at = Column(DateTime(timezone = True), default = func.now())
+    updated_at = Column(DateTime(timezone = True), onupdate = func.current_timestamp())

		from cognee.api.v1.users.routers import get_auth_router, get_register_router,\
		get_reset_password_router, get_verify_router, get_users_router

	document_id = UUID(document_id) if type(document_id) == str else document_id
	document_id = UUID(document_id) if isinstance(document_id, str) else document_id

added boilerplate for user management #123

added boilerplate for user management #123

Conversation

Vasilije1990 commented Jul 21, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Jul 21, 2024 • edited Loading

Rate limit exceeded

Walkthrough

Changes

Poem

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jul 21, 2024

Choose a reason for hiding this comment

gitguardian bot commented Jul 21, 2024 • edited Loading

⚠️ GitGuardian has uncovered 16 secrets following the scan of your pull request.

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jul 21, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jul 22, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 22, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 22, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 22, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 22, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 22, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 22, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 22, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 22, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 22, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jul 22, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 22, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jul 23, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 23, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 23, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 23, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coderabbitai bot Jul 23, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 23, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coderabbitai bot Jul 23, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jul 23, 2024

Choose a reason for hiding this comment

Vasilije1990 commented Jul 21, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jul 21, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

gitguardian bot commented Jul 21, 2024 •

edited

Loading

coderabbitai bot Jul 23, 2024 •

edited

Loading

coderabbitai bot Jul 23, 2024 •

edited

Loading

coderabbitai bot Jul 23, 2024 •

edited

Loading