-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added boilerplate for user management #123
Conversation
Warning Rate limit exceeded@borisarzentar has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 3 minutes and 48 seconds before requesting another review. How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. WalkthroughThe recent enhancements to the FastAPI application significantly improve user management, database interactions, and overall application performance. New authentication routes, along with the Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (4)
- cognee/api/client.py (1 hunks)
- cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (1 hunks)
- cognee/infrastructure/databases/relational/user_authentication/schemas.py (1 hunks)
- cognee/infrastructure/databases/relational/user_authentication/users.py (1 hunks)
Additional context used
Ruff
cognee/api/client.py
46-46: Module level import not at top of file
(E402)
48-48: Module level import not at top of file
(E402)
50-50: Module level import not at top of file
(E402)
51-51: Module level import not at top of file
(E402)
52-52: Module level import not at top of file
(E402)
114-114: Redefinition of unused
authenticated_route
from line 80(F811)
Additional comments not posted (17)
cognee/infrastructure/databases/relational/user_authentication/schemas.py (3)
6-7
: LGTM!The class
UserRead
correctly inherits fromschemas.BaseUser[uuid.UUID]
.
10-11
: LGTM!The class
UserCreate
correctly inherits fromschemas.BaseUserCreate
.
14-15
: LGTM!The class
UserUpdate
correctly inherits fromschemas.BaseUserUpdate
.cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (5)
11-12
: LGTM!The class
Base
correctly inherits fromDeclarativeBase
.
15-16
: LGTM!The class
User
correctly inherits fromSQLAlchemyBaseUserTableUUID
andBase
.
25-27
: LGTM!The function
create_db_and_tables
correctly creates the database and tables using SQLAlchemy.
30-32
: LGTM!The function
get_async_session
correctly provides an async session for database operations.
35-36
: LGTM!The function
get_user_db
correctly provides a user database using SQLAlchemy.cognee/infrastructure/databases/relational/user_authentication/users.py (6)
18-33
: LGTM!The class
UserManager
correctly inherits fromUUIDIDMixin
andBaseUserManager
. The methods for user registration, password reset, and verification are correctly implemented.
36-37
: LGTM!The function
get_user_manager
correctly provides a user manager usingUserManager
.
43-44
: LGTM!The function
get_jwt_strategy
correctly provides a JWT strategy for authentication.
47-51
: LGTM!The function
auth_backend
correctly sets up an authentication backend using JWT strategy.
53-53
: LGTM!The function
fastapi_users
correctly sets up FastAPI users usingUserManager
andauth_backend
.
55-55
: LGTM!The function
current_active_user
correctly sets up the current active user usingfastapi_users
.cognee/api/client.py (3)
54-76
: LGTM! Verify router functionality.The routers are correctly set up and prefixed. Ensure that they function as expected.
79-81
: LGTM! Verify function usage.The function correctly uses dependency injection for
current_active_user
. Ensure that the function is used as intended.
83-87
: LGTM! Verify database setup.The context manager correctly sets up the database and tables. Ensure that the database setup is functioning as expected.
cognee/api/client.py
Outdated
from contextlib import asynccontextmanager | ||
|
||
from fastapi import Depends, FastAPI | ||
|
||
from cognee.infrastructure.databases.relational.user_authentication.authentication_db import User, create_db_and_tables | ||
from cognee.infrastructure.databases.relational.user_authentication.schemas import UserCreate, UserRead, UserUpdate | ||
from cognee.infrastructure.databases.relational.user_authentication.users import auth_backend, current_active_user, fastapi_users |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reorganize import statements to the top of the file.
Imports should be at the top of the file to comply with PEP 8 standards.
- from contextlib import asynccontextmanager
- from fastapi import Depends, FastAPI
- from cognee.infrastructure.databases.relational.user_authentication.authentication_db import User, create_db_and_tables
- from cognee.infrastructure.databases.relational.user_authentication.schemas import UserCreate, UserRead, UserUpdate
- from cognee.infrastructure.databases.relational.user_authentication.users import auth_backend, current_active_user, fastapi_users
+ from contextlib import asynccontextmanager
+ from fastapi import Depends, FastAPI
+ from cognee.infrastructure.databases.relational.user_authentication.authentication_db import User, create_db_and_tables
+ from cognee.infrastructure.databases.relational.user_authentication.schemas import UserCreate, UserRead, UserUpdate
+ from cognee.infrastructure.databases.relational.user_authentication.users import auth_backend, current_active_user, fastapi_users
Committable suggestion was skipped due to low confidence.
Tools
Ruff
46-46: Module level import not at top of file
(E402)
48-48: Module level import not at top of file
(E402)
50-50: Module level import not at top of file
(E402)
51-51: Module level import not at top of file
(E402)
52-52: Module level import not at top of file
(E402)
|
GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
---|---|---|---|---|---|
9573981 | Triggered | Generic Password | 77e8c1b | cognee/infrastructure/databases/relational/config.py | View secret |
9573981 | Triggered | Generic Password | e785b30 | docker-compose.yml | View secret |
9573981 | Triggered | Generic Password | e785b30 | cognee/infrastructure/databases/relational/config.py | View secret |
9573981 | Triggered | Generic Password | 401167b | cognee/infrastructure/databases/relational/config.py | View secret |
9573981 | Triggered | Generic Password | 896a2ce | .github/workflows/test_common.yml | View secret |
9573981 | Triggered | Generic Password | 896a2ce | .github/workflows/test_qdrant.yml | View secret |
9573981 | Triggered | Generic Password | 896a2ce | .github/workflows/test_weaviate.yml | View secret |
9573981 | Triggered | Generic Password | e492a18 | .github/workflows/test_weaviate.yml | View secret |
9573981 | Triggered | Generic Password | e492a18 | .github/workflows/test_qdrant.yml | View secret |
9573981 | Triggered | Generic Password | e492a18 | .github/workflows/test_qdrant.yml | View secret |
9573981 | Triggered | Generic Password | e492a18 | .github/workflows/test_neo4j.yml | View secret |
9573981 | Triggered | Generic Password | e492a18 | .github/workflows/test_common.yml | View secret |
9573981 | Triggered | Generic Password | e492a18 | .github/workflows/test_weaviate.yml | View secret |
9573981 | Triggered | Generic Password | e492a18 | .github/workflows/test_common.yml | View secret |
13323556 | Triggered | Username Password | 73dd6c2 | .github/workflows/test_neo4j.yml | View secret |
13323556 | Triggered | Username Password | 4749995 | .github/workflows/test_neo4j.yml | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secrets safely. Learn here the best practices.
- Revoke and rotate these secrets.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (4)
- cognee/infrastructure/databases/relational/config.py (2 hunks)
- cognee/infrastructure/databases/relational/create_relational_engine.py (1 hunks)
- cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py (1 hunks)
- cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (1 hunks)
Files skipped from review as they are similar to previous changes (1)
- cognee/infrastructure/databases/relational/user_authentication/authentication_db.py
Additional context used
Ruff
cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py
94-98: f-string without any placeholders
Remove extraneous
f
prefix(F541)
102-102: f-string without any placeholders
Remove extraneous
f
prefix(F541)
Additional comments not posted (13)
cognee/infrastructure/databases/relational/create_relational_engine.py (1)
8-10
: EnumDBProvider
looks good.The enum correctly defines the database types
DUCKDB
andPOSTGRES
.cognee/infrastructure/databases/relational/config.py (1)
14-15
: Configuration changes look good.The
db_provider
configuration option is correctly added and thecreate_relational_engine
function is updated accordingly. Theto_dict
method is also updated to include the newdb_provider
attribute.Also applies to: 34-34
cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py (11)
5-10
: Initialization method looks good.The
__init__
method correctly initializes theSQLAlchemyAdapter
with the provideddb_type
,db_path
, anddb_name
.
11-20
: Dataset retrieval method looks good.The
get_datasets
method correctly retrieves and filters distinct schema names from the database.
22-25
: File metadata retrieval method looks good.The
get_files_metadata
method correctly retrieves file metadata from the specified dataset.
27-31
: Table creation method looks good.The
create_table
method correctly creates a table with the specified schema name, table name, and table configuration.
33-35
: Table deletion method looks good.The
delete_table
method correctly deletes the specified table.
37-42
: Data insertion method looks good.The
insert_data
method correctly inserts data into the specified table.
44-59
: Data retrieval method looks good.The
get_data
method correctly retrieves data from the specified table and applies filters if provided.
61-64
: Query execution method looks good.The
execute_query
method correctly executes a raw SQL query.
66-80
: Data loading method looks good.The
load_cognify_data
method correctly loads data into thecognify
table.
106-119
: Data deletion method looks good.The
delete_cognify_data
method correctly deletes all data from thecognify
table and drops the table.
121-125
: Database deletion method looks good.The
delete_database
method correctly deletes the database file and its write-ahead log file.
def create_relational_engine(db_path: str, db_name: str, db_provider:str): | ||
LocalStorage.ensure_directory_exists(db_path) | ||
|
||
return DuckDBAdapter( | ||
db_name = db_name, | ||
db_path = db_path, | ||
) | ||
llm_config = get_relationaldb_config() | ||
|
||
provider = DBProvider(llm_config.llm_provider) | ||
|
||
|
||
if provider == DBProvider.DUCKDB: | ||
|
||
return DuckDBAdapter( | ||
db_name = db_name, | ||
db_path = db_path, | ||
) | ||
elif provider == DBProvider.POSTGRES: | ||
return SQLAlchemyAdapter( | ||
db_name = db_name, | ||
db_path = db_path, | ||
db_type = db_provider, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling for invalid llm_config.llm_provider
.
The function does not handle the case where llm_config.llm_provider
is not a valid DBProvider
. Consider adding error handling for this scenario.
+ try:
+ provider = DBProvider(llm_config.llm_provider)
+ except ValueError:
+ raise ValueError(f"Invalid database provider: {llm_config.llm_provider}")
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def create_relational_engine(db_path: str, db_name: str, db_provider:str): | |
LocalStorage.ensure_directory_exists(db_path) | |
return DuckDBAdapter( | |
db_name = db_name, | |
db_path = db_path, | |
) | |
llm_config = get_relationaldb_config() | |
provider = DBProvider(llm_config.llm_provider) | |
if provider == DBProvider.DUCKDB: | |
return DuckDBAdapter( | |
db_name = db_name, | |
db_path = db_path, | |
) | |
elif provider == DBProvider.POSTGRES: | |
return SQLAlchemyAdapter( | |
db_name = db_name, | |
db_path = db_path, | |
db_type = db_provider, | |
) | |
def create_relational_engine(db_path: str, db_name: str, db_provider:str): | |
LocalStorage.ensure_directory_exists(db_path) | |
llm_config = get_relationaldb_config() | |
try: | |
provider = DBProvider(llm_config.llm_provider) | |
except ValueError: | |
raise ValueError(f"Invalid database provider: {llm_config.llm_provider}") | |
if provider == DBProvider.DUCKDB: | |
return DuckDBAdapter( | |
db_name = db_name, | |
db_path = db_path, | |
) | |
elif provider == DBProvider.POSTGRES: | |
return SQLAlchemyAdapter( | |
db_name = db_name, | |
db_path = db_path, | |
db_type = db_provider, | |
) |
|
||
def delete_cognify_data(self): | ||
with self.engine.connect() as connection: | ||
connection.execute(text(""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We create a table here and delete it immediately in the next line. Should we check if table exists and delete if it does?
document_id_target STRING NULL | ||
); | ||
""")) | ||
query = text(f""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we just created the table this query won't return anything, so we can just set records to []. Otherwise we can execute the query.
|
||
|
||
|
||
llm_config = get_relationaldb_config() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llm_config
-> relationaldb_config
. In line 26 as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed this
) | ||
from fastapi_users.db import SQLAlchemyUserDatabase | ||
|
||
from app.db import User, get_user_db |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is add.db
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 15
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
Files selected for processing (8)
- cognee/api/client.py (3 hunks)
- cognee/infrastructure/databases/relational/config.py (2 hunks)
- cognee/infrastructure/databases/relational/create_relational_engine.py (1 hunks)
- cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py (1 hunks)
- cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (1 hunks)
- cognee/infrastructure/databases/relational/user_authentication/users.py (1 hunks)
- docker-compose.yml (1 hunks)
- pyproject.toml (1 hunks)
Files skipped from review as they are similar to previous changes (3)
- cognee/infrastructure/databases/relational/config.py
- cognee/infrastructure/databases/relational/user_authentication/authentication_db.py
- cognee/infrastructure/databases/relational/user_authentication/users.py
Additional context used
Ruff
cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py
117-121: f-string without any placeholders
Remove extraneous
f
prefix(F541)
125-125: f-string without any placeholders
Remove extraneous
f
prefix(F541)
cognee/api/client.py
29-29: Module level import not at top of file
(E402)
51-51: Module level import not at top of file
(E402)
53-53: Module level import not at top of file
(E402)
55-55: Module level import not at top of file
(E402)
56-56: Module level import not at top of file
(E402)
57-57: Module level import not at top of file
(E402)
119-119: Redefinition of unused
authenticated_route
from line 85(F811)
Additional comments not posted (7)
cognee/infrastructure/databases/relational/create_relational_engine.py (4)
1-2
: Import statement is correct.The
Enum
import is necessary for defining theDBProvider
enumeration.
3-7
: Import statements are correct.The imported modules and classes are necessary for the functionality of the
create_relational_engine
function.
8-11
: Enumeration definition is correct.The
DBProvider
enumeration is correctly defined and will facilitate the selection of the appropriate database adapter.
12-16
: Function signature and directory existence check are correct.The function signature has been expanded to include additional parameters for database configuration. The
LocalStorage.ensure_directory_exists
method ensures the directory exists.docker-compose.yml (2)
63-75
: Service configuration for PostgreSQL is correct.The service configuration includes necessary environment variables, volume mapping, port exposure, and network connection.
81-83
: Volume declaration for PostgreSQL data is correct.The volume declaration ensures data persistence across container restarts.
pyproject.toml (1)
72-73
: Dependencies are correctly added.The
fastapi-users
andasyncpg
dependencies are necessary for user management and asynchronous database operations.
if provider == DBProvider.DUCKDB: | ||
# return DuckDBAdapter( | ||
# db_name = db_name, | ||
# db_path = db_path, | ||
# ) | ||
return SQLAlchemyAdapter( | ||
db_name = db_name, | ||
db_path = db_path, | ||
db_type = db_provider, | ||
db_host=db_host, | ||
db_port=db_port, | ||
db_user=db_user, | ||
db_password=db_password | ||
) | ||
elif provider == DBProvider.POSTGRES: | ||
return SQLAlchemyAdapter( | ||
db_name = db_name, | ||
db_path = db_path, | ||
db_type = db_provider, | ||
db_host= db_host, | ||
db_port= db_port, | ||
db_user= db_user, | ||
db_password= db_password | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove commented-out code and add error handling for invalid db_provider
values.
The function correctly selects the database adapter based on the db_provider
value. However, the commented-out code for DuckDBAdapter
should be removed to maintain code cleanliness. Additionally, error handling for invalid db_provider
values should be added.
- # return DuckDBAdapter(
- # db_name = db_name,
- # db_path = db_path,
- # )
+ try:
+ provider = DBProvider(db_provider)
+ except ValueError:
+ raise ValueError(f"Invalid database provider: {db_provider}")
Committable suggestion was skipped due to low confidence.
async def get_async_session(self): | ||
async_session_maker = self.Session | ||
async with async_session_maker() as session: | ||
yield session |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add type hints and docstrings.
Include type hints and docstrings for better readability and maintainability.
+ async def get_async_session(self) -> AsyncSession:
+ """
+ Yields an asynchronous SQLAlchemy session.
+ """
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async def get_async_session(self): | |
async_session_maker = self.Session | |
async with async_session_maker() as session: | |
yield session | |
async def get_async_session(self) -> AsyncSession: | |
""" | |
Yields an asynchronous SQLAlchemy session. | |
""" | |
async_session_maker = self.Session | |
async with async_session_maker() as session: | |
yield session |
def get_session(self): | ||
session_maker = self.Session | ||
with session_maker() as session: | ||
yield session |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add type hints and docstrings.
Include type hints and docstrings for better readability and maintainability.
+ def get_session(self) -> sessionmaker:
+ """
+ Yields a synchronous SQLAlchemy session.
+ """
Committable suggestion was skipped due to low confidence.
def delete_database(self): | ||
from cognee.infrastructure.files.storage import LocalStorage | ||
LocalStorage.remove(self.db_location) | ||
if LocalStorage.file_exists(self.db_location + ".wal"): | ||
LocalStorage.remove(self.db_location + ".wal") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add type hints and docstrings.
Include type hints and docstrings for better readability and maintainability.
+ def delete_database(self) -> None:
+ """
+ Deletes the database file and its write-ahead log file.
+ """
Committable suggestion was skipped due to low confidence.
def delete_cognify_data(self): | ||
with self.engine.connect() as connection: | ||
connection.execute(text(""" | ||
CREATE TABLE IF NOT EXISTS cognify ( | ||
document_id STRING, | ||
layer_id STRING, | ||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, | ||
updated_at TIMESTAMP DEFAULT NULL, | ||
processed BOOLEAN DEFAULT FALSE, | ||
document_id_target STRING NULL | ||
); | ||
""")) | ||
connection.execute(text("DELETE FROM cognify;")) | ||
connection.execute(text("DROP TABLE cognify;")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add type hints, docstrings, and remove redundant table creation logic.
Include type hints and docstrings for better readability and maintainability. Remove redundant table creation logic before deletion.
+ def delete_cognify_data(self) -> None:
+ """
+ Deletes data from the `cognify` table.
+ """
- connection.execute(text("""
- CREATE TABLE IF NOT EXISTS cognify (
- document_id STRING,
- layer_id STRING,
- created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
- updated_at TIMESTAMP DEFAULT NULL,
- processed BOOLEAN DEFAULT FALSE,
- document_id_target STRING NULL
- );
- """))
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def delete_cognify_data(self): | |
with self.engine.connect() as connection: | |
connection.execute(text(""" | |
CREATE TABLE IF NOT EXISTS cognify ( | |
document_id STRING, | |
layer_id STRING, | |
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, | |
updated_at TIMESTAMP DEFAULT NULL, | |
processed BOOLEAN DEFAULT FALSE, | |
document_id_target STRING NULL | |
); | |
""")) | |
connection.execute(text("DELETE FROM cognify;")) | |
connection.execute(text("DROP TABLE cognify;")) | |
def delete_cognify_data(self) -> None: | |
""" | |
Deletes data from the `cognify` table. | |
""" | |
with self.engine.connect() as connection: | |
connection.execute(text("DELETE FROM cognify;")) | |
connection.execute(text("DROP TABLE cognify;")) |
def get_datasets(self): | ||
with self.engine.connect() as connection: | ||
result = connection.execute(text("SELECT DISTINCT schema_name FROM information_schema.tables;")) | ||
tables = [row['schema_name'] for row in result] | ||
return list( | ||
filter( | ||
lambda schema_name: not schema_name.endswith("staging") and schema_name != "cognee", | ||
tables | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add type hints, docstrings, and improve filtering logic.
Include type hints and docstrings for better readability and maintainability. Improve filtering logic for readability.
+ def get_datasets(self) -> List[str]:
+ """
+ Retrieves distinct schema names from the database, excluding staging schemas and 'cognee'.
+ """
- return list(
- filter(
- lambda schema_name: not schema_name.endswith("staging") and schema_name != "cognee",
- tables
- )
- )
+ return [schema_name for schema_name in tables if not schema_name.endswith("staging") and schema_name != "cognee"]
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def get_datasets(self): | |
with self.engine.connect() as connection: | |
result = connection.execute(text("SELECT DISTINCT schema_name FROM information_schema.tables;")) | |
tables = [row['schema_name'] for row in result] | |
return list( | |
filter( | |
lambda schema_name: not schema_name.endswith("staging") and schema_name != "cognee", | |
tables | |
) | |
) | |
def get_datasets(self) -> List[str]: | |
""" | |
Retrieves distinct schema names from the database, excluding staging schemas and 'cognee'. | |
""" | |
with self.engine.connect() as connection: | |
result = connection.execute(text("SELECT DISTINCT schema_name FROM information_schema.tables;")) | |
tables = [row['schema_name'] for row in result] | |
return [schema_name for schema_name in tables if not schema_name.endswith("staging") and schema_name != "cognee"] |
def load_cognify_data(self, data): | ||
metadata = MetaData() | ||
cognify_table = Table( | ||
'cognify', metadata, | ||
Column('document_id', String), | ||
Column('layer_id', String), | ||
Column('created_at', TIMESTAMP, server_default=text('CURRENT_TIMESTAMP')), | ||
Column('updated_at', TIMESTAMP, nullable=True, default=None), | ||
Column('processed', Boolean, default=False), | ||
Column('document_id_target', String, nullable=True) | ||
) | ||
metadata.create_all(self.engine) | ||
insert_query = cognify_table.insert().values(document_id=text(':document_id'), layer_id=text(':layer_id')) | ||
with self.engine.connect() as connection: | ||
connection.execute(insert_query, data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add type hints and docstrings.
Include type hints and docstrings for better readability and maintainability.
+ def load_cognify_data(self, data: List[Dict[str, Any]]) -> None:
+ """
+ Loads data into the `cognify` table.
+ """
Committable suggestion was skipped due to low confidence.
def __init__(self, db_type: str, db_path: str, db_name: str, db_user:str, db_password:str, db_host:str, db_port:str): | ||
self.db_location = os.path.abspath(os.path.join(db_path, db_name)) | ||
# self.engine = create_engine(f"{db_type}:///{self.db_location}") | ||
if db_type == "duckdb": | ||
self.engine = create_engine(f"duckdb:///{self.db_location}") | ||
self.Session = sessionmaker(bind=self.engine) | ||
|
||
else: | ||
print("Name: ", db_name) | ||
print("User: ", db_user) | ||
print("Password: ", db_password) | ||
print("Host: ", db_host) | ||
print("Port: ", db_port) | ||
self.engine = create_async_engine(f"postgresql+asyncpg://{db_user}:{db_password}@{db_host}:{db_port}/{db_name}") | ||
self.Session = sessionmaker(bind=self.engine, class_=AsyncSession, expire_on_commit=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use logging instead of print statements and conditionally construct self.db_location
.
Replace print statements with logging for better maintainability. Construct self.db_location
only if the database type is DuckDB.
- self.db_location = os.path.abspath(os.path.join(db_path, db_name))
+ if db_type == "duckdb":
+ self.db_location = os.path.abspath(os.path.join(db_path, db_name))
- print("Name: ", db_name)
- print("User: ", db_user)
- print("Password: ", db_password)
- print("Host: ", db_host)
- print("Port: ", db_port)
+ logger.info(f"Name: {db_name}")
+ logger.info(f"User: {db_user}")
+ logger.info(f"Password: {db_password}")
+ logger.info(f"Host: {db_host}")
+ logger.info(f"Port: {db_port}")
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def __init__(self, db_type: str, db_path: str, db_name: str, db_user:str, db_password:str, db_host:str, db_port:str): | |
self.db_location = os.path.abspath(os.path.join(db_path, db_name)) | |
# self.engine = create_engine(f"{db_type}:///{self.db_location}") | |
if db_type == "duckdb": | |
self.engine = create_engine(f"duckdb:///{self.db_location}") | |
self.Session = sessionmaker(bind=self.engine) | |
else: | |
print("Name: ", db_name) | |
print("User: ", db_user) | |
print("Password: ", db_password) | |
print("Host: ", db_host) | |
print("Port: ", db_port) | |
self.engine = create_async_engine(f"postgresql+asyncpg://{db_user}:{db_password}@{db_host}:{db_port}/{db_name}") | |
self.Session = sessionmaker(bind=self.engine, class_=AsyncSession, expire_on_commit=False) | |
def __init__(self, db_type: str, db_path: str, db_name: str, db_user:str, db_password:str, db_host:str, db_port:str): | |
if db_type == "duckdb": | |
self.db_location = os.path.abspath(os.path.join(db_path, db_name)) | |
self.engine = create_engine(f"duckdb:///{self.db_location}") | |
self.Session = sessionmaker(bind=self.engine) | |
else: | |
logger.info(f"Name: {db_name}") | |
logger.info(f"User: {db_user}") | |
logger.info(f"Password: {db_password}") | |
logger.info(f"Host: {db_host}") | |
logger.info(f"Port: {db_port}") | |
self.engine = create_async_engine(f"postgresql+asyncpg://{db_user}:{db_password}@{db_host}:{db_port}/{db_name}") | |
self.Session = sessionmaker(bind=self.engine, class_=AsyncSession, expire_on_commit=False) |
def execute_query(self, query): | ||
with self.engine.connect() as connection: | ||
result = connection.execute(text(query)) | ||
return [dict(row) for row in result] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add type hints, docstrings, and use parameterized queries.
Include type hints and docstrings for better readability and maintainability. Use parameterized queries to prevent SQL injection.
+ def execute_query(self, query: str) -> List[Dict[str, Any]]:
+ """
+ Executes the provided SQL query.
+ """
- result = connection.execute(text(query))
+ result = connection.execute(text(":query"), {'query': query})
Committable suggestion was skipped due to low confidence.
def fetch_cognify_data(self, excluded_document_id: str): | ||
with self.engine.connect() as connection: | ||
connection.execute(text(""" | ||
CREATE TABLE IF NOT EXISTS cognify ( | ||
document_id STRING, | ||
layer_id STRING, | ||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, | ||
updated_at TIMESTAMP DEFAULT NULL, | ||
processed BOOLEAN DEFAULT FALSE, | ||
document_id_target STRING NULL | ||
); | ||
""")) | ||
query = text(f""" | ||
SELECT document_id, layer_id, created_at, updated_at, processed | ||
FROM cognify | ||
WHERE document_id != :excluded_document_id AND processed = FALSE; | ||
""") | ||
records = connection.execute(query, {'excluded_document_id': excluded_document_id}).fetchall() | ||
if records: | ||
document_ids = tuple(record['document_id'] for record in records) | ||
update_query = text(f"UPDATE cognify SET processed = TRUE WHERE document_id IN :document_ids;") | ||
connection.execute(update_query, {'document_ids': document_ids}) | ||
return [dict(record) for record in records] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add type hints, docstrings, and remove extraneous f
prefixes.
Include type hints and docstrings for better readability and maintainability. Remove extraneous f
prefixes from f-strings without placeholders.
+ def fetch_cognify_data(self, excluded_document_id: str) -> List[Dict[str, Any]]:
+ """
+ Fetches data from the `cognify` table based on the provided `excluded_document_id`.
+ """
- query = text(f"""
- SELECT document_id, layer_id, created_at, updated_at, processed
- FROM cognify
- WHERE document_id != :excluded_document_id AND processed = FALSE;
- """)
+ query = text("""
+ SELECT document_id, layer_id, created_at, updated_at, processed
+ FROM cognify
+ WHERE document_id != :excluded_document_id AND processed = FALSE;
+ """)
- update_query = text(f"UPDATE cognify SET processed = TRUE WHERE document_id IN :document_ids;")
+ update_query = text("UPDATE cognify SET processed = TRUE WHERE document_id IN :document_ids;")
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def fetch_cognify_data(self, excluded_document_id: str): | |
with self.engine.connect() as connection: | |
connection.execute(text(""" | |
CREATE TABLE IF NOT EXISTS cognify ( | |
document_id STRING, | |
layer_id STRING, | |
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, | |
updated_at TIMESTAMP DEFAULT NULL, | |
processed BOOLEAN DEFAULT FALSE, | |
document_id_target STRING NULL | |
); | |
""")) | |
query = text(f""" | |
SELECT document_id, layer_id, created_at, updated_at, processed | |
FROM cognify | |
WHERE document_id != :excluded_document_id AND processed = FALSE; | |
""") | |
records = connection.execute(query, {'excluded_document_id': excluded_document_id}).fetchall() | |
if records: | |
document_ids = tuple(record['document_id'] for record in records) | |
update_query = text(f"UPDATE cognify SET processed = TRUE WHERE document_id IN :document_ids;") | |
connection.execute(update_query, {'document_ids': document_ids}) | |
return [dict(record) for record in records] | |
def fetch_cognify_data(self, excluded_document_id: str) -> List[Dict[str, Any]]: | |
""" | |
Fetches data from the `cognify` table based on the provided `excluded_document_id`. | |
""" | |
with self.engine.connect() as connection: | |
connection.execute(text(""" | |
CREATE TABLE IF NOT EXISTS cognify ( | |
document_id STRING, | |
layer_id STRING, | |
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, | |
updated_at TIMESTAMP DEFAULT NULL, | |
processed BOOLEAN DEFAULT FALSE, | |
document_id_target STRING NULL | |
); | |
""")) | |
query = text(""" | |
SELECT document_id, layer_id, created_at, updated_at, processed | |
FROM cognify | |
WHERE document_id != :excluded_document_id AND processed = FALSE; | |
""") | |
records = connection.execute(query, {'excluded_document_id': excluded_document_id}).fetchall() | |
if records: | |
document_ids = tuple(record['document_id'] for record in records) | |
update_query = text("UPDATE cognify SET processed = TRUE WHERE document_id IN :document_ids;") | |
connection.execute(update_query, {'document_ids': document_ids}) | |
return [dict(record) for record in records] |
Tools
Ruff
117-121: f-string without any placeholders
Remove extraneous
f
prefix(F541)
125-125: f-string without any placeholders
Remove extraneous
f
prefix(F541)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Outside diff range, codebase verification and nitpick comments (1)
cognee/api/client.py (1)
Line range hint
31-59
:
Reorganize import statements to the top of the file.Imports should be at the top of the file to comply with PEP 8 standards.
- from contextlib import asynccontextmanager - from fastapi import Depends, FastAPI - from cognee.infrastructure.databases.relational.user_authentication.authentication_db import User, create_db_and_tables - from cognee.infrastructure.databases.relational.user_authentication.schemas import UserCreate, UserRead, UserUpdate - from cognee.infrastructure.databases.relational.user_authentication.users import auth_backend, current_active_user, fastapi_users + from contextlib import asynccontextmanager + from fastapi import Depends, FastAPI + from cognee.infrastructure.databases.relational.user_authentication.authentication_db import User, create_db_and_tables + from cognee.infrastructure.databases.relational.user_authentication.schemas import UserCreate, UserRead, UserUpdate + from cognee.infrastructure.databases.relational.user_authentication.users import auth_backend, current_active_user, fastapi_usersAlso applies to: 91-123
Tools
Ruff
31-31: Module level import not at top of file
(E402)
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (4)
- cognee/api/client.py (4 hunks)
- cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (1 hunks)
- cognee/infrastructure/databases/relational/user_authentication/routers.py (1 hunks)
- cognee/infrastructure/databases/relational/user_authentication/users.py (1 hunks)
Files skipped from review as they are similar to previous changes (1)
- cognee/infrastructure/databases/relational/user_authentication/authentication_db.py
Additional context used
Ruff
cognee/infrastructure/databases/relational/user_authentication/users.py
12-12: Redefinition of unused
Depends
from line 4Remove definition:
Depends
(F811)
cognee/api/client.py
31-31: Module level import not at top of file
(E402)
53-53: Module level import not at top of file
(E402)
55-55: Module level import not at top of file
(E402)
57-57: Module level import not at top of file
(E402)
58-58: Module level import not at top of file
(E402)
59-59: Module level import not at top of file
(E402)
122-122: Redefinition of unused
authenticated_route
from line 88(F811)
Additional comments not posted (3)
cognee/infrastructure/databases/relational/user_authentication/users.py (3)
37-38
: LGTM!The function
get_user_manager
is correctly implemented.
44-45
: LGTM!The function
get_jwt_strategy
is correctly implemented.
64-70
: LGTM!The function
has_permission
is correctly implemented.
@permission_router.post("/groups/{group_id}/permissions") | ||
async def assign_permission_to_group(group_id: int, permission: str, db: Session = Depends(get_user_db)): | ||
group = db.query(Group).filter(Group.id == group_id).first() | ||
if not group: | ||
raise HTTPException(status_code=404, detail="Group not found") | ||
perm = db.query(Permission).filter(Permission.name == permission).first() | ||
if not perm: | ||
perm = Permission(name=permission) | ||
db.add(perm) | ||
group.permissions.append(perm) | ||
db.commit() | ||
return {"msg": "Permission added to group"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling and logging.
The function should handle database operation errors and log the actions for better traceability.
import logging
logger = logging.getLogger(__name__)
@permission_router.post("/groups/{group_id}/permissions")
async def assign_permission_to_group(group_id: int, permission: str, db: Session = Depends(get_user_db)):
try:
group = db.query(Group).filter(Group.id == group_id).first()
if not group:
raise HTTPException(status_code=404, detail="Group not found")
perm = db.query(Permission).filter(Permission.name == permission).first()
if not perm:
perm = Permission(name=permission)
db.add(perm)
group.permissions.append(perm)
db.commit()
logger.info(f"Permission '{permission}' added to group '{group_id}'")
return {"msg": "Permission added to group"}
except Exception as e:
db.rollback()
logger.error(f"Error assigning permission to group: {e}")
raise HTTPException(status_code=500, detail="Internal Server Error")
@permission_router.post("/users/{user_id}/groups") | ||
async def add_user_to_group(user_id: int, group_id: int, db: Session = Depends(get_user_db)): | ||
user = db.query(User).filter(User.id == user_id).first() | ||
group = db.query(Group).filter(Group.id == group_id).first() | ||
if not user or not group: | ||
raise HTTPException(status_code=404, detail="User or group not found") | ||
user.groups.append(group) | ||
db.commit() | ||
return {"msg": "User added to group"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling and logging.
The function should handle database operation errors and log the actions for better traceability.
import logging
logger = logging.getLogger(__name__)
@permission_router.post("/users/{user_id}/groups")
async def add_user_to_group(user_id: int, group_id: int, db: Session = Depends(get_user_db)):
try:
user = db.query(User).filter(User.id == user_id).first()
group = db.query(Group).filter(Group.id == group_id).first()
if not user or not group:
raise HTTPException(status_code=404, detail="User or group not found")
user.groups.append(group)
db.commit()
logger.info(f"User '{user_id}' added to group '{group_id}'")
return {"msg": "User added to group"}
except Exception as e:
db.rollback()
logger.error(f"Error adding user to group: {e}")
raise HTTPException(status_code=500, detail="Internal Server Error")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
Outside diff range, codebase verification and nitpick comments (1)
cognee/infrastructure/data/models/Dataset.py (1)
5-5
: Verify the implications of changing the base class.There are still several references to
ModelBase
in the codebase, indicating that the transition toBase
is incomplete. Ensure all references toModelBase
are updated toBase
and that all necessary migrations are handled.
cognee/infrastructure/pipeline/models/Operation.py
cognee/modules/pipelines/models/Task.py
cognee/modules/pipelines/models/Pipeline.py
cognee/modules/pipelines/models/PipelineTask.py
cognee/infrastructure/databases/relational/sqlite/SqliteEngine.py
Analysis chain
Verify the implications of changing the base class.
The change from
ModelBase
toBase
might have significant implications on how theDataset
class interacts with the database. Ensure that this change is compatible with the rest of the codebase and that all necessary migrations are handled.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the usage and compatibility of the new base class `Base` in the codebase. # Test: Search for the `Dataset` class usage. Expect: No issues with the new base class. rg --type python -A 5 $'class Dataset(Base)'Length of output: 76
Script:
#!/bin/bash # Description: Verify the usage and compatibility of the new base class `Base` in the codebase. # Test: Search for the `Dataset` class usage. Expect: No issues with the new base class. rg --type py -A 5 $'class Dataset(Base)'Length of output: 41
Script:
#!/bin/bash # Description: Verify the current base class used by the `Dataset` class in the codebase. # Test: Search for the `Dataset` class definition. Expect: Identify the current base class used. rg --type py -A 5 $'class Dataset'Length of output: 963
Script:
#!/bin/bash # Description: Verify that there are no remaining references to `ModelBase` in the codebase. # Test: Search for any references to `ModelBase`. Expect: No references found. rg --type py $'ModelBase'Length of output: 937
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
Files selected for processing (11)
- cognee/api/client.py (4 hunks)
- cognee/infrastructure/data/models/Data.py (1 hunks)
- cognee/infrastructure/data/models/Dataset.py (1 hunks)
- cognee/infrastructure/data/models/DatasetData.py (1 hunks)
- cognee/infrastructure/databases/relational/ModelBase.py (1 hunks)
- cognee/infrastructure/databases/relational/init.py (1 hunks)
- cognee/infrastructure/databases/relational/sqlite/SqliteEngine.py (2 hunks)
- cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (1 hunks)
- cognee/infrastructure/databases/relational/user_authentication/routers.py (1 hunks)
- cognee/infrastructure/databases/relational/user_authentication/users.py (1 hunks)
- pyproject.toml (2 hunks)
Files skipped from review due to trivial changes (1)
- cognee/infrastructure/databases/relational/sqlite/SqliteEngine.py
Files skipped from review as they are similar to previous changes (2)
- cognee/infrastructure/databases/relational/user_authentication/routers.py
- pyproject.toml
Additional context used
Ruff
cognee/infrastructure/databases/relational/user_authentication/users.py
12-12: Redefinition of unused
Depends
from line 4Remove definition:
Depends
(F811)
cognee/infrastructure/databases/relational/user_authentication/authentication_db.py
12-12: Redefinition of unused
DeclarativeBase
from line 7Remove definition:
DeclarativeBase
(F811)
12-12: Redefinition of unused
sessionmaker
from line 7Remove definition:
sessionmaker
(F811)
12-12: Redefinition of unused
Session
from line 7Remove definition:
Session
(F811)
cognee/api/client.py
31-31: Module level import not at top of file
(E402)
53-53: Module level import not at top of file
(E402)
55-55: Module level import not at top of file
(E402)
57-57: Module level import not at top of file
(E402)
58-58: Module level import not at top of file
(E402)
59-59: Module level import not at top of file
(E402)
121-121: Redefinition of unused
authenticated_route
from line 87(F811)
Additional comments not posted (14)
cognee/infrastructure/databases/relational/ModelBase.py (1)
2-6
: LGTM! The new base class definition aligns with modern SQLAlchemy practices.The use of
DeclarativeBase
is a more contemporary approach and improves maintainability.cognee/infrastructure/databases/relational/__init__.py (1)
1-1
: LGTM! The import statement change is consistent with the new base class definition.The update ensures that the module imports the correct base class.
cognee/infrastructure/data/models/DatasetData.py (1)
4-6
: LGTM! The class inheritance change is consistent with the new base class definition.The update ensures that the
DatasetData
class inherits from the correct base class.cognee/infrastructure/data/models/Data.py (1)
5-5
: Verify the implications of changing the base class.The change from
ModelBase
toBase
might have significant implications on how theData
class interacts with the database. Ensure that this change is compatible with the rest of the codebase and that all necessary migrations are handled.cognee/infrastructure/databases/relational/user_authentication/users.py (4)
37-38
: LGTM!The function
get_user_manager
correctly yields an instance ofUserManager
.
44-45
: LGTM!The function
get_jwt_strategy
correctly returns aJWTStrategy
with the specified secret and lifetime.
59-63
: LGTM!The function
get_user_permissions
correctly retrieves and returns the permissions for a user.
65-71
: LGTM!The function
has_permission
correctly checks if a user has a specific permission.cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (2)
44-57
: LGTM!The database configuration and session management are correctly implemented.
51-61
: LGTM!The utility functions for creating the database and tables, and for getting the async session and user database are correctly implemented.
cognee/api/client.py (4)
85-85
: LGTM!The permission router is correctly implemented.
Line range hint
124-139
:
LGTM!The root and health check endpoints are correctly implemented.
Tools
Ruff
53-53: Module level import not at top of file
(E402)
55-55: Module level import not at top of file
(E402)
57-57: Module level import not at top of file
(E402)
58-58: Module level import not at top of file
(E402)
59-59: Module level import not at top of file
(E402)
121-121: Redefinition of unused
authenticated_route
from line 87(F811)
Line range hint
140-261
:
LGTM!The dataset-related endpoints are correctly implemented.
Line range hint
262-349
:
LGTM!The endpoints for settings, adding data, cognitive processing, and searching are correctly implemented.
# Association table for many-to-many relationship between users and groups | ||
user_group = Table('user_group', Base.metadata, | ||
Column('user_id', UUID, ForeignKey('users.id')), | ||
Column('group_id', UUID, ForeignKey('groups.id'))) | ||
|
||
# Association table for many-to-many relationship between groups and permissions | ||
group_permission = Table('group_permission', Base.metadata, | ||
Column('group_id', UUID, ForeignKey('groups.id')), | ||
Column('permission_id', UUID, ForeignKey('permissions.id'))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding explicit primary keys for association tables.
While the association tables are correctly defined, adding explicit primary keys can help with performance and data integrity.
- user_group = Table('user_group', Base.metadata,
- Column('user_id', UUID, ForeignKey('users.id')),
- Column('group_id', UUID, ForeignKey('groups.id')))
+ user_group = Table('user_group', Base.metadata,
+ Column('id', UUID, primary_key=True, index=True),
+ Column('user_id', UUID, ForeignKey('users.id')),
+ Column('group_id', UUID, ForeignKey('groups.id')))
- group_permission = Table('group_permission', Base.metadata,
- Column('group_id', UUID, ForeignKey('groups.id')),
- Column('permission_id', UUID, ForeignKey('permissions.id')))
+ group_permission = Table('group_permission', Base.metadata,
+ Column('id', UUID, primary_key=True, index=True),
+ Column('group_id', UUID, ForeignKey('groups.id')),
+ Column('permission_id', UUID, ForeignKey('permissions.id')))
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
# Association table for many-to-many relationship between users and groups | |
user_group = Table('user_group', Base.metadata, | |
Column('user_id', UUID, ForeignKey('users.id')), | |
Column('group_id', UUID, ForeignKey('groups.id'))) | |
# Association table for many-to-many relationship between groups and permissions | |
group_permission = Table('group_permission', Base.metadata, | |
Column('group_id', UUID, ForeignKey('groups.id')), | |
Column('permission_id', UUID, ForeignKey('permissions.id'))) | |
# Association table for many-to-many relationship between users and groups | |
user_group = Table('user_group', Base.metadata, | |
Column('id', UUID, primary_key=True, index=True), | |
Column('user_id', UUID, ForeignKey('users.id')), | |
Column('group_id', UUID, ForeignKey('groups.id'))) | |
# Association table for many-to-many relationship between groups and permissions | |
group_permission = Table('group_permission', Base.metadata, | |
Column('id', UUID, primary_key=True, index=True), | |
Column('group_id', UUID, ForeignKey('groups.id')), | |
Column('permission_id', UUID, ForeignKey('permissions.id'))) |
class User(SQLAlchemyBaseUserTableUUID, Base): | ||
__tablename__ = 'users' | ||
groups = relationship('Group', secondary=user_group, back_populates='users') | ||
|
||
class Group(Base): | ||
__tablename__ = 'groups' | ||
id = Column(UUID, primary_key=True, index=True) | ||
name = Column(String, unique=True, index=True) | ||
users = relationship('User', secondary=user_group, back_populates='groups') | ||
permissions = relationship('Permission', secondary=group_permission, back_populates='groups') | ||
|
||
class Permission(Base): | ||
__tablename__ = 'permissions' | ||
id = Column(UUID, primary_key=True, index=True) | ||
name = Column(String, unique=True, index=True) | ||
groups = relationship('Group', secondary=group_permission, back_populates='permissions') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add id
column to the User
model.
The User
model is missing the id
column, which is necessary for primary key identification.
- class User(SQLAlchemyBaseUserTableUUID, Base):
- __tablename__ = 'users'
- groups = relationship('Group', secondary=user_group, back_populates='users')
+ class User(SQLAlchemyBaseUserTableUUID, Base):
+ __tablename__ = 'users'
+ id = Column(UUID, primary_key=True, index=True)
+ groups = relationship('Group', secondary=user_group, back_populates='users')
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
class User(SQLAlchemyBaseUserTableUUID, Base): | |
__tablename__ = 'users' | |
groups = relationship('Group', secondary=user_group, back_populates='users') | |
class Group(Base): | |
__tablename__ = 'groups' | |
id = Column(UUID, primary_key=True, index=True) | |
name = Column(String, unique=True, index=True) | |
users = relationship('User', secondary=user_group, back_populates='groups') | |
permissions = relationship('Permission', secondary=group_permission, back_populates='groups') | |
class Permission(Base): | |
__tablename__ = 'permissions' | |
id = Column(UUID, primary_key=True, index=True) | |
name = Column(String, unique=True, index=True) | |
groups = relationship('Group', secondary=group_permission, back_populates='permissions') | |
class User(SQLAlchemyBaseUserTableUUID, Base): | |
__tablename__ = 'users' | |
id = Column(UUID, primary_key=True, index=True) | |
groups = relationship('Group', secondary=user_group, back_populates='users') | |
class Group(Base): | |
__tablename__ = 'groups' | |
id = Column(UUID, primary_key=True, index=True) | |
name = Column(String, unique=True, index=True) | |
users = relationship('User', secondary=user_group, back_populates='groups') | |
permissions = relationship('Permission', secondary=group_permission, back_populates='groups') | |
class Permission(Base): | |
__tablename__ = 'permissions' | |
id = Column(UUID, primary_key=True, index=True) | |
name = Column(String, unique=True, index=True) | |
groups = relationship('Group', secondary=group_permission, back_populates='permissions') |
cognee/api/client.py
Outdated
@@ -13,6 +13,8 @@ | |||
from fastapi.middleware.cors import CORSMiddleware | |||
from pydantic import BaseModel | |||
|
|||
from cognee.infrastructure.databases.relational.user_authentication.routers import permission_router |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reorganize import statements to the top of the file.
Imports should be at the top of the file to comply with PEP 8 standards.
- from contextlib import asynccontextmanager
- from fastapi import Depends, FastAPI
- from cognee.infrastructure.databases.relational.user_authentication.authentication_db import User, create_db_and_tables
- from cognee.infrastructure.databases.relational.user_authentication.schemas import UserCreate, UserRead, UserUpdate
- from cognee.infrastructure.databases.relational.user_authentication.users import auth_backend, current_active_user, fastapi_users
+ from contextlib import asynccontextmanager
+ from fastapi import Depends, FastAPI
+ from cognee.infrastructure.databases.relational.user_authentication.routers import permission_router
+ from cognee.infrastructure.databases.relational.user_authentication.authentication_db import User, create_db_and_tables
+ from cognee.infrastructure.databases.relational.user_authentication.schemas import UserCreate, UserRead, UserUpdate
+ from cognee.infrastructure.databases.relational.user_authentication.users import auth_backend, current_active_user, fastapi_users
Also applies to: 31-31, 53-59
cognee/api/client.py
Outdated
@asynccontextmanager | ||
async def lifespan(app: FastAPI): | ||
# Not needed if you setup a migration system like Alembic | ||
await create_db_and_tables() | ||
yield | ||
app = FastAPI(debug = os.getenv("ENV") != "prod", lifespan=lifespan) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove duplicate lifespan
context manager.
The lifespan
context manager is defined twice, which can cause conflicts.
- @asynccontextmanager
- async def lifespan(app: FastAPI):
- # Not needed if you setup a migration system like Alembic
- await create_db_and_tables()
- yield
Also applies to: 90-95
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Vasilije1990 This one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
cognee/api/client.py
Outdated
@app.get("/authenticated-route") | ||
async def authenticated_route(user: User = Depends(current_active_user)): | ||
return {"message": f"Hello {user.email}!"} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove duplicate function authenticated_route
.
The function authenticated_route
is defined twice, which can cause conflicts.
- @app.get("/authenticated-route")
- async def authenticated_route(user: User = Depends(current_active_user)):
- return {"message": f"Hello {user.email}!"}
Also applies to: 120-123
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Vasilije1990 This one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Outside diff range, codebase verification and nitpick comments (2)
cognee/modules/data/processing/process_documents.py (1)
4-4
: Document the new parameters.The new parameters
user
anduser_permissions
should be documented to explain their purpose and usage.+ """ + Process documents and add user-specific data. + + Args: + documents (list[Document]): List of documents to process. + parent_node_id (str, optional): ID of the parent node. Defaults to None. + user (str, optional): User information. Defaults to None. + user_permissions (str, optional): User permissions. Defaults to None. + """cognee/api/v1/cognify/cognify_v2.py (1)
39-39
: Document the newuser_id
parameter.The new parameter
user_id
should be documented to explain its purpose and usage.+ """ + Cognify datasets and add user-specific data. + + Args: + datasets (Union[str, list[str]], optional): List of datasets to process. Defaults to None. + root_node_id (str, optional): ID of the root node. Defaults to None. + user_id (str, optional): User ID. Defaults to "default_user". + """
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (3)
- cognee/api/v1/cognify/cognify_v2.py (3 hunks)
- cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (1 hunks)
- cognee/modules/data/processing/process_documents.py (2 hunks)
Additional context used
Ruff
cognee/infrastructure/databases/relational/user_authentication/authentication_db.py
12-12: Redefinition of unused
DeclarativeBase
from line 7Remove definition:
DeclarativeBase
(F811)
12-12: Redefinition of unused
sessionmaker
from line 7Remove definition:
sessionmaker
(F811)
12-12: Redefinition of unused
Session
from line 7Remove definition:
Session
(F811)
Additional comments not posted (4)
cognee/api/v1/cognify/cognify_v2.py (4)
1-11
: Imports look good.The imports include necessary modules for user authentication and permission management.
34-38
:PermissionDeniedException
class looks good.The class is correctly defined and necessary for handling permission errors.
40-47
: Permission check logic looks good.The permission check logic ensures that only authorized users can proceed with the operations.
83-83
: Pass user information to tasks.The tasks now include user information, which is necessary for user-specific processing.
document_dict = document.to_dict() | ||
document_dict["user"] = user | ||
document_dict["user_permissions"] = user_permissions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the unused document_dict
.
The new dictionary document_dict
is created but not used in the nodes
list. Use document_dict
instead of calling document.to_dict()
again.
- nodes.append((str(document.id), document.to_dict()))
+ nodes.append((str(document.id), document_dict))
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
document_dict = document.to_dict() | |
document_dict["user"] = user | |
document_dict["user_permissions"] = user_permissions | |
document_dict = document.to_dict() | |
document_dict["user"] = user | |
document_dict["user_permissions"] = user_permissions | |
nodes.append((str(document.id), document_dict)) |
class User(SQLAlchemyBaseUserTableUUID, Base): | ||
__tablename__ = 'users' | ||
groups = relationship('Group', secondary=user_group, back_populates='users') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add id
column to the User
model.
The User
model is missing the id
column, which is necessary for primary key identification.
- class User(SQLAlchemyBaseUserTableUUID, Base):
- __tablename__ = 'users'
- groups = relationship('Group', secondary=user_group, back_populates='users')
+ class User(SQLAlchemyBaseUserTableUUID, Base):
+ __tablename__ = 'users'
+ id = Column(UUID, primary_key=True, index=True)
+ groups = relationship('Group', secondary=user_group, back_populates='users')
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
class User(SQLAlchemyBaseUserTableUUID, Base): | |
__tablename__ = 'users' | |
groups = relationship('Group', secondary=user_group, back_populates='users') | |
class User(SQLAlchemyBaseUserTableUUID, Base): | |
__tablename__ = 'users' | |
id = Column(UUID, primary_key=True, index=True) | |
groups = relationship('Group', secondary=user_group, back_populates='users') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Vasilije1990 This one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it already does this when it inherits from SQLAlchemyBaseUserTableUUID
from typing import AsyncGenerator, Generator | ||
|
||
from fastapi import Depends | ||
from fastapi_users.db import SQLAlchemyBaseUserTableUUID, SQLAlchemyUserDatabase | ||
from sqlalchemy import create_engine, UUID | ||
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine | ||
from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session | ||
from cognee.infrastructure.databases.relational import get_relationaldb_config, Base | ||
from sqlalchemy import Column, String, ForeignKey, Table, Integer | ||
|
||
|
||
from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session, relationship |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove redefined imports and organize them.
The imports DeclarativeBase
, sessionmaker
, and Session
are redefined on line 12. Additionally, imports should be organized according to PEP 8 guidelines.
- from typing import AsyncGenerator, Generator
- from fastapi import Depends
- from fastapi_users.db import SQLAlchemyBaseUserTableUUID, SQLAlchemyUserDatabase
- from sqlalchemy import create_engine, UUID
- from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine
- from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session
- from cognee.infrastructure.databases.relational import get_relationaldb_config, Base
- from sqlalchemy import Column, String, ForeignKey, Table, Integer
- from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session, relationship
+ from typing import AsyncGenerator, Generator
+ from fastapi import Depends
+ from fastapi_users.db import SQLAlchemyBaseUserTableUUID, SQLAlchemyUserDatabase
+ from sqlalchemy import create_engine, UUID, Column, String, ForeignKey, Table, Integer
+ from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine
+ from sqlalchemy.orm import sessionmaker, Session, relationship
+ from cognee.infrastructure.databases.relational import get_relationaldb_config, Base
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
from typing import AsyncGenerator, Generator | |
from fastapi import Depends | |
from fastapi_users.db import SQLAlchemyBaseUserTableUUID, SQLAlchemyUserDatabase | |
from sqlalchemy import create_engine, UUID | |
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine | |
from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session | |
from cognee.infrastructure.databases.relational import get_relationaldb_config, Base | |
from sqlalchemy import Column, String, ForeignKey, Table, Integer | |
from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session, relationship | |
from typing import AsyncGenerator, Generator | |
from fastapi import Depends | |
from fastapi_users.db import SQLAlchemyBaseUserTableUUID, SQLAlchemyUserDatabase | |
from sqlalchemy import create_engine, UUID, Column, String, ForeignKey, Table, Integer | |
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine | |
from sqlalchemy.orm import sessionmaker, Session, relationship | |
from cognee.infrastructure.databases.relational import get_relationaldb_config, Base |
Tools
Ruff
12-12: Redefinition of unused
DeclarativeBase
from line 7Remove definition:
DeclarativeBase
(F811)
12-12: Redefinition of unused
sessionmaker
from line 7Remove definition:
sessionmaker
(F811)
12-12: Redefinition of unused
Session
from line 7Remove definition:
Session
(F811)
cognee/api/client.py
Outdated
@asynccontextmanager | ||
async def lifespan(app: FastAPI): | ||
# Not needed if you setup a migration system like Alembic | ||
await create_db_and_tables() | ||
yield | ||
app = FastAPI(debug = os.getenv("ENV") != "prod", lifespan=lifespan) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Vasilije1990 This one
cognee/api/client.py
Outdated
@app.get("/authenticated-route") | ||
async def authenticated_route(user: User = Depends(current_active_user)): | ||
return {"message": f"Hello {user.email}!"} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Vasilije1990 This one
cognee/api/client.py
Outdated
@@ -267,8 +345,8 @@ def start_api_server(host: str = "0.0.0.0", port: int = 8000): | |||
relational_config.create_engine() | |||
|
|||
from cognee.modules.data.deletion import prune_system, prune_data | |||
asyncio.run(prune_data()) | |||
asyncio.run(prune_system(metadata = True)) | |||
# asyncio.run(prune_data()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please uncomment this after you are done.
cognee/api/v1/cognify/cognify_v2.py
Outdated
super().__init__(self.message) | ||
|
||
async def cognify(datasets: Union[str, list[str]] = None, root_node_id: str = None, user_id:str="default_user"): | ||
session: AsyncSession = async_session_maker() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's put this login in a function in some module and just use it here.
Something like def check_permissions(user)
for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's discuss this live
self.Session = sessionmaker(bind=self.engine) | ||
|
||
else: | ||
print("Name: ", db_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove print statements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session, relationship | ||
|
||
# Association table for many-to-many relationship between users and groups | ||
user_group = Table('user_group', Base.metadata, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we define this like we did for Data and Dataset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you specify exactly what needs to be done?
class User(SQLAlchemyBaseUserTableUUID, Base): | ||
__tablename__ = 'users' | ||
groups = relationship('Group', secondary=user_group, back_populates='users') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Vasilije1990 This one
if not group: | ||
raise HTTPException(status_code=404, detail="Group not found") | ||
perm = db.query(Permission).filter(Permission.name == permission).first() | ||
if not perm: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perm
-> permission
Do we want to create a new permission freely here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't get it, what is the exact problem?
permissions.update(permission.name for permission in group.permissions) | ||
return permissions | ||
|
||
def has_permission(permission: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use this function where we need to check for permissions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where exactly?
@@ -60,8 +60,24 @@ services: | |||
- 3001:3000 | |||
networks: | |||
- cognee-network | |||
postgres: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Space between.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Outside diff range, codebase verification and nitpick comments (1)
cognee/api/client.py (1)
345-346
: Uncomment or remove the commented-out code.Please uncomment this code if it is needed, or remove it if it is no longer necessary.
- # asyncio.run(prune_data()) - # asyncio.run(prune_system(metadata = True)) + asyncio.run(prune_data()) + asyncio.run(prune_system(metadata = True))
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- cognee/api/client.py (4 hunks)
Additional context used
Ruff
cognee/api/client.py
31-31: Module level import not at top of file
(E402)
53-53: Module level import not at top of file
(E402)
55-55: Module level import not at top of file
(E402)
57-57: Module level import not at top of file
(E402)
58-58: Module level import not at top of file
(E402)
59-59: Module level import not at top of file
(E402)
Additional comments not posted (3)
cognee/api/client.py (3)
85-85
: LGTM!The inclusion of
permission_router
appears correct and follows the pattern used for other routers.
61-83
: LGTM!The inclusion of authentication and user management routers appears correct and follows the pattern used for other routers.
Also applies to: 96-117
Line range hint
1-345
:
LGTM!The overall structure and completeness of the file appear to be well-organized and functional.
Tools
Ruff
53-53: Module level import not at top of file
(E402)
55-55: Module level import not at top of file
(E402)
57-57: Module level import not at top of file
(E402)
58-58: Module level import not at top of file
(E402)
59-59: Module level import not at top of file
(E402)
cognee/api/client.py
Outdated
from cognee.infrastructure.databases.relational.user_authentication.routers import permission_router | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reorganize import statements to the top of the file.
Imports should be at the top of the file to comply with PEP 8 standards.
- from contextlib import asynccontextmanager
- from fastapi import Depends, FastAPI
- from cognee.infrastructure.databases.relational.user_authentication.authentication_db import User, create_db_and_tables
- from cognee.infrastructure.databases.relational.user_authentication.schemas import UserCreate, UserRead, UserUpdate
- from cognee.infrastructure.databases.relational.user_authentication.users import auth_backend, current_active_user, fastapi_users
+ from contextlib import asynccontextmanager
+ from fastapi import Depends, FastAPI
+ from cognee.infrastructure.databases.relational.user_authentication.routers import permission_router
+ from cognee.infrastructure.databases.relational.user_authentication.authentication_db import User, create_db_and_tables
+ from cognee.infrastructure.databases.relational.user_authentication.schemas import UserCreate, UserRead, UserUpdate
+ from cognee.infrastructure.databases.relational.user_authentication.users import auth_backend, current_active_user, fastapi_users
Also applies to: 31-32, 53-60
cognee/api/client.py
Outdated
@asynccontextmanager | ||
async def lifespan(app: FastAPI): | ||
# Not needed if you setup a migration system like Alembic | ||
await create_db_and_tables() | ||
yield | ||
app = FastAPI(debug = os.getenv("ENV") != "prod", lifespan=lifespan) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove duplicate lifespan
context manager.
The lifespan
context manager is defined twice, which can cause conflicts.
- @asynccontextmanager
- async def lifespan(app: FastAPI):
- # Not needed if you setup a migration system like Alembic
- await create_db_and_tables()
- yield
Also applies to: 90-95
cognee/api/client.py
Outdated
@app.get("/authenticated-route") | ||
async def authenticated_route(user: User = Depends(current_active_user)): | ||
return {"message": f"Hello {user.email}!"} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove duplicate function authenticated_route
.
The function authenticated_route
is defined twice, which can cause conflicts.
- @app.get("/authenticated-route")
- async def authenticated_route(user: User = Depends(current_active_user)):
- return {"message": f"Hello {user.email}!"}
Also applies to: 120-123
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (2)
- cognee/api/client.py (4 hunks)
- cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py (1 hunks)
Additional context used
Ruff
cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py
112-116: f-string without any placeholders
Remove extraneous
f
prefix(F541)
120-120: f-string without any placeholders
Remove extraneous
f
prefix(F541)
cognee/api/client.py
31-31: Module level import not at top of file
(E402)
53-53: Module level import not at top of file
(E402)
55-55: Module level import not at top of file
(E402)
57-57: Module level import not at top of file
(E402)
58-58: Module level import not at top of file
(E402)
59-59: Module level import not at top of file
(E402)
Additional comments not posted (14)
cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py (14)
7-16
: Add type hints, docstrings, and conditionally constructdb_location
.Include type hints and docstrings for better readability and maintainability. Construct
db_location
only if the database type is DuckDB.- def __init__(self, db_type: str, db_path: str, db_name: str, db_user:str, db_password:str, db_host:str, db_port:str): + def __init__(self, db_type: str, db_path: str, db_name: str, db_user: str, db_password: str, db_host: str, db_port: str): + """ + Initializes the SQLAlchemyAdapter with the specified database configuration. + """ - self.db_location = os.path.abspath(os.path.join(db_path, db_name)) + if db_type == "duckdb": + self.db_location = os.path.abspath(os.path.join(db_path, db_name))
19-22
: Add type hints and docstrings.Include type hints and docstrings for better readability and maintainability.
+ async def get_async_session(self) -> AsyncSession: + """ + Yields an asynchronous SQLAlchemy session. + """
24-27
: Add type hints and docstrings.Include type hints and docstrings for better readability and maintainability.
+ def get_session(self) -> sessionmaker: + """ + Yields a synchronous SQLAlchemy session. + """
29-38
: Add type hints, docstrings, and improve filtering logic.Include type hints and docstrings for better readability and maintainability. Improve filtering logic for readability.
+ def get_datasets(self) -> List[str]: + """ + Retrieves distinct schema names from the database, excluding staging schemas and 'cognee'. + """ - return list( - filter( - lambda schema_name: not schema_name.endswith("staging") and schema_name != "cognee", - tables - ) - ) + return [schema_name for schema_name in tables if not schema_name.endswith("staging") and schema_name != "cognee"]
40-43
: Add type hints, docstrings, and use parameterized queries.Include type hints and docstrings for better readability and maintainability. Use parameterized queries to prevent SQL injection.
+ def get_files_metadata(self, dataset_name: str) -> List[Dict[str, Any]]: + """ + Retrieves file metadata from the specified dataset. + """ - result = connection.execute(text(f"SELECT id, name, file_path, extension, mime_type FROM {dataset_name}.file_metadata;")) + result = connection.execute(text("SELECT id, name, file_path, extension, mime_type FROM :dataset_name.file_metadata;"), {'dataset_name': dataset_name})
45-49
: Add type hints, docstrings, and use parameterized queries.Include type hints and docstrings for better readability and maintainability. Use parameterized queries to prevent SQL injection.
+ def create_table(self, schema_name: str, table_name: str, table_config: List[Dict[str, str]]) -> None: + """ + Creates a table with the specified schema and configuration. + """ - connection.execute(text(f"CREATE SCHEMA IF NOT EXISTS {schema_name};")) - connection.execute(text(f"CREATE TABLE IF NOT EXISTS {schema_name}.{table_name} ({', '.join(fields_query_parts)});")) + connection.execute(text("CREATE SCHEMA IF NOT EXISTS :schema_name;"), {'schema_name': schema_name}) + connection.execute(text("CREATE TABLE IF NOT EXISTS :schema_name.:table_name (:fields_query_parts);"), {'schema_name': schema_name, 'table_name': table_name, 'fields_query_parts': ', '.join(fields_query_parts)})
51-53
: Add type hints, docstrings, and use parameterized queries.Include type hints and docstrings for better readability and maintainability. Use parameterized queries to prevent SQL injection.
+ def delete_table(self, table_name: str) -> None: + """ + Deletes the specified table. + """ - connection.execute(text(f"DROP TABLE IF EXISTS {table_name};")) + connection.execute(text("DROP TABLE IF EXISTS :table_name;"), {'table_name': table_name})
55-60
: Add type hints, docstrings, and use parameterized queries.Include type hints and docstrings for better readability and maintainability. Use parameterized queries to prevent SQL injection.
+ def insert_data(self, schema_name: str, table_name: str, data: List[Dict[str, Any]]) -> None: + """ + Inserts data into the specified table. + """ - insert_query = text(f"INSERT INTO {schema_name}.{table_name} ({columns}) VALUES {values};") + insert_query = text("INSERT INTO :schema_name.:table_name (:columns) VALUES :values;") + with self.engine.connect() as connection: + connection.execute(insert_query, {'schema_name': schema_name, 'table_name': table_name, 'columns': columns, 'values': values, **data})
62-77
: Add type hints, docstrings, and use parameterized queries.Include type hints and docstrings for better readability and maintainability. Use parameterized queries to prevent SQL injection.
+ def get_data(self, table_name: str, filters: Optional[Dict[str, Union[str, List[str]]]] = None) -> Dict[str, Any]: + """ + Retrieves data from the specified table based on the provided filters. + """ - query = f"SELECT * FROM {table_name}" + query = "SELECT * FROM :table_name" - filter_conditions = " AND ".join([ - f"{key} IN ({', '.join([f':{key}{i}' for i in range(len(value))])})" if isinstance(value, list) - else f"{key} = :{key}" for key, value in filters.items() - ]) - query += f" WHERE {filter_conditions};" + filter_conditions = " AND ".join([ + f"{key} IN (:values)" if isinstance(value, list) + else f"{key} = :{key}" for key, value in filters.items() + ]) + query += f" WHERE {filter_conditions};" + query = text(query) + results = connection.execute(query, {'table_name': table_name, 'values': ', '.join([f':{key}{i}' for i in range(len(value))]), **filters})
79-82
: Add type hints, docstrings, and use parameterized queries.Include type hints and docstrings for better readability and maintainability. Use parameterized queries to prevent SQL injection.
+ def execute_query(self, query: str) -> List[Dict[str, Any]]: + """ + Executes the provided SQL query. + """ - result = connection.execute(text(query)) + result = connection.execute(text(":query"), {'query': query})
84-98
: Add type hints, docstrings, and use parameterized queries.Include type hints and docstrings for better readability and maintainability. Use parameterized queries to prevent SQL injection.
+ def load_cognify_data(self, data: List[Dict[str, Any]]) -> None: + """ + Loads data into the `cognify` table. + """ - insert_query = cognify_table.insert().values(document_id=text(':document_id'), layer_id=text(':layer_id')) + insert_query = cognify_table.insert().values(document_id=:document_id, layer_id=:layer_id)
100-122
: Add type hints, docstrings, and remove redundant table creation logic.Include type hints and docstrings for better readability and maintainability. Remove redundant table creation logic before fetching data. Remove extraneous
f
prefixes from f-strings without placeholders.+ def fetch_cognify_data(self, excluded_document_id: str) -> List[Dict[str, Any]]: + """ + Fetches data from the `cognify` table based on the provided `excluded_document_id`. + """ - connection.execute(text(""" - CREATE TABLE IF NOT EXISTS cognify ( - document_id STRING, - layer_id STRING, - created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, - updated_at TIMESTAMP DEFAULT NULL, - processed BOOLEAN DEFAULT FALSE, - document_id_target STRING NULL - ); - """)) - query = text(f""" + query = text(""" + SELECT document_id, layer_id, created_at, updated_at, processed + FROM cognify + WHERE document_id != :excluded_document_id AND processed = FALSE; + """) - update_query = text(f"UPDATE cognify SET processed = TRUE WHERE document_id IN :document_ids;") + update_query = text("UPDATE cognify SET processed = TRUE WHERE document_id IN :document_ids;")Tools
Ruff
112-116: f-string without any placeholders
Remove extraneous
f
prefix(F541)
120-120: f-string without any placeholders
Remove extraneous
f
prefix(F541)
124-137
: Add type hints, docstrings, and remove redundant table creation logic.Include type hints and docstrings for better readability and maintainability. Remove redundant table creation logic before deletion.
+ def delete_cognify_data(self) -> None: + """ + Deletes data from the `cognify` table. + """ - connection.execute(text(""" - CREATE TABLE IF NOT EXISTS cognify ( - document_id STRING, - layer_id STRING, - created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, - updated_at TIMESTAMP DEFAULT NULL, - processed BOOLEAN DEFAULT FALSE, - document_id_target STRING NULL - ); - """))
139-143
: Add type hints and docstrings.Include type hints and docstrings for better readability and maintainability.
+ def delete_database(self) -> None: + """ + Deletes the database file and its write-ahead log file. + """
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
Outside diff range, codebase verification and nitpick comments (2)
cognee/api/v1/authenticate_user/authenticate_user.py (1)
1-1
: Theauthenticate_user_method
function does not exist in theauthentication_db
module.
- The file
cognee/infrastructure/databases/relational/user_authentication/authentication_db.py
does not contain a function namedauthenticate_user_method
.Please review the import statement in
cognee/api/v1/authenticate_user/authenticate_user.py
and ensure that the correct module and function are being imported.Analysis chain
Verify the import statement.
Ensure that the
authenticate_user_method
function exists in theauthentication_db
module.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the existence of the `authenticate_user_method` function in the `authentication_db` module. # Test: Search for the function definition. Expect: The function definition exists. ast-grep --lang python --pattern $'def authenticate_user_method($_):'Length of output: 70
Script:
#!/bin/bash # Description: Locate the `authentication_db` module and verify the existence of the `authenticate_user_method` function. # Locate the `authentication_db` module fd 'authentication_db.py' # Search for the function definition within the located file(s) fd 'authentication_db.py' --exec ast-grep --lang python --pattern $'def authenticate_user_method($_):' {}Length of output: 216
cognee/api/v1/create_user/create_user.py (1)
1-1
: Thecreate_user_method
function does not exist in theauthentication_db
module.
- The module
cognee/infrastructure/databases/relational/user_authentication/authentication_db.py
exists.- However, the function
create_user_method
was not found within this module.Please review and correct the import statement in
cognee/api/v1/create_user/create_user.py
.Analysis chain
Verify the import statement.
Ensure that the
create_user_method
function exists in theauthentication_db
module.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the existence of the `create_user_method` function in the `authentication_db` module. # Test: Search for the function definition. Expect: The function definition exists. ast-grep --lang python --pattern $'def create_user_method($_, $_, $_):'Length of output: 72
Script:
#!/bin/bash # Description: Re-verify the existence of the `create_user_method` function in the `authentication_db` module with a broader search pattern. # Test: Search for any function named `create_user_method`. Expect: The function definition exists. ast-grep --lang python --pattern 'def create_user_method($$):' # Additionally, verify the module path to ensure it exists. fd 'authentication_db.py' | xargs -I {} echo "Found module: {}"Length of output: 228
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (8)
- cognee/api/v1/authenticate_user/init.py (1 hunks)
- cognee/api/v1/authenticate_user/authenticate_user.py (1 hunks)
- cognee/api/v1/create_user/init.py (1 hunks)
- cognee/api/v1/create_user/create_user.py (1 hunks)
- cognee/api/v1/reset_user_password/init.py (1 hunks)
- cognee/api/v1/verify_user_token/init.py (1 hunks)
- cognee/api/v1/verify_user_token/verify_user_token.py (1 hunks)
- cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (1 hunks)
Files skipped from review due to trivial changes (4)
- cognee/api/v1/authenticate_user/init.py
- cognee/api/v1/create_user/init.py
- cognee/api/v1/reset_user_password/init.py
- cognee/api/v1/verify_user_token/init.py
Additional context used
Ruff
cognee/infrastructure/databases/relational/user_authentication/authentication_db.py
15-15: Redefinition of unused
DeclarativeBase
from line 9Remove definition:
DeclarativeBase
(F811)
15-15: Redefinition of unused
sessionmaker
from line 9Remove definition:
sessionmaker
(F811)
15-15: Redefinition of unused
Session
from line 9Remove definition:
Session
(F811)
149-149: Local variable
user_manager
is assigned to but never usedRemove assignment to unused variable
user_manager
(F841)
158-158: Do not use bare
except
(E722)
174-174: Do not use bare
except
(E722)
Additional comments not posted (10)
cognee/api/v1/verify_user_token/verify_user_token.py (1)
1-1
: Verify the import statement.Ensure that the
user_check_token
function exists in theauthentication_db
module.Verification successful
The
user_check_token
function exists in theauthentication_db
module. The import statement is correct.
- The
user_check_token
function is defined towards the end of theauthentication_db
module.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the existence of the `user_check_token` function in the `authentication_db` module. # Test: Search for the function definition. Expect: The function definition exists. ast-grep --lang python --pattern $'def user_check_token($_):'Length of output: 62
Script:
#!/bin/bash # Description: List all function definitions in the `authentication_db` module to verify the presence of `user_check_token`. # Locate the `authentication_db` module auth_db_path=$(fd authentication_db.py) # List all function definitions in the `authentication_db` module ast-grep --lang python --pattern $'def $_($_):' $auth_db_pathLength of output: 258
Script:
#!/bin/bash # Description: Display the contents of the `authentication_db` module to verify the presence of `user_check_token`. # Locate the `authentication_db` module auth_db_path=$(fd authentication_db.py) # Display the contents of the `authentication_db` module cat $auth_db_pathLength of output: 7948
cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (9)
20-27
: Consider adding explicit primary keys for association tables.While the association tables are correctly defined, adding explicit primary keys can help with performance and data integrity.
- user_group = Table('user_group', Base.metadata, - Column('user_id', UUID, ForeignKey('users.id')), - Column('group_id', UUID, ForeignKey('groups.id'))) + user_group = Table('user_group', Base.metadata, + Column('id', UUID, primary_key=True, index=True), + Column('user_id', UUID, ForeignKey('users.id')), + Column('group_id', UUID, ForeignKey('groups.id'))) - group_permission = Table('group_permission', Base.metadata, - Column('group_id', UUID, ForeignKey('groups.id')), - Column('permission_id', UUID, ForeignKey('permissions.id'))) + group_permission = Table('group_permission', Base.metadata, + Column('id', UUID, primary_key=True, index=True), + Column('group_id', UUID, ForeignKey('groups.id')), + Column('permission_id', UUID, ForeignKey('permissions.id')))
35-46
: LGTM!The model classes
Group
andPermission
are correctly defined with primary keys and relationships.
60-65
: LGTM!The configuration section is correctly defined.
67-69
: LGTM!The function
create_db_and_tables
is correctly defined.
71-73
: LGTM!The function
get_async_session
is correctly defined.
76-77
: LGTM!The function
get_user_db
is correctly defined.
80-81
: LGTM!The function
hash_password
is correctly defined.
114-122
: LGTM!The function
reset_user_password_method
is correctly defined.
48-57
: Verify the relationship todataset_data
.The
ACL
class includes a relationship todataset_data
, which should be reviewed for correctness.Verification successful
Verify the relationship to
dataset_data
.The
ACL
class includes a relationship todataset_data
, which should be reviewed for correctness.
Verified: The relationship to
dataset_data
is correctly defined.The
ACL
class includes a correctly defined relationship todataset_data
based on the table name specified in theDatasetData
model.
cognee/infrastructure/databases/relational/user_authentication/authentication_db.py
: Relationship todataset_data
is correctly defined.cognee/infrastructure/data/models/DatasetData.py
:__tablename__
is set to "dataset_data".Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the relationship to `dataset_data` in the `ACL` class. # Test: Search for the `dataset_data` definition. Expect: Correct relationship definition. rg --type python -A 5 $'dataset_data'Length of output: 67
Script:
#!/bin/bash # Description: Verify the relationship to `dataset_data` in the `ACL` class. # Test: Search for the `dataset_data` definition. Expect: Correct relationship definition. rg --type py -A 5 $'dataset_data'Length of output: 4769
async def verify_user_token(token: str): | ||
|
||
output = await user_check_token(token=token) | ||
return output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling and a docstring.
The function lacks error handling and documentation. Consider adding a try-except block and a docstring.
+ """
+ Verify the user token.
+
+ Args:
+ token (str): The user token to verify.
+
+ Returns:
+ output: The result of the token verification.
+ """
async def verify_user_token(token: str):
+ try:
output = await user_check_token(token=token)
+ except Exception as e:
+ # Handle the exception (e.g., log the error, raise a custom exception, etc.)
+ raise e
return output
Committable suggestion was skipped due to low confidence.
# async def generate_verification_token(email: str, tokens_db: dict) -> str: | ||
# async with get_async_session_context() as session: | ||
# async with get_user_db_context(session) as user_db: | ||
# if not await user_db.get_by_email(email): | ||
# raise ValueError("User does not exist") | ||
# token = str(uuid.uuid4()) | ||
# tokens_db[token] = email | ||
# return token | ||
|
||
# async def verify_user_method(token: str, tokens_db: dict) -> bool: | ||
# async with get_async_session_context() as session: | ||
# async with get_user_db_context(session) as user_db: | ||
# email = tokens_db.get(token) | ||
# if not email or not await user_db.get_by_email(email): | ||
# return False | ||
# user = await user_db.get_by_email(email) | ||
# user.is_verified = True | ||
# await user_db.update(user) | ||
# return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Address commented-out functions.
The commented-out functions for generating and verifying tokens should be reviewed for relevance and either implemented or removed.
Do you want me to help implement these functions or open a GitHub issue to track this task?
async def authenticate_user(): | ||
""" | ||
This function is used to authenticate a user. | ||
""" | ||
output = await authenticate_user_method() | ||
return output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling and improve the docstring.
The function lacks error handling and the docstring could be more descriptive. Consider adding a try-except block and enhancing the docstring.
async def authenticate_user():
- """
- This function is used to authenticate a user.
- """
+ """
+ Authenticate a user.
+
+ Returns:
+ output: The result of the user authentication.
+ """
+ try:
output = await authenticate_user_method()
+ except Exception as e:
+ # Handle the exception (e.g., log the error, raise a custom exception, etc.)
+ raise e
return output
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async def authenticate_user(): | |
""" | |
This function is used to authenticate a user. | |
""" | |
output = await authenticate_user_method() | |
return output | |
async def authenticate_user(): | |
""" | |
Authenticate a user. | |
Returns: | |
output: The result of the user authentication. | |
""" | |
try: | |
output = await authenticate_user_method() | |
except Exception as e: | |
# Handle the exception (e.g., log the error, raise a custom exception, etc.) | |
raise e | |
return output |
async def authenticate_user_method(email: str, password: str) -> Optional[User]: | ||
try: | ||
async with get_async_session_context() as session: | ||
async with get_user_db_context(session) as user_db: | ||
async with get_user_manager_context(user_db) as user_manager: | ||
credentials = OAuth2PasswordRequestForm(username=email, password=password) | ||
user = await user_manager.authenticate(credentials) | ||
if user is None or not user.is_active: | ||
return None | ||
return user | ||
except Exception as e: | ||
print(f"Error during authentication: {e}") | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve exception handling by logging the exception.
The function authenticate_user_method
is correctly defined, but the exception handling can be improved by logging the exception.
- except Exception as e:
- print(f"Error during authentication: {e}")
+ except Exception as e:
+ print(f"Error during authentication: {e}")
+ raise e
Committable suggestion was skipped due to low confidence.
async def create_user_method(email: str, password: str, is_superuser: bool = False): | ||
try: | ||
async with get_async_session_context() as session: | ||
async with get_user_db_context(session) as user_db: | ||
async with get_user_manager_context(user_db) as user_manager: | ||
user = await user_manager.create( | ||
UserCreate(email=email, password=password, is_superuser=is_superuser) | ||
) | ||
print(f"User created: {user.email}") | ||
except UserAlreadyExists: | ||
print(f"User {email} already exists") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve exception handling by logging the exception.
The function create_user_method
is correctly defined, but the exception handling can be improved by logging the exception.
- except UserAlreadyExists:
- print(f"User {email} already exists")
+ except UserAlreadyExists as e:
+ print(f"User {email} already exists: {e}")
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async def create_user_method(email: str, password: str, is_superuser: bool = False): | |
try: | |
async with get_async_session_context() as session: | |
async with get_user_db_context(session) as user_db: | |
async with get_user_manager_context(user_db) as user_manager: | |
user = await user_manager.create( | |
UserCreate(email=email, password=password, is_superuser=is_superuser) | |
) | |
print(f"User created: {user.email}") | |
except UserAlreadyExists: | |
print(f"User {email} already exists") | |
async def create_user_method(email: str, password: str, is_superuser: bool = False): | |
try: | |
async with get_async_session_context() as session: | |
async with get_user_db_context(session) as user_db: | |
async with get_user_manager_context(user_db) as user_manager: | |
user = await user_manager.create( | |
UserCreate(email=email, password=password, is_superuser=is_superuser) | |
) | |
print(f"User created: {user.email}") | |
except UserAlreadyExists as e: | |
print(f"User {email} already exists: {e}") |
import hashlib | ||
import uuid | ||
from typing import AsyncGenerator, Generator, Optional | ||
|
||
from fastapi import Depends | ||
from fastapi_users.db import SQLAlchemyBaseUserTableUUID, SQLAlchemyUserDatabase | ||
from sqlalchemy import create_engine, UUID | ||
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine | ||
from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session | ||
from cognee.infrastructure.databases.relational import get_relationaldb_config, Base | ||
from sqlalchemy import Column, String, ForeignKey, Table, Integer | ||
from contextlib import asynccontextmanager | ||
from fastapi_users.exceptions import UserAlreadyExists | ||
from fastapi_users.authentication import JWTStrategy | ||
from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session, relationship |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clean up redundant and redefined imports.
The imports DeclarativeBase
, sessionmaker
, and Session
are redefined on line 15. Additionally, there are redundant imports that can be removed to improve readability and maintainability.
- from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session
- from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session, relationship
+ from sqlalchemy.orm import relationship
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
import hashlib | |
import uuid | |
from typing import AsyncGenerator, Generator, Optional | |
from fastapi import Depends | |
from fastapi_users.db import SQLAlchemyBaseUserTableUUID, SQLAlchemyUserDatabase | |
from sqlalchemy import create_engine, UUID | |
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine | |
from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session | |
from cognee.infrastructure.databases.relational import get_relationaldb_config, Base | |
from sqlalchemy import Column, String, ForeignKey, Table, Integer | |
from contextlib import asynccontextmanager | |
from fastapi_users.exceptions import UserAlreadyExists | |
from fastapi_users.authentication import JWTStrategy | |
from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session, relationship | |
import hashlib | |
import uuid | |
from typing import AsyncGenerator, Generator, Optional | |
from fastapi import Depends | |
from fastapi_users.db import SQLAlchemyBaseUserTableUUID, SQLAlchemyUserDatabase | |
from sqlalchemy import create_engine, UUID | |
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine | |
from cognee.infrastructure.databases.relational import get_relationaldb_config, Base | |
from sqlalchemy import Column, String, ForeignKey, Table, Integer | |
from contextlib import asynccontextmanager | |
from fastapi_users.exceptions import UserAlreadyExists | |
from fastapi_users.authentication import JWTStrategy | |
from sqlalchemy.orm import relationship |
Tools
Ruff
15-15: Redefinition of unused
DeclarativeBase
from line 9Remove definition:
DeclarativeBase
(F811)
15-15: Redefinition of unused
sessionmaker
from line 9Remove definition:
sessionmaker
(F811)
15-15: Redefinition of unused
Session
from line 9Remove definition:
Session
(F811)
async def user_create_token(user: User) -> Optional[str]: | ||
try: | ||
async with get_async_session_context() as session: | ||
async with get_user_db_context(session) as user_db: | ||
async with get_user_manager_context(user_db) as user_manager: | ||
if user is None: | ||
return None | ||
strategy = get_jwt_strategy() | ||
token = await strategy.write_token(user) | ||
if token is not None: | ||
return token | ||
else: | ||
return None | ||
except: | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused variable and improve exception handling.
The function user_create_token
is correctly defined, but the unused user_manager
variable should be removed. Additionally, the bare except
should be replaced with a specific exception.
- async with get_user_manager_context(user_db) as user_manager:
+ async with get_user_manager_context(user_db):
...
- except:
+ except Exception as e:
+ print(f"Error during token creation: {e}")
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async def user_create_token(user: User) -> Optional[str]: | |
try: | |
async with get_async_session_context() as session: | |
async with get_user_db_context(session) as user_db: | |
async with get_user_manager_context(user_db) as user_manager: | |
if user is None: | |
return None | |
strategy = get_jwt_strategy() | |
token = await strategy.write_token(user) | |
if token is not None: | |
return token | |
else: | |
return None | |
except: | |
return None | |
async def user_create_token(user: User) -> Optional[str]: | |
try: | |
async with get_async_session_context() as session: | |
async with get_user_db_context(session) as user_db: | |
async with get_user_manager_context(user_db): | |
if user is None: | |
return None | |
strategy = get_jwt_strategy() | |
token = await strategy.write_token(user) | |
if token is not None: | |
return token | |
else: | |
return None | |
except Exception as e: | |
print(f"Error during token creation: {e}") | |
return None |
Tools
Ruff
149-149: Local variable
user_manager
is assigned to but never usedRemove assignment to unused variable
user_manager
(F841)
158-158: Do not use bare
except
(E722)
async def user_check_token(token: str) -> bool: | ||
try: | ||
async with get_async_session_context() as session: | ||
async with get_user_db_context(session) as user_db: | ||
async with get_user_manager_context(user_db) as user_manager: | ||
if token is None: | ||
return False | ||
strategy = get_jwt_strategy() | ||
user = await strategy.read_token(token, user_manager) | ||
if user is None or not user.is_active: | ||
return False | ||
else: | ||
return True | ||
except: | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve exception handling.
The function user_check_token
is correctly defined, but the bare except
should be replaced with a specific exception.
- except:
- return False
+ except Exception as e:
+ print(f"Error during token validation: {e}")
+ return False
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async def user_check_token(token: str) -> bool: | |
try: | |
async with get_async_session_context() as session: | |
async with get_user_db_context(session) as user_db: | |
async with get_user_manager_context(user_db) as user_manager: | |
if token is None: | |
return False | |
strategy = get_jwt_strategy() | |
user = await strategy.read_token(token, user_manager) | |
if user is None or not user.is_active: | |
return False | |
else: | |
return True | |
except: | |
return False | |
async def user_check_token(token: str) -> bool: | |
try: | |
async with get_async_session_context() as session: | |
async with get_user_db_context(session) as user_db: | |
async with get_user_manager_context(user_db) as user_manager: | |
if token is None: | |
return False | |
strategy = get_jwt_strategy() | |
user = await strategy.read_token(token, user_manager) | |
if user is None or not user.is_active: | |
return False | |
else: | |
return True | |
except Exception as e: | |
print(f"Error during token validation: {e}") | |
return False |
Tools
Ruff
174-174: Do not use bare
except
(E722)
async def create_user(email: str, password: str, is_superuser: bool = False): | ||
output = await create_user_method(email=email, password=password, is_superuser=is_superuser) | ||
return output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling and a docstring.
The function lacks error handling and documentation. Consider adding a try-except block and a docstring.
+ """
+ Create a new user.
+
+ Args:
+ email (str): The email of the user.
+ password (str): The password of the user.
+ is_superuser (bool): Whether the user is a superuser. Defaults to False.
+
+ Returns:
+ output: The result of the user creation.
+ """
async def create_user(email: str, password: str, is_superuser: bool = False):
+ try:
output = await create_user_method(email=email, password=password, is_superuser=is_superuser)
+ except Exception as e:
+ # Handle the exception (e.g., log the error, raise a custom exception, etc.)
+ raise e
return output
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async def create_user(email: str, password: str, is_superuser: bool = False): | |
output = await create_user_method(email=email, password=password, is_superuser=is_superuser) | |
return output | |
""" | |
Create a new user. | |
Args: | |
email (str): The email of the user. | |
password (str): The password of the user. | |
is_superuser (bool): Whether the user is a superuser. Defaults to False. | |
Returns: | |
output: The result of the user creation. | |
""" | |
async def create_user(email: str, password: str, is_superuser: bool = False): | |
try: | |
output = await create_user_method(email=email, password=password, is_superuser=is_superuser) | |
except Exception as e: | |
# Handle the exception (e.g., log the error, raise a custom exception, etc.) | |
raise e | |
return output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (7)
- cognee/api/v1/authenticate_user/authenticate_user.py (1 hunks)
- cognee/api/v1/cognify/cognify_v2.py (3 hunks)
- cognee/api/v1/create_user/create_user.py (1 hunks)
- cognee/api/v1/verify_user_token/verify_user_token.py (1 hunks)
- cognee/infrastructure/data/models/DatasetData.py (1 hunks)
- cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (1 hunks)
- cognee/infrastructure/databases/relational/user_authentication/users.py (1 hunks)
Files skipped from review as they are similar to previous changes (3)
- cognee/api/v1/authenticate_user/authenticate_user.py
- cognee/api/v1/create_user/create_user.py
- cognee/api/v1/verify_user_token/verify_user_token.py
Additional context used
Ruff
cognee/infrastructure/databases/relational/user_authentication/authentication_db.py
15-15: Redefinition of unused
DeclarativeBase
from line 9Remove definition:
DeclarativeBase
(F811)
15-15: Redefinition of unused
sessionmaker
from line 9Remove definition:
sessionmaker
(F811)
15-15: Redefinition of unused
Session
from line 9Remove definition:
Session
(F811)
cognee/api/v1/cognify/cognify_v2.py
83-83: Undefined name
hashed_user_id
(F821)
83-83: Undefined name
user_permissions
(F821)
cognee/infrastructure/databases/relational/user_authentication/users.py
14-14: Redefinition of unused
Depends
from line 5Remove definition:
Depends
(F811)
19-19: Redefinition of unused
JWTStrategy
from line 10Remove definition:
JWTStrategy
(F811)
150-150: Local variable
user_manager
is assigned to but never usedRemove assignment to unused variable
user_manager
(F841)
159-159: Do not use bare
except
(E722)
175-175: Do not use bare
except
(E722)
Additional comments not posted (17)
cognee/infrastructure/data/models/DatasetData.py (4)
8-8
: Ensure consistency with ORM framework.The class now inherits from
Base
instead ofModelBase
. Verify that this change aligns with the ORM framework being used.
13-14
: Primary key adjustment.The primary key definitions for
dataset_id
anddata_id
have been removed. Ensure that the new primary key setup withid
is consistent with the overall data model.
15-17
: Unique constraint added.The unique constraint on
dataset_id
anddata_id
ensures data integrity. This is a good practice to prevent duplicate entries.
19-19
: Relationship with ACL model.The relationship with the
ACL
model establishes a back-reference. Ensure that theACL
model has the corresponding relationship defined.cognee/infrastructure/databases/relational/user_authentication/authentication_db.py (4)
36-42
: Ensure relationships are correctly defined.The
Group
model defines several relationships. Ensure that the corresponding relationships are defined in the related models.
43-47
: Ensure relationships are correctly defined.The
Permission
model defines a relationship with theGroup
model. Ensure that the corresponding relationship is defined in theGroup
model.
49-58
: Ensure relationships are correctly defined.The
ACL
model defines relationships withDatasetData
,User
, andGroup
. Ensure that the corresponding relationships are defined in the related models.
31-34
: Addid
column to theUser
model.The
User
model is missing theid
column, which is necessary for primary key identification.+ id = Column(UUID, primary_key=True, index=True)
Likely invalid or redundant comment.
cognee/api/v1/cognify/cognify_v2.py (2)
34-37
: LGTM!The
PermissionDeniedException
class is correctly implemented.
39-47
: Verify the integration of user permissions.The function now retrieves user permissions and checks for the required "write" permission. Ensure that the
get_user_permissions
function and thefastapi_users
manager are correctly implemented and integrated.cognee/infrastructure/databases/relational/user_authentication/users.py (7)
26-41
: Replace print statements with proper logging.Using print statements for logging is not ideal for production code. Consider using the
logging
module.import logging logger = logging.getLogger(__name__) class UserManager(UUIDIDMixin, BaseUserManager[User, uuid.UUID]): reset_password_token_secret = SECRET verification_token_secret = SECRET async def on_after_register(self, user: User, request: Optional[Request] = None): logger.info(f"User {user.id} has registered.") async def on_after_forgot_password(self, user: User, token: str, request: Optional[Request] = None): logger.info(f"User {user.id} has forgot their password. Reset token: {token}") async def on_after_request_verify(self, user: User, token: str, request: Optional[Request] = None): logger.info(f"Verification requested for user {user.id}. Verification token: {token}")
44-45
: LGTM!The
get_user_manager
function correctly yields theUserManager
instance.
51-52
: LGTM!The
get_jwt_strategy
function correctly returns the JWT strategy.
66-70
: LGTM!The
get_user_permissions
function correctly retrieves and returns the user's permissions.
72-78
: LGTM!The
has_permission
function correctly checks the user's permissions and raises an HTTP exception if the user lacks the required permission.
81-82
: LGTM!The
hash_password
function correctly hashes the password using SHA-256.
115-123
: LGTM!The
reset_user_password_method
function correctly resets the user's password and updates the database.
from fastapi_users.db import SQLAlchemyBaseUserTableUUID, SQLAlchemyUserDatabase | ||
from sqlalchemy import create_engine, UUID | ||
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine | ||
from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clean up redundant and redefined imports.
The imports DeclarativeBase
, sessionmaker
, and Session
are redefined. Additionally, there are redundant imports that can be removed to improve readability and maintainability.
- from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session
- from sqlalchemy.orm import DeclarativeBase, sessionmaker, Session, relationship
+ from sqlalchemy.orm import relationship
Also applies to: 15-15
user_group = Table('user_group', Base.metadata, | ||
Column('user_id', UUID, ForeignKey('users.id')), | ||
Column('group_id', UUID, ForeignKey('groups.id'))) | ||
|
||
# Association table for many-to-many relationship between groups and permissions | ||
group_permission = Table('group_permission', Base.metadata, | ||
Column('group_id', UUID, ForeignKey('groups.id')), | ||
Column('permission_id', UUID, ForeignKey('permissions.id'))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding explicit primary keys for association tables.
While the association tables are correctly defined, adding explicit primary keys can help with performance and data integrity.
- user_group = Table('user_group', Base.metadata,
- Column('user_id', UUID, ForeignKey('users.id')),
- Column('group_id', UUID, ForeignKey('groups.id')))
+ user_group = Table('user_group', Base.metadata,
+ Column('id', UUID, primary_key=True, index=True),
+ Column('user_id', UUID, ForeignKey('users.id')),
+ Column('group_id', UUID, ForeignKey('groups.id')))
- group_permission = Table('group_permission', Base.metadata,
- Column('group_id', UUID, ForeignKey('groups.id')),
- Column('permission_id', UUID, ForeignKey('permissions.id')))
+ group_permission = Table('group_permission', Base.metadata,
+ Column('id', UUID, primary_key=True, index=True),
+ Column('group_id', UUID, ForeignKey('groups.id')),
+ Column('permission_id', UUID, ForeignKey('permissions.id')))
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
user_group = Table('user_group', Base.metadata, | |
Column('user_id', UUID, ForeignKey('users.id')), | |
Column('group_id', UUID, ForeignKey('groups.id'))) | |
# Association table for many-to-many relationship between groups and permissions | |
group_permission = Table('group_permission', Base.metadata, | |
Column('group_id', UUID, ForeignKey('groups.id')), | |
Column('permission_id', UUID, ForeignKey('permissions.id'))) | |
user_group = Table('user_group', Base.metadata, | |
Column('id', UUID, primary_key=True, index=True), | |
Column('user_id', UUID, ForeignKey('users.id')), | |
Column('group_id', UUID, ForeignKey('groups.id'))) | |
# Association table for many-to-many relationship between groups and permissions | |
group_permission = Table('group_permission', Base.metadata, | |
Column('id', UUID, primary_key=True, index=True), | |
Column('group_id', UUID, ForeignKey('groups.id')), | |
Column('permission_id', UUID, ForeignKey('permissions.id'))) |
async def create_user_method(email: str, password: str, is_superuser: bool = False): | ||
try: | ||
async with get_async_session_context() as session: | ||
async with get_user_db_context(session) as user_db: | ||
async with get_user_manager_context(user_db) as user_manager: | ||
user = await user_manager.create( | ||
UserCreate(email=email, password=password, is_superuser=is_superuser) | ||
) | ||
print(f"User created: {user.email}") | ||
except UserAlreadyExists: | ||
print(f"User {email} already exists") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace print statements with proper logging.
Using print statements for logging is not ideal for production code. Consider using the logging
module.
import logging
logger = logging.getLogger(__name__)
async def create_user_method(email: str, password: str, is_superuser: bool = False):
try:
async with get_async_session_context() as session:
async with get_user_db_context(session) as user_db:
async with get_user_manager_context(user_db) as user_manager:
user = await user_manager.create(
UserCreate(email=email, password=password, is_superuser=is_superuser)
)
logger.info(f"User created: {user.email}")
except UserAlreadyExists:
logger.warning(f"User {email} already exists")
cognee/api/v1/cognify/cognify_v2.py
Outdated
@@ -60,7 +80,7 @@ async def run_cognify_pipeline(dataset_name: str, files: list[dict]): | |||
root_node_id = "ROOT" | |||
|
|||
tasks = [ | |||
Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type | |||
Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }, user_id = hashed_user_id, user_permissions=user_permissions), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undefined variables hashed_user_id
and user_permissions
.
The variables hashed_user_id
and user_permissions
are used but not defined in the provided code. Ensure that these variables are correctly defined and passed to the Task
instances.
- Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }, user_id = hashed_user_id, user_permissions=user_permissions), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type
+ Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }, user_id = hashed_user_id, user_permissions=user_permissions), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type | |
Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type |
Tools
Ruff
83-83: Undefined name
hashed_user_id
(F821)
83-83: Undefined name
user_permissions
(F821)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
Outside diff range, codebase verification and nitpick comments (3)
cognee/api/v1/add/add.py (2)
89-89
: The function call todata_resources
does not match the new signature.
- File:
cognee/api/v1/add/add.py
- Lines: 89-89
- Issue: The function call to
data_resources
does not include theuser_id
parameter, which is now required by the updated function signature.Analysis chain
LGTM! But verify the function usage in the codebase.
The code changes are approved.
However, ensure that all function calls to
data_resources
match the new signature.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify all function calls to `data_resources` match the new signature. # Test: Search for the function usage. Expect: Only occurrences of the new signature. rg --type python -A 5 $'data_resources'Length of output: 69
Script:
#!/bin/bash # Description: Verify all function calls to `data_resources` match the new signature. # Test: Search for the function usage. Expect: Only occurrences of the new signature. rg --type py -A 5 $'data_resources'Length of output: 920
55-55
: Update the function call to match the new signature.The function call to
add_files
incognee/api/v1/add/add.py
does not include theuser_id
parameter. Update this call to match the new function signature.
cognee/api/v1/add/add.py
: Line withreturn await add_files(file_paths, dataset_name)
Analysis chain
LGTM! But verify the function usage in the codebase.
The code changes are approved.
However, ensure that all function calls to
add_files
match the new signature.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify all function calls to `add_files` match the new signature. # Test: Search for the function usage. Expect: Only occurrences of the new signature. rg --type python -A 5 $'add_files'Length of output: 64
Script:
#!/bin/bash # Description: Verify all function calls to `add_files` match the new signature. # Test: Search for the function usage. Expect: Only occurrences of the new signature. rg --type py -A 5 $'add_files'Length of output: 573
cognee/api/v1/cognify/cognify_v2.py (1)
43-43
: Function calls tocognify
need updating to match the new signature.The
cognify
function has been updated to include auser_id
parameter, but several calls to this function in the codebase do not include this new parameter. Please update these calls to match the new function signature.
cognee/tests/test_weaviate.py
cognee/tests/test_qdrant.py
cognee/tests/test_neo4j.py
cognee/tests/test_library.py
Analysis chain
LGTM! But verify the function usage in the codebase.
The code changes are approved.
However, ensure that all function calls to
cognify
match the new signature.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify all function calls to `cognify` match the new signature. # Test: Search for the function usage. Expect: Only occurrences of the new signature. rg --type python -A 5 $'cognify'Length of output: 62
Script:
#!/bin/bash # Description: Verify all function calls to `cognify` match the new signature. # Test: Search for the function usage. Expect: Only occurrences of the new signature. rg --type py -A 5 $'cognify'Length of output: 34467
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (3)
- cognee/api/v1/add/add.py (3 hunks)
- cognee/api/v1/cognify/cognify_v2.py (3 hunks)
- cognee/infrastructure/databases/relational/user_authentication/users.py (1 hunks)
Additional context used
Ruff
cognee/api/v1/cognify/cognify_v2.py
61-61: Local variable
out
is assigned to but never usedRemove assignment to unused variable
out
(F841)
89-89: Undefined name
hashed_user_id
(F821)
89-89: Undefined name
user_permissions
(F821)
cognee/infrastructure/databases/relational/user_authentication/users.py
14-14: Redefinition of unused
Depends
from line 5Remove definition:
Depends
(F811)
20-20: Redefinition of unused
JWTStrategy
from line 10Remove definition:
JWTStrategy
(F811)
151-151: Local variable
user_manager
is assigned to but never usedRemove assignment to unused variable
user_manager
(F841)
160-160: Do not use bare
except
(E722)
176-176: Do not use bare
except
(E722)
cognee/api/v1/cognify/cognify_v2.py
Outdated
|
||
async with get_async_session_context() as session: | ||
|
||
out = await has_permission_document(user_id, file["id"], "write", session) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove assignment to unused variable out
.
The variable out
is assigned but never used.
- out = await has_permission_document(user_id, file["id"], "write", session)
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
out = await has_permission_document(user_id, file["id"], "write", session) | |
await has_permission_document(user_id, file["id"], "write", session) |
Tools
Ruff
61-61: Local variable
out
is assigned to but never usedRemove assignment to unused variable
out
(F841)
async def give_permission_document(user: Optional[User], document_id: str, permission: str, | ||
session: AsyncSession): | ||
|
||
acl_entry = ACL( | ||
document_id=document_id, | ||
user_id=user.id, | ||
permission=permission | ||
) | ||
session.add(acl_entry) | ||
await session.commit() | ||
|
||
|
||
if user.is_superuser: | ||
permission = 'all_permissions' # Example permission, change as needed | ||
acl_entry = ACL( | ||
document_id=document_id, | ||
user_id=user.id, | ||
permission=permission | ||
) | ||
session.add(acl_entry) | ||
await session.commit() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplify code for superuser permissions.
The code for superuser permissions is redundant and can be simplified.
async def give_permission_document(user: Optional[User], document_id: str, permission: str,
session: AsyncSession):
acl_entry = ACL(
document_id=document_id,
user_id=user.id,
permission=permission
)
session.add(acl_entry)
await session.commit()
if user.is_superuser:
acl_entry = ACL(
document_id=document_id,
user_id=user.id,
permission='all_permissions' # Example permission, change as needed
)
session.add(acl_entry)
await session.commit()
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async def give_permission_document(user: Optional[User], document_id: str, permission: str, | |
session: AsyncSession): | |
acl_entry = ACL( | |
document_id=document_id, | |
user_id=user.id, | |
permission=permission | |
) | |
session.add(acl_entry) | |
await session.commit() | |
if user.is_superuser: | |
permission = 'all_permissions' # Example permission, change as needed | |
acl_entry = ACL( | |
document_id=document_id, | |
user_id=user.id, | |
permission=permission | |
) | |
session.add(acl_entry) | |
await session.commit() | |
async def give_permission_document(user: Optional[User], document_id: str, permission: str, | |
session: AsyncSession): | |
acl_entry = ACL( | |
document_id=document_id, | |
user_id=user.id, | |
permission=permission | |
) | |
session.add(acl_entry) | |
await session.commit() | |
if user.is_superuser: | |
acl_entry = ACL( | |
document_id=document_id, | |
user_id=user.id, | |
permission='all_permissions' # Example permission, change as needed | |
) | |
session.add(acl_entry) | |
await session.commit() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 13
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (2)
- cognee/api/v1/cognify/cognify_v2.py (3 hunks)
- cognee/infrastructure/databases/relational/user_authentication/users.py (1 hunks)
Additional context used
Ruff
cognee/api/v1/cognify/cognify_v2.py
62-62: Local variable
out
is assigned to but never usedRemove assignment to unused variable
out
(F841)
90-90: Undefined name
hashed_user_id
(F821)
90-90: Undefined name
user_permissions
(F821)
cognee/infrastructure/databases/relational/user_authentication/users.py
14-14: Redefinition of unused
Depends
from line 5Remove definition:
Depends
(F811)
20-20: Redefinition of unused
JWTStrategy
from line 10Remove definition:
JWTStrategy
(F811)
154-154: Local variable
user_manager
is assigned to but never usedRemove assignment to unused variable
user_manager
(F841)
163-163: Do not use bare
except
(E722)
179-179: Do not use bare
except
(E722)
Additional comments not posted (11)
cognee/api/v1/cognify/cognify_v2.py (7)
4-13
: Imports look good.The new imports are necessary for the added user management functionalities.
38-41
: Exception class implementation looks good.The
PermissionDeniedException
class is correctly implemented.
42-43
: Function signature update looks good.The
cognify
function now includes auser_id
parameter, which is necessary for user permission checks.
Line range hint
44-48
:
Initialization segment looks good.The segment initializes configurations and creates a task status table, which is necessary for the function.
Line range hint
49-53
:
Dataset handling segment looks good.The segment handles the datasets parameter and calls
cognify
recursively if datasets are not provided.Tools
Ruff
62-62: Local variable
out
is assigned to but never usedRemove assignment to unused variable
out
(F841)
90-90: Undefined name
hashed_user_id
(F821)
90-90: Undefined name
user_permissions
(F821)
65-73
: Task status update segment looks good.The segment handles task status updates using a lock, which is necessary for synchronization.
Line range hint
127-137
:
Existing datasets handling segment looks good.The segment handles existing datasets and calls
run_cognify_pipeline
.Tools
Ruff
62-62: Local variable
out
is assigned to but never usedRemove assignment to unused variable
out
(F841)
90-90: Undefined name
hashed_user_id
(F821)
90-90: Undefined name
user_permissions
(F821)
cognee/infrastructure/databases/relational/user_authentication/users.py (4)
45-46
: Function implementation looks good.The
get_user_manager
function is correctly implemented.
52-53
: Function implementation looks good.The
get_jwt_strategy
function is correctly implemented.
70-74
: Function implementation looks good.The
get_user_permissions
function is correctly implemented.
76-82
: Function implementation looks good.The
has_permission
function is correctly implemented.
cognee/api/v1/cognify/cognify_v2.py
Outdated
|
||
for file in files: | ||
file["id"] = str(uuid.uuid4()) | ||
file["name"] = file["name"].replace(" ", "_") | ||
|
||
async with get_async_session_context() as session: | ||
active_user = await fast_api_users_init() | ||
|
||
out = await has_permission_document(active_user.current_user(active=True), file["id"], "write", session) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove assignment to unused variable out
.
The variable out
is assigned but never used.
- out = await has_permission_document(active_user.current_user(active=True), file["id"], "write", session)
+ await has_permission_document(active_user.current_user(active=True), file["id"], "write", session)
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
for file in files: | |
file["id"] = str(uuid.uuid4()) | |
file["name"] = file["name"].replace(" ", "_") | |
async with get_async_session_context() as session: | |
active_user = await fast_api_users_init() | |
out = await has_permission_document(active_user.current_user(active=True), file["id"], "write", session) | |
for file in files: | |
file["id"] = str(uuid.uuid4()) | |
file["name"] = file["name"].replace(" ", "_") | |
async with get_async_session_context() as session: | |
active_user = await fast_api_users_init() | |
await has_permission_document(active_user.current_user(active=True), file["id"], "write", session) |
Tools
Ruff
62-62: Local variable
out
is assigned to but never usedRemove assignment to unused variable
out
(F841)
cognee/api/v1/cognify/cognify_v2.py
Outdated
cognee_config = get_cognify_config() | ||
graph_config = get_graph_config() | ||
root_node_id = None | ||
|
||
if graph_config.infer_graph_topology and graph_config.graph_topology_task: | ||
from cognee.modules.topology.topology import TopologyEngine | ||
topology_engine = TopologyEngine(infer=graph_config.infer_graph_topology) | ||
root_node_id = await topology_engine.add_graph_topology(files = files) | ||
elif graph_config.infer_graph_topology and not graph_config.infer_graph_topology: | ||
from cognee.modules.topology.topology import TopologyEngine | ||
topology_engine = TopologyEngine(infer=graph_config.infer_graph_topology) | ||
await topology_engine.add_graph_topology(graph_config.topology_file_path) | ||
elif not graph_config.graph_topology_task: | ||
root_node_id = "ROOT" | ||
|
||
tasks = [ | ||
Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }, user_id = hashed_user_id, user_permissions=user_permissions), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type | ||
Task(establish_graph_topology, topology_model = KnowledgeGraph), # Set the graph topology for the document chunk data | ||
Task(expand_knowledge_graph, graph_model = KnowledgeGraph), # Generate knowledge graphs from the document chunks and attach it to chunk nodes | ||
Task(filter_affected_chunks, collection_name = "chunks"), # Find all affected chunks, so we don't process unchanged chunks | ||
Task( | ||
save_data_chunks, | ||
collection_name = "chunks", | ||
), # Save the document chunks in vector db and as nodes in graph db (connected to the document node and between each other) | ||
run_tasks_parallel([ | ||
Task( | ||
summarize_text_chunks, | ||
summarization_model = cognee_config.summarization_model, | ||
collection_name = "chunk_summaries", | ||
), # Summarize the document chunks | ||
Task( | ||
classify_text_chunks, | ||
classification_model = cognee_config.classification_model, | ||
), | ||
]), | ||
Task(remove_obsolete_chunks), # Remove the obsolete document chunks. | ||
] | ||
|
||
pipeline = run_tasks(tasks, [ | ||
PdfDocument(title=f"{file['name']}.{file['extension']}", file_path=file["file_path"]) if file["extension"] == "pdf" else | ||
AudioDocument(title=f"{file['name']}.{file['extension']}", file_path=file["file_path"]) if file["extension"] == "audio" else | ||
ImageDocument(title=f"{file['name']}.{file['extension']}", file_path=file["file_path"]) if file["extension"] == "image" else | ||
TextDocument(title=f"{file['name']}.{file['extension']}", file_path=file["file_path"]) | ||
for file in files | ||
]) | ||
|
||
async for result in pipeline: | ||
print(result) | ||
|
||
update_task_status(dataset_name, "DATASET_PROCESSING_FINISHED") | ||
except Exception as error: | ||
update_task_status(dataset_name, "DATASET_PROCESSING_ERROR") | ||
raise error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve error handling in the task execution segment.
Avoid using bare except
statements to catch specific exceptions.
- except Exception as error:
+ except Exception as e:
+ logger.error(f"Error during task execution: {e}")
Committable suggestion was skipped due to low confidence.
Tools
Ruff
90-90: Undefined name
hashed_user_id
(F821)
90-90: Undefined name
user_permissions
(F821)
cognee/api/v1/cognify/cognify_v2.py
Outdated
root_node_id = "ROOT" | ||
|
||
tasks = [ | ||
Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }, user_id = hashed_user_id, user_permissions=user_permissions), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undefined variables hashed_user_id
and user_permissions
.
The variables hashed_user_id
and user_permissions
are used but not defined in the provided code. Ensure that these variables are correctly defined and passed to the Task
instances.
- Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }, user_id = hashed_user_id, user_permissions=user_permissions), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type
+ Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }, user_id = hashed_user_id, user_permissions=user_permissions), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type | |
Task(process_documents, parent_node_id = root_node_id, task_config = { "batch_size": 10 }), # Classify documents and save them as a nodes in graph db, extract text chunks based on the document type |
Tools
Ruff
90-90: Undefined name
hashed_user_id
(F821)
90-90: Undefined name
user_permissions
(F821)
from cognee.infrastructure.databases.relational.user_authentication.authentication_db import User, get_user_db, \ | ||
get_async_session, ACL | ||
from fastapi.security import OAuth2PasswordRequestForm | ||
from fastapi_users.authentication import JWTStrategy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove redundant import.
The JWTStrategy
import on line 20 is redundant and should be removed.
- from fastapi_users.authentication import JWTStrategy
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
from fastapi_users.authentication import JWTStrategy |
Tools
Ruff
20-20: Redefinition of unused
JWTStrategy
from line 10Remove definition:
JWTStrategy
(F811)
print(f"User created: {user.email}") | ||
except UserAlreadyExists: | ||
print(f"User {email} already exists") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace print statements with proper logging.
Using print statements for logging is not ideal for production code. Consider using the logging
module.
- print(f"User {email} already exists")
+ logger.warning(f"User {email} already exists")
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
print(f"User created: {user.email}") | |
except UserAlreadyExists: | |
print(f"User {email} already exists") | |
print(f"User created: {user.email}") | |
except UserAlreadyExists: | |
logger.warning(f"User {email} already exists") |
async def on_after_request_verify( | ||
self, user: User, token: str, request: Optional[Request] = None | ||
): | ||
print(f"Verification requested for user {user.id}. Verification token: {token}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace print statements with proper logging.
Using print statements for logging is not ideal for production code. Consider using the logging
module.
- print(f"Verification requested for user {user.id}. Verification token: {token}")
+ logger.info(f"Verification requested for user {user.id}. Verification token: {token}")
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
print(f"Verification requested for user {user.id}. Verification token: {token}") | |
logger.info(f"Verification requested for user {user.id}. Verification token: {token}") |
return False | ||
else: | ||
return True | ||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid bare except statements.
Avoid using bare except
statements to catch specific exceptions.
- except:
+ except Exception as e:
+ logger.error(f"Error checking token: {e}")
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
except: | |
except Exception as e: | |
logger.error(f"Error checking token: {e}") |
Tools
Ruff
179-179: Do not use bare
except
(E722)
return token | ||
else: | ||
return None | ||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid bare except statements.
Avoid using bare except
statements to catch specific exceptions.
- except:
+ except Exception as e:
+ logger.error(f"Error creating token: {e}")
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
except: | |
except Exception as e: | |
logger.error(f"Error creating token: {e}") |
Tools
Ruff
163-163: Do not use bare
except
(E722)
) | ||
from fastapi_users.exceptions import UserAlreadyExists | ||
from fastapi_users.db import SQLAlchemyUserDatabase | ||
from fastapi import Depends, HTTPException, status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove redundant import.
The Depends
import on line 14 is redundant and should be removed.
- from fastapi import Depends, HTTPException, status
+ from fastapi import HTTPException, status
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
from fastapi import Depends, HTTPException, status | |
from fastapi import HTTPException, status |
Tools
Ruff
14-14: Redefinition of unused
Depends
from line 5Remove definition:
Depends
(F811)
return None | ||
return user | ||
except Exception as e: | ||
print(f"Error during authentication: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace print statements with proper logging and avoid bare except statements.
Using print statements for logging is not ideal for production code. Consider using the logging
module. Additionally, avoid using bare except
statements to catch specific exceptions.
- print(f"Error during authentication: {e}")
+ logger.error(f"Error during authentication: {e}")
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
print(f"Error during authentication: {e}") | |
logger.error(f"Error during authentication: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- .github/workflows/test_neo4j.yml (3 hunks)
Additional comments not posted (3)
.github/workflows/test_neo4j.yml (3)
31-42
: LGTM! PostgreSQL service configuration is correctly added.The
services
section is correctly configured to add a PostgreSQL service for the CI workflow.
59-65
: LGTM! Waiting for PostgreSQL to be ready is a good practice.The step ensures that PostgreSQL is ready before running tests, which is crucial for avoiding connection issues.
82-86
: LGTM! PostgreSQL environment variables are correctly added.The environment variables are necessary for the tests to connect to the PostgreSQL instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 15
Outside diff range, codebase verification and nitpick comments (3)
cognee/modules/search/graph/search_similarity.py (1)
Line range hint
3-20
:
Correct the return type hint and update the docstring.The return type hint should be
list[dict]
instead oflist[str, str]
. Additionally, update the docstring to reflect the changes.- async def search_similarity(query: str) -> list[str, str]: + async def search_similarity(query: str) -> list[dict]: """ Parameters: - query (str): The query string to filter nodes by. Returns: - - list(chunk): A list of objects providing information about the chunks related to query. + - list[dict]: A list of dictionaries providing information about the chunks related to query. """cognee/api/v1/cognify/cognify_v2.py (1)
40-50
: Update docstring to reflect the new function signature.The function signature now includes a
user
parameter. Update the docstring to reflect this change.""" Process datasets and manage user permissions. Args: datasets (Union[str, list[str]], optional): The datasets to process. user (User, optional): The user performing the operation. """Tools
Ruff
46-46: Use
is
andis not
for type comparisons, orisinstance()
for isinstance checks(E721)
cognee/api/v1/cognify/cognify.py (1)
Line range hint
61-85
:
Improve error handling in the task execution segment.Avoid using bare
except
statements to catch specific exceptions.- except Exception as error: + except (SQLAlchemyError, ValueError) as error:
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (27)
- cognee-frontend/src/app/page.tsx (1 hunks)
- cognee-frontend/src/modules/datasets/cognifyDataset.ts (1 hunks)
- cognee-frontend/src/modules/ingestion/DataView/DataView.module.css (1 hunks)
- cognee-frontend/src/modules/ingestion/DataView/DataView.tsx (4 hunks)
- cognee-frontend/src/modules/ingestion/DatasetsView/DatasetsView.tsx (2 hunks)
- cognee-frontend/src/modules/ingestion/useDatasets.ts (2 hunks)
- cognee-frontend/src/ui/Partials/SearchView/SearchView.tsx (2 hunks)
- cognee/api/client.py (8 hunks)
- cognee/api/v1/add/add.py (5 hunks)
- cognee/api/v1/cognify/cognify.py (5 hunks)
- cognee/api/v1/cognify/cognify_v2.py (4 hunks)
- cognee/api/v1/datasets/datasets.py (1 hunks)
- cognee/api/v1/search/search.py (4 hunks)
- cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py (1 hunks)
- cognee/modules/data/extraction/data_summary/summarize_text_chunks.py (1 hunks)
- cognee/modules/data/models/Data.py (1 hunks)
- cognee/modules/data/models/Dataset.py (1 hunks)
- cognee/modules/data/operations/get_dataset_data.py (1 hunks)
- cognee/modules/data/operations/retrieve_datasets.py (1 hunks)
- cognee/modules/pipelines/models/PipelineRun.py (1 hunks)
- cognee/modules/pipelines/operations/get_pipeline_status.py (1 hunks)
- cognee/modules/pipelines/operations/log_pipeline_status.py (1 hunks)
- cognee/modules/search/graph/search_similarity.py (1 hunks)
- cognee/modules/topology/topology.py (2 hunks)
- cognee/modules/users/models/Permission.py (1 hunks)
- cognee/modules/users/permissions/methods/check_permissions_on_documents.py (1 hunks)
- cognee/modules/users/permissions/methods/get_document_ids_for_user.py (1 hunks)
Files skipped from review due to trivial changes (2)
- cognee-frontend/src/modules/ingestion/DataView/DataView.module.css
- cognee-frontend/src/modules/ingestion/useDatasets.ts
Files skipped from review as they are similar to previous changes (10)
- cognee-frontend/src/modules/datasets/cognifyDataset.ts
- cognee-frontend/src/modules/ingestion/DataView/DataView.tsx
- cognee/api/v1/datasets/datasets.py
- cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py
- cognee/modules/pipelines/models/PipelineRun.py
- cognee/modules/pipelines/operations/get_pipeline_status.py
- cognee/modules/pipelines/operations/log_pipeline_status.py
- cognee/modules/topology/topology.py
- cognee/modules/users/permissions/methods/check_permissions_on_documents.py
- cognee/modules/users/permissions/methods/get_document_ids_for_user.py
Additional context used
Ruff
cognee/modules/data/models/Dataset.py
19-19: Undefined name
Data
(F821)
cognee/modules/data/models/Data.py
22-22: Undefined name
Dataset
(F821)
cognee/api/v1/cognify/cognify_v2.py
46-46: Use
is
andis not
for type comparisons, orisinstance()
for isinstance checks(E721)
cognee/api/client.py
31-31: Module level import not at top of file
(E402)
56-57: Module level import not at top of file
(E402)
59-59: Module level import not at top of file
(E402)
Additional comments not posted (51)
cognee/modules/data/operations/retrieve_datasets.py (1)
1-3
: Imports look good.The necessary modules for database operations are correctly imported.
cognee/modules/users/models/Permission.py (1)
1-5
: Imports look good.The necessary modules for defining the SQLAlchemy model are correctly imported.
cognee/modules/search/graph/search_similarity.py (1)
22-27
: Function looks good.The
parse_payload
function is well-structured and correctly extracts the required fields from the payload.cognee/modules/data/operations/get_dataset_data.py (3)
1-5
: Import statements look good.The necessary modules are imported, including UUID, SQLAlchemy components, and project-specific modules.
6-8
: Good error handling for missing parameters.The function correctly raises a
ValueError
if neitherdataset_id
nordataset_name
is provided.
10-11
: Efficient database engine retrieval.The function retrieves the database engine efficiently using
get_relational_engine()
.cognee/modules/data/models/Dataset.py (5)
9-11
: Correct table name definition.The
__tablename__
attribute correctly defines the table name as "datasets".
12-17
: Proper column definitions.The columns are defined correctly, including UUID primary key and datetime fields with default values.
19-22
: Fix undefined nameData
.The type hint for
data
usesData
, which is not defined in this file. This will cause a runtime error.- data: Mapped[List["Data"]] = relationship( + from .Data import Data + data: Mapped[List[Data]] = relationship(Tools
Ruff
19-19: Undefined name
Data
(F821)
24-31
: Good JSON serialization method.The
to_json
method correctly serializes theDataset
object to a JSON-compatible dictionary.
1-8
: Import statements look good, but missing import forData
.The necessary modules are imported, including UUID, typing, datetime, and SQLAlchemy components. However, the
Data
model is referenced but not imported.+ from .Data import Data
Likely invalid or redundant comment.
cognee/modules/data/extraction/data_summary/summarize_text_chunks.py (5)
Line range hint
1-10
:
Import statements look good.The necessary modules are imported, including asyncio, typing, pydantic, and project-specific modules.
Line range hint
12-13
:
Good initial check for empty data chunks.The function correctly returns early if the
data_chunks
list is empty.
Line range hint
15-17
:
Efficient use ofasyncio.gather
for concurrent processing.The function uses
asyncio.gather
to process text chunks concurrently, which is efficient.
Line range hint
19-21
:
Efficient vector engine retrieval and collection creation.The function retrieves the vector engine and creates a collection efficiently.
Line range hint
23-32
:
Good addition ofdocument_id
to payload.The addition of
document_id
to the payload enhances the data structure by including the document identifier, which is crucial for tracking or referencing the source document.cognee/modules/data/models/Data.py (4)
1-7
: Imports look good.The imports are appropriate and necessary for the functionality of the file.
9-10
: Class declaration looks good.The class declaration and table name definition are correct.
12-20
: Column definitions look good.The columns are appropriately defined with correct data types and default values.
27-37
: Address the commented-outdatasets
field.The
to_json
method is well-defined, but thedatasets
field is commented out. Ensure it is either included or removed based on the requirements.- # "datasets": [dataset.to_json() for dataset in self.datasets] + "datasets": [dataset.to_json() for dataset in self.datasets]cognee-frontend/src/ui/Partials/SearchView/SearchView.tsx (2)
10-10
: Review the change oftext
property toany
.Changing the
text
property toany
can introduce potential type safety issues. Ensure that this change is necessary and justified.
101-103
: Rendering logic looks good.The conditional rendering logic ensures that
message.text
is displayed correctly based on its type, enhancing the component's robustness.cognee/api/v1/search/search.py (4)
Line range hint
1-15
:
Imports look good.The imports are appropriate and necessary for the functionality of the file.
Line range hint
17-39
:
Class declarations look good.The class declarations are well-defined and appropriate for the file's functionality.
45-63
: Function changes look good.The enhancements to the
search
function improve functionality by integrating user context into the search process.
Line range hint
65-89
:
Function looks good.The
specific_search
function is well-defined and leverages asyncio for concurrent execution of search tasks.cognee-frontend/src/modules/ingestion/DatasetsView/DatasetsView.tsx (2)
43-43
: Update state variable to includename
property.The
dataset
state variable has been expanded to include bothid
andname
properties. This change is necessary to support the updated rendering logic.
100-100
: Render dataset name instead of id.The rendering logic within the modal has been updated to display the
name
of the dataset instead of itsid
. This improves the user interface by providing more meaningful information.cognee-frontend/src/app/page.tsx (4)
51-51
: Update function parameter to includename
property.The
onDatasetCognify
function's parameter type has been updated to include bothid
andname
properties. This change is necessary to support the updated notification messages.
52-52
: Update notification message to use dataset name.The notification message has been updated to use the
name
property of the dataset instead of theid
. This improves the clarity of the notifications.
56-56
: Update success notification message to use dataset name.The success notification message has been updated to use the
name
property of the dataset instead of theid
. This improves the clarity of the notifications.
59-59
: Update failure notification message to use dataset name.The failure notification message has been updated to use the
name
property of the dataset instead of theid
. This improves the clarity of the notifications.cognee/api/v1/add/add.py (6)
18-18
: Introduceuser
parameter inadd
function.The
user
parameter allows for user-specific operations during data processing. This change enhances the functionality by enabling permission checks and assignments.
54-54
: Passuser
parameter toadd_files
function.The
user
parameter is passed to theadd_files
function to ensure user-specific operations are handled during file processing.
58-58
: Introduceuser
parameter inadd_files
function.The
user
parameter allows for user-specific operations during file processing. This change enhances the functionality by enabling permission checks and assignments.
107-107
: Introduceuser
parameter indata_resources
function.The
user
parameter allows for user-specific operations during data processing. This change enhances the functionality by enabling permission checks and assignments.
153-154
: Assign read and write permissions to the user.The
give_permission_on_document
function is used to assign read and write permissions to the user for the processed data.
157-157
: Get or create default user if none is provided.If no user is provided, the function retrieves or creates a default user. This ensures that user-specific operations can still be performed.
cognee/api/v1/cognify/cognify_v2.py (6)
25-29
: New imports for user management.The imports for user management are correctly added. Ensure that these modules are available and correctly implemented.
35-39
: LGTM! ThePermissionDeniedException
class is correctly defined.The exception class is well-defined and follows standard practices.
49-50
: Ensure default user retrieval is correct.The default user is retrieved if no user is provided. Ensure that the
get_default_user
function is correctly implemented and handles any potential errors.
66-70
: Ensure permissions are correctly checked.The
check_permissions_on_documents
function is used to check user permissions. Ensure that this function is correctly implemented and handles any potential errors.
154-155
: LGTM! Thegenerate_dataset_name
function is correctly implemented.The function follows best practices and is correctly implemented.
7-7
: Update import statement forget_relational_engine
.The import statement has been updated to import
get_relational_engine
. Ensure that this change aligns with the new database interaction approach.Verification successful
Update import statement for
get_relational_engine
.The import statement has been updated to import
get_relational_engine
. This change aligns with the new database interaction approach and is correctly used across the codebase.
cognee/api/v1/cognify/cognify_v2.py
cognee/modules/users/methods/get_default_user.py
cognee/modules/users/permissions/methods/give_permission_on_document.py
cognee/modules/pipelines/operations/get_pipeline_status.py
cognee/modules/users/permissions/methods/get_document_ids_for_user.py
cognee/modules/users/permissions/methods/check_permissions_on_documents.py
cognee/modules/pipelines/operations/log_pipeline_status.py
cognee/modules/data/deletion/prune_system.py
cognee/modules/data/operations/ensure_dataset_exists.py
cognee/modules/data/operations/retrieve_datasets.py
cognee/modules/users/methods/create_user.py
cognee/modules/data/operations/get_dataset_data.py
cognee/infrastructure/databases/relational/get_relational_engine.py
cognee/infrastructure/databases/relational/create_db_and_tables.py
cognee/infrastructure/databases/relational/__init__.py
cognee/api/v1/add/add.py
cognee/api/v1/datasets/datasets.py
cognee/api/v1/cognify/cognify.py
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the usage of `get_relational_engine` in the codebase. # Test: Search for the function usage. Expect: Only occurrences of the new function. rg --type python -A 5 $'get_relational_engine'Length of output: 76
Script:
#!/bin/bash # Description: Verify the usage of `get_relational_engine` in the codebase. # Test: Search for the function usage. Expect: Only occurrences of the new function. rg -t py -A 5 'get_relational_engine'Length of output: 18774
cognee/api/client.py (1)
291-292
: LGTM! Theget_settings
function is correctly implemented.The function follows best practices and is correctly implemented.
cognee/api/v1/cognify/cognify.py (6)
49-49
: Ensurecreate_task_status_table
is correctly awaited.The
create_task_status_table
function is now awaited. Ensure that this function is correctly implemented and handles any potential errors.
56-56
: Ensureget_datasets
is correctly awaited.The
get_datasets
function is now awaited. Ensure that this function is correctly implemented and handles any potential errors.
86-94
: Ensureget_dataset_data
is correctly awaited.The
get_dataset_data
function is now awaited. Ensure that this function is correctly implemented and handles any potential errors.
Line range hint
96-138
:
Ensure file processing logic is correct.The function
process_batch
processes batches of files. Ensure that the file processing logic is correctly implemented and handles any potential errors.
Line range hint
170-229
:
Ensure text processing logic is correct.The function
process_text
processes text chunks. Ensure that the text processing logic is correctly implemented and handles any potential errors.
33-36
: Update import statement forget_relational_engine
.The import statement has been updated to import
get_relational_engine
. Ensure that this change aligns with the new database interaction approach.Verification successful
Update import statement for
get_relational_engine
.The import statement for
get_relational_engine
is correctly updated and aligns with the new database interaction approach. The function is widely used across the codebase, indicating its necessity and proper integration.
cognee/api/v1/cognify/cognify.py
: The import and usage ofget_relational_engine
are consistent with the rest of the codebase.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the usage of `get_relational_engine` in the codebase. # Test: Search for the function usage. Expect: Only occurrences of the new function. rg --type python -A 5 $'get_relational_engine'Length of output: 76
Script:
#!/bin/bash # Description: Verify the usage of `get_relational_engine` in the codebase. # Test: Search for the function usage. Expect: Only occurrences of the new function. rg -A 5 'get_relational_engine'Length of output: 18768
datasets: Mapped[List["Dataset"]] = relationship( | ||
secondary = DatasetData.__tablename__, | ||
back_populates = "data" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix undefined name Dataset
.
The datasets
relationship references Dataset
, which is not defined or imported. Change it to DatasetData
to align with the import.
- datasets: Mapped[List["Dataset"]] = relationship(
+ datasets: Mapped[List["DatasetData"]] = relationship(
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
datasets: Mapped[List["Dataset"]] = relationship( | |
secondary = DatasetData.__tablename__, | |
back_populates = "data" | |
) | |
datasets: Mapped[List["DatasetData"]] = relationship( | |
secondary = DatasetData.__tablename__, | |
back_populates = "data" | |
) |
Tools
Ruff
22-22: Undefined name
Dataset
(F821)
async def retrieve_datasets(dataset_names: list[str]) -> list[Dataset]: | ||
db_engine = get_relational_engine() | ||
|
||
async with db_engine.get_async_session() as session: | ||
datasets = (await session.scalars( | ||
select(Dataset).filter(Dataset.name.in_(dataset_names)) | ||
)).all() | ||
|
||
return datasets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding error handling.
The function lacks error handling, which is crucial for robust database operations. Consider adding try-except blocks to handle potential exceptions.
from sqlalchemy.exc import SQLAlchemyError
async def retrieve_datasets(dataset_names: List[str]) -> List[Dataset]:
db_engine = get_relational_engine()
try:
async with db_engine.get_async_session() as session:
datasets = (await session.scalars(
select(Dataset).filter(Dataset.name.in_(dataset_names))
)).all()
return datasets
except SQLAlchemyError as e:
# Handle the exception (e.g., log it and return an empty list or re-raise)
print(f"Database error: {e}")
return []
from cognee.infrastructure.databases.relational import get_relational_engine | ||
from ..models import Dataset | ||
|
||
async def retrieve_datasets(dataset_names: list[str]) -> list[Dataset]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct the type hint for the parameter.
Use List[str]
instead of list[str]
for consistency with Python type hinting conventions.
- async def retrieve_datasets(dataset_names: list[str]) -> list[Dataset]:
+ async def retrieve_datasets(dataset_names: List[str]) -> List[Dataset]:
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async def retrieve_datasets(dataset_names: list[str]) -> list[Dataset]: | |
from typing import List | |
async def retrieve_datasets(dataset_names: List[str]) -> List[Dataset]: |
class Permission(Base): | ||
__tablename__ = "permissions" | ||
|
||
id = Column(UUID, primary_key = True, index = True, default = uuid4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct the default value for id
.
The default value for id
should be a callable (i.e., uuid4
without parentheses).
- id = Column(UUID, primary_key = True, index = True, default = uuid4)
+ id = Column(UUID, primary_key = True, index = True, default = uuid4)
Committable suggestion was skipped due to low confidence.
created_at = Column(DateTime(timezone = True), default = lambda: datetime.now(timezone.utc)) | ||
updated_at = Column(DateTime(timezone = True), onupdate = lambda: datetime.now(timezone.utc)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use func.now()
and func.current_timestamp()
for timestamps.
Replace the lambda functions for created_at
and updated_at
with func.now()
and func.current_timestamp()
from SQLAlchemy for better readability and consistency.
- created_at = Column(DateTime(timezone = True), default = lambda: datetime.now(timezone.utc))
- updated_at = Column(DateTime(timezone = True), onupdate = lambda: datetime.now(timezone.utc))
+ from sqlalchemy.sql import func
+ created_at = Column(DateTime(timezone = True), default = func.now())
+ updated_at = Column(DateTime(timezone = True), onupdate = func.current_timestamp())
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
created_at = Column(DateTime(timezone = True), default = lambda: datetime.now(timezone.utc)) | |
updated_at = Column(DateTime(timezone = True), onupdate = lambda: datetime.now(timezone.utc)) | |
from sqlalchemy.sql import func | |
created_at = Column(DateTime(timezone = True), default = func.now()) | |
updated_at = Column(DateTime(timezone = True), onupdate = func.current_timestamp()) |
@asynccontextmanager | ||
async def lifespan(app: FastAPI): | ||
# Not needed if you setup a migration system like Alembic | ||
await create_db_and_tables() | ||
yield | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove duplicate lifespan
context manager.
The lifespan
context manager is defined twice, which can cause conflicts.
- @asynccontextmanager
- async def lifespan(app: FastAPI):
- # Not needed if you setup a migration system like Alembic
- await create_db_and_tables()
- yield
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@asynccontextmanager | |
async def lifespan(app: FastAPI): | |
# Not needed if you setup a migration system like Alembic | |
await create_db_and_tables() | |
yield |
from contextlib import asynccontextmanager | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reorganize import statements to the top of the file.
Imports should be at the top of the file to comply with PEP 8 standards.
- from contextlib import asynccontextmanager
Committable suggestion was skipped due to low confidence.
Tools
Ruff
31-31: Module level import not at top of file
(E402)
from cognee.api.v1.users.routers import get_auth_router, get_register_router,\ | ||
get_reset_password_router, get_verify_router, get_users_router | ||
|
||
from cognee.api.v1.permissions.get_permissions_router import get_permissions_router |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reorganize import statements to the top of the file.
Imports should be at the top of the file to comply with PEP 8 standards.
- from cognee.api.v1.permissions.get_permissions_router import get_permissions_router
Committable suggestion was skipped due to low confidence.
Tools
Ruff
59-59: Module level import not at top of file
(E402)
from cognee.api.v1.users.routers import get_auth_router, get_register_router,\ | ||
get_reset_password_router, get_verify_router, get_users_router |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reorganize import statements to the top of the file.
Imports should be at the top of the file to comply with PEP 8 standards.
- from cognee.api.v1.users.routers import get_auth_router, get_register_router,\
- get_reset_password_router, get_verify_router, get_users_router
Committable suggestion was skipped due to low confidence.
Tools
Ruff
56-57: Module level import not at top of file
(E402)
@app.get("/datasets", response_model = list) | ||
async def get_datasets(): | ||
from cognee.api.v1.datasets.datasets import datasets | ||
return datasets.list_datasets() | ||
try: | ||
from cognee.api.v1.datasets.datasets import datasets | ||
datasets = await datasets.list_datasets() | ||
|
||
return JSONResponse( | ||
status_code = 200, | ||
content = [dataset.to_json() for dataset in datasets], | ||
) | ||
except Exception as error: | ||
raise HTTPException(status_code = 500, detail=f"Error retrieving datasets: {str(error)}") from error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve error handling in the get_datasets
function.
Avoid using bare except
statements to catch specific exceptions.
- except Exception as error:
+ except (SQLAlchemyError, ValueError) as error:
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@app.get("/datasets", response_model = list) | |
async def get_datasets(): | |
from cognee.api.v1.datasets.datasets import datasets | |
return datasets.list_datasets() | |
try: | |
from cognee.api.v1.datasets.datasets import datasets | |
datasets = await datasets.list_datasets() | |
return JSONResponse( | |
status_code = 200, | |
content = [dataset.to_json() for dataset in datasets], | |
) | |
except Exception as error: | |
raise HTTPException(status_code = 500, detail=f"Error retrieving datasets: {str(error)}") from error | |
@app.get("/datasets", response_model = list) | |
async def get_datasets(): | |
try: | |
from cognee.api.v1.datasets.datasets import datasets | |
datasets = await datasets.list_datasets() | |
return JSONResponse( | |
status_code = 200, | |
content = [dataset.to_json() for dataset in datasets], | |
) | |
except (SQLAlchemyError, ValueError) as error: | |
raise HTTPException(status_code = 500, detail=f"Error retrieving datasets: {str(error)}") from error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Outside diff range, codebase verification and nitpick comments (1)
.github/workflows/docker_compose.yml (1)
38-38
: Ensure consistent usage ofdocker compose
ordocker-compose
commands.There are mixed references to
docker compose
anddocker-compose
in the codebase:
docker compose
is used indocs/blog/posts/from-demo-to-production-3.md
.docker-compose
is used inREADME.md
.To avoid confusion or potential errors, ensure consistent usage of either
docker compose
ordocker-compose
throughout the codebase.
- docs/blog/posts/from-demo-to-production-3.md
- README.md
Analysis chain
Verify compatibility with
docker compose
command for down.Ensure that the new command
docker compose -f docker-compose.yml down
is compatible with your workflow and does not introduce any issues.
Verify consistency of
docker compose
command usage.Ensure that the new command
docker compose -f docker-compose.yml down
is consistently used and correctly referenced in the codebase.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify compatibility with `docker compose` command for down. # Test: Check if `docker compose` command is available. Expect: Command found. docker compose versionLength of output: 98
Script:
#!/bin/bash # Search for references to `docker compose` and `docker-compose` in the codebase to verify consistency and correctness. rg "docker compose" -A 2 rg "docker-compose" -A 2Length of output: 536
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
Files selected for processing (6)
- .github/workflows/docker_compose.yml (1 hunks)
- .github/workflows/test_common.yml (2 hunks)
- .github/workflows/test_neo4j.yml (3 hunks)
- .github/workflows/test_qdrant.yml (1 hunks)
- .github/workflows/test_weaviate.yml (1 hunks)
- pyproject.toml (3 hunks)
Files skipped from review as they are similar to previous changes (1)
- .github/workflows/test_neo4j.yml
Additional comments not posted (18)
.github/workflows/docker_compose.yml (3)
24-24
: Verify compatibility withdocker compose
command for build.Ensure that the new command
docker compose -f docker-compose.yml build
is compatible with your workflow and does not introduce any issues.
28-28
: Verify compatibility withdocker compose
command for up.Ensure that the new command
docker compose -f docker-compose.yml up -d
is compatible with your workflow and does not introduce any issues.
20-20
: Verify compatibility with Docker Buildx v3.Ensure that the new version
docker/setup-buildx-action@v3
is compatible with your workflow and does not introduce any breaking changes..github/workflows/test_qdrant.yml (4)
Line range hint
50-72
:
Verify therun_qdrant_integration_test
job steps configuration.Ensure that the job steps are correctly configured and align with best practices.
Verification successful
The
run_qdrant_integration_test
job steps are correctly configured and align with best practices.
- The workflow sets up Python, installs dependencies using Poetry, and runs the integration tests with the necessary environment variables.
- The recent failures are likely due to issues in the test script or environment, not the workflow configuration.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the `run_qdrant_integration_test` job steps configuration. # Test: Check the job steps configuration for errors. Expect: No errors. gh workflow view test_qdrant.ymlLength of output: 804
Script:
#!/bin/bash # Description: View the YAML configuration of the `test_qdrant.yml` workflow. # Command to view the YAML configuration of the workflow. gh workflow view test_qdrant.yml --yamlLength of output: 1558
38-49
: Verify the new PostgreSQL service configuration.Ensure that the new PostgreSQL service is correctly configured and aligns with best practices.
21-27
: Verify the newsetup_docker
job configuration.Ensure that the new job
setup_docker
is correctly configured and aligns with best practices.
32-32
: Verify compatibility withubuntu-latest
environment.Ensure that the new environment
ubuntu-latest
is compatible with your workflow and does not introduce any issues.Verification successful
Environment change to
ubuntu-latest
is not the cause of workflow failures.The recent failures in the workflow are due to missing configuration fields, specifically the 'username' field in the PostgresCredentials configuration. The change from
macos-latest
toubuntu-latest
is not the root cause of these failures.
- Ensure that all necessary configuration values are provided for the workflow to run successfully.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify compatibility with `ubuntu-latest` environment. # Test: Check the environment configuration for errors. Expect: No errors. gh workflow view test_qdrant.ymlLength of output: 804
Script:
#!/bin/bash # Description: Fetch and analyze logs of recent failed runs for the `test_qdrant.yml` workflow. # Get the list of recent runs for the workflow gh run list --workflow test_qdrant.yml --limit 5 --json conclusion,workflowDatabaseId,name,headSha,headBranch,url,status # Fetch logs for the most recent failed run gh run view 10267916576 --logLength of output: 218035
.github/workflows/test_weaviate.yml (4)
Line range hint
50-72
:
Verify therun_weaviate_integration_test
job steps configuration.Ensure that the job steps are correctly configured and align with best practices.
Verification successful
The
run_weaviate_integration_test
job steps are correctly configured and align with best practices.
- The job includes steps for checking out the repository, setting up Python, installing Poetry and dependencies, creating a directory, and running the integration tests with the necessary environment variables.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the `run_weaviate_integration_test` job steps configuration. # Test: Check the job steps configuration for errors. Expect: No errors. gh workflow view test_weaviate.ymlLength of output: 824
Script:
#!/bin/bash # Description: Retrieve the YAML content of the `test_weaviate.yml` workflow file. # Fetch the YAML content of the workflow file to inspect the job steps configuration. gh workflow view test_weaviate.yml --yamlLength of output: 1572
32-32
: Verify compatibility withubuntu-latest
environment.Ensure that the new environment
ubuntu-latest
is compatible with your workflow and does not introduce any issues.
38-49
: Verify the new PostgreSQL service configuration.Ensure that the new PostgreSQL service is correctly configured and aligns with best practices.
21-27
: Verify the newsetup_docker
job configuration.Ensure that the new job
setup_docker
is correctly configured and aligns with best practices.pyproject.toml (5)
29-29
: LGTM! Fixed version forsqlalchemy
ensures stability.Changing the version constraint from
^2.0.21
to2.0.21
helps avoid potential breaking changes from minor updates.
32-32
: LGTM! Updateddebugpy
version ensures compatibility.Updating from
^1.8.0
to1.8.2
likely addresses specific bug fixes or required features.
41-41
: LGTM! Updateddlt
version with PostgreSQL support enhances functionality.Updating from
0.5.1
to{extras = ["postgres"], version = "^0.5.2"}
adds PostgreSQL support, improving the library's capabilities.
72-72
: LGTM! Addedfastapi-users
for user management.Introducing
fastapi-users
withsqlalchemy
extras adds essential user management capabilities to the project.
73-73
: LGTM! Addedasyncpg
for asynchronous PostgreSQL interactions.Introducing
asyncpg
with version^0.29.0
enhances the project's capability to handle asynchronous PostgreSQL database operations..github/workflows/test_common.yml (2)
21-26
: LGTM! Addedsetup_docker
job for multi-platform builds.The new job sets up Docker Buildx using
docker/setup-buildx-action@v3
, enhancing the workflow's capability to handle multi-platform builds efficiently.
50-60
: LGTM! Addedpostgres
service for PostgreSQL database setup.The new service uses the latest PostgreSQL image, sets up environment variables, and defines a volume for persistent storage, improving the testing environment's fidelity and functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Outside diff range, codebase verification and nitpick comments (1)
cognee/api/v1/search/search.py (1)
45-47
: Verify the function usage in the codebase.The following instances of the
search
function calls do not include theuser
parameter, which is required by the new function signature. Please update these calls to match the new signature:
cognee/tests/test_weaviate.py
cognee/tests/test_qdrant.py
cognee/tests/test_neo4j.py
cognee/tests/test_library.py
cognee/modules/search/vector/search_traverse.py
cognee/modules/search/graph/search_adjacent.py
cognee/modules/search/graph/search_summary.py
cognee/modules/search/graph/search_similarity.py
cognee/modules/cognify/llm/resolve_cross_graph_references.py
cognee/infrastructure/databases/vector/vector_db_interface.py
cognee/infrastructure/databases/vector/falkordb/FalkorDBAdapter.py
cognee/infrastructure/databases/vector/lancedb/LanceDBAdapter.py
cognee/infrastructure/databases/vector/qdrant/QDrantAdapter.py
cognee/infrastructure/databases/vector/weaviate_db/WeaviateAdapter.py
cognee/api/v1/search/search.py
cognee/api/v1/cognify/cognify.py
cognee/api/client.py
Analysis chain
Verify the function usage in the codebase.
Ensure that all function calls to
search
match the new signature.
Let's correct the regex pattern and re-run the search to verify all function calls to
search
match the new signature.
Let's correct the pattern format and use double quotes to avoid the regex parse error.
Let's try escaping the parentheses in the pattern to resolve the regex parse error.
Let's use
fd
to find Python files and then search for the function calls usingrg
.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify all function calls to `search` match the new signature. # Test: Search for the function usage. Expect: Only occurrences of the new signature. rg --type python -A 5 $'search('Length of output: 98
Script:
#!/bin/bash # Description: Verify all function calls to `search` match the new signature. # Test: Search for the function usage. Expect: Only occurrences of the new signature. rg --type python -A 5 'search('Length of output: 98
Script:
#!/bin/bash # Description: Verify all function calls to `search` match the new signature. # Test: Search for the function usage. Expect: Only occurrences of the new signature. rg --type python -A 5 "search("Length of output: 98
Script:
#!/bin/bash # Description: Verify all function calls to `search` match the new signature. # Test: Search for the function usage. Expect: Only occurrences of the new signature. rg --type python -A 5 "search\\("Length of output: 65
Script:
#!/bin/bash # Description: Verify all function calls to `search` match the new signature. # Find all Python files and search for the function usage. fd --extension py | xargs rg -A 5 "search\\("Length of output: 21025
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
Files selected for processing (8)
- .github/workflows/test_common.yml (4 hunks)
- .github/workflows/test_neo4j.yml (3 hunks)
- .github/workflows/test_qdrant.yml (2 hunks)
- .github/workflows/test_weaviate.yml (2 hunks)
- cognee/api/client.py (8 hunks)
- cognee/api/v1/search/search.py (4 hunks)
- cognee/infrastructure/databases/graph/neo4j_driver/adapter.py (1 hunks)
- pyproject.toml (4 hunks)
Files skipped from review due to trivial changes (1)
- pyproject.toml
Files skipped from review as they are similar to previous changes (4)
- .github/workflows/test_common.yml
- .github/workflows/test_neo4j.yml
- .github/workflows/test_qdrant.yml
- .github/workflows/test_weaviate.yml
Additional context used
Ruff
cognee/api/v1/search/search.py
59-59: Use
is
andis not
for type comparisons, orisinstance()
for isinstance checks(E721)
cognee/api/client.py
31-31: Module level import not at top of file
(E402)
56-57: Module level import not at top of file
(E402)
59-59: Module level import not at top of file
(E402)
Additional comments not posted (10)
cognee/api/v1/search/search.py (1)
Line range hint
64-90
:
LGTM!The
specific_search
function is well-implemented and efficient.cognee/api/client.py (5)
33-38
: LGTM!The
lifespan
context manager is well-implemented and efficient.
Line range hint
39-54
:
LGTM!The FastAPI application setup and middleware configuration are well-implemented and efficient.
Tools
Ruff
31-31: Module level import not at top of file
(E402)
61-95
: LGTM!The router inclusion for authentication and user management is well-implemented and efficient.
111-122
: LGTM!The
get_datasets
function is well-implemented and efficient.
291-292
: LGTM!The
get_settings
function and other endpoints are well-implemented and efficient.cognee/infrastructure/databases/graph/neo4j_driver/adapter.py (4)
24-25
: LGTM!The addition of
max_connection_lifetime
to theAsyncGraphDatabase.driver
initialization is well-implemented and efficient.
Line range hint
47-57
:
LGTM!The
query
method is well-implemented and efficient.
Line range hint
81-111
:
LGTM!The
add_node
andadd_nodes
methods are well-implemented and efficient.
Line range hint
211-243
:
LGTM!The
add_edge
andadd_edges
methods are well-implemented and efficient.
|
||
for search_result in search_results: | ||
document_id = search_result["document_id"] if "document_id" in search_result else None | ||
document_id = UUID(document_id) if type(document_id) == str else document_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use isinstance
for type comparison.
Replace type(document_id) == str
with isinstance(document_id, str)
for better readability and performance.
- document_id = UUID(document_id) if type(document_id) == str else document_id
+ document_id = UUID(document_id) if isinstance(document_id, str) else document_id
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
document_id = UUID(document_id) if type(document_id) == str else document_id | |
document_id = UUID(document_id) if isinstance(document_id, str) else document_id |
Tools
Ruff
59-59: Use
is
andis not
for type comparisons, orisinstance()
for isinstance checks(E721)
from cognee.infrastructure.databases.relational import create_db_and_tables | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reorganize import statements to the top of the file.
Imports should be at the top of the file to comply with PEP 8 standards.
- from contextlib import asynccontextmanager
- from cognee.api.v1.users.routers import get_auth_router, get_register_router,\
- get_reset_password_router, get_verify_router, get_users_router
- from cognee.api.v1.permissions.get_permissions_router import get_permissions_router
+ from contextlib import asynccontextmanager
+ from cognee.api.v1.users.routers import get_auth_router, get_register_router, get_reset_password_router, get_verify_router, get_users_router
+ from cognee.api.v1.permissions.get_permissions_router import get_permissions_router
Also applies to: 31-32, 56-60
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- .github/workflows/test_common.yml (4 hunks)
Additional comments not posted (3)
.github/workflows/test_common.yml (3)
21-34
: LGTM! Thesetup_docker
job is correctly defined.The job sets up Docker Buildx for multi-platform builds using appropriate actions.
57-67
: LGTM! The PostgreSQL service is correctly defined.The service uses the latest PostgreSQL image, sets up environment variables, defines a volume for persistent storage, and is exposed on port 5432.
108-113
: LGTM! The step to wait for PostgreSQL to be ready is correctly implemented.The step uses the
pg_isready
command in a loop until the database is ready, ensuring that PostgreSQL is fully initialized before running tests.
Summary by CodeRabbit
New Features
Bug Fixes
Documentation