-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implements graph visualization method for cognee #493
feat: implements graph visualization method for cognee #493
Conversation
WalkthroughThe pull request refactors the visualization implementation. In the API module, the Changes
Sequence Diagram(s)sequenceDiagram
participant Main
participant EventLoop
participant VisualizeGraph
participant NetworkVisualization
Main->>EventLoop: Create asyncio event loop
EventLoop->>VisualizeGraph: Run visualize_graph()
VisualizeGraph->>NetworkVisualization: Call cognee_network_visualization(graph_data)
NetworkVisualization-->>VisualizeGraph: Return HTML content
VisualizeGraph-->>EventLoop: Complete execution
EventLoop->>Main: Shutdown loop
sequenceDiagram
participant Test
participant NetworkVisualization
Test->>NetworkVisualization: Invoke with sample graph data
NetworkVisualization-->>Test: Return generated HTML
Test->>Test: Assert HTML structure (e.g., <html>, D3.js script)
Poem
Tip 🌐 Web search-backed reviews and chat
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
cognee/shared/utils.py (1)
374-501
: Consider configuring file output path and adding error handling.
The function currently writes the HTML to a file in the user’s home directory. If the user lacks permission or if the path is invalid, this could fail silently. Adding optional parameters for the target file path or introducing more robust exception handling (e.g., logging errors or raising an exception) would improve usability.cognee/api/v1/visualize/visualize.py (1)
10-15
: Add docstring and guard for empty graph data.
The newvisualize_graph()
function currently logs and calls the utility but lacks a docstring describing its purpose or a check ifgraph_data
is empty. Consider adding brief documentation and verifying data before proceeding.+""" +Asynchronously fetches the graph data, logs it, and generates a styleable HTML network visualization. +""" async def visualize_graph(): + if not graph_data or not any(graph_data): + logging.warning("No graph data found; skipping visualization.")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
cognee/api/v1/visualize/visualize.py
(1 hunks)cognee/shared/utils.py
(2 hunks)cognee/tests/unit/processing/utils/utils_test.py
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (10)
- GitHub Check: run_notebook_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: test
- GitHub Check: run_notebook_test / test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: windows-latest
- GitHub Check: docker-compose-test
🔇 Additional comments (3)
cognee/shared/utils.py (1)
340-341
: Function signature appears clear and concise.
The newasync def create_cognee_style_network_with_logo(graph_data)
function properly accepts a singlegraph_data
parameter. Streamlining the signature makes usage more straightforward throughout the codebase.cognee/api/v1/visualize/visualize.py (1)
6-7
: Async support and logging setup.
Importingasyncio
andsetup_logging
aligns well with the new asynchronous approach. This addition is appropriate and helps maintain a robust application structure.cognee/tests/unit/processing/utils/utils_test.py (1)
85-112
: Comprehensive test coverage for HTML output.
This test method thoroughly checks the generated HTML content for crucial elements (e.g.,<html>
, D3 script tags). It effectively validates that the function produces a valid visualization scaffold.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (3)
cognee/tests/unit/processing/utils/utils_test.py (2)
89-104
: Consider moving test data to fixtures.Moving the test data to fixtures would improve reusability and make the test more maintainable.
+@pytest.fixture +def graph_data(): + nodes_data = [ + (1, {"pydantic_type": "Entity", "name": "Node1", "updated_at": 123, "created_at": 123}), + (2, { + "pydantic_type": "DocumentChunk", + "name": "Node2", + "updated_at": 123, + "created_at": 123, + }), + ] + edges_data = [(1, 2, "related_to", {})] + return (nodes_data, edges_data) + @pytest.mark.asyncio -async def test_create_cognee_style_network_with_logo(): +async def test_create_cognee_style_network_with_logo(graph_data): from pathlib import Path - - nodes_data = [ - (1, {"pydantic_type": "Entity", "name": "Node1", "updated_at": 123, "created_at": 123}), - ( - 2, - { - "pydantic_type": "DocumentChunk", - "name": "Node2", - "updated_at": 123, - "created_at": 123, - }, - ), - ] - edges_data = [ - (1, 2, "related_to", {}), - ] - graph_data = (nodes_data, edges_data)
108-113
: Add more comprehensive assertions.The current assertions only verify basic HTML structure. Consider adding more specific assertions to validate:
- Node properties in the output
- Edge properties in the output
- Graph styling and configuration
assert isinstance(html_output, str) assert "<html>" in html_output assert '<script src="https://d3js.org/d3.v5.min.js"></script>' in html_output assert "var nodes =" in html_output assert "var links =" in html_output + # Verify node properties + assert '"pydantic_type":"Entity"' in html_output + assert '"name":"Node1"' in html_output + # Verify edge properties + assert '"relation":"related_to"' in html_output + # Verify styling + assert "linear-gradient(90deg, #101010, #1a1a2e)" in html_outputcognee/infrastructure/visualization/cognee_network_visualization.py (1)
12-17
: Make color mapping configurable.Consider moving the color mapping to a configuration file or accepting it as a parameter for better flexibility.
- color_map = { - "Entity": "#f47710", - "EntityType": "#6510f4", - "DocumentChunk": "#801212", - "default": "#D3D3D3", - } + color_map = get_color_mapping() # Load from config or accept as parameter
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
cognee/__init__.py
(1 hunks)cognee/api/v1/visualize/visualize.py
(1 hunks)cognee/infrastructure/visualization/cognee_network_visualization.py
(1 hunks)cognee/tests/unit/processing/utils/utils_test.py
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- cognee/api/v1/visualize/visualize.py
⏰ Context from checks skipped due to timeout of 90000ms (19)
- GitHub Check: run_notebook_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_dynamic_steps_example_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: test
- GitHub Check: run_simple_example_test / test
- GitHub Check: run_simple_example_test / test
- GitHub Check: test
- GitHub Check: windows-latest
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: run_notebook_test / test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: docker-compose-test
- GitHub Check: profiler
🔇 Additional comments (2)
cognee/__init__.py (1)
8-10
: LGTM! Good code organization.Moving the visualization code to a dedicated module improves code organization and maintainability.
cognee/infrastructure/visualization/cognee_network_visualization.py (1)
24-27
: Address the TODO comment.The TODO comment indicates a need to decide which properties to show on nodes and edges.
Please review and decide which properties should be displayed. Consider:
- Essential properties for visualization
- Performance impact of including additional properties
- User experience and readability
cognee/infrastructure/visualization/cognee_network_visualization.py
Outdated
Show resolved
Hide resolved
cognee/infrastructure/visualization/cognee_network_visualization.py
Outdated
Show resolved
Hide resolved
cognee/infrastructure/visualization/cognee_network_visualization.py
Outdated
Show resolved
Hide resolved
…ttps://github.com/topoteretes/cognee into feature/cog-1036-improving-visualization-endpoint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (3)
cognee/shared/utils.py (3)
30-34
: Add error handling for NLTK downloads.The quiet=True parameter could mask download failures. Consider adding error handling to ensure required NLTK resources are available.
- nltk.download("maxent_ne_chunker", quiet=True) + try: + nltk.download("maxent_ne_chunker", quiet=True) + except Exception as e: + logging.error(f"Failed to download NLTK resource: {e}") + raise
57-71
: Add error handling for file operations.File operations should have proper error handling to gracefully handle permission issues or disk errors.
- if not os.path.isdir(home_dir): - os.makedirs(home_dir, exist_ok=True) - anonymous_id_file = os.path.join(home_dir, ".anon_id") - if not os.path.isfile(anonymous_id_file): - anonymous_id = str(uuid4()) - with open(anonymous_id_file, "w", encoding="utf-8") as f: - f.write(anonymous_id) - else: - with open(anonymous_id_file, "r", encoding="utf-8") as f: - anonymous_id = f.read() + try: + if not os.path.isdir(home_dir): + os.makedirs(home_dir, exist_ok=True) + anonymous_id_file = os.path.join(home_dir, ".anon_id") + if not os.path.isfile(anonymous_id_file): + anonymous_id = str(uuid4()) + with open(anonymous_id_file, "w", encoding="utf-8") as f: + f.write(anonymous_id) + else: + with open(anonymous_id_file, "r", encoding="utf-8") as f: + anonymous_id = f.read() + except OSError as e: + logging.error(f"Failed to handle anonymous ID file: {e}") + anonymous_id = str(uuid4()) # Fallback to temporary ID
96-96
: Add timeout to HTTP request.The requests.post call should include a timeout to prevent hanging on slow responses.
- response = requests.post(proxy_url, json=payload) + response = requests.post(proxy_url, json=payload, timeout=10)
🧹 Nitpick comments (5)
cognee/shared/utils.py (5)
26-27
: Consider making the proxy URL configurable.The proxy URL is hardcoded to a test environment. Consider moving this to a configuration file or environment variable for better maintainability and environment-specific settings.
-# Analytics Proxy Url, currently hosted by Vercel -proxy_url = "https://test.prometh.ai" +# Analytics Proxy Url +proxy_url = os.getenv("ANALYTICS_PROXY_URL", "https://test.prometh.ai")
340-349
: Add return type hint for graph_to_tuple function.The function's return type should be explicitly defined for better code clarity and type checking.
-def graph_to_tuple(graph): +def graph_to_tuple(graph) -> tuple[list, list]:
295-295
: Move logo image to a separate asset file.The base64 encoded logo image should be moved to a separate asset file for better maintainability.
Consider creating an assets directory and loading the image from there:
- logo_url = "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA..." + with open("assets/logo.base64", "r") as f: + logo_url = f"data:image/png;base64,{f.read().strip()}"
313-315
: Remove unused imports.The following imports are not used in the function:
- output_file
- show
- from bokeh.plotting import figure, from_networkx - from bokeh.models import Circle, MultiLine, HoverTool, ColumnDataSource, Range1d - from bokeh.plotting import output_file, show + from bokeh.plotting import from_networkx + from bokeh.models import Circle, MultiLine
352-366
: Improve function docstring with type hints and return value.Add parameter and return type information to the docstring for better documentation.
-def setup_logging(log_level=logging.INFO): - """Sets up the logging configuration.""" +def setup_logging(log_level: int = logging.INFO) -> None: + """Sets up the logging configuration. + + Args: + log_level (int): The logging level to use. Defaults to logging.INFO. + + Returns: + None + """
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cognee/shared/utils.py
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (18)
- GitHub Check: run_notebook_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: run_notebook_test / test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: run_simple_example_test / test
- GitHub Check: test
- GitHub Check: windows-latest
- GitHub Check: test
- GitHub Check: run_multimedia_example_test / test
- GitHub Check: docker-compose-test
- GitHub Check: profiler
import os | ||
|
||
|
||
async def create_cognee_style_network_with_logo(graph_data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's put it in modules
, as it is not related to infrastructure
. Also please rename the file to match the function that is inside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -82,27 +84,30 @@ def test_prepare_nodes(): | |||
|
|||
@pytest.mark.asyncio | |||
async def test_create_cognee_style_network_with_logo(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move the test to its own file as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🔭 Outside diff range comments (1)
cognee/tests/unit/modules/visualization/visualization_test.py (1)
8-34
: Enhance test coverage with edge cases.The test only covers the happy path. Add tests for:
- Empty graph data
- Large graphs
- Special characters in node names
- Invalid node/edge data
+@pytest.mark.asyncio +async def test_empty_graph(): + nodes_data = [] + edges_data = [] + graph_data = (nodes_data, edges_data) + html_output = await cognee_network_visualization(graph_data) + assert isinstance(html_output, str) + assert "<html>" in html_output +@pytest.mark.asyncio +async def test_special_characters(): + nodes_data = [(1, {"name": "<script>alert('xss')</script>", "pydantic_type": "Entity"})] + edges_data = [] + graph_data = (nodes_data, edges_data) + html_output = await cognee_network_visualization(graph_data) + assert "<script>alert('xss')</script>" not in html_output
🧹 Nitpick comments (3)
cognee/modules/visualization/cognee_network_visualization.py (1)
24-27
: Consider using a configuration for node properties.The TODO comment suggests uncertainty about which properties to display. Consider moving this configuration to a separate settings file.
+ # Import from configuration + EXCLUDED_NODE_PROPERTIES = ["updated_at", "created_at"] + - del node_info["updated_at"] - del node_info["created_at"] + for prop in EXCLUDED_NODE_PROPERTIES: + node_info.pop(prop, None)cognee/api/v1/visualize/visualize.py (1)
23-31
: Consider using asyncio.run() for main block.The manual event loop management could be simplified using
asyncio.run()
.if __name__ == "__main__": setup_logging(logging.ERROR) - loop = asyncio.new_event_loop() - asyncio.set_event_loop(loop) - try: - loop.run_until_complete(visualize_graph()) - finally: - loop.run_until_complete(loop.shutdown_asyncgens()) + asyncio.run(visualize_graph())cognee/tests/unit/modules/visualization/visualization_test.py (1)
28-33
: Add more specific HTML structure assertions.The current assertions are too basic. Add more specific checks for the graph structure.
assert isinstance(html_output, str) assert "<html>" in html_output assert '<script src="https://d3js.org/d3.v5.min.js"></script>' in html_output assert "var nodes =" in html_output assert "var links =" in html_output + # Check for specific node data + assert '"pydantic_type":"Entity"' in html_output + assert '"name":"Node1"' in html_output + # Check for specific edge data + assert '"relation":"related_to"' in html_output + # Check for SVG elements + assert '<svg style="position: fixed;"' in html_output
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
cognee/__init__.py
(1 hunks)cognee/api/v1/visualize/visualize.py
(1 hunks)cognee/modules/visualization/cognee_network_visualization.py
(1 hunks)cognee/tests/unit/modules/visualization/visualization_test.py
(1 hunks)cognee/tests/unit/processing/utils/utils_test.py
(0 hunks)
💤 Files with no reviewable changes (1)
- cognee/tests/unit/processing/utils/utils_test.py
🚧 Files skipped from review as they are similar to previous changes (1)
- cognee/init.py
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: run_notebook_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: windows-latest
- GitHub Check: test
- GitHub Check: docker-compose-test
🔇 Additional comments (1)
cognee/modules/visualization/cognee_network_visualization.py (1)
45-45
: Verify D3.js source integrity.Using an external CDN without integrity checks poses a security risk.
Description
This PR contains the improvement of the visualization endpoint
DCO Affirmation
I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin
Summary by CodeRabbit