New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

From Natural Language to SQL: Building and Tracking a Multi-Lingual Query Engine #132

Open

joanacmesquitaf wants to merge 2 commits into mlflow:main from iRahulPandey:from-natural-language-to-sql

joanacmesquitaf commented Dec 5, 2024

Description:

This blog post demonstrates how to build a Multilingual Query Engine that combines Natural Language-to-SQL generation with query execution while fully leveraging MLflow’s features. It explores how to leverage MLflow Models from Code to treat AI workflows as traditional ML models, enabling seamless tracking, versioning, and deployment across diverse serving infrastructures. Additionally, it dives into MLflow’s Tracing feature, which enhances observability by tracking inputs, outputs, and metadata at every intermediate step of the AI workflow.

Additions:

Created a folder for the blog content containing:

index.md with the content of the blog.
Relevant illustrations.

Added all authors' information and thumbnails to the correct locations.

Additional Information:

Related to: #115

[email protected] added 2 commits

December 5, 2024 10:19


          Add initial blog commit

e2008d8


          Add posts entry

3ac4775

github-actions bot commented Dec 5, 2024

Preview for `3ac4775`

For faster build, the doc pages are not included in the preview.
Redirects are disabled in the preview.

Open in StackBlitz

djliden suggested changes

View reviewed changes

djliden left a comment •

edited

Loading

Thank you for this blog! Great project. I appreciate the detailed code examples and I liked seeing all of the LangGraph nodes reflected in the MLflow traces.

A few general comments:

It took me a little while to figure out exactly what we would be building; I'd like to see a clearer statement of this very early on. "We will build a system that takes user inputs in any(?) language, translates them to SQL, validates, executes, etc..." Ideally with at least one concrete example. Given a database with data about X, a user can ask <natural language query> in natural language and see <output>.
The role of MLflow in the story of the post was a little unclear until the end. It talked about lifecycle management a few times but we don't really see it used for that purpose much. I would propose emphasizing tracing a bit more, and really highlighting the correspondence between the nodes and the recorded tracing spans, emphasizing that tracing gives a lot of visibility of what happens at each step.
I wonder if it's possible to make the node descriptions section a little more concise. Perhaps instead of having the process/key considerations/examples lists in each section, those could be consolidated into a single table at the beginning/end of the node descriptions section, or those could be consolidated into a few sentences per section. It might be even more effective to follow one concrete example though each step instead. E.g. a user starts with "Quantos pedidos foram realizados em Novembro?" — I would be really interested to see what happens with this at each stage (maybe with tracing screenshots at each stage 🙂)

website/blog/2024-12-04-from-natural-language-to-sql/index.md


		# Multilingual Query Engine using Langraph

		The Multilingual Query Engine leverages LangGraph’s advanced features to create a stateful, multi-agent, and cyclical graph architecture.

djliden Dec 12, 2024

I suggest briefly explaining what LangGraph is and why it is the right tool for this purpose (this section gets into features, but I think readers would benefit from a higher-level intro that identifies it as an AI orchestration tool, explains why it was used in this case, etc.)

website/blog/2024-12-04-from-natural-language-to-sql/index.md

Comment on lines +68 to +88

+              ## AI Workflow Overview
+              The Multilingual Query Engine’s advanced AI workflow is composed of interconnected nodes and edges, each representing a crucial stage:
+. **Translation Node**: Converts the user’s input into English.
+. **Safety Checks**: Ensures user input is free from toxic or inappropriate content and does not contain harmful SQL commands (e.g., DELETE, DROP).
+. **Database Schema Extraction**: Retrieves the schema of the target database to understand its structure and available data.
+. **Relevancy Validation**: Validates the user’s input against the database schema to ensure alignment with the database’s capabilities.
+. **SQL Query Generation**: Generates an SQL query based on the user’s input and the current database schema.
+. **SQL Query Validation**: Executes the SQL query in a rollback-safe environment to ensure its validity before running it.
+. **Dynamic State Evaluation**: Determines the next steps based on the current state. If the SQL query validation fails, it loops back to Stage 5 to regenerate the query.
+. **Query Execution and Result Retrieval**: Executes the SQL query and returns the results if it’s a SELECT statement.
+              The retry mechanism is introduced in Stage 7, where the system dynamically evaluates the current graph state. Specifically, when the SQL query validation node (Stage 6) detects an issue, the state triggers a loop back to the SQL Generation node (Stage 5) for a new SQL Generation attempt (within a maximum of 3 attemps).

djliden Dec 12, 2024

The intro—first paragraph or two—could use a very brief summary of this. "We will build a system that takes natural language input, such as X, from the user, validates it for safety, and generates correct SQL, informed by context about the database schema." A clear description of the task and an example of the final workflow the project will enable will help readers get their bearings right from the beginning.

website/blog/2024-12-04-from-natural-language-to-sql/index.md

Comment on lines +429 to +437

+              **Examples:**
+              - Input: _"Quantos pedidos foram realizados em Novembro?"_
+              - Translated: _"How many orders were made in November?"_
+              - Input: _"Combien de ventes avons-nous enregistrées en France ?"_
+              - Translated: _"How many sales did we record in France?"_

djliden Dec 12, 2024

Note early on that multilingual refers to taking natural-language inputs in multiple languages, not e.g. supporting multiple SQL dialects or something like that. I wasn't 100% sure until I got to here!

website/blog/2024-12-04-from-natural-language-to-sql/index.md

Comment on lines +966 to +1034

+              def main():
+                  # Load environment variables from .env file
+                  load_dotenv()
+                  # Access secrets using os.getenv
+                  os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")
+                  # Setup database and vector store
+                  conn = setup_database()
+                  cursor = conn.cursor()
+                  vector_store = setup_vector_store()
+                  # Load the model
+                  model_uri = f"models:/{REGISTERED_MODEL_NAME}@{MODEL_ALIAS}"
+                  model = mlflow.pyfunc.load_model(model_uri)
+                  model_input = {"conn": conn, "cursor": cursor, "vector_store": vector_store}
+                  app = model.predict(model_input)
+                  # save image
+                  app.get_graph().draw_mermaid_png(
+                      output_file_path="sql_agent_with_safety_checks.png"
+                  )
+                  # Example user interaction
+                  print("Welcome to the SQL Assistant!")
+                  while True:
+                      question = input("\nEnter your SQL question (or type 'exit' to quit): ")
+                      if question.lower() == "exit":
+                          break
+                      # Initialize the state with all required keys
+                      initial_state = {
+                          "messages": [("user", question)],
+                          "iterations": 0,
+                          "error": "",
+                          "results": None,
+                          "generation": None,
+                          "no_records_found": False,
+                          "translated_input": "",  # Initialize translated_input
+                      }
+                      solution = app.invoke(initial_state)
+                      # Check if an error was set during the safety check
+                      if solution["error"] == "yes":
+                          print("\nAssistant Message:\n")
+                          print(solution["messages"][-1][1])  # Display the assistant's message
+                          continue  # Skip to the next iteration
+                      # Extract the generated SQL query from solution["generation"]
+                      sql_query = solution["generation"].sql_code
+                      print("\nGenerated SQL Query:\n")
+                      print(sql_query)
+                      # Extract and display the query results
+                      if solution.get("no_records_found"):
+                          print("\nNo records found matching your query.")
+                      elif "results" in solution and solution["results"] is not None:
+                          print("\nQuery Results:\n")
+                          for row in solution["results"]:
+                              print(row)
+                      else:
+                          print("\nNo results returned or query did not execute successfully.")
+                  print("Goodbye!")
+              if __name__ == "__main__":
+                  main()

djliden Dec 12, 2024

Can you show an example invocation of this (not all the code, just one quick example) toward the beginning?

website/blog/2024-12-04-from-natural-language-to-sql/index.md


		However, there are also a number of problems that remain when creating an NL2SQL system like semantic ambiguity, schema mapping or error handling and user feedback. Therefore, it is very important that while building such systems, we must put some guardrails instead of completely relying on LLM.

		In this blog post, we’ll walk you through the process of building and managing the lifecycle of a Multilingual Query Engine, encompassing both Natural Language to SQL generation and query execution.

djliden Dec 12, 2024

I don't see much about lifecycle management in the post (it mentions lifecycle management a few times, but doesn't show much about how to use MLflow for that purpose). You might de-emphasize that but show a little more about how MLflow tracing gives visibility into the many different components of a setup like this.

I think it's really cool, looking at the gif at the end, how we can see the different nodes laid out in the article reflected in the trace. It might be interesting to show screenshots of that for each section, or at least call that out more clearly in that section—specifically, that the final graph can be a bit of a black box, might be challenging to debug, to figure out what is happening at each step, but tracing gives really clear visibility into that with the one line of code.

website/blog/2024-12-04-from-natural-language-to-sql/index.md

Comment on lines +872 to +874

		# Logging the Model in MLFlow

		Now that we have built a Multi-Lingual Query Engine using LangGraph, we are ready to log the model using MLflow’s [ Model from Code](https://mlflow.org/blog/models_from_code). This approach, where we log the code that represents the model, contrasts with object-based logging, where a model object is created, serialized, and logged as a pickle or JSON object.

djliden Dec 12, 2024

Motivate this step (logging the model) a little more? e.g. can mention versioning, sharing, packaging for deployment, etc.

website/blog/2024-12-04-from-natural-language-to-sql/index.md

Comment on lines +442 to +461

+              def translate_input(state: GraphState):
+                  print("---TRANSLATING INPUT---")
+                  messages = state["messages"]
+                  user_input = messages[-1][1]  # Get the latest user input
+                  # Translation prompt for the model
+                  translation_prompt = f"""
+              Translate the following text to English. If the text is already in English, repeat it exactly without any additional explanation.
+              Text:
+              {user_input}
+              """
+                  # Call the OpenAI LLM to translate the text
+                  translated_response = llm.invoke(translation_prompt)
+                  translated_text = translated_response.content.strip()  # Access the 'content' attribute and strip any extra spaces
+                  state["translated_input"] = translated_text  # Save the translated input
+                  print(f"Translated Input: {translated_text}")
+                  return state
+              ```

djliden Dec 12, 2024

Does translating to english before translating to SQL improve performance? Or would it work just as well to translate from whatever the input language is to SQL? Worth motivating this step.

website/blog/2024-12-04-from-natural-language-to-sql/index.md

Comment on lines +510 to +527

+              def safety_check(state: GraphState):
+                  print("---PERFORMING SAFETY CHECK---")
+                  translated_input = state["translated_input"]
+                  messages = state["messages"]
+                  error = "no"
+                  # List of disallowed SQL operations (e.g., DELETE, DROP)
+                  disallowed_operations = ['CREATE', 'DELETE', 'DROP', 'INSERT', 'UPDATE', 'ALTER', 'TRUNCATE', 'EXEC', 'EXECUTE']
+                  pattern = re.compile(r'\b(' + '|'.join(disallowed_operations) + r')\b', re.IGNORECASE)
+                  # Check if the input contains disallowed SQL operations
+                  if pattern.search(translated_input):
+                      print("Input contains disallowed SQL operations. Halting the workflow.")
+                      error = "yes"
+                      messages += [("assistant", "Your query contains disallowed SQL operations and cannot be processed.")]
+                  else:
+                      # Check if the input contains inappropriate content
+                      safety_prompt = f"""

djliden Dec 12, 2024 •

edited

Loading

curious about using pattern search on the (translated) natural language input—wouldn't checking for disallowed SQL operations make more sense after the SQL is generated?

i.e. what would happen if the natural language input is "please get rid of the customers table." It doesn't look like checks for disallowed operations are run again after the sql is generated, so might it be possible that the system would generate and run a drop table command as long as the user didn't explicitly say "drop table"?

website/blog/2024-12-04-from-natural-language-to-sql/index.md

Comment on lines +745 to +803

+              The `sql_check` node validates the generated SQL query for safety and integrity before execution.
+              **Purpose:** Ensure the SQL query adheres to safety and syntactical standards.
+              **Process:**
+              - Executes the query within a transactional savepoint to test its validity.
+              - Rolls back any changes after validation.
+              - Flags errors and updates the state if validation fails.
+              **Key Considerations:**
+              - Detects potentially destructive operations.
+              - Provides detailed feedback on validation errors.
+              **Examples:**
+              - Input SQL: _"SELECT name FROM customers WHERE city = 'New York';"_
+              - Validation: Query is valid.
+              - Input SQL: _"SELECT MONTH(date) AS month, SUM(total) AS total_sales FROM orders GROUP BY MONTH(date);"_
+              - Response: _"Your SQL query failed to execute: no such function: MONTH."_
+              **Code:**
+              ```python
+              def sql_check(state: GraphState):
+                  print("---VALIDATING SQL QUERY---")
+                  messages = state["messages"]
+                  sql_solution = state["generation"]
+                  error = "no"
+                  sql_code = sql_solution.sql_code.strip()
+                  try:
+                      # Start a savepoint for the transaction
+                      conn.execute('SAVEPOINT sql_check;')
+                      # Attempt to execute the SQL query
+                      cursor.execute(sql_code)
+                      # Roll back to the savepoint to undo any changes
+                      conn.execute('ROLLBACK TO sql_check;')
+                      print("---SQL QUERY VALIDATION: SUCCESS---")
+                  except Exception as e:
+                      # Roll back in case of error
+                      conn.execute('ROLLBACK TO sql_check;')
+                      print("---SQL QUERY VALIDATION: FAILED---")
+                      print(f"Error: {e}")
+                      messages += [("user", f"Your SQL query failed to execute: {e}")]
+                      error = "yes"
+                  state["error"] = error
+                  return state
+              ```

djliden Dec 12, 2024

This seems to validate that the SQL runs, but I don't see where it checks for safety or detects potentially destructive operations. As far as I can tell, a drop table command would clear this step. See earlier note—it looks like the safety check occurs before the sql is generated.

website/blog/2024-12-04-from-natural-language-to-sql/index.md

Comment on lines +1050 to +1064

+              ## Viewing Traces in MLflow
+              Traces can be easily accessed by navigating to the MLflow experiment of interest and clicking on the "Tracing" tab. Once inside, selecting a specific trace provides detailed execution information.
+              Each trace includes:
+. **Execution Graphs**: Visualizations of the workflow steps.
+. **Inputs and Outputs**: Detailed logs of data processed at each step.
+              This granular visibility enables developers to debug and optimize their workflows effectively.
+              By leveraging MLflow tracing, we ensure that our Multi-Lingual Query Engine remains transparent, auditable, and scalable.
+              ![mlflow_tracing_gif](mlflow_trace.gif)

djliden Dec 12, 2024

I really like this, would love some more emphasis on how tracing lets you visualize the whole graph execution.

BenWilson2 reviewed

View reviewed changes

website/blog/2024-12-04-from-natural-language-to-sql/index.md


		We’ll start by demonstrating how to leverage LangGraph’s capabilities to build a dynamic AI workflow. This workflow integrates OpenAI and external data sources, such as a Vector Store and an SQLite database, to process user input, perform safety checks, query databases, and generate meaningful responses.

		Throughout this post, we’ll leverage MLflow’s Models from Code feature to manage the lifecycle of the Multilingual Query Engine. This approach allows the AI workflow to be treated like a traditional ML model, enabling tracking, versioning, and deployment across various serving infrastructures.

Member

BenWilson2 Dec 13, 2024

Might want to do a doc-links directly to this page for the Models from Code reference :) https://mlflow.org/docs/latest/model/models-from-code.html

website/blog/2024-12-04-from-natural-language-to-sql/index.md

+. **Multi-Agent Design**: The AI Workflow includes multiple interactions with OpenAI and other external tools throughout the workflow.
+. **Cyclical Graph Structure**: The graph’s cyclical nature introduces a robust retry mechanism. This mechanism dynamically addresses failures by looping back to previous stages when needed, ensuring continuous graph execution. (Details of this mechanism will be discussed later.)
+              ## AI Workflow Overview

Member

BenWilson2 Dec 13, 2024

Suggested change

      
            ## AI Workflow Overview
          
            ## AI Workflow Overview

Linting - header sections need spaces on either side

website/blog/2024-12-04-from-natural-language-to-sql/index.md

+              #### Step 1: Load SQL Documentation
+              The first step in creating a FAISS Vector Store with SQL query generation guidelines is to load SQL documentation from the [W3Schools SQL page](https://www.w3schools.com/sql/) using Langchain's RecursiveUrlLoader. This tool retrieves the documentation, allowing us to use it as a knowledge base for our engine.
+              #### Step 2: Split the Text into Manageable Chunks
+              The loaded SQL documentation is a lengthy text, making it difficult to be effectively ingested by the LLM. To address this, the next step involves splitting the text into smaller, manageable chunks using Langchain's RecursiveCharacterTextSplitter. By splitting the text into chunks of 500 characters with a 50-character overlap, we ensure the AI has sufficient context while minimizing the risk of losing important information that spans across chunks. The split_text method applies this splitting process, storing the resulting pieces in a list called documents.

Member

BenWilson2 Dec 13, 2024

Suggested change

      
            The loaded SQL documentation is a lengthy text, making it difficult to be effectively ingested by the LLM. To address this, the next step involves splitting the text into smaller, manageable chunks using Langchain's RecursiveCharacterTextSplitter. By splitting the text into chunks of 500 characters with a 50-character overlap, we ensure the AI has sufficient context while minimizing the risk of losing important information that spans across chunks. The split_text method applies this splitting process, storing the resulting pieces in a list called documents.
          
            The loaded SQL documentation is a lengthy text, making it difficult to be effectively ingested by the LLM. To address this, the next step involves splitting the text into smaller, manageable chunks using Langchain's RecursiveCharacterTextSplitter. By splitting the text into chunks of 500 characters with a 50-character overlap, we ensure the language model has sufficient context while minimizing the risk of losing important information that spans across chunks. The split_text method applies this splitting process, storing the resulting pieces in a list called 'documents'.

website/blog/2024-12-04-from-natural-language-to-sql/index.md

+              ### FAISS Vector Store
+              To build an effective Natural Language to SQL engine capable of generating accurate and executable SQL queries, we leverage Langchain's FAISS Vector Store feature. This setup allows the system to search and extract SQL query generation guidelines from W3Schools SQL documents previously stored in the Vector Database, enhancing the success of SQL query generation.
+              #### Step 1: Load SQL Documentation

Member

BenWilson2 Dec 13, 2024

Let's make sure to leave a blank new line on either side of any heading section

website/blog/2024-12-04-from-natural-language-to-sql/index.md


		Details on OpenAI implementation will be provided later on in the Node implementation section.
		### FAISS Vector Store

Member

BenWilson2 Dec 13, 2024

Might want to mention alternatives to an in-memory Vector Store for permanent / more scalable shared resource embeddings storage that can be shared across projects.

website/blog/2024-12-04-from-natural-language-to-sql/index.md

+                      # Save the vector store to disk
+                      vector_store.save_local(vector_store_dir)
+                      print("Vector store created and saved to disk.")

Member

BenWilson2 Dec 13, 2024

Could we convert the print statements to _logger.info() statements instead to show curious readers how to avoid common linting issues in their code?

website/blog/2024-12-04-from-natural-language-to-sql/index.md


		### SQLite Database

		The SQLite database is a key component of the Multilingual Query Engine, serving as the structured data repository that supports SQL query efficient generation, validation and execution by enabling:

Member

BenWilson2 Dec 13, 2024

It might be worth mentioning why this is chosen for this example. Personally, I'm a huge fan of the portability and performance of sqlite and it definitely has many uses far beyond just demonstrations of concepts. As a self-contained data storage layer for an application, it's phenomenal at what it does and can greatly simplify developers' lives who would otherwise assume that they need to spin up a MySQL / PostGres DB for something that a local disk DB would handle much better.

website/blog/2024-12-04-from-natural-language-to-sql/index.md


		- The corresponding SQL code ready for execution.

		- Adaptable and Reliable: Uses GPT-4 for robust, consistent query generation, minimizing manual effort and errors.

Member

BenWilson2 Dec 13, 2024

Should we update this to use gpt-4o-mini? The tool calling functionality with that LLM build is far superior to base gpt-4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet