Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when setting PostgreSQL search_path using exec_driver_sql in SQLDatabase class #17231

Closed
4 tasks done
Zafershah24 opened this issue Feb 8, 2024 · 2 comments
Closed
4 tasks done
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature 🔌: google Primarily related to Google GenAI or VertexAI integrations

Comments

@Zafershah24
Copy link

Zafershah24 commented Feb 8, 2024

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.

Example Code

    def _execute(
        self,
        command: str,
        fetch: Literal["all", "one"] = "all",
    ) -> Sequence[Dict[str, Any]]:
        """
        Executes SQL command through underlying engine.

        If the statement returns no rows, an empty list is returned.
        """
        with self._engine.begin() as connection:  # type: Connection
            if self._schema is not None:
                if self.dialect == "snowflake":
                    connection.exec_driver_sql(
                        "ALTER SESSION SET search_path = %s", (self._schema,)
                    )
                elif self.dialect == "bigquery":
                    connection.exec_driver_sql("SET @@dataset_id=?", (self._schema,))
                elif self.dialect == "mssql":
                    pass
                elif self.dialect == "trino":
                    connection.exec_driver_sql("USE ?", (self._schema,))
                elif self.dialect == "duckdb":
                    # Unclear which parameterized argument syntax duckdb supports.
                    # The docs for the duckdb client say they support multiple,
                    # but `duckdb_engine` seemed to struggle with all of them:
                    # https://github.com/Mause/duckdb_engine/issues/796
                    connection.exec_driver_sql(f"SET search_path TO {self._schema}")
                elif self.dialect == "oracle":
                    connection.exec_driver_sql(
                        f"ALTER SESSION SET CURRENT_SCHEMA = {self._schema}"
                    )
                elif self.dialect == "sqlany":
                    # If anybody using Sybase SQL anywhere database then it should not
                    # go to else condition. It should be same as mssql.
                    pass
                elif self.dialect == "postgresql":  # postgresql
                    connection.exec_driver_sql("SET search_path TO %s", (self._schema,))

            cursor = connection.execute(text(command))
            if cursor.returns_rows:
                if fetch == "all":
                    result = [x._asdict() for x in cursor.fetchall()]
                elif fetch == "one":
                    first_result = cursor.fetchone()
                    result = [] if first_result is None else [first_result._asdict()]
                else:
                    raise ValueError("Fetch parameter must be either 'one' or 'all'")
                return result
        return []

Error Message and Stack Trace (if applicable)

SELECT * FROM metadata_sch_stg.company_datasets LIMIT 2←[0m←[36;1m←[1;3mError: (pg8000.exceptions.DatabaseError) {'S': 'ERROR', 'V': 'ERROR', 'C': '42601', 'M': 'syntax error at or near "$1"', 'P': '20', 'F': 'scan.l', 'L': '1180', 'R': 'scanner_yyerror'}
[SQL: SET search_path TO %s]
[parameters: ('metadata_sch_stg',)]

Description

When attempting to set the PostgreSQL search_path using exec_driver_sql within the SQLDatabase class, an error is thrown. The relevant code snippet is as follows:

elif self.dialect == "postgresql":  # postgresql
    connection.exec_driver_sql("SET search_path TO %s", (self._schema,))

This line attempts to set the search_path to the schema defined in the self._schema attribute. However, this results in a syntax error because the parameter substitution (%s) is not supported for the SET command in PostgreSQL.

Expected Behavior:
The search_path should be set to the specified schema without errors, allowing subsequent queries to run within the context of that schema.

Actual Behavior:
A syntax error is raised, indicating an issue with the SQL syntax near the parameter substitution placeholder.

Steps to Reproduce the error:

Instantiate an SQLDatabase object with the PostgreSQL dialect.
Change the Postgres schema to any other schema, other that 'public' Schema.
Observe the syntax error.

System Info

langchain==0.1.4
langchain-community==0.0.16
langchain-core==0.1.17
langchain-google-vertexai==0.0.3
langsmith==0.0.85
pg8000==1.29.8
SQLAlchemy==2.0.16
cloud-sql-python-connector==1.2.4
OS: Windows

Copy link

dosubot bot commented Feb 8, 2024

🤖

Thank you for the detailed report, Zafershah24. The error you're encountering is due to the incorrect usage of parameterized SQL in the PostgreSQL section of the _execute method in the SQLDatabase class. The correct way to use parameterized SQL with PostgreSQL is to use %s as placeholders for parameters, not ?.

Here's the corrected code:

elif self.dialect == "postgresql":  # postgresql
    connection.exec_driver_sql(
        "SET search_path TO %s",
        (self._schema,),
    )

This code sets the search_path for the PostgreSQL connection to the schema specified in self._schema. The %s is a placeholder for the schema name, which is passed as a tuple (self._schema,) to the exec_driver_sql method.

The exec_driver_sql method in the SQLDatabase class is used to execute SQL commands through the underlying engine. It is specifically used in the _execute method to set the search path or schema for the SQL session, depending on the dialect of the SQL engine being used.

For example, if the dialect is 'snowflake', it executes the SQL command "ALTER SESSION SET search_path = %s" where %s is replaced with the schema. Similarly, for 'bigquery' it executes "SET @@dataset_id=?", for 'trino' it executes "USE ?", and for 'postgresql' and other compatible dialects, it executes "SET search_path TO %s".

In the case of 'duckdb' and 'oracle', it directly sets the search_path or CURRENT_SCHEMA to the schema without using parameterized arguments.

I hope this helps! If you have any other questions or need further clarification, feel free to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@dosubot dosubot bot added 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature 🔌: google Primarily related to Google GenAI or VertexAI integrations labels Feb 8, 2024
Zafershah24 added a commit to Zafershah24/langchain that referenced this issue Feb 9, 2024
@alex-ber
Copy link

alex-ber commented Mar 6, 2024

See #18512

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Jun 5, 2024
@dosubot dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 12, 2024
@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Jun 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature 🔌: google Primarily related to Google GenAI or VertexAI integrations
Projects
None yet
Development

No branches or pull requests

2 participants