-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ADAP-549] [CT-2577] [Regression] Error Only %s and %% are supported in the query #441
Comments
Thank you for reporting this @craigchurch ! In dbt 1.5 we switched from using This is a delightful example of the regression -- 100% guaranteed to get a laugh 😂
|
@dbeatty10 we are actually not laughing, because that's yet another regression in |
@jaklan we do not have a fix for this regression yet, and I would recommend downgrading to 1.4 in the meantime. I'm sorryWe take it seriously whenever users are affected by a bug in a release. I am very sorry if it seemed that I was laughing that this regression broke your models, especially if you have experienced multiple issues with dbt-redshift 1.5. I want to clarify that my intention was purely to show appreciation for the clever way that @craigchurch wrote up the following in the original bug report:
I didn't fully consider how my words could be interpreted, and I want to extend my heartfelt apology. ContextWe made a significant change for dbt-redshift between 1.4 and 1.5 by switching from Prior to the April 27, 2023 1.5 release, we did five public releases and release candidates between Feb 22, 2023, and we fixed all known regressions prior to then. This issue was present since 1.5.0b1, but unfortunately it wasn't included in our automated test cases, and also wasn't discovered and reported by any end users until just a few days ago. |
@dbeatty10 no worries, we are just a bit tired of addressing next and next problems after the upgrade (i.a. this one, dbt-labs/dbt-core#7465, #427), so that's why I just wanted to bring the discussion back to the issue 😉 |
@dbeatty10 maybe you would consider reverting the switch and returning to |
@jaklan That is totally understandable that you are tired especially experiencing several issues during this upgrade. Thank you for sticking with us, raising these issues, and sharing how they are affecting you. We've got line of sight to resolve all known |
@dbeatty10 got it, do you have any planned timeline for releasing |
@Fleid do you have an expected timeline for dbt-redshift 1.5.2? |
I'm assuming that this issue has the same underlying cause as #432, and we could consider closing one as a duplicate of the other. Also guessing that it has to do with We should try setting |
@dbeatty10 Thanks for the suggested solution. Working on implementing the suggestion and will open up a PR for this! |
@jiezhen-chen I gave I suspect there is a bug that affects comment on table customers is '95% of customer records'; In the meantime, adding the following to here could be a workaround : {#
By using dollar-quoting like this, users can embed anything they want into their comments
(including nested dollar-quoting), as long as they do not use this exact dollar-quoting
label. It would be nice to just pick a new one but eventually you do have to give up.
#}
{% macro postgres_escape_comment(comment) -%}
{% if comment is not string %}
{% do exceptions.raise_compiler_error('cannot escape a non-string: ' ~ comment) %}
{% endif %}
{%- set magic = '$dbt_comment_literal_block$' -%}
{%- if magic in comment -%}
{%- do exceptions.raise_compiler_error('The string ' ~ magic ~ ' is not allowed in comments.') -%}
{%- endif -%}
{#- -- escape % until the underlying issue is fixed -#}
{%- set comment = comment|replace("%", "%%") -%}
{{ magic }}{{ comment }}{{ magic }}
{%- endmacro %} |
@dbeatty10 Seems like both pg8000 and redshift_connector has this convert_paramstyle method that is raising this error. Another potential solution would be to replace % to %% in connections.py itself |
@jiezhen-chen I saw that! Read a lot more The InterfaceError is not affecting a bunch of other SQL queries I tried that contain a It might be that this error only pops up with multi-line comments -- I didn't try both single line and multi-line comments -- only tried multi-line comments using dollar-quoted string constants like: comment on table customers is $dbt_comment_literal_block$Dianne's 100% horse
$dbt_comment_literal_block$; |
ooooooh, looking at the comment above, we can see why it's in the section of code that is raising that error. It's 'cause it doesn't have any single quotes (or any of the rest of these):
So I'm guessing |
@jiezhen-chen here's some code that roughly matches the coding style of Code
import typing
def convert_paramstyle(style: str, query) -> typing.Tuple[str, typing.Any]:
OUTSIDE: int = 0 # outside quoted string
INSIDE_SQ: int = 1 # inside single-quote string '...'
INSIDE_QI: int = 2 # inside quoted identifier "..."
INSIDE_ES: int = 3 # inside escaped single-quote string, E'...'
INSIDE_PN: int = 4 # inside parameter name eg. :name
INSIDE_CO: int = 5 # inside inline comment eg. --
INSIDE_DQ: int = 6 # inside dollar-quoted tag eg. $tag$ inside $tag$
OPENING_DQ: int = 7 # opening a dollar-quoted tag eg. $tag$
CLOSING_DQ: int = 8 # closing a dollar-quoted tag eg. $tag$ inside $tag$
string_constant: str = "" # TODO - remove
starting_tag: str = ""
closing_tag: str = ""
output_query: typing.List[str] = []
state: int = OUTSIDE
for i, c in enumerate(query):
if state == OUTSIDE:
if c == "$":
# starting tag for a dollar-quoted string constant
output_query.append(c)
state = OPENING_DQ
starting_tag = ""
else:
output_query.append(c)
elif state == OPENING_DQ:
if c == "$":
# end of a tag for a dollar-quoted string constant
# and start of the actual string constant
output_query.append(c)
state = INSIDE_DQ
else:
output_query.append(c)
starting_tag += c
elif state == INSIDE_DQ:
string_constant += c # TODO - remove
if c == "$":
# potential closing tag for a dollar-quoted string constant
output_query.append(c)
state = CLOSING_DQ
else:
output_query.append(c)
elif state == CLOSING_DQ:
string_constant += c # TODO - remove
if c == "$" and closing_tag == starting_tag:
# end of a closing tag for a dollar-quoted string constant
# TODO - remove the next 3 lines - only for demo purposes
string_constant = string_constant[:-(len(closing_tag) + 2)]
print(f"string_constant: {string_constant}")
string_constant = ""
output_query.append(c)
closing_tag = ""
state = OUTSIDE
else:
output_query.append(c)
closing_tag += c
if not starting_tag.startswith(closing_tag):
# failed potential closing tag
state = INSIDE_DQ
closing_tag = ""
else:
output_query.append(c)
return "".join(output_query)
query = "$$Dan's cats$$ $<tag>$Dianne's $$horse$$ $<tag$ $<tagg $<tag> $ $<tag>$ $animal$ Dani's dog $animal$"
result = convert_paramstyle('faux style', query)
print(result) python dollar_quoting.py
|
Re-opening since I believe #441 only mitigates certain use-cases but doesn't resolve this issue. The ultimate solution is probably to fully support dollar-quoted string literals (which will need to be done within |
@dbeatty10 Seems like the only time this is an issue is when '%' is used instead of '%%' in a COMMENT. Is that correct? I haven't been able to reproduce the same error running other types of queries that contain '%'. To consolidate our findings, and correct me if I'm wrong - COMMENT statements that contain one single '%' are triggering the error in
These options can be viable paths forward:
@sathiish-kumar and I are leaning towards option2 - which is to handle this within python rather than relying on jinja. Option3 would introduce unnecessary engineering cost for the redshift driver team. Considering that I dug around and haven't found any complaints of this issue for redshift_connector, I think this may be a dbt-redshift specific issue that can be handled within dbt. |
@dbeatty10 Thanks for merging 466 so quickly! While the change in 466 takes care of the issue in comments, you're right about it failing with cases like the one below:
We could try detecting dollar-quoted string literals in add_query in connections.py, and replace instances of |
@craigchurch, I haven't been able to reproduce the scenario where there's a percent sign in a SQL comment of a model
|
@dataders Seems like inline comments are handled by convert_paramstyle in redshift_connector. |
@jiezhen-chen would you be interested in doing a proof of concept of amalgamating @dbeatty10's code snippet within #441 (comment) into the existing This seems more and more to be a bug in the connector library as opposed to a bug in dbt-redshift. For example, aws/amazon-redshift-python-driver#156 looks to be the same case. @Brooke-white does this track for you as well? I'd appreciate your help here |
@dbeatty10's change in 466 mitigated this issue for the COMMENT keyword, but this issue pertains when parsing a comment within a query, if the comment is wrapped in I agree that the ultimate solution is to have redshift_connector support dollar-quoted string literals and comments in |
@jiezhen-chen I agree with you that the ultimate solution is having redshift_connector support dollar-quoted string literals and multi-line comments 👍 Unless #466 is reverted first, I'd be leery of doing further replacements of My suggestion would be to focus our effort towards implementing multiline + $$ support directly within redshift_connector or adding the relevant monkey patch within dbt-redshift. The former seems much preferable to the latter. |
I just ran into this issue with a model in which the author included an apostrophe in a multi-line comment:
The Hilariously, if only the author had included a second apostrophe in their comment, things would run fine! |
It doesn't need to be part of a comment. This also happens if you use |
Hi folks, I maintain redshift-connector. This is a long-standing but just recently discovered bug in our convert_paramstyle method like a few of you identified above :) . We are tracking it in aws/amazon-redshift-python-driver#156 and have a fix ready for our next release scheduled for next week (week of June 5th). Thank you all for your patience! |
@craigchurch @jaklan @stlee-aurora @shanehodgkins thanks so much for both your
I'm going to close this issue, but will happily re-open if you report more issues. thanks! |
Is this a regression in a recent version of dbt-core?
Current Behavior
URLs in markdown files that have a percent sign are causing the error
For example this markdown will trigger the error message:
Also a single percent sign in the
.md
file will trigger the error. For example:Also a single percent sign in the sql file will trigger the error. Such as a percent sign in a comment in the SQL like this:
Expected/Previous Behavior
These errors were not present in version dbt 1.2.
Steps To Reproduce
Build the dbt project
Relevant log output
Environment
Which database adapter are you using with dbt?
redshift
Additional Context
No response
The text was updated successfully, but these errors were encountered: