Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TO_NUMBER without format edge case #172

Merged
merged 5 commits into from
Mar 12, 2024
Merged

TO_NUMBER without format edge case #172

merged 5 commits into from
Mar 12, 2024

Conversation

sundarshankar89
Copy link
Collaborator

Closes #166

@sundarshankar89 sundarshankar89 requested a review from a team as a code owner March 11, 2024 10:17
Copy link

codecov bot commented Mar 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.95%. Comparing base (e1ef05e) to head (54150d9).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #172      +/-   ##
==========================================
+ Coverage   95.92%   95.95%   +0.02%     
==========================================
  Files          19       19              
  Lines        1228     1236       +8     
  Branches      200      200              
==========================================
+ Hits         1178     1186       +8     
  Misses         25       25              
  Partials       25       25              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/databricks/labs/remorph/snow/databricks.py Outdated Show resolved Hide resolved
src/databricks/labs/remorph/snow/snowflake.py Outdated Show resolved Hide resolved
tests/unit/snow/test_databricks.py Show resolved Hide resolved
@sundarshankar89 sundarshankar89 requested a review from nfx March 12, 2024 04:04
Copy link
Collaborator

@nfx nfx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@nfx nfx merged commit bc089b6 into main Mar 12, 2024
6 checks passed
@nfx nfx deleted the patch/to_number_166 branch March 12, 2024 12:35
sundarshankar89 added a commit that referenced this pull request Mar 15, 2024
* Added Pylint Checker ([#149](#149)). This diff adds a Pylint checker to the project, which is used to enforce a consistent code style, identify potential bugs, and check for errors in the Python code. The configuration for Pylint includes various settings, such as a line length limit, the maximum number of arguments for a function, and the maximum number of lines in a module. Additionally, several plugins have been specified to load, which add additional checks and features to Pylint. The configuration also includes settings that customize the behavior of Pylint's naming conventions checks and handle various types of code constructs, such as exceptions, logging statements, and import statements. By using Pylint, the project can help ensure that its code is of high quality, easy to understand, and free of bugs. This diff includes changes to various files, such as cli.py, morph_status.py, validate.py, and several SQL-related files, to ensure that they adhere to the desired Pylint configuration and best practices for code quality and organization.
* Fixed edge case where column name is same as alias name ([#164](#164)). A recent commit has introduced fixes for edge cases related to conflicts between column names and alias names in SQL queries, addressing issues [#164](#164) and [#130](#130). The `check_for_unsupported_lca` function has been updated with two helper functions `_find_aliases_in_select` and `_find_invalid_lca_in_window` to detect aliases with the same name as a column in a SELECT expression and identify invalid Least Common Ancestors (LCAs) in window functions, respectively. The `find_windows_in_select` function has been refactored and renamed to `_find_windows_in_select` for improved code readability. The `transpile` and `parse` functions in the `sql_transpiler.py` file have been updated with try-except blocks to handle cases where a column name matches the alias name, preventing errors or exceptions such as `ParseError`, `TokenError`, and `UnsupportedError`. A new unit test, "test_query_with_same_alias_and_column_name", has been added to verify the fix, passing a SQL query with a subquery having a column alias `ca_zip` which is also used as a column name in the same query, confirming that the function correctly handles the scenario where a column name conflicts with an alias name.
* `TO_NUMBER` without `format` edge case ([#172](#172)). The `TO_NUMBER without format edge case` commit introduces changes to address an unsupported usage of the `TO_NUMBER` function in Databicks SQL dialect when the `format` parameter is not provided. The new implementation introduces constants `PRECISION_CONST` and `SCALE_CONST` (set to 38 and 0 respectively) as default values for `precision` and `scale` parameters. These changes ensure Databricks SQL dialect requirements are met by modifying the `_to_number` method to incorporate these constants. An `UnsupportedError` will now be raised when `TO_NUMBER` is called without a `format` parameter, improving error handling and ensuring users are aware of the required `format` parameter. Test cases have been added for `TO_DECIMAL`, `TO_NUMERIC`, and `TO_NUMBER` functions with format strings, covering cases where the format is taken from table columns. The commit also ensures that an error is raised when `TO_DECIMAL` is called without a format parameter.

Dependency updates:

 * Bump sqlglot from 21.2.1 to 22.0.1 ([#152](#152)).
 * Bump sqlglot from 22.0.1 to 22.1.1 ([#159](#159)).
 * Updated databricks-labs-blueprint[yaml] requirement from ~=0.2.3 to >=0.2.3,<0.4.0 ([#162](#162)).
 * Bump sqlglot from 22.1.1 to 22.2.0 ([#161](#161)).
 * Bump sqlglot from 22.2.0 to 22.2.1 ([#163](#163)).
 * Updated databricks-sdk requirement from <0.21,>=0.18 to >=0.18,<0.22 ([#168](#168)).
 * Bump sqlglot from 22.2.1 to 22.3.1 ([#170](#170)).
 * Updated databricks-labs-blueprint[yaml] requirement from <0.4.0,>=0.2.3 to >=0.2.3,<0.5.0 ([#171](#171)).
 * Bump sqlglot from 22.3.1 to 22.4.0 ([#173](#173)).
@sundarshankar89 sundarshankar89 mentioned this pull request Mar 15, 2024
github-merge-queue bot pushed a commit that referenced this pull request Mar 15, 2024
* Added Pylint Checker
([#149](#149)). This
diff adds a Pylint checker to the project, which is used to enforce a
consistent code style, identify potential bugs, and check for errors in
the Python code. The configuration for Pylint includes various settings,
such as a line length limit, the maximum number of arguments for a
function, and the maximum number of lines in a module. Additionally,
several plugins have been specified to load, which add additional checks
and features to Pylint. The configuration also includes settings that
customize the behavior of Pylint's naming conventions checks and handle
various types of code constructs, such as exceptions, logging
statements, and import statements. By using Pylint, the project can help
ensure that its code is of high quality, easy to understand, and free of
bugs. This diff includes changes to various files, such as cli.py,
morph_status.py, validate.py, and several SQL-related files, to ensure
that they adhere to the desired Pylint configuration and best practices
for code quality and organization.
* Fixed edge case where column name is same as alias name
([#164](#164)). A recent
commit has introduced fixes for edge cases related to conflicts between
column names and alias names in SQL queries, addressing issues
[#164](#164) and
[#130](#130). The
`check_for_unsupported_lca` function has been updated with two helper
functions `_find_aliases_in_select` and `_find_invalid_lca_in_window` to
detect aliases with the same name as a column in a SELECT expression and
identify invalid Least Common Ancestors (LCAs) in window functions,
respectively. The `find_windows_in_select` function has been refactored
and renamed to `_find_windows_in_select` for improved code readability.
The `transpile` and `parse` functions in the `sql_transpiler.py` file
have been updated with try-except blocks to handle cases where a column
name matches the alias name, preventing errors or exceptions such as
`ParseError`, `TokenError`, and `UnsupportedError`. A new unit test,
"test_query_with_same_alias_and_column_name", has been added to verify
the fix, passing a SQL query with a subquery having a column alias
`ca_zip` which is also used as a column name in the same query,
confirming that the function correctly handles the scenario where a
column name conflicts with an alias name.
* `TO_NUMBER` without `format` edge case
([#172](#172)). The
`TO_NUMBER without format edge case` commit introduces changes to
address an unsupported usage of the `TO_NUMBER` function in Databicks
SQL dialect when the `format` parameter is not provided. The new
implementation introduces constants `PRECISION_CONST` and `SCALE_CONST`
(set to 38 and 0 respectively) as default values for `precision` and
`scale` parameters. These changes ensure Databricks SQL dialect
requirements are met by modifying the `_to_number` method to incorporate
these constants. An `UnsupportedError` will now be raised when
`TO_NUMBER` is called without a `format` parameter, improving error
handling and ensuring users are aware of the required `format`
parameter. Test cases have been added for `TO_DECIMAL`, `TO_NUMERIC`,
and `TO_NUMBER` functions with format strings, covering cases where the
format is taken from table columns. The commit also ensures that an
error is raised when `TO_DECIMAL` is called without a format parameter.

Dependency updates:

* Bump sqlglot from 21.2.1 to 22.0.1
([#152](#152)).
* Bump sqlglot from 22.0.1 to 22.1.1
([#159](#159)).
* Updated databricks-labs-blueprint[yaml] requirement from ~=0.2.3 to
>=0.2.3,<0.4.0
([#162](#162)).
* Bump sqlglot from 22.1.1 to 22.2.0
([#161](#161)).
* Bump sqlglot from 22.2.0 to 22.2.1
([#163](#163)).
* Updated databricks-sdk requirement from <0.21,>=0.18 to >=0.18,<0.22
([#168](#168)).
* Bump sqlglot from 22.2.1 to 22.3.1
([#170](#170)).
* Updated databricks-labs-blueprint[yaml] requirement from
<0.4.0,>=0.2.3 to >=0.2.3,<0.5.0
([#171](#171)).
* Bump sqlglot from 22.3.1 to 22.4.0
([#173](#173)).
sundarshankar89 added a commit to sundarshankar89/remorph that referenced this pull request Jan 2, 2025
sundarshankar89 added a commit to sundarshankar89/remorph that referenced this pull request Jan 3, 2025
sundarshankar89 added a commit to sundarshankar89/remorph that referenced this pull request Jan 3, 2025
* Added Pylint Checker
([databrickslabs#149](databrickslabs#149)). This
diff adds a Pylint checker to the project, which is used to enforce a
consistent code style, identify potential bugs, and check for errors in
the Python code. The configuration for Pylint includes various settings,
such as a line length limit, the maximum number of arguments for a
function, and the maximum number of lines in a module. Additionally,
several plugins have been specified to load, which add additional checks
and features to Pylint. The configuration also includes settings that
customize the behavior of Pylint's naming conventions checks and handle
various types of code constructs, such as exceptions, logging
statements, and import statements. By using Pylint, the project can help
ensure that its code is of high quality, easy to understand, and free of
bugs. This diff includes changes to various files, such as cli.py,
morph_status.py, validate.py, and several SQL-related files, to ensure
that they adhere to the desired Pylint configuration and best practices
for code quality and organization.
* Fixed edge case where column name is same as alias name
([databrickslabs#164](databrickslabs#164)). A recent
commit has introduced fixes for edge cases related to conflicts between
column names and alias names in SQL queries, addressing issues
[databrickslabs#164](databrickslabs#164) and
[databrickslabs#130](databrickslabs#130). The
`check_for_unsupported_lca` function has been updated with two helper
functions `_find_aliases_in_select` and `_find_invalid_lca_in_window` to
detect aliases with the same name as a column in a SELECT expression and
identify invalid Least Common Ancestors (LCAs) in window functions,
respectively. The `find_windows_in_select` function has been refactored
and renamed to `_find_windows_in_select` for improved code readability.
The `transpile` and `parse` functions in the `sql_transpiler.py` file
have been updated with try-except blocks to handle cases where a column
name matches the alias name, preventing errors or exceptions such as
`ParseError`, `TokenError`, and `UnsupportedError`. A new unit test,
"test_query_with_same_alias_and_column_name", has been added to verify
the fix, passing a SQL query with a subquery having a column alias
`ca_zip` which is also used as a column name in the same query,
confirming that the function correctly handles the scenario where a
column name conflicts with an alias name.
* `TO_NUMBER` without `format` edge case
([databrickslabs#172](databrickslabs#172)). The
`TO_NUMBER without format edge case` commit introduces changes to
address an unsupported usage of the `TO_NUMBER` function in Databicks
SQL dialect when the `format` parameter is not provided. The new
implementation introduces constants `PRECISION_CONST` and `SCALE_CONST`
(set to 38 and 0 respectively) as default values for `precision` and
`scale` parameters. These changes ensure Databricks SQL dialect
requirements are met by modifying the `_to_number` method to incorporate
these constants. An `UnsupportedError` will now be raised when
`TO_NUMBER` is called without a `format` parameter, improving error
handling and ensuring users are aware of the required `format`
parameter. Test cases have been added for `TO_DECIMAL`, `TO_NUMERIC`,
and `TO_NUMBER` functions with format strings, covering cases where the
format is taken from table columns. The commit also ensures that an
error is raised when `TO_DECIMAL` is called without a format parameter.

Dependency updates:

* Bump sqlglot from 21.2.1 to 22.0.1
([databrickslabs#152](databrickslabs#152)).
* Bump sqlglot from 22.0.1 to 22.1.1
([databrickslabs#159](databrickslabs#159)).
* Updated databricks-labs-blueprint[yaml] requirement from ~=0.2.3 to
>=0.2.3,<0.4.0
([databrickslabs#162](databrickslabs#162)).
* Bump sqlglot from 22.1.1 to 22.2.0
([databrickslabs#161](databrickslabs#161)).
* Bump sqlglot from 22.2.0 to 22.2.1
([databrickslabs#163](databrickslabs#163)).
* Updated databricks-sdk requirement from <0.21,>=0.18 to >=0.18,<0.22
([databrickslabs#168](databrickslabs#168)).
* Bump sqlglot from 22.2.1 to 22.3.1
([databrickslabs#170](databrickslabs#170)).
* Updated databricks-labs-blueprint[yaml] requirement from
<0.4.0,>=0.2.3 to >=0.2.3,<0.5.0
([databrickslabs#171](databrickslabs#171)).
* Bump sqlglot from 22.3.1 to 22.4.0
([databrickslabs#173](databrickslabs#173)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TO_NUMBER should cast as number and use given precision and scale
2 participants