Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support INTEGER again in addition to INT in CREATE TABLE and CAST statements #3167

Merged
merged 3 commits into from
Aug 16, 2022

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Aug 15, 2022

Which issue does this PR close?

Closes #3059

Rationale for this change

Fixes a regresssion -- CREATE TABLE (x INTEGER) does not work on master, but used to previously

apache/datafusion-sqlparser-rs#525 made INT and INTEGER different types (previously INT parsed to INTEGER). However, DataFusion was not updated.

What changes are included in this PR?

  1. Test
  2. Add support for INTEGER
  3. Explicitly name all other sqlparser types in the match to try and avoid such regressions in the future

Are there any user-facing changes?

Bug is fixed

@alamb alamb marked this pull request as ready for review August 15, 2022 21:29
@github-actions github-actions bot added core Core DataFusion crate sql SQL Planner labels Aug 15, 2022
@alamb
Copy link
Contributor Author

alamb commented Aug 15, 2022

cc @waitingkuo

@@ -631,7 +631,7 @@ async fn register_aggregate_csv_by_sql(ctx: &SessionContext) {
c2 INT NOT NULL,
c3 SMALLINT NOT NULL,
c4 SMALLINT NOT NULL,
c5 INT NOT NULL,
c5 INTEGER NOT NULL,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with this change, many of the sql_integration tests fail

@@ -483,7 +483,7 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
fn make_data_type(&self, sql_type: &SQLDataType) -> Result<DataType> {
match sql_type {
SQLDataType::BigInt(_) => Ok(DataType::Int64),
SQLDataType::Int(_) => Ok(DataType::Int32),
SQLDataType::Int(_) | SQLDataType::Integer(_) => Ok(DataType::Int32),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the fix

@@ -498,7 +498,30 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
SQLDataType::Date => Ok(DataType::Date32),
SQLDataType::Time => Ok(DataType::Time64(TimeUnit::Nanosecond)),
SQLDataType::Timestamp => Ok(DataType::Timestamp(TimeUnit::Nanosecond, None)),
_ => Err(DataFusionError::NotImplemented(format!(
// Explicitly list all other types so that if sqlparser
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not change the behavior, but makes the current behavior more explicit in my opinion

It also makes clear some surprising things (like several unsigned variants appear not to be supported along with interval as noted by @waitingkuo in #3166 (comment)

@alamb alamb changed the title Support INTEGER again in addition to INT Support INTEGER again in addition to INT in CREATE TABLE and CAST statements Aug 15, 2022
@waitingkuo
Copy link
Contributor

LGTM. thank you @alamb

@codecov-commenter
Copy link

Codecov Report

Merging #3167 (8cf1549) into master (36def8f) will decrease coverage by 0.02%.
The diff coverage is 4.54%.

❗ Current head 8cf1549 differs from pull request most recent head 62ba0d4. Consider uploading reports for the commit 62ba0d4 to get more accurate results

@@            Coverage Diff             @@
##           master    #3167      +/-   ##
==========================================
- Coverage   85.87%   85.85%   -0.03%     
==========================================
  Files         291      291              
  Lines       52758    52778      +20     
==========================================
+ Hits        45307    45310       +3     
- Misses       7451     7468      +17     
Impacted Files Coverage Δ
datafusion/core/tests/sql/mod.rs 96.94% <ø> (ø)
datafusion/sql/src/planner.rs 81.23% <4.54%> (-0.77%) ⬇️
datafusion/expr/src/logical_plan/plan.rs 77.77% <0.00%> (+0.34%) ⬆️
datafusion/core/src/physical_plan/metrics/value.rs 87.43% <0.00%> (+0.50%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@alamb alamb merged commit 89bcfc4 into apache:master Aug 16, 2022
@alamb alamb deleted the alamb/fix_csv branch August 16, 2022 00:44
@ursabot
Copy link

ursabot commented Aug 16, 2022

Benchmark runs are scheduled for baseline = 8e9a8d5 and contender = 89bcfc4. 89bcfc4 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

INTEGER type does't work while importing csv
5 participants