Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix UNION ALL bug: thread 'main' panicked at 'index out of bounds: the len is 1 but the index is 1', ./src/datatypes/schema.rs:165:10 #1088

Merged
merged 1 commit into from
Oct 9, 2021

Conversation

xudong963
Copy link
Member

Which issue does this PR close?

Closes #1064

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

@github-actions github-actions bot added the datafusion Changes in the datafusion crate label Oct 8, 2021
@xudong963
Copy link
Member Author

Finally, I found the real bug is at optimizing phrase where union logical plan is optimized.

PTAL,thanks @alamb @Dandandan @houqp

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 🕵️ work @xudong963 👍

Comment on lines 419 to 424
let mut new_schema = Vec::new();
for df_field in schema.fields() {
if union_required_fields.contains(df_field.field()) {
new_schema.push(df_field.clone());
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if you are into the whole functional programming thing, but you can probably write this without a mut Vec like

            let new_schema = DFSchema::new(
                schema.fields()
                    .iter()
                    .filter(|df_field| {
                        union_required_fields.contains(df_field.field())
                    })
                    .cloned()
                    .collect()
            )?;

I don't think it makes any practical difference (what you have here is fine too) I just figured I would point it out

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer us to go functional ere to keep the style consistent within the code base :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also like the cool and concise functional style. But because I'm not familiar with functional programming style, I chose conservative writing at the beginning. Thanks for your suggestion!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it took me a while to become comfortable with the functional style.

As an aside - I also later realized that using this functional style is more than just about code structure / readability -- it is one of the key ways Rust does fast and safe: in the mut Vec<..> version, every time data is pushed the capacity of the vec needs to be checked and extended if not large enough. In the iter() version often the Vec can be created up front and then written directly into.

Copy link
Member

@houqp houqp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch @xudong963 !

Comment on lines 419 to 424
let mut new_schema = Vec::new();
for df_field in schema.fields() {
if union_required_fields.contains(df_field.field()) {
new_schema.push(df_field.clone());
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer us to go functional ere to keep the style consistent within the code base :)

@houqp houqp added the bug Something isn't working label Oct 8, 2021
async fn union_all_with_aggregate() -> Result<()> {
let mut ctx = ExecutionContext::new();
let sql =
"SELECT SUM(d) FROM (SELECT 1 as c, 2 as d UNION ALL SELECT 1 as c, 3 AS d) as a";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

…e len is 1 but the index is 1', ./src/datatypes/schema.rs:165:10
@xudong963
Copy link
Member Author

@alamb @houqp Updated, maybe It's time to merge😄.

@houqp houqp merged commit 5a7daa3 into apache:master Oct 9, 2021
@houqp
Copy link
Member

houqp commented Oct 9, 2021

Bug fix merged, thanks @xudong963!

@alamb
Copy link
Contributor

alamb commented Oct 9, 2021

Thanks again @xudong963 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working datafusion Changes in the datafusion crate
Projects
None yet
4 participants