Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better disallowmissing error message #2966

Merged
merged 5 commits into from
Dec 18, 2021
Merged

better disallowmissing error message #2966

merged 5 commits into from
Dec 18, 2021

Conversation

bkamins
Copy link
Member

@bkamins bkamins commented Dec 15, 2021

Improve the error message in case column conversion fails, e.g.:

julia> df = DataFrame(x=1:3, y=[1,2,missing])
3×2 DataFrame
 Row │ x      y       
     │ Int64  Int64?  
─────┼────────────────
   1 │     1        1 
   2 │     2        2 
   3 │     3  missing 

julia> disallowmissing(df)
ERROR: ArgumentError: Missing value found in column number 2 

julia> disallowmissing!(df)
ERROR: ArgumentError: Missing value found in column number 2

@bkamins bkamins added this to the patch milestone Dec 15, 2021
@bkamins bkamins linked an issue Dec 15, 2021 that may be closed by this pull request
@bkamins bkamins requested a review from nalimilan December 15, 2021 11:52
@bkamins
Copy link
Member Author

bkamins commented Dec 15, 2021

doctests failure is unrelated. I fix it in #2967.

@jtrakk
Copy link

jtrakk commented Dec 15, 2021

Thanks.

I don't use column numbers much so a name would be more familiar to me.

Maybe reporting a row too would be useful? I'm not sure.

@bkamins
Copy link
Member Author

bkamins commented Dec 15, 2021

Fixed. Is it better now?

@jtrakk
Copy link

jtrakk commented Dec 15, 2021

Cheers.

Closes #2965

Comment on lines 2079 to 2084
if any(ismissing, x)
y = x
else
y = disallowmissing(x)
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this, we could just do error || continue in the if row !== nothing branch above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could not do continue because we still need to execute line 2088, so instead we would need to write inside if row !== nothing:

if error
    col = _names(df)[i]
    throw(ArgumentError("Missing value found in column :$col in row $row"))
else
    y = x
end

Do you think it would be simpler to understand by the potential readers of the code?
I used this explicit logic because when I read the old condition:

if !error && Missing <: eltype(x) && any(ismissing, x)

I had to spend some time to make sure it was OK (although I think I have written it some time ago).

Therefore I would vote for leaving things as they are as it is I think easier to understand if some user wants to inspect the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. But I'd use try... catch when !error too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed disallowmissing and disallowmissing! the way that I think you wanted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. But I'd use try... catch when !error too.

@bkamins bkamins merged commit 83f0f96 into main Dec 18, 2021
@bkamins bkamins deleted the bl/disallowmissing_error branch December 18, 2021 18:04
@bkamins
Copy link
Member Author

bkamins commented Dec 18, 2021

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Locate the problem in disallowmissing error
3 participants