Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cat_relevel + missing #8

Closed
behinger opened this issue Oct 23, 2023 · 5 comments
Closed

cat_relevel + missing #8

behinger opened this issue Oct 23, 2023 · 5 comments

Comments

@behinger
Copy link

behinger commented Oct 23, 2023

cat_relevel doesnt play well with missings

@test cat_relevel(CategoricalArray(["A","B",missing]),["B",missing,"A"])|>levels == ["B",missing,"A"]

(this could be a unittest to test the functionality)

This results in:

MethodError: no method matching cat_relevel(::CategoricalArrays.CategoricalVector{Union{Missing, String}, UInt32, String, CategoricalArrays.CategoricalValue{String, UInt32}, Missing}, ::Vector{Union{Missing, String}})

Closest candidates are:

cat_relevel(::CategoricalArrays.CategoricalArray, !Matched::Vector{String})

@ TidierCats C:\Users\behinger\.julia\packages\TidierCats\9gKGY\src\TidierCats.jl:26

Everything else works great so far!! Thanks for the very nice package

@drizk1
Copy link
Member

drizk1 commented Oct 24, 2023

Thank you for bringing this up! I'm happy you enjoy the package.

I will work on getting this fixed so it's not an issue and getting the new version up soon

@behinger
Copy link
Author

thanks! it might be as easy as allowing Missing in the type definition, but I didnt check. I for now replaced missing with "missing" ;-)

@drizk1
Copy link
Member

drizk1 commented Oct 24, 2023

I think so as well. Should have it up by the end of the week

@drizk1
Copy link
Member

drizk1 commented Oct 28, 2023

Update: I tried a few different ways and it appears this may not be as straightforward as I originally anticipated, but it will get fixed.

@drizk1
Copy link
Member

drizk1 commented Aug 24, 2024

Ok, just a few short weeks later, cat_relevel can now handle missing values.

cx = CategoricalArray(["A","B",missing]);

julia> print(levels(cx, skipmissing = false))
Union{Missing, String}["A", "B", missing]

julia> print(levels(cat_relevel(cx, ["B",missing,"A"]), skipmissing=false))
Union{Missing, String}["B", "A", missing]

do show missing levels, skipmissing=false in levels must be set to false, and my understanding from the catarrays.jl documentation is that missing will get sorted to last.

so i think if you wanted to put missing somewhere other than last, replacing it with "missing" is maybe the easiest way.

the new update will also have

cat_replace_missing(cat_array, "string_for_missing_values") as function. its a simple wrapper but will be available.

sorry for the long delay and imperfect solution.

Plz reopen the issue as needed

@drizk1 drizk1 closed this as completed Aug 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants