-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New naming scheme for the missing diagnostics / summary functions #38
Comments
I'd like to add two points to the discussion of names in general (as far as things stand in
|
Thanks for your comments, @seasmith, good to get another opinion on this. The I'm not entirely convinced that reordering the columns makes things easier to understand, although I do like the consistency. library(naniar)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
table_missing_var_2 <- function(x){
table_missing_var(x) %>%
select_("n_vars",
"n_missing_in_var",
"percent")
}
table_missing_var(airquality)
#> # A tibble: 3 × 3
#> n_missing_in_var n_vars percent
#> <int> <int> <dbl>
#> 1 0 4 66.66667
#> 2 7 1 16.66667
#> 3 37 1 16.66667
summary_missing_var(airquality)
#> # A tibble: 6 × 3
#> variable n_missing percent
#> <chr> <int> <dbl>
#> 1 Ozone 37 24.183007
#> 2 Solar.R 7 4.575163
#> 3 Wind 0 0.000000
#> 4 Temp 0 0.000000
#> 5 Month 0 0.000000
#> 6 Day 0 0.000000
table_missing_var_2(airquality)
#> # A tibble: 3 × 3
#> n_vars n_missing_in_var percent
#> <int> <int> <dbl>
#> 1 4 0 66.66667
#> 2 1 7 16.66667
#> 3 1 37 16.66667
summary_missing_var(airquality)
#> # A tibble: 6 × 3
#> variable n_missing percent
#> <chr> <int> <dbl>
#> 1 Ozone 37 24.183007
#> 2 Solar.R 7 4.575163
#> 3 Wind 0 0.000000
#> 4 Temp 0 0.000000
#> 5 Month 0 0.000000
#> 6 Day 0 0.000000 Do you have any strong opinion about the renaming I proposed?
Thanks again for your input, much appreciated! |
I like the new naming scheme. Ought to help make tab completion a lot easier. |
OK great, just prepping the release for this now. I'm going to go ahead and use |
Thank you again @seasmith for your input! :) |
Currently I'm finding it a bit hard to remember which functions I want to do what summary of the missing data.
I am moving towards the format
miss_type_value/fun
, because it makes more sense to me when tabbing through functions.miss_*
= I want to explore missing valuesmiss_case_*
= I want to explore missing casesmiss_case_pct
= I want to find the percentage of cases containing a missing valuemiss_case_summary
= I want to find the number / percentage of missings in each casemiss_case_table
= I want a tabulation of the number / percentage of cases missingThis is more consistent and easier to reason with. I will not be providing .Deprecated for these functions,
naniar
is still early days, and these functions shouldn't break much analysis code, and are easy to fix.The text was updated successfully, but these errors were encountered: