-
-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
visdat package #87
Comments
I don't know if this is allowed, but I volunteer to review this package.
I've been wanting to give back to rOpenSci since I had a package reviewed.
|
That's fine @seaaan, I'll assign once I get through editor checks. Could you fill out the form at http://ropensci.org/onboarding/? We've started using that to help keep track of reviewers' various expertises. |
@noamross done!
…On Wed, Jan 4, 2017 at 7:13 AM, Noam Ross ***@***.***> wrote:
That's fine @seaaan <https://github.com/seaaan>, I'll assign once I get
through editor checks. Could you fill out the form at http://ropensci.org/
onboarding/? We've started using that to help keep track of reviewers'
various expertises.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#87 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AKDNdTKLD7524o6M_Q8h0Xd4-GaZZCbnks5rO7cdgaJpZM4LaRc6>
.
|
Editor checks:
Editor commentsCurrently seeking additional reviewers. This package fits well into our visualization/reproducibility categories.
Reviewers: @seaaan @batpigandme |
Reviewers assigned: @seaaan @batpigandme |
Hi Everyone, Thank you very much for this, this is great feedback, and a great process. I'm not sure what my expected role here as submitter is but I just wanted to say that I agree with the feedback given, and have been looking for a good excuse to use I was also wondering if I am able to submit this paper to JOSS? I understand if this is too late to do so. |
Fine to submit to JOSS. To do so, check off the the JOSS publication options above, and write a |
Oh, and the comments I put above, e.g., testing, can be addressed once all other reviews are in, too. If they were major I would have held up assigning reviewers until they were addressed. |
OK great, just working on the JOSS paper now, will hopefully have it done this evening. Re your comments @noamross
DESCRIPTION file has been updated to reflect this here
Here is my updated output from
Test coverage is now at 99% using vdiffr. I think I have used Thank you again for the feedback, I will let you know when I add the paper.md doc. |
Ack, sorry if I jumped the gun replying to the comments. |
I've also updated the DESCRIPTION file as well. |
Great, @seaaan @batpigandme you can go ahead with your reviews. I've moved the due date to 1/30. |
Package ReviewPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
Functionality
Final approval (post-review)
Estimated hours spent reviewing:
|
Thank you very much @seaaan for the review, and for taking the time to cover it in as so much detail. I agree with you very much that this is a great process set up by rOpenSci. @noamross and @sckott, is it OK for me to address these comments now, or should I wait for the second reviewer, to avoid doubling up? I'm really looking forward to addressing these changes, I feel like visdat is going to be much better as a result of this process. |
Thank you @seaaan for the thorough review. @njtierney you can reply here and make changes, but I suggest you use a branch or work locally so that @batpigandme may have a stable version to review. Two quick notes:
|
I'm just finishing up a different project, so if you want to push changes
in the next day or so, I'm fine with reviewing with revisions made.
…On Mon, Jan 16, 2017 at 9:23 AM, Noam Ross ***@***.***> wrote:
Thank you @seaaan <https://github.com/seaaan> for the thorough review.
@njtierney <https://github.com/njtierney> you can reply here and make
changes, but I suggest you use a branch or work locally so that
@batpigandme <https://github.com/batpigandme> may have a stable version
to review.
Two quick notes:
- @seaaan <https://github.com/seaaan>, it looks like @njtierney
<https://github.com/njtierney> is use the development version of
*roxygen2*, which allows one to write documentation with markdown
syntax which is converted to .Rd syntax. This is fine.
- @arfon <https://github.com/arfon>, can you comment if the paper.md
<https://github.com/njtierney/visdat/blob/master/paper/paper.md> isn't
in JOSS format. I believe images are allowed, correct?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#87 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAyw9PrCM0LHiAelnzmD31HYaDqb7znlks5rS31fgaJpZM4LaRc6>
.
--
Mara
--------------------------------------------------------
Mara Averick | [email protected]
http://www.linkedin.com/in/maraaverick
|
Cool, I didn't know that roxygen2 would do that. That definitely makes it easier. For some reason, however, the markdown syntax didn't get converted to |
Images are ok. I'm not exactly sure what happens to those |
@batpigandme At this stage I probably won't have time to address @seaaan 's comments until the weekend, no need to wait on me, :) Re markdown in the Rd files, @noamross and @seaaan, I need to one more thing to the DESCRIPTION file to get the markdown syntax to work - [ |
@njtierney apologies — I meant to synchronize my PR and review posting, but if you ignore the PR for a few more hours, you can pretend I did it at the same time |
The changes look good after a quick look. I should be able to respond fully
to the modifications this weekend.
…On Wed, Jul 5, 2017 at 3:22 PM, Noam Ross ***@***.***> wrote:
Thanks for the comprehensive update, @njtierney
<https://github.com/njtierney>.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#87 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AKDNdbwSdnQvtr8TnAolElEbT2jyYDyOks5sLAy7gaJpZM4LaRc6>
.
|
SummaryI respond below to a few specific things, but in general for the sake of brevity I didn't respond when I agreed with your changes. So mentally insert "awesome change! you rock!" everywhere that I didn't respond :) The two points of substance I have, as you'll see, are:
DetailsDid you mean to export guess_type()?
Formatting: links, code, and other formatting need to be done with .Rd syntax. For example, for code, use \code{}, not backticks. For bold, use \strong{} instead of asterisks.
The paper doesn't have any references in the "References" section, but does have inline references.
I agree that the rows should be in the order that you see the dataframes. I would also prefer for the column names to be on top of the visualisation
This
Answers to your questionsI agree with @batpigandme's answers to your questions. Q. 3: I agree with Mara. Please do implement the warning and save me from myself: My dinky computer runs out of memory and dies trying to display large data frames. Great job on the changes! |
Thanks for continued great progress, everyone! Let me just chime in with that it's fine to defer a more efficient |
@batpigandme Thank you again for your comments! :) Q. 2: I prefer the visual analogue to the data frame, but, again that's totally stylistic.
|
@seaaan Thank you again for your comments! :D Likewise, please insert "Awesome! Thank you!" For each of these :) Re Re Re Re @noamross I have just added the Just to clarify - I am working on getting a cut down, preliminary version of visdat onto CRAN that just has |
All that sounds great to me.
…On Tue, Jul 11, 2017 at 2:50 AM, Nicholas Tierney ***@***.***> wrote:
@seaaan <https://github.com/seaaan> Thank you again for your comments! :D
Likewise, please insert "Awesome! Thank you!" For each of these :)
Re guess_type - I think that this function, whilst kinda useful, doesn't
really belong to the user, so I think the move to make it internal is
better, and makes the purpose of the package, pre-exploratory
visualisation, clearer.
Re x axis label of "variables in dataset" - I agree, this has been
changed in a recent push to the cran branch, great point!
Re vis_compare, I need to think about this some more, for the moment,
perhaps we can label this as an upcoming feature?
Re vectorized guess, perhaps we can post an issue on readr and ask the
tidyverse team?
@noamross <https://github.com/noamross> I have just added the
warn_large_data param for both vis_dat and vis_miss - which you can see
here
<https://github.com/njtierney/visdat/blob/cran-0-1-0/R/vis_miss.R#L45>
Just to clarify - I am working on getting a cut down, preliminary version
of visdat onto CRAN that just has vis_dat and vis_miss, then we can go
from there, does that work for everyone?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#87 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AKDNdWypOHFIoJjh7dQyZKsu2ZsQmAAzks5sM0VVgaJpZM4LaRc6>
.
|
Hi @seaaan and @batpigandme. @njtierney has informed me that he has addressed all of the above and harmonized the CRAN changes with the master branch of the repo. Please take a look at the current version and see if it addresses any outstanding concerns. If it does, let us know below and also check off any boxes on your original views. If not, let us know any details. |
Thanks @noamross ! Important question - how do I best add you all into the DESCRIPTION file? |
We're not done until we're done @njtierney 😉 We are currently working with CRAN to get a |
OK, sure thing, that makes sense! Would you mind if I added you all to the |
Do you mean to be exporting `label_col_missing_pct`? It is exported but the
documentation states that it is internal. Otherwise, looks good to me! I've
been using the package when I get new data sets lately and it's really
helpful for getting a quick overview of problems.
I'd be happy to be included in the thank yous section :)
…On Thu, Jul 27, 2017 at 10:12 PM, Nicholas Tierney ***@***.*** > wrote:
OK, sure thing, that makes sense!
Would you mind if I added you all to the Thank Yous section at the end of
visdat?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#87 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AKDNddLzbp6L_bBry-uWVG9Ds3XG5jPDks5sSW3CgaJpZM4LaRc6>
.
|
Oh, thanks for picking up on that! I've just removed it now and added y'all in the Thank yous in the referenced commits above :D Lemme know what you all think. |
Other than a renegade missing s (just submitted PR), it all looks great. @njtierney and |
woohoo! Changes merged, thanks Mara! :D |
Approved! Thanks @njtierney for submitting and @batpigandme and @seaaan for your reviews and all of you hanging on for the long haul! To-dos:
Welcome aboard! I believe you already have plans for a blog post to follow this up. |
OK so I'm ready to create a Zenodo release, but before I go ahead and do that I was wondering what the status is with adding the rOpenSci onboarding folk (editor/reviews) to the DESCRIPTION? |
Reviewers may be added with the annotated |
Alrighty!
I'm not sure what to do about the badges for appveyor and codecov, can I help with that? Working on submitting visdat to JOSS now, I might do this tomorrow morning as it's getting a bit late 😪 |
OK...I think I'm done! Except I submitted to JOSS using the github repo not the Zenodo repo link: https://zenodo.org/record/837274 Oops! |
Summary
visdat
visualises R data frames so that you can quickly identify data structure and data types. This makes it easier to "get a look at the data" and visually identify abnormalities with a dataset.https://github.com/njtierney/visdat
R users who want to explore their data, particularly when they first receive it.
In terms of visualising missing data as a heatmap, there are a few other packages that have worked on this. The
mi
package used to have a visualisation method for missing data,missing.pattern.plot
- however this is no longer present in the latest versions. TheAmelia
package hasmissmap
, but the default requires some more work to make the final output easier to read.The
VIM
package provides visualisations for missing data, for example, theaggr
function provides a histogram of the missingness present in each variable.In terms of visualising the types of data in a dataset, the wakefield package provides the
table_heat
function for visualising column data types.But what makes visdat different?
visdat
adheres to the principle that R packages should try to do one thing, it is a simple package that specialises in visualisation of data frames. Amelia andmi
focus on multiple imputation and missing data methods.VIM
focusses on visualising missingness and imputation in data, and thewakefield
package focusses on creating random, reproducible data.The functionality in visualising missing data for these packages is not the main focus, and so I argue that because visdat is purely about visualising dataframes, it gives it greater scope to work on just one thing.
Requirements
Confirm each of the following by checking the box. This package:
Publication options
paper.md
with a high-level description.Detail
R CMD check
(ordevtools::check()
) succeed? Paste and describe any errors or warnings:It succeeds, but there are some notes.
It does not yet have an rOpenSci footer image
I have not set up pre-commit hooks to ensure that README.md is always newer than README.Rmd, as I'm not sure what
devtools::use_git_hook
does?I have not added
#' @noRd
to internal functions as I think it is still useful to have them documented, but I can change this if need be.If this is a resubmission following rejection, please explain the change in circumstances:
If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names:
Maëlle Salmon (@masalmon)
Jenny Bryan (@jennybc)
Andrew MacDonald (@aammd)
The text was updated successfully, but these errors were encountered: