Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: Adds imputation and improves missing data summary #29

Merged
merged 1 commit into from
Jul 11, 2024

Conversation

ns-rse
Copy link
Owner

@ns-rse ns-rse commented Jul 11, 2024

Closes #28

IMPORTANT Currently there is a bug in the stable release of Quarto which prevents rendering of the missing data figures. It is fixed in development version v1.6.1 (currently available as pre-release, so if things don't render upgrade to this version).

  • Uses the mice package to summarise missing data graphically and undertake three different methods of multiple imputation. Functions are defined to aid with the plotting of imputed data for comparison to the original dataset. Notes on tasks that could be done to augment this such as tabulation. This is via the sections/_interpolation.qmd file. Includes citation for the mice R package.
  • Moves data dictionary to Appendix.
  • Tidies up tables adding missing captions and removing print()
  • Moves tables to panel-tabset as document was getting long and cluttered. This makes it shorter and easier to navigate. Used for plots that summarise imputation.
  • Introduces caching to the document so that computationally expensive sections of code are not re-run on every render.
  • Some house keeping wrapping lines to 120 characters.
  • Moves summary of missing data patterns to sections/_missing.qmd.
  • Removes dark_theme_minimal() from plot of final lasso.
  • Tidies up sections/_logistic.qmd to explicitly use family = binomial(link = "logit") (NB Previous work ensured the train data frame is used in all logistic regression rather the raw df which includes individuals with missing final_pathology).

Closes #28

**IMPORTANT** Currently there is a [bug in the stable release of
Quarto](quarto-dev/quarto-cli#10196) which prevents rendering of the missing data figures. It
is fixed in development version [`v1.6.1`](https://github.com/quarto-dev/quarto-cli/releases/tag/v1.6.1) (currently
available as pre-release, so if things don't render upgrade to this version).

+ Uses the [mice](https://amices.org/mice/index.html) package to summarise missing data graphically and undertake three
  different methods of multiple imputation. Functions are defined to aid with the plotting of imputed data for
  comparison to the original dataset. Notes on tasks that could be done to augment this such as tabulation.
  This is via the `sections/_interpolation.qmd` file. Includes citation for the mice R package.
+ Moves data dictionary to Appendix.
+ Tidies up tables adding missing captions and removing `print()`
+ Moves tables to [panel-tabset](https://quarto.org/docs/interactive/layout.html#tabset-panel) as document was getting
  long and cluttered. This makes it shorter and easier to navigate. Used for plots that summarise imputation.
+ Introduces caching to the document so that computationally expensive sections of code are not re-run on every render.
+ Some house keeping wrapping lines to 120 characters.
+ Moves summary of missing data patterns to `sections/_missing.qmd`.
+ Removes `dark_theme_minimal()` from plot of final lasso.
+ Tidies up `sections/_logistic.qmd` to explicitly use `family = binomial(link = "logit")` (**NB** Previous work ensured
  the `train` data frame is used in all logistic regression rather the raw `df` which includes individuals with missing
  `final_pathology`).
@ns-rse ns-rse requested a review from mdp21oe July 11, 2024 18:59
@ns-rse ns-rse merged commit 5206872 into mdp21oe/elasticnet Jul 11, 2024
@ns-rse ns-rse deleted the ns-rse/imputation branch July 11, 2024 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant