Skip to content

nhanesA: R package for browsing and retrieving NHANES data

Notifications You must be signed in to change notification settings

cjendres1/nhanes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nhanesA

CRAN status LICENSE metacran downloads

nhanesA is an R package for browsing and retrieving data from the National Health And Nutrition Examination Survey (NHANES). This package is designed to be useful for research and instructional purposes.

The functions in the nhanesA package allow for fully customizable selection and import of data directly from the NHANES website. Thus it is essential to have an active network connection.

Install from CRAN

install.packages("nhanesA")

Install from the development repository

install.packages("devtools")
devtools::install_github("cjendres1/nhanes")

Use nhanesA in Docker

It is also possible to use the nhanesA package in conjunction with a suitably configured SQL database which contains a snapshot of the data publicly available at the NHANES website. This eliminates the time spent in downloading data, enabling faster operations, at the cost of pre-downloading all the data locally. The simplest way to do this is by running a docker image as described here.

The Docker container includes the data in a SQL database, allowing for faster access and manipulation directly from the local Docker environment. The summary of the differences between using the standard nhanesA and using it inside Docker is as follows:

Standard nhanesA:

  • When used outside of Docker, the nhanesA functions scrape data directly from the CDC website each time they are invoked.

  • The advantage is simplicity; users only need to install the nhanesA package without any additional setup.

  • However, the response time is contingent upon internet speed and the size of the requested data.

Docker-enhanced nhanesA:

  • The Docker container locally hosts most of the NHANES data, allowing for significantly faster data access and manipulation.

  • Initial setup requires Docker installation and downloading the Docker image.

  • Some data, such as the youth survey, are not present in the Docker database and would need to be fetched from the CDC website.

In essence, the Docker-enhanced version offers fast access to a majority of the data, and will fetch data in the standard nhanesA manner for datasets not present in its database.

To use the Docker-enhanced version of nhanesA, one must already have Docker installed, and then run a command similar to the following.

docker run --rm -d -p 8787:8787 -e 'CONTAINER_USER_USERNAME=<USER>' -e 'CONTAINER_USER_PASSWORD=<PASSWORD>' deepayansarkar/nhanes-postgresql:<VERSION>

This command will download the (fairly large) docker image the first time it is run. More details such as the latest version and other useful options can be found here. Once the command runs successfully, one can log into RStudio Server running inside the docker container via http://localhost:8787 using the username and password set in the command above. The nhanesA package is already installed, and configured to use the database.

Working with nhanesA


drawing

About

nhanesA: R package for browsing and retrieving NHANES data

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published