Skip to content

This repository includes the datasets that was used throughout my biostatistics course

Notifications You must be signed in to change notification settings

MarwaTawfikBadawy/Biostatistics

Repository files navigation

Biostatistics For Public Health

This course provided a foundational understanding of biostatistical methods and their applications in public health and medical research. It covered key statistical concepts, data analysis techniques, and interpretation of results to support evidence-based decision-making in healthcare, with a focus on Statstical analyses using RStudio.

Here's the R markdown for analyzing the dataset that represents the relationship between smoking mothers and babies health conditions.


library(tidyverse) 

Download the Babies dataset (Babies.rda) from the following link https://goo.gl/7NrADr

load("Babies.rda")

Let’s take a glimpse of the dataset

glimpse(Babies)

The first a few rows of the dataset

head(Babies)

Mom’s smoking and baby’s weight

 ggplot(Babies) +
  geom_boxplot(aes(x = smoke, y = weight))

Fig1

Mom’s smoking and baby’s weight with reordered x-axis

 ggplot(Babies) +
  geom_boxplot(aes(x = reorder(smoke, weight, FUN = median), y = weight))

Fig2

Mom’s race and baby’s weight

 ggplot(Babies) +
  geom_boxplot(aes(x = reorder(mom.race, weight, FUN = median), y = weight))

Dad’s race and baby’s weight

 ggplot(Babies) +
  geom_boxplot(aes(x = reorder(dad.race, weight, FUN = median), y = weight))

The box length gives an indication of the sample variability and the line across the box shows where the sample is centred. The position of the box in its whiskers and the position of the line in the box also tells us whether the sample is symmetric or skewed, either to the right or left.

Mom’s race and baby’s weight and dad’s race

ggplot(Babies) +
  geom_boxplot(aes(x = reorder(mom.race, weight, FUN = median), y = weight)) +
  geom_jitter(aes(x = reorder(mom.race, weight, FUN = median), y = weight, color = dad.race), alpha = 0.5)

Mom’s height and moms’s weight

ggplot(Babies) +
  geom_point(aes(x = mom.height, y = mom.weight, color = mom.race), alpha = 0.5) +
  geom_smooth(aes(x = mom.height, y = mom.weight), method = "lm")

model = lm (data = Babies, formula = mom.weight ~ mom.height)
summary(model)

Dad’s height and dad’s weight

ggplot(Babies) +
  geom_point(aes(x = dad.height, y = dad.weight, color = mom.race), alpha = 0.5) +
  geom_smooth(aes(x = dad.height, y = dad.weight), method = "lm")
model = lm (data = Babies, formula = dad.weight ~ dad.height)
summary(model)

Mom’s weight and dad’s weight

ggplot(Babies) +
  geom_point(aes(x = mom.weight, y = dad.weight, color = mom.race, shape = dad.race), alpha = 0.5)
model = lm (data = Babies, formula = dad.weight ~ mom.weight)
summary(model)

Mom’s smoking and mom’s education

 ggplot(Babies) +
  geom_bar(aes(x = smoke, fill = mom.edu), position = "fill")

Mom’s smoking and the family’s income

ggplot(Babies) +
  geom_bar(aes(x = smoke, fill = income), position = "fill")

Mom’s race and mom’s weight

ggplot(Babies %>% na.omit()) +
  geom_boxplot(aes(x = reorder(mom.race, mom.weight, FUN = median), y = mom.weight))

Dad’s race and dad’s weight

ggplot(Babies %>% na.omit()) +
  geom_boxplot(aes(x = reorder(dad.race, dad.weight, FUN = median), y = dad.weight))

About

This repository includes the datasets that was used throughout my biostatistics course

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages