Skip to content

petern48/covid_data_pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

covid_data_pipeline

Here's the general outline of the project:

  • Gather data sources from online
  • Design the data model for the database and data warehouse
  • Load the raw data files into cloud storage (AWS S3)
  • Preprocess the data and load it into a structured relational database. (AWS RDS)
  • Automate the process of Extracting, Transforming, and Loading (ETL) the data into a data warehouse
    • Transformations are done to clean the data, improve the data quality, and restructure the tables for the DW for analytics
  • Write unit tests to ensure the data pipeline behaves properly and reliably.

About

A data pipeline for analyzing covid and stock trends

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published