Skip to content

Latest commit

 

History

History
executable file
·
48 lines (37 loc) · 2.1 KB

README.md

File metadata and controls

executable file
·
48 lines (37 loc) · 2.1 KB

Genearte COVID 19 Reports for UK, AND Move Date from Postgresql to MongoDB using Airflow and Docker

this Repository contains Docker-compose for Airflow, Postgresql, pgAdmin, JupyterLab and MongoDB. also have Airflow dags, notebooks for Genearte COVID 19 Reports for UK and Move data from Postgres to MongoDB

Prerequisites

Installation

Clone the Repository and navigate to docker-airflow-assigment-two directory
Then run the docker-compose file

docker-compose up -d

Files

  • The Dags located in ./notebooks/src directory
  • The generated reports located in ./notebooks/output directory
  • The Jupyter notebooks located in ./notebooks directory

Usage

Genearte COVID 19 Reports for UK

  • Using PGAdmin create Covid_DB database
  • Using Airflow Open covid_data DAG
  • Tragger the DAG manually
  • The following output will be genareted as following: 1- uk_scoring_report.png, uk_scoring_report.csv and uk_scoring_report_NotScaled.csv

Move data from Postgres to MongoDB

  • Using PGAdmin create Faker_DB database
  • Create Table with any data
  • Tragger the DAG manually
  • the Output will be a collection with the data in MongoDB

Information

The description for each process in both workflows as the following

Genearte COVID 19 Reports for UK

The following operation used to achieve the purpose

  • Get_uk_data Load all the data from Johns Hopkins University Data from Github Repo and store it in Postgrss with clean process applied and filter the data
  • report_data The result of proccess genearted and located on ./notebooks/output directory

Move Data from Postgrs to MongoDB

The following operation used to achieve the purpose

  • extract_load extract the data using pandas from and dump it to mongoDB