Skip to content

pullz6/ELT_Pipeline_Airflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Creating a ELT pipeline in Airflow

This is project was used to create ELT pipeline to maintain supply chains in airflow. You can use the dags created inside the dag folder to understand how to implement a full and incremental data load, how to connect to postgres, how to insert data from a csv file by creating a pandas dataframe and how to retrieve data from the postgres server.

Pre-requisities

Please ensure that the below are installed:

  1. Docker
  2. Docker Compose
  3. Postgres
  4. Postgres Server, Database and a Table

Installation

You can pull the Airflow Docker Image as per the instructions available in the Apache Airflow documentation -> https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html

Explanation

Here we are tackling a full load of data and a incremental load of data.

Full Load

Here we are loading an entire dataset into the database. For example when you have first created the pipeline you might want to add previous data into the server. The .py file named as Full_load.py performs a full_load.

Incremental Load

Here we are loading only a certain portion of a dataset into the database. For example, when you have some data collection over time after an inital load, you will only upload the data that is newly added. The .py file named as Incremental_load.py performs a incremental load.

The first screenshot is the final execution of the incremental_load, you can see that final run is successful and the event logs confirming that the single inserts by the dataframe's rows are complete. Consequently data has being loaded into the postgres table as shown in the next screenshot.

image

Screenshot 2024-10-12 at 15 29 35

ITS IMPORTANT TO NOTE THAT YOU CANNOT RUN THE LOADS FOR THE SAME SET OF DATA SEVERAL TIMES SINCE IT WILL CAUSE A PRIMARY KEY ERROR!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published