Skip to content

maxjerin/PythonETL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

PythonETL

Python ETL samples using docker

Environment

  1. homebrew: to manage macOs dependencies (like pip, mysql etc).
  2. http://brew.sh/
  3. pip: to manage python dependencies.
  4. http://docs.python-guide.org/en/latest/starting/install/osx/
  5. sudo easy_install pip
  6. docker to create containers
  7. https://docs.docker.com/docker-for-mac/
  8. install and start docker

Pre-requisite

  1. Install postgres using homebrew
  2. brew install postgres
  3. Install psycopg2 using
  4. sudo pip install psycopg2
  5. Start postgres container
  6. postgresql_container.sh
  7. Run individual python scripts
  8. python create_db.py
  9. python create_table.py
  10. python insert_data.py

Guidelines for working on this Repo

  1. Always work off a new branch.
  2. Create pull request against "dev" branch.

Tasks

  1. Create 2 sample data file with 10 records.
  2. These two files will be loaded in to the database.
  3. Create Python scripts to generate large amount of data (do not upload large data file to github).
  4. Create scripts to install and start mysql.
  5. Create scripts to create database and table in mysql.
  6. Create Python ETL script.
  7. Script to automatically run Python ETL script.

About

Python ETL samples using docker

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published