Skip to content

A basic ETL job written in python targeting mongoDB for storing transformed data

Notifications You must be signed in to change notification settings

Rishabh-Hupr/mongoETLpipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bootstrapping instructions

  1. Create virtual environment of python using python3 -m venv <path> and activate it, then install all the requirements present in requirements.txt using pip3 install -r requirements.txt
  2. Install and run mongoDB locally
  3. Get inside the mongo shell, using command mongosh
  4. Create a new database using command use newDB (as used in the etl.py file line number 43)
  5. Create a new collection called 'etl_output' using command db.createCollection("etl_output")
  6. Check no data is present at the moment in there using command db.etl_output.find(), now exit the mongosh
  7. Execute the script using python3 etl.py
  8. Head back to mongo shell, using mongosh and then change the database using command use newDB
  9. Check your merged data, using command db.createCollection("etl_output")

About

A basic ETL job written in python targeting mongoDB for storing transformed data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages