About this project - it's streaming event pipeline around Apache Kafka and its ecosystem that simulates trains movement. It uses public data from the Chicago Transit Authority
- Docker
- Python 3.7
- Access to a computer with a minimum of 16gb+ RAM and a 4-core CPU to execute the simulation
Our architecture looks like this:
- Confluent Python Client Documentation
- Confluent Python Client Usage and Examples
- REST Proxy API Reference
- Kafka Connect JDBC Source Connector Configuration Options
The project consists of two main directories, producers
and consumers
.
The following directory layout indicates the files that the student is responsible for modifying by adding a *
indicator. Instructions for what is required are present as comments in each file.
├── consumers
│ ├── consumer.py
│ ├── faust_stream.py
│ ├── ksql.py
│ ├── models
│ │ ├── lines.py
│ │ ├── line.py
│ │ ├── station.py
│ │ └── weather.py
│ ├── requirements.txt
│ ├── server.py
│ ├── topic_check.py
│ └── templates
│ └── status.html
└── producers
├── connector.py
├── models
│ ├── line.py
│ ├── producer.py
│ ├── schemas
│ │ ├── arrival_key.json
│ │ ├── arrival_value.json
│ │ ├── turnstile_key.json
│ │ ├── turnstile_value.json
│ │ ├── weather_key.json
│ │ └── weather_value.json
│ ├── station.py
│ ├── train.py
│ ├── turnstile.py
│ ├── turnstile_hardware.py
│ └── weather.py *
├── requirements.txt
└── simulation.py
To run the simulation, you must first start up the Kafka ecosystem on their machine utilizing Docker Compose.
%> docker-compose up
Docker compose will take a 3-5 minutes to start, depending on your hardware. Please be patient and wait for the docker-compose logs to slow down or stop before beginning the simulation.
Once docker-compose is ready, the following services will be available:
Service | Host URL | Docker URL | Username | Password |
---|---|---|---|---|
Kafka | PLAINTEXT://localhost:9092 | PLAINTEXT://broker:29092 | ||
REST Proxy | http://localhost:8082 | http://rest-proxy:8082/ | ||
Schema Registry | http://localhost:8081 | http://schema-registry:8081/ | ||
Kafka Connect | http://localhost:8083 | http://connect:8083 | ||
KSQL | http://localhost:8088 | http://ksqldb-server:8088 | ||
PostgreSQL | jdbc:postgresql://localhost:5432/cta |
jdbc:postgresql://postgres:5432/cta |
cta_admin |
chicago |
If you want to change configuration you can do so in config.yml
There are two pieces to the simulation, the producer
and consumer
make producer
make faust
make consumer
Once the server is running, you may hit Ctrl+C
at any time to exit.