Skip to content

Studying and exercising material to deploy data science and machine learning model on top of GCP.

Notifications You must be signed in to change notification settings

t8rohman/ds_deploy_gcp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ds_deploy_gcp

Studying and exercising material to deploy data science and machine learning model on top of GCP. All the contents here are the exercise material taken from Data Science on the Google Cloud Platform book by Valliappa Lakshmanan (2017). Some of the codes are modified to be able to be ran on local machine.

For the complete repository, please refer to: https://github.com/GoogleCloudPlatform/data-science-on-gcp

The exercise includes:

  1. 01_ingest

    • Ingesting data from external, here we're using BTS data.
    • Deploy the ingestion app using flask.
    • Containerize it into a docker container.
  2. 02_streaming

    • Create streaming transformation using Apache Beam on local file.
    • Moving and running the local transformation to Google Dataflow.
    • Simulate the streaming data, and publish it using Google Pub/Sub.
  3. 03_pyspark

    • Create a simple bayes model using Apache Spark.
    • Shell command editor to convert notebook to .py file using nbconvert.
    • Can be ran on Google Dataproc, please check the source's github.
  4. 04_sparkml

    • Create a logistic regression model using MLLib from Pyspark.
    • Evaluate the model manually.
  5. 05_mlopstf

    • Program for training the model and save it into local drive.
    • Use the model to make a prediction on an input data.
    • All done using TensorFlow library.
    • Both of the programs are containerized already in a Docker.

About

Studying and exercising material to deploy data science and machine learning model on top of GCP.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published