-
Notifications
You must be signed in to change notification settings - Fork 1
drizham/cs190.1x
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
BerkeleyX: CS190.1x Scalable Machine Learning This course introduces the statistical and algorithmic principles required to develop scalable machine learning pipelines, and provides hands-on experience using Apache Spark. Week 1: Lecture 1 provides a course overview and presents core machine learning and mathematical concepts. Lab 1 reviews lambda functions and introduces Python's scientific computing library (NumPy) to manipulate vectors and matrices. Week 2: Introduction to Apache Spark. Lab 2 includes a hands-on Spark tutorial and an exercise in which you will count the words in all of Shakespeare's plays. Note: Week 2 is identical to Week 2 of BerkeleyX CS100.1x; if you've completed Lab 1 of CS100.1x you can submit your completed notebook to receive credit for Lab 2 in this course. Week 3: Linear regression and distributed machine learning principles. Lecture 3: Topics include linear regression formulation and closed-form solution, distributed machine learning principles (related to computation, storage, and communication), gradient descent, quadratic features, grid search Lab 3: Millionsong Regression Pipeline. Develop an end-to-end linear regression pipeline to predict the release year of a song given a set of audio features. You will implement a gradient descent solver for linear regression, use Spark's machine learning library (MLlib) to train additional models, tune models via grid search, improve accuracy using quadratic features, and visualize various intermediate results to build intuition.
About
BerkeleyX: CS190.1x Scalable Machine Learning
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published