Skip to content

The DS 3001 project for CJ Barcelos, Benjamin Sarkis, and Drew Ciccarelli

Notifications You must be signed in to change notification settings

sarkisben/ds3001

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

82 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ds3001

The DS 3001 project for CJ Barcelos, Benjamin Sarkis, and Drew Ciccarelli

Project Overview

For the project, you will work in teams of three or four on a problem of your choosing that is interesting, significant, and relevant to Data Science. You will have great latitude in what you choose to work on, so take advantage of this opportunity to make a big impact!

The primary requirements of the project are:

Your project code must live on github (Links to an external site.)Links to an external site.. We prefer it to be public. However, if you're too scared to share your code with future employers, you can claim a private github account (with a .edu email address). Your project must use some non-trivial data. Here are good starting points: (1) http://www.kdnuggets.com/datasets/index.html (Links to an external site.)Links to an external site., (2) https://github.com/caesar0301/awesome-public-datasets (Links to an external site.)Links to an external site.,(3) https://snap.stanford.edu/data/index.html (Links to an external site.)Links to an external site., and (4) https://www.kaggle.com/ (Links to an external site.)Links to an external site. Your project must apply some mining or analytics algorithm. Your project must use the "data science loop", ultimately leading to a data product or data visualization that can help guide decision making. Grading Criteria

The course project officially counts for 30% of your final grade.

[25%] Project proposal: Due April 8 by 11:59pm [25%] Checkpoint: Due April 19 by 11:59pm [50%] Project workshop: April 30 and May 1 in-class Project proposal (April 8) [1 to 2 pages (pdf); Post on Canvas]

Each group should post a 1-2 page project proposal in PDF to the Canvas by April 8 at 11:59pm.

You should include:

The name of your team and the team members What is the need? Who wants or benefits? What data (or datasets)? What is your "data science" toolkit? You should list specific tools / packages you will use. Preliminary sketch of what you hope to build Checkpoint: Exploratory Data Analysis and Data Visualization (April 19) [4 page MAX (PDF); Post on Canvas]

For the project checkpoint, you must have collected a significant portion of the data that your project will ultimately use. You will post a brief summary of your exploratory data analysis and your prototype visualization. Post your MAX 4-page PDF to Canvas.

You should include:

Summary and descriptive statistics of your data Data cleaning steps taken Insights Initial screenshots Sketch of interaction in your final data product Next steps Project Workshop (April 30 and May 1 in-class; Post on Canvas)

On April 30 and May 1 we will hold the Data Science Project Workshop in-class. Each team will give a project overview and a demo. All students are required to participate on both days.

About

The DS 3001 project for CJ Barcelos, Benjamin Sarkis, and Drew Ciccarelli

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published