This work on Visualization was done using javascript library (D3.js and add-on C3.js) as I was helping one of my teacher to build a modelling tool (first version using Logistic Regression). The goal is to provide a chart visualization for variable selection to include in a model:
http://christopheduong.github.io/Practicing-Data-Discovery/charts/index.html
This second work on Visualization was done using javascript and WebGL. This is trying to solve the issue of "Bigger Data", displaying 1 million rows of GPS traces from Uber rides in D3.js is not tractable... So using the graphic card's power should improve these type of issues:
http://christopheduong.github.io/Practicing-Data-Discovery/dataviz/uber.html
Basic Data Manipulation using Pandas library to clean and transform datasets to study a specific question:
First, let's retrieve some data and build a simple study case on the following question: Are macroeconomics influencing car sales?
Next, we can use our favorite statistical tool to conduct Data Analysis using R and perform a Linear Regression on the produced dataset:
http://christopheduong.github.io/Practicing-Data-Discovery/macroeconomics/macroeconomics.html
Example of usage of LinkedIn API to make a simple Analysis of authors in subscribed discussion group's posts:
Natural Language Processing using NLTK library in Python:
The following are just random exercises:
A naive implementation of Sentence-ranking and extraction applied to Summarization:
An example on how to scale ARIMA model fitting for Time Series Analysis using Spark
Example of recommender system: http://nbviewer.ipython.org/github/ChristopheDuong/Practicing-Data-Discovery/blob/gh-pages/recommendations/recommendations.ipynb
Analysis of US census Data: http://nbviewer.ipython.org/github/ChristopheDuong/Practicing-Data-Discovery/blob/gh-pages/us-census/Explore.ipynb