Skip to content

ChristopheDuong/Practicing-Data-Discovery

Repository files navigation

Practicing-Data-Discovery

This work on Visualization was done using javascript library (D3.js and add-on C3.js) as I was helping one of my teacher to build a modelling tool (first version using Logistic Regression). The goal is to provide a chart visualization for variable selection to include in a model:

http://christopheduong.github.io/Practicing-Data-Discovery/charts/index.html

This second work on Visualization was done using javascript and WebGL. This is trying to solve the issue of "Bigger Data", displaying 1 million rows of GPS traces from Uber rides in D3.js is not tractable... So using the graphic card's power should improve these type of issues:

http://christopheduong.github.io/Practicing-Data-Discovery/dataviz/uber.html

Basic Data Manipulation using Pandas library to clean and transform datasets to study a specific question:

First, let's retrieve some data and build a simple study case on the following question: Are macroeconomics influencing car sales?

http://nbviewer.ipython.org/github/ChristopheDuong/Practicing-Data-Discovery/blob/master/macroeconomics/Some%20macroeconomics.ipynb

Next, we can use our favorite statistical tool to conduct Data Analysis using R and perform a Linear Regression on the produced dataset:

http://christopheduong.github.io/Practicing-Data-Discovery/macroeconomics/macroeconomics.html

Example of usage of LinkedIn API to make a simple Analysis of authors in subscribed discussion group's posts:

http://nbviewer.ipython.org/github/ChristopheDuong/Practicing-Data-Discovery/blob/master/linkedin/SocialMediaAnalysis.ipynb

Natural Language Processing using NLTK library in Python:

The following are just random exercises:

http://nbviewer.ipython.org/github/ChristopheDuong/Practicing-Data-Discovery/blob/gh-pages/NLP/NLTK.ipynb

A naive implementation of Sentence-ranking and extraction applied to Summarization:

http://nbviewer.ipython.org/github/ChristopheDuong/Practicing-Data-Discovery/blob/gh-pages/NLP/summarization.ipynb

An example on how to scale ARIMA model fitting for Time Series Analysis using Spark

http://nbviewer.ipython.org/github/ChristopheDuong/Practicing-Data-Discovery/blob/gh-pages/arima_spark/ARIMA_Spark.ipynb

Example of recommender system: http://nbviewer.ipython.org/github/ChristopheDuong/Practicing-Data-Discovery/blob/gh-pages/recommendations/recommendations.ipynb

Analysis of US census Data: http://nbviewer.ipython.org/github/ChristopheDuong/Practicing-Data-Discovery/blob/gh-pages/us-census/Explore.ipynb

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published