Movie plots by genre tutorial at PyData Berlin 20 May 2016.
See slides for the narrative.
Make sure you have Python 3.
- Clone this repository
git clone
if you don't have git you can also download a zip of this repo
- Install virtualenv
(sudo) pip install virtualenv
- Create a virtual env and install all the requirements.
cd movie-plots-by-genre/
virtualenv gensim # if you have both python2 and python3 then use virtualenv -p python3 gensim
source gensim/bin/activate
pip3 install cython gensim sklearn pandas matplotlib nltk pyemd jupyter
NOTE:On OSX you might want to download pyemd from github and install it via python3 install
Download Google News pre-trained word2vec model (1.5 Gb) from here
Download nltk data
python -m nltk.downloader punkt
- Fire up a jupyter notebook
jupyter notebook
If you are short on bandwidth then you will be able to follow most of the tutorial just with these libraries:
- Python 3
- pip3 install cython gensim sklearn pandas matplotlib nltk pyemd jupyter
- python -m nltk.downloader punkt