A web app that uses data from Twitter combined with sentiment analysis and emotion detection to create a series of data visualisations to illustrate the happy and less happy locations, topics and times.
This project aims to make Twitter data more understandable. It streams real-time tweets, or can fetch tweets about a specific topic or keyword - it then analyses this data using a custom-written sentiment analysis algorithm, and finally displays the results with a series of dynamic D3.js data visualisations.
The aim of the app is to allow trends to be found between sentiment and other factors such as geographical location, time of day, other topics...
It has a wide range of uses, from analysing the effectiveness of a marketing campaign, to comparing two competing topics.
Read more about the application here.
The application is fully documented, which can be viewed here
A live demo of the application has been deployed to: http://sentiment-sweep.com
As part of the documentation there is one shot of each screen in it's current state. View screen shots here
Below is a sample of the 12 key screens.
Several open source node modules have been developed and published on npm as part of this project
- sentiment-analysis - useses the AFINN-111 word list to calculate overall sentiment of a sentence
- fetch-tweets - fetches tweets from Twitter based on topic, location, timeframe or combination
- stream-tweets - streams live Tweets in real-time
- remove-words - removes all non-key words from a string sentence
- place-lookup - finds the latitude and longitude for any fuzzy place name using the Google Places API
- hp-haven-sentiment-analysis - A Node.js client library for HP Haven OnDemand Sentiment Analysis module
- haven-entity-extraction - Node.js client for HP Haven OnDemand Entity Extraction
- tweet-location - calculates the location from geo-tagged Tweets using the Twitter Geo API
- find-region-from-location - given a latitude and longitude calculates which region that point belongs in
- A set of user stories with acceptance criteria and complexity estimates have been drawn up outlining what features the finished solution should have. They are also managed on the Trello Board.
-
Prerequisites - You will need Node.js, MongoDB and git installed on your system. You will also need Gulp and Bower, which (once node is installed) you can install by running
npm install gulp bower -g
. Also Yarn is recommended. -
Get the files -
git clone https://github.com/Lissy93/twitter- sentiment-visualisation.git
then navigate into it withcd twitter-sentiment-visualisation
-
Install dependencies -
yarn
will install the npm node_modules, then should automatically kick off abower install
(if not, then just run it manually). If you are developing, you will need to usenpm install
in order to get the devDependencies too. -
Set Config
yarn run config
will generate theconfig\src\keys.coffee
file, which you will then need to populate with your API keys and save. Also check that your happy with the general app config inconfig/src/app-config.coffee
. -
Build Project -
yarn run build
will compile the project, from the source. -
Start MongoDB -
mongod
will start a MongoDB instance (run in separate terminal instance). See instructions: Starting a MongoDB instance. -
Run the project - Run
yarn start
then open your browser and navigate to http://localhost:8080
View detailed installation instructions
To run the tests: npm test
or see the more test strategy
To run in development mode, use yarn run dev
. This will use dev environmental variables,
and also will watch for changes, lint, compile and refresh automatically.
TSV uses the Gulp streaming build tool to automate the development workflow.
The key tasks you need to run are:
gulp generate-config
- before first-time running of the project, run this command to generate configuration files for your API keysgulp build
- This will build the project fully, this includes cleaning the working directory and then all tasks that must happen for CoffeeScript, JavaScript, CSS, images, HTML and Browserify tasks.gulp nodemon
- Runs the application on the default port (probably 8080)gulp test
- This will run all unit and coverage tests, printing a summary of the results to the console and generating more detailed reports into the reports directory.gulp
- this is the default task, it will check the project is configured correctly, build ALL the files, run the server, watch for changes, recompile relevant files and reload browsers on change, and keep all browsers in sync, when a test condition changes it will also re-run tests - a lot going on!
To read more about the project setup and gulp process, see build environment in the docs
Twitter Sentiment Visualisation follows the TDD approach and is structured around it's unit tests.
To run tests: npm test
Testing Tools
- Framework - Mocha
- Assertion Library - Chai
- Coverage Testing - Istanbul
- Stubs, Spies and Mocking - Sinon.js
- Continuous Integration Testing - Travis CI
- Dependency Checking - David
- Automated Code Review's - Code Climate
- Headless Browser Testing � PhantomJS
- Testing HTTP services - SuperTest
More details on each of the tools and how they will be implemented along with the pass and fail criteria can be found on the test strategy page of the documentation.
This project wouldn't have been possible at all without making use of many many open source packages, libraries, frameworks etc..
I would like to personally thank the hundreds of developers who have worked on open source packages like these.
There is an extensive stack of technologies that were used to develop the final application. The following list is a summary of the key utilities:
The current sentiment analysis scene
Comparison of various sentiment analysis algorithm approaches