Skip to content
This repository has been archived by the owner on May 11, 2020. It is now read-only.
/ twitterbeat Public archive

ElasticBeat to download and index tweets of specified screen names

License

Notifications You must be signed in to change notification settings

buehler/twitterbeat

Repository files navigation

TwitterBeat is a elastic beat that fetches tweets from the Twitter Api (v1.1) and indexes them into elasticsearch. You can configure which screennames that should be fetched.

To use TwitterBeat, you need a valid Twitter Api Key (Consumer Key and stuff), an elasticsearch and Go. Or preferably a docker environment since I made a dockerized version of it.

##Elasticsearch template

To apply the elasticsearch index template:

curl -XPUT 'http://localhost:9200/_template/twitterbeat' -d@etc/twitterbeat.template.json

##Building

For now, I didn't manage to create a makefile 😉. You need to build it from source, by hand. Download the source, install the dependencies with glide (make sure you have GO15VENDOREXPERIMENT set to 1) and then, run go build or go install.

I would recommend to use the docker image to run a containerized version of the twitterbeat.

##Run in docker

###Environment

ENV variable Default value Description
PERIOD 60s Refresh rate in seconds
SCREEN_NAMES ["@smartive", "@elastic"] Screennames to fetch
ES_HOSTS ["elasticsearch:9200"] Hosts to index tweets to
CONSUMER_KEY null Twitter api consumer key
CONSUMER_SECRET null Twitter api consumer secret
ACCESS_KEY null Twitter api access key
ACCESS_SECRET null Twitter api access secret
BEAT_NAME null Shipper name
BEAT_TAGS null Shipper tags

###Volumes / mount

TwitterBeat has two volumes that can be mounted (both located in /var/twitterbeat):

  • config
  • data

Inside the config folder lies the twitterbeat.yml, the configuration of the beat.

In the data folder, the twittermap.json is stored, which is generated by the beat and keeps track, which tweet id is the newest fetched for each screen name. You might see content like this:

{"@smartive":198759827398749873,"@elastic":87572876363772774}

If you delete the content of this file, twitterbeat will reindex the last 20 tweets of each screenname.

###Example

docker run -d \
       -e CONSUMER_KEY=<consumerKey> \
       -e CONSUMER_SECRET=<secret> \
       -e ACCESS_KEY=<accessKey> \
       -e ACCESS_SECRET=<secret> \
       --link elasticsearch:elasticsearch \
       buehler/go-elastic-twitterbeat

You can mount the data or config volume for custom configuration and persisting of the twittermap.json file.

docker run ... \
       -v $PWD/data:/var/twitterbeat/data \
       -v $PWD/twitterbeat.yml:/var/twitterbeat/config/twitterbeat.yml \
       buehler/go-elastic-twitterbeat

Development

Ensure that this folder is at the following location: ${GOPATH}/github.com/buehler

Getting Started with Twitterbeat

Init Project

To get running with Twitterbeat, run the following commands:

glide update --no-recursive
glide install
make update

To push Twitterbeat in the git repository, run the following commands:

git init
git add .
git commit
git remote set-url origin https://github.com/buehler/twitterbeat
git push origin master

For further development, check out the beat developer guide.

Build

To build the binary for Twitterbeat run the command below. This will generate a binary in the same directory with the name twitterbeat.

make

Run

To run Twitterbeat with debugging output enabled, run:

./twitterbeat -c twitterbeat.yml -e -d "*"

Test

To test Twitterbeat, run the following commands:

make testsuite

alternatively:

make unit-tests
make system-tests
make integration-tests
make coverage-report

The test coverage is reported in the folder ./build/coverage/

Update

Each beat has a template for the mapping in elasticsearch and a documentation for the fields which is automatically generated based on etc/fields.yml. To generate etc/twitterbeat.template.json and etc/twitterbeat.asciidoc

make update

Cleanup

To clean Twitterbeat source code, run the following commands:

make fmt
make simplify

To clean up the build directory and generated artifacts, run:

make clean

Clone

To clone Twitterbeat from the git repository, run the following commands:

mkdir -p ${GOPATH}/github.com/buehler
cd ${GOPATH}/github.com/buehler
git clone https://github.com/buehler/go-elastic-twitterbeat twitterbeat

For further development, check out the beat developer guide.

About

ElasticBeat to download and index tweets of specified screen names

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published