Skip to content
This repository has been archived by the owner on Jan 27, 2020. It is now read-only.

Twitter to Kafka pipeline

Christian Kreutzfeldt edited this page Apr 22, 2015 · 7 revisions

This tutorial shows how to set up a very basic pipeline which simply reads content from a source and copies it into a destination (here: from Twitter Stream API to kafka topic).

To learn more about the configuration format, please read the associated documentation on that topic.

We assume that you have a standalone processing node up and running. If not, see the build and deployment instruction for guidance.

General settings

The pipeline set up in this use case is named twitter-to-kafka.

Queues

As this pipeline serves as bridge between a source and a sink it has only two components configured which need one queue to exchange data. The queue used in this case is named twitter-content according to its contents.

Components

The pipeline configures two components: a source and an emitter. The source component establishes a connection with the Twitter Streaming API using the spqr-twitter source. All data is handed over to the Kafka emitter which exports the contents using the spqr-kafka emitter.

Deployment

To put the pipeline to life, we assume that you have a Kafka instance available somewhere and that a processing node is running in standalone mode (to keep things simple).

As the processing node comes with a set of pre-installed components, all you have to do, is POST the configuration shown below to processing node running at port 7070 (localhost): (configuration is assumed to live inside file twitter-to-kafka.json)

curl -H "Content-Type: application/json" -X POST -d @twitter-to-kafka.json http://localhost:7070/pipelines/

The response should look like the following and the Kafka topic should receive incoming data:

{
  state: "OK"
  msg: ""
  pid: "twitter-to-kafka"
}

Configuration

{
  "id" : "twitter-to-kafka",
  "queues" : [ {
    "id" : "twitter-content",
    "queueSettings" : null
  } ],
  "components" : [ {
    "id" : "twitter-stream-reader",
    "type" : "SOURCE",
    "name" : "twitterSource",
    "version" : "0.0.1",
    "settings" : {
      "twitter.consumer.key" : "<your_consumer_key>",
      "twitter.consumer.secret" : "<your_consumer_secret>",
      "twitter.token.key" : "<your_token_key>",
      "twitter.token.secret" : "<your_token_secret>",
      "twitter.tweet.terms" : "fifa,uefa,soccer"
    },
    "fromQueue" : "",
    "toQueue" : "twitter-content"
  }, {
    "id" : "kafka-topic-emitter",
    "type" : "EMITTER",
    "name" : "kafkaEmitter",
    "version" : "0.0.1",
    "settings" : {
      "clientId" : "twitterToKafka",
      "topic" : "twitter",
      "metadataBrokerList" : "localhost:9092",
      "zookeeperConnect" : "localhost:2181",
      "messageAcking" : "false",
      "charset" : "UTF-8"
    },
    "fromQueue" : "twitter-content",
    "toQueue" : ""
  } ]
}
Clone this wiki locally