refactor kafkaMdm to manage its own offsets. #296

woodsaj · 2016-08-25T09:33:32Z

removes use of sarama-cluster.
As kafkaMdm currently always consumes from all partitions, using
a consumerGroup just adds complexity without any up side.
Instead of storing parition offsets in kafka, this change now tracks the
offsets in a leveldb database. Offests are flushed to the index every 5seconds.
In addition to tracking the offsets, this commit also makes it
easier to specify where a consumer should start from. The options
are:
- newest: the newest data available.
- oldest: the oldest data available.
- last: the last commited offset.
- duration: a time.Duration amount of time ago to start from. eg
  "30m" would start consuming from data written 30minutes ago.

- removes use of sarama-cluster. - As kafkaMdm currently always consumes from all partitions, using a consumerGroup just adds complexity without any up side. - Instead of storing parition offsets in kafka, this change now tracks the offsets in a leveldb database. Offests are flushed to the index every 5seconds. - In addition to tracking the offsets, this commit also makes it easier to specify where a consumer should start from. The options are: - newest: the newest data available. - oldest: the oldest data available. - last: the last commited offset. - <duration>: a time.Duration amount of time ago to start from. eg "30m" would start consuming from data written 30minutes ago.

Dieterbe · 2016-08-25T11:19:01Z

note that DurationVars, if invalid, just get a value of 0ns. So the values still need to be validated (which i did for es retry-duration at ca0ff63 )

Dieterbe · 2016-08-25T11:23:09Z

in/kafkamdm/kafkamdm.go

+	case "oldest":
+	case "newest":
+	default:
+		_, err := time.ParseDuration(offset)


why don't we set a package-level variable here, so that we don't have to re-parse in Start()

Dieterbe · 2016-08-25T11:47:25Z

few remarks, but overall 👍

this will fix #236

also don't forget to update the update the "doesn't work yet" in the install guides in the kafka section.
and the example config as well as the docker/package config

This allows us to use the same offsetMgr for both ingestion and the clusterHandler

was always seeing something like: ~/g/s/g/r/metrictank ❯❯❯ govendor status The following packages are missing or modified locally: github.com/shopify/sarama note that we, and sarama-cluster use github.com/Shopify/sarama as import path. vendor.json had both shopify and Shopify in the json, but only the one with caps in the vendor dir.

by calling os.MkdirAll() on empty directories we would get 'no such file or directory' let's default to '' (working dir) for dev builds and /var/lib/metrictank for docker and packages

woodsaj added 4 commits August 25, 2016 17:18

fix typo

0ac96bc

correctly close partitionsOffset db.

2a00d52

improve logging for kafkaMdm

0a7b27e

Dieterbe reviewed Aug 25, 2016
View reviewed changes

woodsaj mentioned this pull request Aug 25, 2016

consumer/broker/0 abandoned subscription to metricpersist/0 because consuming was taking too long #298

Closed

woodsaj and others added 14 commits August 26, 2016 12:55

ensure DurationVars are non-zero

cfd0870

only parse offsetDuration once.

f23e347

use offsetCommitInterval when creating ticker

2a0a8c6

write error log if we cant commit offset to index during shutdown

a54efb0

move OffsetMtr to its own package.

b767bf1

This allows us to use the same offsetMgr for both ingestion and the clusterHandler

use our own offsetMgr for metricPersist messages.

eee24d1

Merge branch 'master' into kafkaMdmNoConsumerGroup

f0ab56b

initialize offsetMgr map

7432ee0

update Shopify/sarama to latest release

8fa7b7d

update sample config and docs. fix #236

c77ed49

govendor remove unused + add missing

7add549

fix support for relative directories.

55b07b5

by calling os.MkdirAll() on empty directories we would get 'no such file or directory' let's default to '' (working dir) for dev builds and /var/lib/metrictank for docker and packages

slightly clearer code and log msg

10a95db

Dieterbe merged commit 3b5f787 into master Sep 6, 2016

Dieterbe deleted the kafkaMdmNoConsumerGroup branch December 15, 2017 19:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor kafkaMdm to manage its own offsets. #296

refactor kafkaMdm to manage its own offsets. #296

woodsaj commented Aug 25, 2016

Dieterbe commented Aug 25, 2016

Dieterbe Aug 25, 2016

Dieterbe commented Aug 25, 2016 •

edited

Loading

refactor kafkaMdm to manage its own offsets. #296

refactor kafkaMdm to manage its own offsets. #296

Conversation

woodsaj commented Aug 25, 2016

Dieterbe commented Aug 25, 2016

Dieterbe Aug 25, 2016

Choose a reason for hiding this comment

Dieterbe commented Aug 25, 2016 • edited Loading

Dieterbe commented Aug 25, 2016 •

edited

Loading