This repository has been archived by the owner on Aug 23, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 105
refactor kafkaMdm to manage its own offsets. #296
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- removes use of sarama-cluster. - As kafkaMdm currently always consumes from all partitions, using a consumerGroup just adds complexity without any up side. - Instead of storing parition offsets in kafka, this change now tracks the offsets in a leveldb database. Offests are flushed to the index every 5seconds. - In addition to tracking the offsets, this commit also makes it easier to specify where a consumer should start from. The options are: - newest: the newest data available. - oldest: the oldest data available. - last: the last commited offset. - <duration>: a time.Duration amount of time ago to start from. eg "30m" would start consuming from data written 30minutes ago.
note that DurationVars, if invalid, just get a value of 0ns. So the values still need to be validated (which i did for es retry-duration at ca0ff63 ) |
case "oldest": | ||
case "newest": | ||
default: | ||
_, err := time.ParseDuration(offset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't we set a package-level variable here, so that we don't have to re-parse in Start()
few remarks, but overall 👍 this will fix #236 also don't forget to update the update the "doesn't work yet" in the install guides in the kafka section. |
This allows us to use the same offsetMgr for both ingestion and the clusterHandler
was always seeing something like: ~/g/s/g/r/metrictank ❯❯❯ govendor status The following packages are missing or modified locally: github.com/shopify/sarama note that we, and sarama-cluster use github.com/Shopify/sarama as import path. vendor.json had both shopify and Shopify in the json, but only the one with caps in the vendor dir.
by calling os.MkdirAll() on empty directories we would get 'no such file or directory' let's default to '' (working dir) for dev builds and /var/lib/metrictank for docker and packages
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
a consumerGroup just adds complexity without any up side.
offsets in a leveldb database. Offests are flushed to the index every 5seconds.
easier to specify where a consumer should start from. The options
are:
"30m" would start consuming from data written 30minutes ago.