Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrating from Zipkin Kafka stream #171

Closed
JodeZer opened this issue May 22, 2017 · 25 comments
Closed

Migrating from Zipkin Kafka stream #171

JodeZer opened this issue May 22, 2017 · 25 comments
Labels
help wanted Features that maintainers are willing to accept but do not have cycles to implement

Comments

@JodeZer
Copy link

JodeZer commented May 22, 2017

hi, we currently have a kafka cluster which stores zipkin spans. Does jaeger collector support consume zipkin data from kafka and transform to jaeger data model into cassandra?

@black-adder black-adder added the help wanted Features that maintainers are willing to accept but do not have cycles to implement label May 22, 2017
@black-adder
Copy link
Contributor

We do have a transformer than can convert zipkin to jaeger but we don't have a kafka ingestor built for collector ATM.

It would be awesome if this could be contributed :)

@yurishkuro
Copy link
Member

There is a bit of a Wild West going on with Kafka client libraries in Go. A robust solution needs to be able to automatically assign Kafka topic partitions to available collectors, and to reshuffle them when collectors go up or down or when the number of partitions for the topic changes.

One alternative is to run Zipkin's native (Java) Kafka ingester and make it do an HTTP POST to Jaeger collectors.

@prestonprice57
Copy link

Where can I find the transformer that can convert Zipkin to Jaeger? I've looked around but can't seem to find where it is.

@black-adder
Copy link
Contributor

@yurishkuro
Copy link
Member

@prestonprice57 while there is a transformer, the intention is to have Jaeger accepting Zipkin format out of the box. We already have a Thrift HTTP endpoint that can be used by Zipkin libraries to submit spans to Jaeger collector.

@prestonprice57
Copy link

Oh that makes sense. Is there any other configuration required in order for the endpoint to work? Currently I am starting up docker with this command:
docker run -d -p5775:5775/udp -p6831:6831/udp -p6832:6832/udp -p5778:5778 -p16686:16686 jaegertracing/all-in-one:latest
and then I make a POST request at this endpoint:
http://localhost:5775/api/traces?format=Jaeger.thrift
but I am getting the error "Connection refused"

When I do the same request to a Zipkin endpoint it works fine.

@black-adder
Copy link
Contributor

We're updating the documentation: https://github.com/uber/jaeger/pull/191 we forgot to expose the collector port to receive jaeger.thrift and zipkin.thrift over http. You should use 14268 to hit the collector directly

@prestonprice57
Copy link

That worked. Thanks for the help!

@yurishkuro
Copy link
Member

cc @pavolloffay @objectiser @jpkrohling

I was thinking... I've outlined in #212 why I don't think we should be building ingestion from Kafka. However, to help with Zipkin migration we could create a service based on zipkin-collector that will read spans off Kafka and push them via RPC to Jaeger collector. This way we don't need to deal with Kafka reading logic, at the expense of an extra microservice, which seems a reasonable compromise for people who don't want to upgrade to Jaeger clients directly in the apps.

@objectiser
Copy link
Contributor

@yurishkuro Would it be better to make this kafka collector more general so that potentially in the future it could also consume Jaeger traces from kafka? So kafka-collectior or streaming-collector?

@pavolloffay
Copy link
Member

pavolloffay commented Aug 16, 2017

It sounds good, I am not sure how zipkin-collector works, but maybe it can be initialized with StorageComponent wich just implements AsyncSpanConsumer which reports data anywhere.

It might be good to give it a try.

@jpkrohling
Copy link
Contributor

we could create a service based on zipkin-collector that will read spans off Kafka and push them via RPC to Jaeger collector

Similar to the Agent? Sounds good to me!

@yurishkuro
Copy link
Member

Would it be better to make this kafka collector more general so that potentially in the future it could also consume Jaeger traces from kafka? So kafka-collectior or streaming-collector?

@objectiser well, my point was that we don't want to write new code in Jaeger backend for consuming Kafka, because that's simply not the deployment model we recommend for native Jaeger installation. But for existing Zipkin installations that already collect spans via Kafka, we can use Zipkin's code base to read that and forward to Jaeger.

@objectiser
Copy link
Contributor

@yurishkuro ok no problem - was thinking just in case :) but if definitely not going to be supported in the future then making zipkin specific is fine.

@vprithvi
Copy link
Contributor

In #212, it seems that the major detraction from fully supporting a streaming like Kafka are because of the code, infrastructure, and configuration dependencies. But, if somebody is already using Zipkin with Kafka, these are solved problems.
The issue of bidirectional communication for adaptive sampling can be solved by emitting sampling parameters into a separate Kafka topic that is consumed by span creators.

That being said, I don't know the reasons why some Zipkin operators chose the Kafka transport over HTTP. Does anyone have any insight into this? Is there some fundamental reason (like extremely bursty traffic, reliability/scalability of collectors, etc) that makes it better to put collectors behind a distributed queue?

@yurishkuro yurishkuro changed the title Migrating from Zipkin Migrating from Zipkin Kafka stream Sep 1, 2017
@afalko
Copy link

afalko commented May 17, 2018

@vprithvi Sorry I'm late to the party, but my enterprise has a few reasons why our spans are transported through Kafka. A lot of our environments are locked down in various network buckets and the only way out for things like spans is to put them on our Kafka bus. As extra bonuses we get Kafka as a shock absorber and we can move the storage around to our heart's content (we can replay several days of span residing in Kafka).

If I were to contribute a Zipkin kafka collector that pushes into Jaeger, would it be enough to have it pump the spans into Jaeger's zipkin ingest port? Or is it advisable to inject into the storage layer directly?

@yurishkuro
Copy link
Member

@afalko we just switched to Kafka based ingestion internally as well, and planning to release that code into open source soon.

@jmhon08
Copy link

jmhon08 commented May 23, 2018

Can you specify how soon? We are looking forward to this as well

@black-adder
Copy link
Contributor

We hope to get it out by mid august at the latest.

@marcusdb
Copy link

in which branch are you guys working on this?

@vprithvi
Copy link
Contributor

vprithvi commented Jul 16, 2018

@marcusdb We don't have a public branch yet, but work on the Kafka ingester will be tagged with #929, I expect that we'll have initial commits out by end of week.

However, Zipkin support might arrive only by mid August or so.

@marcusdb
Copy link

Tks for the update, are you guys actively working on publishing post-processing events at the moment as well?

@vprithvi
Copy link
Contributor

you guys actively working on publishing post-processing events

Could you elaborate on what you mean by this, and what the use case is?

@yurishkuro
Copy link
Member

This is now solved by #1256.

@pavolloffay
Copy link
Member

This was the last issue in https://github.com/jaegertracing/jaeger/milestone/2 so I have closed the milestone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Features that maintainers are willing to accept but do not have cycles to implement
Projects
None yet
Development

No branches or pull requests