Amazon S3 as a storage backend #2633

csp197 · 2020-11-18T20:52:03Z

Update 2021-04-20: there is a plugin https://github.com/muhammadn/jaeger-s3/

Is it feasible to add Amazon S3 as a storage backend for Jaeger?

Related to #638

yurishkuro · 2020-11-18T22:06:49Z

Currently what Jaeger defines as "storage backend" must support (a) trace lookup by ID, and (b) indexing & searching. The former is relatively easy to do with S3 (you may even be able to use Grafana Tempo directly for that), but indexing/searching is not.

csp197 · 2020-11-18T22:21:12Z

Yeah, I was also looking at Grafana Labs' Tempo storage option for S3, which is why I wanted to ask if such an equivalent existed for Jaeger.
It looks like Tempo has decoupled the trace lookup by ID function into its Compactor

joe-elliott · 2020-11-18T22:26:44Z

In Tempo the querier is responsible for inspecting the backend to find the trace. If you have questions about it, it would make more sense to discuss in that repo.

yurishkuro · 2021-04-20T16:15:26Z

Reopening this to link to @muhammadn 's work mentioned in #638 (comment)

muhammadn · 2021-04-21T17:50:16Z

Thanks @yurishkuro! 🙌

@jkowall So far for testing on my local machine on a 100Mbit connection to S3 for Jaeger UI is slow but bearable since we don't look into the trace data frequently.

We also had implemented Thanos on our infrastructure which we used to ingest the telemetric data and store on S3 and read again from S3 to Thanos and to Grafana so we had expected this on Jaeger-S3 plugin to Jaeger-UI.

We can live with it since there are a lot of cost savings to this. I have already added tag searches on Jaeger-S3 just a few minutes ago so that's done.

Another thing i have to fix is the time is somehow skewed and will be fixing this in a couple of days and this is the only one pending work before we run into production.

Jaeger-S3 should also be able to support storing the trace data in GCS (Google Cloud Storage) and Azure Storage Blobs and theoretically support anything that Cortex supports including Cassandra (Jaeger already has this built in but we use Cortex and Cortex supports Cassandra) Amazon DynamoDB and Google BigTable.

joe-elliott · 2021-04-21T18:38:06Z

Very cool! If you are looking for a trace id do you pull each db independently and search them? Are you coordinating parallelism somehow or do you do it one at a time?

muhammadn · 2021-04-22T04:07:49Z

Hey @joe-elliott! Thanks! Watched your Tempo video on Youtube FOSDEM and it's great to hear that you're making progress!

I have not gone through the internals of cortex on how it manages the data on object storage but from what i understand is that the indexes are stored locally and the data (in chunks) is stored in the object storage.

Despite that i am seeing indexes being stored on s3 as eventhough the boltdb-shipper documentation says indexes local and data on the object storage. The data is stored in very small chunks, ~2KiB per chunk (a file), so from my understanding is that cortex does not really pull the whole db but only part of it from the chunks (i haven't checked it out but probably by timestamp - which is indexed).

The data is usually pulled when i try to find it but i realised finding by trace id is pretty snappy as data is already fetched earlier. As for the current working state, i had set the data to be fetched to be 24 hours old but i can reduce it (in FindTraces method) to make it fetch a lot faster.

Anyway, i've fixed the clock skew issue. I think i will release the plugin binaries in my repo in a couple of days once i can minimise any bugs. Now, you can use helm/k8s/jaeger operator (i've removed the plugin's dependency on my jaeger fork)

muhammadn · 2021-04-26T08:10:34Z

@csp197 @joe-elliott v1.0.0 of jaeger-s3 is released! 🎉

https://github.com/muhammadn/jaeger-s3/releases/tag/v1.0.0

johanneswuerbach · 2022-02-18T22:58:16Z

I've created an S3 plugin that uses S3 and Athena https://github.com/johanneswuerbach/jaeger-s3 and we are currently using it successfully in a setup with ~5000 spans/s and 14 days retention.

The plugin supports all querying capabilities of the Jaeger UI and can also generate the dependency graph. Feedback welcome :-)

sherifkayad · 2022-06-14T12:18:30Z

Are there any updates to this issue? .. Is it somehow planned that Jaeger would support S3 / Object storage natively?

muhammadn · 2022-07-22T20:39:43Z

@sherifkayad I've released version 2. This is a breaking change from version 1 where the data is not backwards compatible.

A lot of rewrite and thoughts on re-architecting jaeger-s3. Now it uses a lot less memory and can scale better.

I had renamed it to jaeger-objectstorage as we now support not just s3 but also AzureBlob and Google GCS. storing indexes using DynamoDB is also supported.

https://github.com/flitnetics/jaeger-objectstorage

github-actions bot added the needs-triage label Nov 18, 2020

yurishkuro mentioned this issue Nov 18, 2020

💾 Additional storage backends #638

Open

20 tasks

yurishkuro added wontfix and removed needs-triage labels Jan 10, 2021

stale bot removed the wontfix label Jan 10, 2021

yurishkuro closed this as completed Jan 10, 2021

yurishkuro reopened this Apr 20, 2021

yurishkuro added the area/storage label Apr 20, 2021

yurishkuro closed this as completed Nov 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Amazon S3 as a storage backend #2633

Amazon S3 as a storage backend #2633

csp197 commented Nov 18, 2020 •

edited by yurishkuro

Loading

yurishkuro commented Nov 18, 2020

csp197 commented Nov 18, 2020

joe-elliott commented Nov 18, 2020 •

edited

Loading

yurishkuro commented Apr 20, 2021

muhammadn commented Apr 21, 2021 •

edited

Loading

joe-elliott commented Apr 21, 2021

muhammadn commented Apr 22, 2021 •

edited

Loading

muhammadn commented Apr 26, 2021 •

edited

Loading

johanneswuerbach commented Feb 18, 2022 •

edited

Loading

sherifkayad commented Jun 14, 2022

muhammadn commented Jul 22, 2022 •

edited

Loading

Amazon S3 as a storage backend #2633

Amazon S3 as a storage backend #2633

Comments

csp197 commented Nov 18, 2020 • edited by yurishkuro Loading

yurishkuro commented Nov 18, 2020

csp197 commented Nov 18, 2020

joe-elliott commented Nov 18, 2020 • edited Loading

yurishkuro commented Apr 20, 2021

muhammadn commented Apr 21, 2021 • edited Loading

joe-elliott commented Apr 21, 2021

muhammadn commented Apr 22, 2021 • edited Loading

muhammadn commented Apr 26, 2021 • edited Loading

johanneswuerbach commented Feb 18, 2022 • edited Loading

sherifkayad commented Jun 14, 2022

muhammadn commented Jul 22, 2022 • edited Loading

csp197 commented Nov 18, 2020 •

edited by yurishkuro

Loading

joe-elliott commented Nov 18, 2020 •

edited

Loading

muhammadn commented Apr 21, 2021 •

edited

Loading

muhammadn commented Apr 22, 2021 •

edited

Loading

muhammadn commented Apr 26, 2021 •

edited

Loading

johanneswuerbach commented Feb 18, 2022 •

edited

Loading

muhammadn commented Jul 22, 2022 •

edited

Loading