Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Storage API v2 that operates on OTLP batches #5079

Open
3 of 4 tasks
Tracked by #4843
yurishkuro opened this issue Jan 4, 2024 · 8 comments
Open
3 of 4 tasks
Tracked by #4843

Implement Storage API v2 that operates on OTLP batches #5079

yurishkuro opened this issue Jan 4, 2024 · 8 comments

Comments

@yurishkuro
Copy link
Member

yurishkuro commented Jan 4, 2024

Summary

In OTEL-based Jaeger-v2 we don't want to force OTLP trace data to go through unnecessary transformation into Jaeger's current data model. We want to define a V2 version of the storage API so that the OTLP data can be passed through the receiver-exporter pipeline without additional conversions, for better efficiency.

Problem

More discussion in the doc https://docs.google.com/document/d/1s4_6VgAS7qAVp6iEm5KYvpiGw3h2Ja5T5HpKo29iv00/edit#heading=h.hlhu8pegwdcj

Progress

yurishkuro added a commit that referenced this issue Jan 15, 2024
## Which problem is this PR solving?
- Part of #5079
- Avoid triple marshaling due to our use of internally generated OTLP
proto types, which prevents us from directly using the output of
jaeger->otlp conversion from collector contrib.

## Description of the changes
- 🛑 breaking: the JSON format is changed to match
[OTLP-JSON][otlp-json], specifically
- the trace & span IDs are returned as hex-encoded strings, not base64
as required by Proto-JSON
  - enums are returned as integers, not strings
- Use Proposal 1 from
open-telemetry/opentelemetry-collector#9233 (comment)
- API-V3 proto is already declared to use official OTLP types; remove
the overrides to our internally generated OTLP proto types
- Replace `SpansResponseChunk` with official `otlp.TracesData`, but
override it internally to use a custom type
- Depends on jaegertracing/jaeger-idl#103

## How was this change tested?
- Unit tests

[otlp-json]:
https://github.com/open-telemetry/opentelemetry-proto/blob/main/docs/specification.md#json-protobuf-encoding

---------

Signed-off-by: Yuri Shkuro <[email protected]>
@tarhphilomina
Copy link
Contributor

Can I take this issue ?

@yurishkuro
Copy link
Member Author

you can try - this is a design issue

@yurishkuro yurishkuro moved this to Todo in Jaeger V2 Feb 3, 2024
@pavolloffay pavolloffay added the v2 label Mar 13, 2024
@yurishkuro yurishkuro moved this from Todo to In Progress in Jaeger V2 May 5, 2024
@yurishkuro yurishkuro moved this from In Progress to Done in Jaeger V2 Jul 16, 2024
@yurishkuro yurishkuro removed this from Jaeger V2 Oct 17, 2024
@mahadzaryab1
Copy link
Collaborator

mahadzaryab1 commented Nov 4, 2024

@yurishkuro For the read path implementation, do we just want to expand the v2 adapter to implement spanstore.Reader (in storage_v2) that just wraps the v1 implementation?

@yurishkuro
Copy link
Member Author

Not quite, please see the Google doc.

@yurishkuro yurishkuro changed the title [Feature]: Define v2 storage API that operates on OTLP batches [Feature]: Implement Storage API v2 that operates on OTLP batches Nov 14, 2024
@yurishkuro yurishkuro moved this to In Progress in Roadmap Nov 14, 2024
@yurishkuro yurishkuro moved this to In Progress in Jaeger V2 Nov 14, 2024
@yurishkuro yurishkuro changed the title [Feature]: Implement Storage API v2 that operates on OTLP batches Implement Storage API v2 that operates on OTLP batches Nov 15, 2024
yurishkuro pushed a commit that referenced this issue Nov 19, 2024
…reader (#6221)

## Which problem is this PR solving?
- Towards #5079 

## Description of the changes
- This PR implements the v2 `spanstore.Reader` interface in the factory
adapter through the `TraceReader`, which is currently just a wrapper
around the v1 `spanstore.Reader`. The `TraceReader` provides a function
`GetV1Reader` that exposes the underlying v1 reader which will be used
in #6170.

## How was this change tested?
- Added unit tests for new functions

## Checklist
- [x] I have read
https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
- [x] I have signed all commits
- [x] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - for `jaeger`: `make lint test`
  - for `jaeger-ui`: `yarn lint` and `yarn test`

---------

Signed-off-by: Mahad Zaryab <[email protected]>
yurishkuro added a commit that referenced this issue Dec 6, 2024
## Which problem is this PR solving?
- Towards #5079

## Description of the changes
- Implemented the read path for the v2 storage interface. This path
currently just wraps a v1 span reader and exposes a static method to
access the v1 reader.
- Change the jaeger query extension to initialize a v2 storage factory
and obtain the v1 span reader from it.
- This will unblock the development of more efficient v2 storage
implementations, like ClickHouse.

## How was this change tested?
- Added unit tests for new functionality

## Checklist
- [x] I have read
https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
- [x] I have signed all commits
- [x] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - for `jaeger`: `make lint test`
  - for `jaeger-ui`: `yarn lint` and `yarn test`

---------

Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Yuri Shkuro <[email protected]>
Signed-off-by: Yuri Shkuro <[email protected]>
Co-authored-by: Yuri Shkuro <[email protected]>
Co-authored-by: Yuri Shkuro <[email protected]>
@mahadzaryab1
Copy link
Collaborator

@yurishkuro Here are the remaining usages of GetStorageFactory that cannot be removed right now because they do runtime casts to other factories which the factoryadapter does not implement currently:

Do you have any thoughts on how we should proceed with the above?

@yurishkuro
Copy link
Member Author

Good question. We might be missing an abstraction layer. Let's think about what it means to define a cassandra backend. On one hand, there are some connection parameters that would be applicable regardless of what the backend is used for. On the other hand, there are some business-specific parameters, e.g. the Schema structure (heavily extended in #5922) that is mostly specific to trace storage, i.e. a dependency or sampling storage might have a different keyspace and schema, we could even implement metric storage on top of Cassandra. So what exactly does the backend section define - pure connectivity options, or also parameters specific to the type of factory we need? Right now it's both, except for the metric store. This is the root of the inconsistency.

Basically, we have two options:

  1. a backend is a superset of parameters that are required for any role that the storage might take. This is currently the case except for metric store.
  2. a backend is just the connectivity information, and we need a higher level abstraction that includes parameters needed for the role / factory type implementation, such as trace_store, dep_store, sampling_store (with distributed lock), metric_store, etc. Each of those stores would have to either refer to a "backend" that only defines connectivity, or just embed that connectivity config (which is what metric store is doing).

I think to avoid breaking 2.x config compatibility we should stick with option 1 and maybe in the future we can pull the roles apart similar to how metric_store has done already. That means that a backend has to implement all distinct storage factory interfaces, similar to have it was done in v1 API - a (trace store) Factory implementation could optionally implement other factory interfaces.

@mahadzaryab1
Copy link
Collaborator

@yurishkuro Got it - so does this mean that our factoryadapter needs to implement all the required interfaces?

@yurishkuro
Copy link
Member Author

yes, I think so

yurishkuro pushed a commit that referenced this issue Dec 7, 2024
## Which problem is this PR solving?
- Towards #5079 

## Description of the changes
- This PR creates a new factory in the `depstore` package for the
DependencyReader and exposes a `CreateDependencyReader` function in the
`factoryadapter`
- This PR also changes some interface return types to structs because
some structs implement multiple interfaces.
- The new factory was integrated into the query service and the
callsites were changed to use the v2 factory instead of the v1 factory
to initialize the dependency reader.

## How was this change tested?
- CI / Unit Tests

## Checklist
- [x] I have read
https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
- [x] I have signed all commits
- [x] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - for `jaeger`: `make lint test`
  - for `jaeger-ui`: `yarn lint` and `yarn test`

---------

Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
mahadzaryab1 added a commit that referenced this issue Dec 8, 2024
…ry adapter (#6325)

## Which problem is this PR solving?
- Towards #5079 

## Description of the changes
- This PR implements `GetServices` and `GetOperations` in the v2
`factoryadapter`

## How was this change tested?
- Added unit tests

## Checklist
- [x] I have read
https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
- [x] I have signed all commits
- [x] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - for `jaeger`: `make lint test`
  - for `jaeger-ui`: `yarn lint` and `yarn test`

---------

Signed-off-by: Mahad Zaryab <[email protected]>
yurishkuro pushed a commit that referenced this issue Dec 8, 2024
## Which problem is this PR solving?
- Towards #5079

## Description of the changes
- This PR implements `FindTraceIDs` in the v2 factory adapter

## How was this change tested?
- Added unit tests

## Checklist
- [x] I have read
https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
- [x] I have signed all commits
- [x] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - for `jaeger`: `make lint test`
  - for `jaeger-ui`: `yarn lint` and `yarn test`

---------

Signed-off-by: Mahad Zaryab <[email protected]>
mahadzaryab1 added a commit that referenced this issue Dec 8, 2024
## Which problem is this PR solving?
- Towards #5079

## Description of the changes
- This PR implements `FindTraces` in the v2 factory adapter

## How was this change tested?
- Added unit tests

## Checklist
- [x] I have read
https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
- [x] I have signed all commits
- [x] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - for `jaeger`: `make lint test`
  - for `jaeger-ui`: `yarn lint` and `yarn test`

---------

Signed-off-by: Mahad Zaryab <[email protected]>
mahadzaryab1 added a commit that referenced this issue Dec 9, 2024
## Which problem is this PR solving?
- Towards #5079

## Description of the changes
- This PR implements `GetTrace` in the v2 factory adapter

## How was this change tested?
- Added unit tests

## Checklist
- [x] I have read
https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
- [x] I have signed all commits
- [x] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - for `jaeger`: `make lint test`
  - for `jaeger-ui`: `yarn lint` and `yarn test`

---------

Signed-off-by: Mahad Zaryab <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Status: In Progress
Development

No branches or pull requests

4 participants