Update json spec for v2 Intake related to distributed tracing #1276

simitt · 2018-08-13T08:51:07Z

At least forbid transaction.spans to be sent in Intake V2.

Check if other changes are necessary, e.g. regarding span.id or span.transaction_id.

see more details on PR ##1303

The text was updated successfully, but these errors were encountered:

simitt · 2018-08-17T09:26:48Z

@elastic/apm-agent-devs and @roncohen as Intake v2 is a breaking change anyways, we could also take the opportunity to apply some breaking changes on the schema specs:

the span.id is currently defined as an integer. Are we deferring to change the format to a string until we implement distributed tracing or should we do this from the beginning of v2? In case we change it, we also need to change the ES schema, similar to what we did in the distributed tracing POC.
will the span.id continue to be optional? The transaction.id is required. In case we generally want to use event.id for any cross referencing with logs etc we should treat them the same. If there are valid use cases to not have an ID on spans keeping it optional is fine with me.
span.transaction_id is optional. For a transaction id is required. Having span.transaction_id optional does only make sense if a span can exist outside a transaction, otherwise I suggest to make it also required.

felixbarny · 2018-08-17T11:32:56Z

Are we deferring to change the format to a string until we implement distributed tracing or should we do this from the beginning of v2?

IIRC, there will be no non-DT version of v2, so changing that to a string sounds fine. Regarding the ES schema, it would still have to be backwards compatible I guess?

What is event.id? Does it come from ECS?

What is span.transaction_id used for again? I think it was for being backwards compatible with the old UI. If we don't need this anymore in version 7 of the stack, we should leave it optional. But maybe I'm missing some of the use-cases for span.transaction_id.

simitt · 2018-08-17T11:50:22Z

Thanks @felixbarny !

I also think we will not release a non-dt v2. I am not sure whether or not to change the format from long to string now or when we implement dt. The agents might want to test against intake v2 already. Also I am not sure if we agreed on how the format for span.id will look exactly. I tend to leave it unchanged for now and change when we introduce dt.
event.id was my attempt to talk about transaction.id, span.id, metric.id, etc in one term as all of them are events.
span.transaction_id is not new and still necessary afaik. It is used to query a transaction's spans. Spans are stored as separate documents in ES and we somehow need to relate them to a transaction.

felixbarny · 2018-08-17T12:12:22Z

The agents might want to test against intake v2 already.

If I'm not mistaken, supporting the non-DT schema with v2 would just complicate things for the agents.

Also I am not sure if we agreed on how the format for span.id will look exactly.

It's a hex-encoded string representing 64 bits of randomness. I would change the span.id to String for the intake API v2 schema now rather than later.

event.id was my attempt to talk about transaction.id, span.id, metric.id, etc in one term as all of them are events.

I see. The id should not be optional for transactions and spans. I don't think we need IDs for metrics.

simitt · 2018-08-17T12:53:05Z

In this case I will

require span.id to be a string then. This requires ES schema changes. I'll store the original value as hex_id and transform it into a long representation for the id field, which will be marked as deprecated. I'm going to apply the changes I did for the distributed tracing POC.
Make span.id a required field from v2 on.
Make span.transaction_id required from v2 on.

roncohen · 2018-08-20T06:39:22Z

this LGTM.

This requires ES schema changes. I'll store the original value as hex_id and transform it into a long representation for the id field.

During a previous discussion it occurrent to me that we very early on put the apm-server version as part of the index name by default (e.g. apm-%{[beat.version]}-transaction-%{+yyyy.MM.dd}), in order to give us more flexibility for changing the schema.

If you have two indices where a specific field has an int type in one index and a keyword type in another, the individual indices will at query time automatically apply the query in a way that makes them work for that index, whether the field is an int or a keyword. This mean you can transparently query across indexes where the same field has different types in different indices.

I'm aware that it's just a default setting, and that people could have altered the settings to remove the version from the index name, but my immediate thought is that it's rather unlikely that anyone has done that at the moment.

The reason i bring it up is that it could mean we can avoid adding the hex_id field, and just change the type. WDYT?

simitt · 2018-08-20T09:02:37Z

@roncohen I have to check how this behaves with the Kibana Index Pattern.

@felixbarny , @roncohen I also suggest to change the transaction.id to be already in distributed tracing format: hex encoded 64 random bits instead of the UUID we have atm. WDYT?

felixbarny · 2018-08-20T09:37:45Z

that makes sense

simitt · 2018-08-21T08:28:25Z

@roncohen regarding your suggestion to simply change the type of span.id from long to keyword. This results in a mapping conflict in Kibana, see screenshot:

You can still generally query on this field, but as pointed out in the Kibana note:

it will be unavailable for functions that require Kibana to know their type.

I don't think that this is a good user experience and suggest to introduce the hex_id field for spans.

roncohen · 2018-08-21T10:10:57Z

thanks for investigating @simitt! Our of curiosity, what are the other conflicts?

watson · 2018-08-21T10:40:46Z

Regarding the ID type check: Is there a performance/storage reason why we want to check that it's a 16 byte hex string? If not, I think it would be safer to just check that it's a string so in case we want to change it in the future, it's not going to be a breaking change.

simitt · 2018-08-21T11:18:22Z

@roncohen the other two conflicts were unrelated to this change.

@watson we have checked the format on the Intake API for IDs so far. That is why I suggested to also check for the new ID format. In case we prefer more flexibility over being strict and @elastic/apm-ui has no requirements to know how an ID looks like but just knowing the type, I am fine with removing the pattern in the json spec.

watson · 2018-08-21T11:41:17Z

@simitt 👍I'm fine either way, I just want to make sure we consider it before we put rules in place that we technically don't need if they have a risk of painting us into a corner

roncohen · 2018-08-21T12:14:41Z

OK @simitt. Let's go with the hex_id.

I'm fine with relaxing the ID format as well.

sorenlouv · 2018-08-21T12:15:38Z

Changing type should be fine with the ui as id's are not transformed but used verbatim.

simitt · 2018-08-21T12:51:57Z

ok, I'll get rid of the validation then and only enforce the regular length limit we have for keywords.

felixbarny · 2018-08-21T15:51:12Z

What about storing the new (hex) id in id as opposed to span.id? That would also comply with ECS.

I personally find it a bit weird that we have (span|transaction).(id|trace_id) as opposed to just id|trace_id. When listing spans and transactions in discover that also looks a bit strange, as one namespace is always empty.

simitt · 2018-08-21T16:27:45Z

I really like the suggestion to use id!

simitt · 2018-09-18T11:20:19Z

update note: sticked with hex_id as the ECS schema suggestion was changed from id to event.id.

roncohen mentioned this issue Aug 13, 2018

Intake v2 support #1237

Closed

30 tasks

simitt mentioned this issue Aug 13, 2018

Intake protocol v2 #1260

Merged

11 tasks

alvarolobato added the [zube]: Ready label Aug 13, 2018

simitt added [zube]: In Progress and removed [zube]: Ready labels Aug 17, 2018

simitt self-assigned this Aug 17, 2018

simitt changed the title ~~Update json spec for v2 Intake~~ Update json spec for v2 Intake related to distributed tracing Aug 20, 2018

roncohen mentioned this issue Aug 20, 2018

Intake v2: Finalize processor/package_tests style tests #1288

Closed

simitt mentioned this issue Aug 21, 2018

Update Intake API spec for v2 #1303

Merged

simitt added [zube]: In Review and removed [zube]: In Progress labels Aug 23, 2018

simitt closed this as completed Aug 28, 2018

simitt added [zube]: Done and removed [zube]: In Review labels Aug 28, 2018

alvarolobato removed the [zube]: Done label Sep 24, 2018

alvarolobato added this to the 6.5 milestone Oct 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update json spec for v2 Intake related to distributed tracing #1276

Update json spec for v2 Intake related to distributed tracing #1276

simitt commented Aug 13, 2018 •

edited

Loading

simitt commented Aug 17, 2018

felixbarny commented Aug 17, 2018

simitt commented Aug 17, 2018

felixbarny commented Aug 17, 2018

simitt commented Aug 17, 2018

roncohen commented Aug 20, 2018 •

edited

Loading

simitt commented Aug 20, 2018

felixbarny commented Aug 20, 2018

simitt commented Aug 21, 2018

roncohen commented Aug 21, 2018

watson commented Aug 21, 2018

simitt commented Aug 21, 2018

watson commented Aug 21, 2018

roncohen commented Aug 21, 2018

sorenlouv commented Aug 21, 2018 •

edited

Loading

simitt commented Aug 21, 2018

felixbarny commented Aug 21, 2018

simitt commented Aug 21, 2018

simitt commented Sep 18, 2018

Update json spec for v2 Intake related to distributed tracing #1276

Update json spec for v2 Intake related to distributed tracing #1276

Comments

simitt commented Aug 13, 2018 • edited Loading

simitt commented Aug 17, 2018

felixbarny commented Aug 17, 2018

simitt commented Aug 17, 2018

felixbarny commented Aug 17, 2018

simitt commented Aug 17, 2018

roncohen commented Aug 20, 2018 • edited Loading

simitt commented Aug 20, 2018

felixbarny commented Aug 20, 2018

simitt commented Aug 21, 2018

roncohen commented Aug 21, 2018

watson commented Aug 21, 2018

simitt commented Aug 21, 2018

watson commented Aug 21, 2018

roncohen commented Aug 21, 2018

sorenlouv commented Aug 21, 2018 • edited Loading

simitt commented Aug 21, 2018

felixbarny commented Aug 21, 2018

simitt commented Aug 21, 2018

simitt commented Sep 18, 2018

simitt commented Aug 13, 2018 •

edited

Loading

roncohen commented Aug 20, 2018 •

edited

Loading

sorenlouv commented Aug 21, 2018 •

edited

Loading