-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v2] accept unix epoch timestamps in microseconds #1413
Conversation
fields: | ||
- name: us | ||
type: long | ||
count: 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't able to figure out what this "count: 1" actually does.
* makes "start" optional if "timestamp" is given for v2 * if "timestamp" is not given, it is set to the sum of req-time + start * adds timestamp.us (long) to span, transaction and error documents (both v1 and v2)
745689a
to
126e9b3
Compare
I've made the changes necessary to keep It adds significant complexity, in particular in the tests. Given the added complexity, i think it's worth re-considering if we shouldn't "just" add The UI will probably not be able to take advantage of this added field for v1 data, because the UI needs to take into account old data that doesn't have the field anyway - so it's easier for it to just assume there no |
If this is not worth it, I'm fine with leaving |
"properties": { | ||
|
||
"timestamp": { | ||
"description": "Recorded time of the span, UTC based and formatted as microseconds since Unix epoch", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of adding something like timestamp_datetime.json
timestamp_epoch.json
and $ref
to those to make it clear where the differences are?
model/error/event.go
Outdated
e.TransactionId = decoder.StringPtr(raw, "id", "transaction") | ||
return e, decoder.Err | ||
if decoder.Err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why change this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It started with the fact that we have this:
https://github.com/elastic/apm-server/pull/1413/files#diff-fcd6e0b6a82640c441aa919ee2076b33L199
which means that if there's an error, V1DecodeEvent
must also return nil, err
. As it turns out, previously, we never tested the path in which the decoder would set decoder.Err
here. Modifying the snippet below (from v2
branch) would still pass all tests:
func V1DecodeEvent(input interface{}, err error) (transform.Transformable, error) {
e, raw, err := decodeEvent(input, err)
if err != nil {
return nil, err
}
decoder := utility.ManualDecoder{}
e.TransactionId = decoder.StringPtr(raw, "id", "transaction")
+ if decoder.Err != nil {
+ panic(decoder.Err)
+ }
return e, decoder.Err
}
I'll make sure it gets tested and also simplify the tests to not care about the first return argument if there's an error returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be clear: my first approach was to just align it across the model and forcing the nil return on decoder errors bring it in line with the others. I'll simplify across the board now instead.
model/error/event.go
Outdated
func (e *v2Event) Transform(tctx *transform.Context) []beat.Event { | ||
events := e.Event.Transform(tctx) | ||
for _, e := range events { | ||
utility.Add(e.Fields, "timestamp", utility.TimeAsMicros(e.Timestamp)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't sitting well with me, though I see how we got here. Have you considered adding a member to Event
to differentiate between v1 and v2 and only add this timestamp object if v2, inside of Event.Transform()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
much better, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considered creating two different event attributes for the two timestamps? This would allow setting the proper timestamp in the v1/v2 decoding methods. The transformation method could then only transform what is set, without any additional knowledge.
The reason I am suggesting this is that we could keep the whole v1/v2 separation in the decode method, as we do so far, which is also the reason that so far no separate v2 transform tests have been necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I see your point. If i understand correctly, you're suggesting to do something like:
type Event struct {
Name string
Type string
Start *float64
Duration float64
Context common.MapStr
Stacktrace m.Stacktrace
- Timestamp time.Time
TransactionId string
// new in v2
HexId string
ParentId string
TraceId string
+ TimestampV2 time.Time
// deprecated in v2
Id *int64
Parent *int64
+ TimestampV1 time.Time
}
func V1DecodeEvent(input interface{}, err error) (transform.Transformable, error) {
...
e.TimestampV1 = decoder.TimeRFC3339(raw, "timestamp")
...
}
func V2DecodeEvent(input interface{}, err error) (transform.Transformable, error) {
...
e.TimestampV2 = decoder.TimeEpochMicro(raw, "timestamp")
...
}
For spans, it seems to me that the transform method needs to know if it is dealing with a v1 span or a v2 span. Part of the new functionality is to set the timestamp to be reqTime + start if no timestamp is given. For v1 we're just setting it to reqTime. Looking at the span
transform method, if none of the timestamps are set (timestamp is optional in both v1 and v2) how will we know if we should apply the v2 logic or not?
Besides the problem above, what i like about this current solution is that the decode methods "decode" to a model that is shared by v1 and v2 in the sense there's one field to denote the timestamp, regardless of which format it came in.
Having two timestamp fields for the sake of deciding the final output seems less clean to me, compared to decoding the different input representations with the same semantics into the same field as it is currently. For example, if there was some functionality that required us to use the timestamp in some logic, if we have two timestamps we would need to check them both even though the semantically mean the same thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Appreciate the explanation, agree that it is cleaner this way for spans.
@@ -130,6 +131,19 @@ func TestDecode(t *testing.T) { | |||
} | |||
} | |||
} | |||
func TestDecodeV1(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this parameterizes testDecode
while in "span" the decode tests are simply copied. I'm fine either way, but we should consider aligning them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The span and transaction events have quite a few differences, whereas the metricset only differs in the time. Therefore, for spans I specifically separated v1 and v2 tests, except for the failures, that were expected to be the same for both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you're OK with the current changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes fine with me
following feedback from @graphaelli things are a bit simpler and I'm OK to move forward with the current functionality. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. Note that model/transaction/event_test.go
'sTestEventTransform
doesn't cover v2 events specifically - I don't think that needs to hold this up as integration tests do cover it for now, we should at least revisit using coverage tools later on.
docs/spec/timestamp_epoch.json
Outdated
"type": ["object"], | ||
"properties": { | ||
"timestamp": { | ||
"description": "Recorded time of the span, UTC based and formatted as microseconds since Unix epoch", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should say event
rather than span
.
@@ -130,6 +131,19 @@ func TestDecode(t *testing.T) { | |||
} | |||
} | |||
} | |||
func TestDecodeV1(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The span and transaction events have quite a few differences, whereas the metricset only differs in the time. Therefore, for spans I specifically separated v1 and v2 tests, except for the failures, that were expected to be the same for both.
thanks @graphaelli and @simitt ! |
@graphaelli i added your comment as a todo here: #1237 |
* makes "start" optional if "timestamp" is given for v2 * if "timestamp" is not given, it is set to the sum of req-time + start if data arrives through v2 * adds `timestamp.us` (long) to span, transaction and error documents for v2
Note for agent devs: the field which holds the epoch micros is still called |
* makes "start" optional if "timestamp" is given for v2 * if "timestamp" is not given, it is set to the sum of req-time + start if data arrives through v2 * adds `timestamp.us` (long) to span, transaction and error documents for v2
* makes "start" optional if "timestamp" is given for v2 * if "timestamp" is not given, it is set to the sum of req-time + start if data arrives through v2 * adds `timestamp.us` (long) to span, transaction and error documents for v2
* makes "start" optional if "timestamp" is given for v2 * if "timestamp" is not given, it is set to the sum of req-time + start if data arrives through v2 * adds `timestamp.us` (long) to span, transaction and error documents for v2
* makes "start" optional if "timestamp" is given for v2 * if "timestamp" is not given, it is set to the sum of req-time + start if data arrives through v2 * adds `timestamp.us` (long) to span, transaction and error documents for v2
* makes "start" optional if "timestamp" is given for v2 * if "timestamp" is not given, it is set to the sum of req-time + start if data arrives through v2 * adds `timestamp.us` (long) to span, transaction and error documents for v2
* makes "start" optional if "timestamp" is given for v2 * if "timestamp" is not given, it is set to the sum of req-time + start if data arrives through v2 * adds `timestamp.us` (long) to span, transaction and error documents for v2
* makes "start" optional if "timestamp" is given for v2 * if "timestamp" is not given, it is set to the sum of req-time + start if data arrives through v2 * adds `timestamp.us` (long) to span, transaction and error documents for v2
Fixes #1340