-
Notifications
You must be signed in to change notification settings - Fork 527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duplicate page.url as url.*, to make it searchable #3827
Comments
What about using the request.url which has the full structure of URL defined properly in the spec. apm-server/docs/spec/request.json Lines 51 to 96 in 5d93733
Not sure if its searchable as well. /cc @elastic/apm-server Please let us know your thoughts. |
Hi, I was looking for something similar and found this ticket. I'm quite new to elastic apm and rum and are investigating now documents structure, I also read about ecs recommendations and was wondering if the url shouldn't be under url? url.registered_domain looks more appropiate? (https://www.elastic.co/guide/en/ecs/current/ecs-url.html) or am I misunderstanding something? Thanks! |
I tend to agree, @moix. If we record the page URL as @vigneshshanmugam In theory you could send We can alternatively update the server to store |
This solution sound good to me. Also I am in favour of the URL duplication work to be done in the server as it would reduce the payload size in case if we need to duplicate from RUM side. |
Thanks @axw, duplicating the Also, if we plan to remove |
Cool. I'm going to transfer this to apm-server then, and we'll take it from there. |
@axw, two questions:
|
I think we can probably manage that.
It's theoretically possible, but unlikely that we would do that - at least not for 7.9, and likely not in the foreseeable future. We don't have any infrastructure in place for things like this. We could suggest an _update_by_query script to run manually though. |
Thanks @axw , I think it's fine to duplicate url only on new data for now. |
What is the usecase for collecting Since the |
Good idea @simitt , for now the use-cases have been around trying to search for a particular url. but I think if we start splitting the url into its fields (as well as keeping the full url), it can help address many more use-cases (e.g. ad-hoc search queries based on scheme). |
I've updated the description to apply the same solution for both Transactions and Errors |
I like @simitt suggest to break it down as much as possible. Question - would you propose normalising the query strings or storing them as received? |
I was not thinking to normalize them; we could reuse the logic we have for jaeger in place https://github.com/elastic/apm-server/blob/master/processor/otel/consumer.go#L508. |
While on it, we could also map |
where should the duplicated field be stored? |
@jalvz the (previously unwritten) assumption here is that you only ever pass In either case they would be recorded in the |
@jahtalab any reasons to not go with #3827 (comment)? (This was expected at least by 1 customer) |
I don't see any reason why not. Other than the fact that ECS field is |
For RUM page load transactions and errors, index the page URL into ECS url fields so it is full-text searchable. For backwards compatibility (up until 8.0), we would need to continue indexing into
page.url
as well.Original description
The current Transaction.page.url is not indexed and therefore not searchable (it's only used in the UI). Since searching over the page url has been requested before we should consider indexing this field.
We should, also, clean up the url for indexing. e.g. removing query strings. another possiblity is to add a
fullUrl
field, that contains the url without any change and set the clean up url onpage.url
The text was updated successfully, but these errors were encountered: