-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DB Grouping variables from statement #1053
Comments
@trask I created the issue as we discussed on the last SIG, but I don't have permission to add this to the DB Client Semantic Convention project |
I added to the project now and removed the triage label :) @maryliag will you be working on this? Should I assign it to you? |
thank you @joaopgrassi ! And yes, you can assign it to me |
let's add something after #1100 to mention in lists MAY be collapsed in some way |
Discussed previously in DB semconv meeting: I sent #1243 to address #1053 (comment). After that is merged we can postpone the remaining portions of this issue until after stability. |
Is your change request related to a problem? Please describe.
As part of sanitization, one improvement is to also do a grouping of the replacements. Splitting this issue from #717 to focus on the grouping itself.
Describe the solution you'd like
For example:
When there was IN clause, it would be replaced by one of the values:
__more1_10__
__more10_100__
__more100_200__
__more200_300__
__more300_400__
__more400_500__
__more500_600__
__more600_700__
__more700_800__
__more800_900__
__more900_1000__
__more1000_plus__
.That created a nice balance of separating groups that would use different plan executions, but at the same time keeping cardinality lower of different possible final strings, since the list can be quite big (I saw cases with 20k+ values in a list)
Describe alternatives you've considered
Another solution would be to always replace with the exact value is being grouped, such as
__more23__
, but that would increase cardinality and this level of details is not that helpful. A solution creating buckets would make more sense.Additional context
No response
The text was updated successfully, but these errors were encountered: