-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(audoedit): implement basic analytics logger #6430
Conversation
59b4792
to
f796e21
Compare
f796e21
to
951a386
Compare
started: ['contextLoaded', 'noResponse'], | ||
contextLoaded: ['loaded', 'noResponse'], | ||
loaded: ['suggested'], | ||
suggested: ['read', 'accepted', 'rejected'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can we transition from suggestion
to accepted
or rejected
without read
?
If we think there is an intermediate stage where although we suggest a prediction but maybe user typed something so that it was not read
, do you think we should introduce another state something like noRead
so that suggested
can have transition to either read
or noRead
and noRead
would have empty transition ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We consider suggestions as read after a hard-coded timeout if they were not rejected during this time. Upon reviewing the code, I see that we don't need this data to send the existing event payloads. We use displayDuration
, which does not require readAt
to be computed.
I'll remove this logic altogether. We can add it later if we think we need it.
lineCount: number | ||
|
||
/** Total characters in the suggestion. */ | ||
charCount: number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We typically use lineCount
and charCount
to calculate stats over newly added code in the editor. But for the auto-edits
the values typically include existing code as well, so we may need to adjust the metrics accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, let's adjust in a follow-up PR.
/** The source of the suggestion, e.g. 'network', 'cache', etc. */ | ||
source?: string | ||
|
||
/** True if we fuzzy-matched this suggestion from a local or remote cache. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another field which would be helpful here is:
is code_to_rewrite
same as prediction
. Since auto-edits
trigger suggestion on cursor movement, a high value suggestion could indicate irrelevant suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add this in a follow-up PR. We might want an enum field indicating the reason for a suggestion to be hidden.
Omit<CodeGenEventMetadata, 'charsInserted' | 'charsDeleted'> {} | ||
|
||
interface AutoeditRejectedEventPayload extends AutoEditFinalMetadata {} | ||
interface AutoeditNoResponseEventPayload extends AutoeditContextLoadedMetadata {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what cases does no response happen ? Is this logged when the response is empty string from the model ? If yes, I would suspect this to not trigger since unlike autocomplete, the model atleast rewrite the codeToRewrite
and empty response here means that we delete all the code in the codeToRewrite section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In autocomplete, this happens when we hide a suggestion for whatever reason, such as when the suggestion duplicates the document suffix. We can extend this even with extra fields in follow-up PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha !!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this event name to discarded
and added some comments to clarify it. We can adjust it follow-ups based on use cases.
interface ReadState extends Omit<SuggestedState, 'phase'> { | ||
phase: 'read' | ||
/** Timestamp when the suggestion can be considered as read by a user. */ | ||
readAt: number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all the states do you think we should just have
- phase
- payload
All the payload at any state anyways extend the previous state. So we can move the specific fields likestartedAt
,loadedAt
etc in the metadata itself. So, that we don't have to extend both the metadata and the states.
So, after this the metadata would be incorporate additional ields like loadedAt
, readAt
etc and each state would have no need to extend previous state and Omit some fields from the previous state.
Currently, introducing any new start in between requires this duplication of extending metadata/state fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this structure requires a bit of extra code. However, based on our experience with the autocomplete analytics logger, it's necessary always to have a type-safe setup for the fields we send to our analytics backend.
I updated a top-level comment explaining the rationale here. Here are the two most relevant points:
* 3. The `payload` field in each state encapsulates the exact list of fields that we plan to send
* to our analytics backend.
*
* 4. Other top-level `state` fields are saved only for bookkeeping and won't end up at our
* analytics backend. This ensures we don't send unintentional or redundant information to
* the analytics backend.
That's why I'd like to separate payload and other bookkeeping fields (startedAt
, etc).
this.writeAutoeditEvent('error', { | ||
version: 0, | ||
metadata: { count: 1 }, | ||
privateMetadata: { message: error.message, traceId }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the privateMetadata
should we add recordsPrivateMetadataTranscript: 1
in the metadata ?
Also adding @akalia25 for a look here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per the docs, it's required if we want to send promptText
or responseText
, which we don't have in this logger yet. Should we add recordsPrivateMetadataTranscript
later when we need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed earlier that we log prediction?: string
field which would be responseText
so suggested the change. yes, we can add it later when we log responseText
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I reread this Slack message and agree that we should add it here.
I had an impression that it's required for long string values, so I wasn't concerned about the prediction
field because it's limited to 300 chars, but it looks like we have to add it because autoedit event may contain sensitive data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It probably means we should add it to autocomplete events containing the insertText
field too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added here: 3570628
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the tag, @hitesh-1997!
Yes, when recording sensitive data (e.g., transcripts, or free-form text), we should always add recordsPrivateMetadataTranscript: 1
to the metadata. This helps us keep track of where sensitive data is being logged and route the data correctly. More details here
We need to know which key within privateMetadata
contains the sensitive data, so if the key is not in the list below (i.e. insertText
), then you'd simply need to message the discuss-analytics channel and we'd be happy to add the key.
promptText
responseText
inlineCompletionItemContext
privateContextSummary
diff
query
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome 🚀🚀
// TODO: double check with the analytics team | ||
// whether we should be categorizing the different completion event types. | ||
category: action === 'suggested' ? 'billable' : 'core', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @kelsey-brown, @akalia25, could you take a look? The list of currently implemented event names:
type AutoeditEventAction =
| 'suggested'
| 'accepted'
| 'noResponse'
| 'error'
| `invalidTransitionTo${Capitalize<Phase>}`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be categorized as follows:
- suggested: billable
- accepted: core
- noResponse: billable
- error: null (uncategorized)
- invalid: null (uncategorized)
Thanks for checking!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing the comments !!
Enabling auto-merge since the logger isn't integrated with the feature yet, and the rollout is still limited to S2 users only. @kelsey-brown and @akalia25, feel free to drop any review comments; I’ll handle them in follow-up PRs. |
@valerybugakov just to clarify, the event names here aren’t changing - just the way they’re logged technically, and some of the metadata? And similarly, we won’t still log events the old way in addition to this - this will be a full replacement?
It's a bit hard for me to tell which metadata fields have been removed based on the code here - there are a lot 😅 Are these events live on S2? If so, I could fire some events there and compare the payloads side by side to what we're getting from other instances. |
@kelsey-brown, I'll share more information in a Slack thread today. |
- Addressing the feedback from [this review comment](#6430 (comment)) by marking `discarded` events as `billable`.
AutoeditAnalyticsLogger
is an improved version of the autocomplete analytics logger without persistence tracking. It's much more type-safe than its predecessor, with explicitly defined state transitions and required attributes for each stage. Plus, many unused attributes and logic were removed.Test plan