feat(audoedit): implement basic analytics logger #6430

valerybugakov · 2024-12-20T08:01:48Z

The AutoeditAnalyticsLogger is an improved version of the autocomplete analytics logger without persistence tracking. It's much more type-safe than its predecessor, with explicitly defined state transitions and required attributes for each stage. Plus, many unused attributes and logic were removed.
At the same time, we want to preserve all the small details that helped us to collect valuable data and filter out the noise. If you see anything important that's missing from the autocomplete logger, please call it out!
Closes CODY-4557: Telemetry: implement basic autoedits logger

Test plan

CI + unit tests
The integration with the actual autoedits provider will implemented in a follow-up PR.

hitesh-1997 · 2024-12-23T11:59:08Z

vscode/src/autoedits/analytics-logger/analytics-logger.ts

+    started: ['contextLoaded', 'noResponse'],
+    contextLoaded: ['loaded', 'noResponse'],
+    loaded: ['suggested'],
+    suggested: ['read', 'accepted', 'rejected'],


How can we transition from suggestion to accepted or rejected without read ?
If we think there is an intermediate stage where although we suggest a prediction but maybe user typed something so that it was not read, do you think we should introduce another state something like noRead so that suggested can have transition to either read or noRead and noRead would have empty transition ?

We consider suggestions as read after a hard-coded timeout if they were not rejected during this time. Upon reviewing the code, I see that we don't need this data to send the existing event payloads. We use displayDuration, which does not require readAt to be computed.

I'll remove this logic altogether. We can add it later if we think we need it.

vscode/src/autoedits/analytics-logger/analytics-logger.ts

hitesh-1997 · 2024-12-23T12:55:18Z

vscode/src/autoedits/analytics-logger/analytics-logger.ts

+    lineCount: number
+
+    /** Total characters in the suggestion. */
+    charCount: number


We typically use lineCount and charCount to calculate stats over newly added code in the editor. But for the auto-edits the values typically include existing code as well, so we may need to adjust the metrics accordingly.

Agree, let's adjust in a follow-up PR.

vscode/src/autoedits/analytics-logger/analytics-logger.ts

hitesh-1997 · 2024-12-23T13:05:48Z

vscode/src/autoedits/analytics-logger/analytics-logger.ts

+    /** The source of the suggestion, e.g. 'network', 'cache', etc. */
+    source?: string
+
+    /** True if we fuzzy-matched this suggestion from a local or remote cache. */


Another field which would be helpful here is:
is code_to_rewrite same as prediction. Since auto-edits trigger suggestion on cursor movement, a high value suggestion could indicate irrelevant suggestion.

Let's add this in a follow-up PR. We might want an enum field indicating the reason for a suggestion to be hidden.

vscode/src/autoedits/analytics-logger/analytics-logger.ts

hitesh-1997 · 2024-12-23T13:10:48Z

vscode/src/autoedits/analytics-logger/analytics-logger.ts

+        Omit<CodeGenEventMetadata, 'charsInserted' | 'charsDeleted'> {}
+
+interface AutoeditRejectedEventPayload extends AutoEditFinalMetadata {}
+interface AutoeditNoResponseEventPayload extends AutoeditContextLoadedMetadata {}


what cases does no response happen ? Is this logged when the response is empty string from the model ? If yes, I would suspect this to not trigger since unlike autocomplete, the model atleast rewrite the codeToRewrite and empty response here means that we delete all the code in the codeToRewrite section.

In autocomplete, this happens when we hide a suggestion for whatever reason, such as when the suggestion duplicates the document suffix. We can extend this even with extra fields in follow-up PRs.

I changed this event name to discarded and added some comments to clarify it. We can adjust it follow-ups based on use cases.

hitesh-1997 · 2024-12-23T13:29:03Z

vscode/src/autoedits/analytics-logger/analytics-logger.ts

+interface ReadState extends Omit<SuggestedState, 'phase'> {
+    phase: 'read'
+    /** Timestamp when the suggestion can be considered as read by a user. */
+    readAt: number


For all the states do you think we should just have

phase

payload
All the payload at any state anyways extend the previous state. So we can move the specific fields like startedAt, loadedAt etc in the metadata itself. So, that we don't have to extend both the metadata and the states.

So, after this the metadata would be incorporate additional ields like loadedAt, readAt etc and each state would have no need to extend previous state and Omit some fields from the previous state.

Currently, introducing any new start in between requires this duplication of extending metadata/state fields.

Yeah, this structure requires a bit of extra code. However, based on our experience with the autocomplete analytics logger, it's necessary always to have a type-safe setup for the fields we send to our analytics backend.

I updated a top-level comment explaining the rationale here. Here are the two most relevant points:

* 3. The `payload` field in each state encapsulates the exact list of fields that we plan to send * to our analytics backend. * * 4. Other top-level `state` fields are saved only for bookkeeping and won't end up at our * analytics backend. This ensures we don't send unintentional or redundant information to * the analytics backend.

That's why I'd like to separate payload and other bookkeeping fields (startedAt, etc).

hitesh-1997 · 2024-12-23T13:45:23Z

vscode/src/autoedits/analytics-logger/analytics-logger.ts

+            this.writeAutoeditEvent('error', {
+                version: 0,
+                metadata: { count: 1 },
+                privateMetadata: { message: error.message, traceId },


For the privateMetadata should we add recordsPrivateMetadataTranscript: 1 in the metadata ?
Also adding @akalia25 for a look here

As per the docs, it's required if we want to send promptText or responseText, which we don't have in this logger yet. Should we add recordsPrivateMetadataTranscript later when we need it?

I noticed earlier that we log prediction?: string field which would be responseText so suggested the change. yes, we can add it later when we log responseText

Ok, I reread this Slack message and agree that we should add it here.

I had an impression that it's required for long string values, so I wasn't concerned about the prediction field because it's limited to 300 chars, but it looks like we have to add it because autoedit event may contain sensitive data

It probably means we should add it to autocomplete events containing the insertText field too.

Added here: 3570628

Thanks for the tag, @hitesh-1997!

Yes, when recording sensitive data (e.g., transcripts, or free-form text), we should always add recordsPrivateMetadataTranscript: 1 to the metadata. This helps us keep track of where sensitive data is being logged and route the data correctly. More details here

We need to know which key within privateMetadata contains the sensitive data, so if the key is not in the list below (i.e. insertText), then you'd simply need to message the discuss-analytics channel and we'd be happy to add the key.

promptText
responseText
inlineCompletionItemContext
privateContextSummary
diff
query

hitesh-1997

Awesome 🚀🚀

valerybugakov · 2024-12-24T03:38:08Z

vscode/src/autoedits/analytics-logger/analytics-logger.ts

+                // TODO: double check with the analytics team
+                // whether we should be categorizing the different completion event types.
+                category: action === 'suggested' ? 'billable' : 'core',


Hey @kelsey-brown, @akalia25, could you take a look? The list of currently implemented event names:

type AutoeditEventAction = | 'suggested' | 'accepted' | 'noResponse' | 'error' | `invalidTransitionTo${Capitalize<Phase>}`

These should be categorized as follows:

suggested: billable

accepted: core

noResponse: billable

error: null (uncategorized)

invalid: null (uncategorized)

Thanks for checking!

hitesh-1997

Thanks for addressing the comments !!

valerybugakov · 2024-12-24T12:21:25Z

Enabling auto-merge since the logger isn't integrated with the feature yet, and the rollout is still limited to S2 users only. @kelsey-brown and @akalia25, feel free to drop any review comments; I’ll handle them in follow-up PRs.

kelsey-brown · 2025-01-02T16:08:24Z

@valerybugakov just to clarify, the event names here aren’t changing - just the way they’re logged technically, and some of the metadata? And similarly, we won’t still log events the old way in addition to this - this will be a full replacement?

If you see anything important that's missing from the autocomplete logger, please call it out!

It's a bit hard for me to tell which metadata fields have been removed based on the code here - there are a lot 😅 Are these events live on S2? If so, I could fire some events there and compare the payloads side by side to what we're getting from other instances.

valerybugakov · 2025-01-10T14:32:16Z

@valerybugakov just to clarify, the event names here aren’t changing - just the way they’re logged technically

@kelsey-brown, I'll share more information in a Slack thread today.

- Addressing the feedback from [this review comment](#6430 (comment)) by marking `discarded` events as `billable`.

valerybugakov added the autoedit label Dec 20, 2024

valerybugakov self-assigned this Dec 20, 2024

valerybugakov force-pushed the vb/autoedits-analytics-logger branch from 59b4792 to f796e21 Compare December 20, 2024 08:13

feat(audoedit): implement basic analytics logger

951a386

valerybugakov force-pushed the vb/autoedits-analytics-logger branch from f796e21 to 951a386 Compare December 20, 2024 08:14

valerybugakov added 3 commits December 23, 2024 17:40

feat(audoedit): improve tests

b439512

feat(audoedit): updated snapshots

495c8c8

feat(audoedit): add comments

01d7c4b

valerybugakov marked this pull request as ready for review December 23, 2024 10:06

valerybugakov requested a review from hitesh-1997 December 23, 2024 10:07

hitesh-1997 reviewed Dec 23, 2024

View reviewed changes

vscode/src/autoedits/analytics-logger/analytics-logger.ts Outdated Show resolved Hide resolved

hitesh-1997 reviewed Dec 23, 2024

View reviewed changes

vscode/src/autoedits/analytics-logger/analytics-logger.ts Outdated Show resolved Hide resolved

hitesh-1997 reviewed Dec 23, 2024

View reviewed changes

vscode/src/autoedits/analytics-logger/analytics-logger.ts Show resolved Hide resolved