Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autocomplete: Overhaul telemetry and fix various corner cases #412

Merged
merged 6 commits into from
Jul 31, 2023

Conversation

philipp-spiess
Copy link
Contributor

Closes #178

This PR fixes a bunch of issues with the completions cache that should make it much more easy to understand now (although there are a few corner cases). The analytics code is quite tangled with the implementation right now but I want to get the logic right first before we do some cleanup.

Conceptually, the big thing that has changed is that we now always create a new completion entry for new completion requests. The only exception to that is if the cache would resolve to the very same completion that was already displayed. This special case is necessary to not send too many completion requests if we forward-type a completion like so:

Screen.Recording.2023-07-27.at.13.40.19.mov

There are a bunch of other small fixes that I noticed were necessary that I’m going to explain inline.

Test plan

  • Open the Output tabs and start playing with completions while keeping track of how the console looks.
  • We'll have to see how big the difference is in our metrics after pushing this to prod.

@philipp-spiess philipp-spiess requested review from dadlerj, vovakulikov and a team July 27, 2023 11:44
@philipp-spiess philipp-spiess self-assigned this Jul 27, 2023
@@ -160,15 +160,13 @@ export class CodyCompletionItemProvider implements vscode.InlineCompletionItemPr
this.abortOpenCompletions = () => abortController.abort()
}

CompletionLogger.clear()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clearing the completion logger will flush all "open" completions, potentially logging completions that are being visible. This is used to measure for how long completions are necessary and thus we want to delay it as long as possible so we can get the most accurate timing.

Previously, every new completion request would immediately flush the queue. Now, however, a cache hit that matches the properties that I outlined in the PR description and is this "completed forward" won't, so we'll have a more accurate number of how long such completions are visible.

@@ -452,3 +512,8 @@ function createTracerForInvocation(tracer: ProvideInlineCompletionItemsTracer):
tracer(data)
}
}

function emptyCompletions(): vscode.InlineCompletionList {
CompletionLogger.clear()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whenever we early return, we want to flush the completions queue. This conceptually makes sense because if we return an empty array at any point, it means no completions form us are visible anymore anyways.

//
// However we only log a completion as started if it's either served from cache _or_ the
// debounce interval has passed to ensure we don't log too many start events where we end up
// not doing any work at all
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The completion logger now has a separate create and start method. This fixes an issue where we haven't properly measured end to end latencies in our completions because we only started it when we actively start to do some work, which happens after both the pre-defined debounce times (30ms) and context retrieval (few ms).

I don't expect this to have a big impact on the overall latencies, certainly not over 50ms, but we should aim to be as close to the real-user experience as possible with these logs.

This won't change the overall amount of events logged, though, because nothing is being logged until the start method is called. We just start the timers a bit earlier.

Comment on lines +244 to +245
const useLogIdFromPreviousCompletion =
cachedCompletions?.logId && cachedCompletions?.logId === this.previousCompletionLogId
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line makes sure that "forwarding completions" don't get a new analytics event assigned. In the future, we might be able to do that with the new enableForwardStability option

Comment on lines +268 to +276
// When we serve a completion from the cache and create a new log
// id, we want to ensure to only refer to the new id for future
// cache retrievals. If we don't do this, every subsequent cache hit
// would otherwise no longer match the previous completion ID and we
// would log a new completion each time, even if the user just
// continues typing on the currently displayed completion.
if (logId !== cachedCompletions.logId) {
this.config.cache?.updateLogId(cachedCompletions.logId, logId)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty much a corner case but updating the ids in our cache is necessary in the following scenario:

  • We start a completion request with ID 1 and display it, cool.
  • Now the user does something else which will start and show other completion requests.
  • Eventually, the prefix matches the first completion request again. The cache will retrieve the item with id 1 however that won't match the previousCompletionLogId.
  • This will cause the above code to create a new ID, say 3.
  • On the next forward request, the cache would again resolve with the id 1 when the previousCompletionLogId would be 3 which would create yet another completion request...

To fix this, we update the cache to refer to the new log id's whenever they're changed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is working as you described, I think this comment from me here isn't a blocker, but if we take a step back maybe we should have some centralised system about logId. It looks like that we're trying to sync log and cache IDs in order to avoid double logging when cache is actually the same, maybe we could use some cache encode function for this kind of purpose? In this case there won't be possible to have different ids if prefixes are the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I don't like this bookkeeping either. It's going to be interesting to see what @sqs will come up in #442 where he's currently seemingly not using a cache at all!

The only reason why we need to keep the log id around is to make sure we can detect cases where the same completion is resolve again after some suffix changes. Maybe we can instead just store the whole completion somewhere and compare that instead of the generated ID?

),
},
})
CompletionLogger.networkRequestStarted(logId, contextResult.logSummary)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We previously appended the context summary in the completion started event. Now, however, we log these events before the context is fetched so I decided to split this into a separate event. We haven't used any of these yet, though.

Comment on lines 426 to 433
let isCompletionVisible = true
if (context.selectedCompletionInfo) {
const currentText = document.getText(context.selectedCompletionInfo.range)
const selectedText = context.selectedCompletionInfo.text
if (!(currentText + results[0].content).startsWith(selectedText)) {
isCompletionVisible = false
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a fun one 😅 In some scenarios, completions that are being returned will still not be shown in VS Code. This case specifically happens when the completion we create does not match the text in an eventual VS Code completion popup. While we have ideas to improve this and use the suggested text to guide the completion, this would require a few more changes so for now, we try to detect when VS Code won't show the completion.

That, however, lead to a different issue. We previously measured the duration for a completion as the time between when the completion was started and when the completion was first seen. Since the completion can now be fully loaded and returned but not immediately seen, we've had to make some changes in the completion logger to add a separate timestamp to calculate these correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just reminder for myself, let's discuss it for our next 1-1. I don't think I fully understand this scenario

Comment on lines +142 to +143
// Log a debug event, this case should not happen in production
logCompletionEvent('acceptedUntrackedCompletion')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I managed to reproduce these cases sometimes. E.g. you accept a completion, then do cmd+z to undo, type the same suffix again (without triggering any other completion in the mean time), and accept the completion again.

I wonder how often this happens in practice though, so this just adds some events for us to see. If this happens often, we might need to ensure that this case creates a new log id (and thus conceptually a new completion request).

@vovakulikov
Copy link
Contributor

@philipp-spiess sorry I didn't make a review for this today. I haven't touched logs yet in the Cody repo so it'll take some time. I will review it today morning first thing.

@dadlerj
Copy link
Member

dadlerj commented Jul 28, 2023

I'm afraid I'm not really qualified to give a review here 😕 . I can check event names if helpful, but I don't think I am familiar enough with the code or logic to validate the call sites or the edge cases, etc. Let me know if there's anything I can do to help.

Copy link
Contributor

@vovakulikov vovakulikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few minor comments and ideas.


expect(cache.get('foo \n ', true)?.logId).toBe('id2')
expect(cache.get('foo \n ', true)?.logId).toBe('id2')
expect(cache.get('foo \n', true)?.logId).toBe('id2')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why 3 expect calls? Just to make sure that there is no toggle state behind the CompleationsCache?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I wanted to make sure all occurrences are updated.

public updateLogId(oldLogId: string, newLogId: string): void {
const entries = this.cache.values()
for (const value of entries) {
if (value && typeof value !== 'string' && 'logId' in value && value.logId === oldLogId) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What case do we catch with typeof value !== 'string'? it looks like value can't be string if we trust TS types here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that was a mistake 🙈

Comment on lines +268 to +276
// When we serve a completion from the cache and create a new log
// id, we want to ensure to only refer to the new id for future
// cache retrievals. If we don't do this, every subsequent cache hit
// would otherwise no longer match the previous completion ID and we
// would log a new completion each time, even if the user just
// continues typing on the currently displayed completion.
if (logId !== cachedCompletions.logId) {
this.config.cache?.updateLogId(cachedCompletions.logId, logId)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is working as you described, I think this comment from me here isn't a blocker, but if we take a step back maybe we should have some centralised system about logId. It looks like that we're trying to sync log and cache IDs in order to avoid double logging when cache is actually the same, maybe we could use some cache encode function for this kind of purpose? In this case there won't be possible to have different ids if prefixes are the same.

Comment on lines 426 to 433
let isCompletionVisible = true
if (context.selectedCompletionInfo) {
const currentText = document.getText(context.selectedCompletionInfo.range)
const selectedText = context.selectedCompletionInfo.text
if (!(currentText + results[0].content).startsWith(selectedText)) {
isCompletionVisible = false
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just reminder for myself, let's discuss it for our next 1-1. I don't think I fully understand this scenario

@philipp-spiess
Copy link
Contributor Author

@dadlerj

I'm afraid I'm not really qualified to give a review here 😕 . I can check event names if helpful, but I don't think I am familiar enough with the code or logic to validate the call sites or the edge cases, etc. Let me know if there's anything I can do to help.

Ha, thanks! I tagged you mostly so that you're aware of some of changes that we'll be doing. :)

@philipp-spiess philipp-spiess merged commit 33f97ba into main Jul 31, 2023
8 checks passed
@philipp-spiess philipp-spiess deleted the ps/autocomplete-telemetry branch July 31, 2023 09:56
philipp-spiess added a commit that referenced this pull request Jul 31, 2023
Two quick fixes for regressions I introduced in #412

- Loading indicator felt flaky so this this PR makes sure we start the
indicator a bit earlier and terminate in any case.
- The opt-out for a deletion case was wrong, causing it to skip the
condition if the last action was a deletion and there was no cache
response. This was not intended to be changed for now. The correct
behavior should still be: If there is a deletion, ONLY continue if twe
have an exact cache response. Otherwise return no completions.

## Test plan

- tested locally,

<!-- Required. See
https://docs.sourcegraph.com/dev/background-information/testing_principles.
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Autocomplete: Analytics doesn't play well with the completions cache
3 participants