refact: simplify LSContentAssistProcessor #1112

sebthom · 2024-09-17T21:00:00Z

I spent quite some time analyzing different parts of LSP4E in the context of concurrency and future handling.
I identified a few patterns and challenges that are mainly rooted in limitations of CompletableFutures. The first issue is there is no way to cancel a chain of future stages in reverse from bottom-up which means one needs to manually track all intermediate stages in separate lists and cancel them all if required. This is done in a few places but in most code locations only the last stage is cancelled leaving other stages executing in the background. A second limitation of CompletableFutures is that they do not support thread interruption which prevents effectively canceling long running/hanging stages.

I implemented an extended version of CompletableFutures named ExtendedFuture that amongst other things solve both issues. It is part of a small dependency free EPL2 licensed library called futures4j. Code examples for various use-cases can be found at https://github.com/futures4j/futures4j?tab=readme-ov-file#the-extendedfuture-class

My suggestion is to adopt futures4j and use the ExtendedFuture class in LSP4E to esp. have better control over stage cancellations. In this PR I exemplary refactored the LSContentAssistProcessor to showcase the improved code readability.

In my wildest dreams lsp4j would adopt futures4j at one point so we could seamlessly leverage all the goodies provided by ExtendedFuture.

I would appreciate any feedback on the futures4j project itself and would also welcome additional contributors. So far, only snapshot versions of futures4j are available, but I am ready to publish to Maven Central after receiving some feedback.

mickaelistria · 2024-09-18T06:49:27Z

I have not yet reviewed the code but here are some preliminary comments.

To be honest, I'd rather avoiding an intermediary framework for futures as it's not really making things simpler: it's an extra layer of indirection, it's some more things for contributors to learn about before being efficient, and (that's more a personal taste than an actual concern), it tends to "hide" some existing complexity of LSP4E rather than removing it.
Beyond those generic concerns, here are some more concrete facts about cancellation in LSP4E that may help taking the best decision.

The first issue is there is no way to cancel a chain of future stages in reverse from bottom-up which means one needs to manually track all intermediate stages in separate lists and cancel them all if required.

Instead of cancelling the whole chain of futures, just cancelling the 1st could be enough. It will cause all downstream futures to be cancelled too, none will start, none to interrupt. Indeed it can leave some intermediary work running if 1st future was complete before cancelling it, but in the case of LSP4E, I believe it's usually not an issue as the intermediary steps between the LS request and the UI are usually fast/cheap enough to just leave them pending. Maybe in some rare long operations we can add checks to interrupt based on the future state.
The important part is to really cancel the LS request ASAP, and once it done everything is usually fine. Or maybe you have some analysis outcome that have demonstrated that the processing on client side is a limitation for contentAssistProcessor?
It's a pattern that's already used here and there and that works well. I don't think we need something more complex than keeping the request future, chaining it, and cancelling the request future if necessary.

A second limitation of CompletableFutures is that they do not support thread interruption which prevents effectively canceling long running/hanging stages.

Thread interruption seems relevant for the "intermediary" futures only. For LS requests, we don't need that. And if the request future is marked as cancelled, then further futures won't even start, so there would be no need to interrupt anything.
For the "intermediary" futures that are expensive, it's simple enough to just add a check of the future state to interrupt it when necessary.

sebthom · 2024-09-18T08:26:31Z

Thanks for your quick feedback. While I understand your reservations against adding another dependency just for working with futures I believe that it is actually worth it.

The issue with the current approach to cancelling intermediate stages is that it leads to a tangled, complex and error prone code structure. Additionally, in many cases, it is simply overlooked that cancelling the last stage does not cancel any preceding stages. Moreover, cancelling the top stage is ineffective if that stage has already completed – attempting to cancel it is a no-op, and subsequent stages will still be executed. To truly cancel the chain, you need to track every single stage, which becomes even more cumbersome and confusing with nested futures through thenCompose().

Regarding thread interruption, since CompletableFuture does not support it, and there is no way to know within a future's runnable if the future has been cancelled, a separate state-tracking mechanism is needed to determine when to exit a runnable early.

These issues result in different flavors of workarounds across the code base that introduce additional class fields and objects that are not related to the actual business logic and blur the actual intent.

We're constantly encountering all kinds of intermittent testing failures, and I suspect a significant part of the issues is due to nondeterministic runtime behavior rooted in future stages being randomly cancelled/completed at the right moment or not.

The approach I am laying out here can solve these issues at a fundamental level.

mickaelistria · 2024-09-18T10:24:54Z

My assumption is that we rarely need to cancel the whole chain or intermediary futures because they're pretty fast and cancelling isn't much helping. In most cases, cancelling the request futures is enough and simple enough to implement to do it directly without a library. The cases where we need cancellation of intermediary futures are rare, and in those rare cases, keeping track the the intermediary futures we'd like to cancel is not too hard.

I'm pretty sure that the code can be significantly improved and made cleaner without extra library. For completion, it's typically a case where we don't need to clear intermediate features. Introducing a new API to solve a problem that has not practical case is not so profitable, the net change is more that it adds complexity to people who would like to change the code further because they'd have to learn about another exotic future framework.

sebthom · 2024-09-18T11:29:56Z

To be fair, the ExtendedFuture class is not a framework. It is a drop-in replacement for CompletableFuture (as it extends it) that provides a few additional fluent API methods, which elegantly solve the issue of reverse chain cancellation. I could also have directly contributed it to the LSP4E project, but since it is of general use, I think it should not be buried inside another project. It is a better replacement for https://github.com/eclipse/lsp4e/blob/main/org.eclipse.lsp4e/src/org/eclipse/lsp4e/internal/CancellationSupport.java which was introduced recently because of the necessity for reverse chain cancellation.

Regarding the interruptibility of futures, in the LSP4E code I find many cases where cancel(true) is invoked and many cases where cancel(false) is invoked, so there seems to be either some confusion or a certain expectation about futures actually being interruptible.

mickaelistria · 2024-09-18T12:02:08Z

the necessity for reverse chain cancellation.

I'm still not convinced that "reverse chain cancellation" is something we really need.
And I also believe that CancellationSupport is not so interesting nowadays and could be removed, as the output of

LanguageServers.forDocument(document)
		.withFilter(...) //
		.collectAll((w, ls) -> ...);

is already supposed to be cancellable (assuming that, as specified in collectAll (assuming that the output of the function returned by collectAll is creating directly a (cancellable) LS request.

Overall, the more I look at it, the more I believe that the issue is actually the assumption that we always need to cancel a whole chain of futures and that there are already some patterns to implement that greedily. It might have been necessary some time ago, when we didn't have proper cancellation in the LS requests jobs, but now it's supposed to work, I don't think we need further complex pattern beyond just keeping and cancelling the request future in most cases, and checking future.isCancelled() in some others (similarly to what one would do checking progressMonitor.isCanceled()) when further works has to be skipped

sebthom requested review from mickaelistria, rubenporras and ghentschke September 17, 2024 21:00

refact: simplify LSContentAssistProcessor

46bc8bd

sebthom force-pushed the futures4j branch from d4b44fa to 46bc8bd Compare September 18, 2024 08:17

sebthom closed this Oct 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refact: simplify LSContentAssistProcessor #1112

refact: simplify LSContentAssistProcessor #1112

sebthom commented Sep 17, 2024 •

edited

Loading

mickaelistria commented Sep 18, 2024

sebthom commented Sep 18, 2024

mickaelistria commented Sep 18, 2024

sebthom commented Sep 18, 2024

mickaelistria commented Sep 18, 2024

refact: simplify LSContentAssistProcessor #1112

refact: simplify LSContentAssistProcessor #1112

Conversation

sebthom commented Sep 17, 2024 • edited Loading

mickaelistria commented Sep 18, 2024

sebthom commented Sep 18, 2024

mickaelistria commented Sep 18, 2024

sebthom commented Sep 18, 2024

mickaelistria commented Sep 18, 2024

sebthom commented Sep 17, 2024 •

edited

Loading