Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.5 UI: Workflow list blocked when limit applied #12663

Closed
3 of 4 tasks
alelapi opened this issue Feb 13, 2024 · 9 comments · Fixed by #12672
Closed
3 of 4 tasks

3.5 UI: Workflow list blocked when limit applied #12663

alelapi opened this issue Feb 13, 2024 · 9 comments · Fixed by #12672
Assignees
Labels
area/ui P2 Important. All bugs with >=3 thumbs up that aren’t P0 or P1, plus: Any other bugs deemed important type/bug type/regression Regression from previous behavior (a specific type of bug)

Comments

@alelapi
Copy link

alelapi commented Feb 13, 2024

Pre-requisites

  • I have double-checked my configuration
  • I can confirm the issue exists when I tested with :latest
  • I have searched existing issues and could not find a match for this bug
  • I'd like to contribute the fix myself (see contributing guide)

What happened/what did you expect to happen?

On the workflow list page, when the total number of workflows exceeds the limit applied in the query string:
(eg: total number of workflows = 150 and <base_url>/workflows/product-replacement-engine?&limit=140)

the application enters into a loop, continuously repeating the same calls, and blocking any other requests.

Screenshot 2024-02-13 at 14 34 01

Version

v3.5.4

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

n/a

Logs from the workflow controller

kubectl logs -n argo deploy/workflow-controller | grep ${workflow}
n/a

Logs from in your workflow's wait container

kubectl logs -n argo -c wait -l workflows.argoproj.io/workflow=${workflow},workflow.argoproj.io/phase!=Succeeded
n/a
@agilgur5 agilgur5 changed the title Workflow list UI blocked when filter limit applied UI: Workflow list blocked when limit applied Feb 13, 2024
@agilgur5 agilgur5 self-assigned this Feb 13, 2024
@agilgur5 agilgur5 added the P2 Important. All bugs with >=3 thumbs up that aren’t P0 or P1, plus: Any other bugs deemed important label Feb 13, 2024
@agilgur5
Copy link
Contributor

agilgur5 commented Feb 13, 2024

Hmm this might be the same root cause as #12626. I can't tell from looking at the code why this would happen though and haven't had a chance yet to sit down and test it.

@agilgur5 agilgur5 added type/regression Regression from previous behavior (a specific type of bug) area/ui labels Feb 13, 2024
@alelapi
Copy link
Author

alelapi commented Feb 13, 2024

Thanks. Seems the same. I believe the cause is in the dependency array of one of the useEffect

@agilgur5
Copy link
Contributor

agilgur5 commented Feb 13, 2024

Yea that was what I checked first too, but I don't see anything obviously off with it (I even checked to make sure toString was stable).
If you're familiar with React, I refactored this whole component to hooks in #11891, where I actually noticed some existing useEffect calls had stale refs. I tested to make sure everything updated correctly too since it was a pretty sizeable and complex refactor (one of the largest I've done in recent years). I also reduced the number of network calls in that PR too (it was doing unnecessary lists when it already has a list watch).

One thing I did noticed was that I seemed to have written the wrong key for the offset query param, but you're not using offset, so I wouldn't think that would impact this specifically. Will fix that ofc, but that doesn't seem like root cause to me yet 🤔 Could always use more eyes!

@agilgur5
Copy link
Contributor

agilgur5 commented Feb 14, 2024

I tested to make sure everything updated correctly too since it was a pretty sizeable and complex refactor (one of the largest I've done in recent years).

Yea I can't reproduce this locally, I only have one long-running eventsource request (SSE) if I don't change any filters:
Screenshot 2024-02-14 at 1 59 25 PM

On http://localhost:8080/workflows?namespace=argo&limit=50

Let me try building 3.5.4 specifically. #12626 suggests it might be an environment issue. Have you tried clearing your localStorage and similar caches? I wonder if there's a dated cache from 3.4 causing this (I made the localStorage code backward-compatible, but maybe I missed something)

@agilgur5
Copy link
Contributor

agilgur5 commented Feb 14, 2024

Hmm this might be the same root cause as #12626.

Hmm actually your issue might be different. There none of the requests get cancelled, it just opens multiple (meaning the useEffect callback didn't happen) for some reason. In your case, they're quickly opened and closed in succession, ad infinitum. EDIT: actually in your screenshot, they're not cancelled, the SSEs are pending infinitely while the initial request completes very quickly. Were there a few SSEs that started successfully before the pending ones?

@agilgur5
Copy link
Contributor

Let me try building 3.5.4 specifically

Huh I can't reproduce this on 3.5.4 locally either:
Screenshot 2024-02-14 at 5 07 35 PM

@agilgur5
Copy link
Contributor

agilgur5 commented Feb 15, 2024

Ok I was able to partially reproduce this. With pagination on and when moving between pages, I get 2 SSEs and often neither will cancelled after moving pages (sometimes 1 will be cancelled). The second one is probably due to React dev mode's double render.

Screenshot 2024-02-15 at 1 54 28 PM

The bizarre part is that there is no error in the console and I confirmed with a log statement and other observations that the useEffect callback and the listWatch.stop() therein is happening.

I can rewrite the logic so that it ensures there is only ever one listWatch with a ref, but in theory that shouldn't be necessary; I'm super confused why they're not being cancelled each time despite listWatch.stop() being called.

This also doesn't happen when I'm not interacting with the page, I still haven't been able to reproduce an infinite loop, nor one with no interaction required.

@agilgur5
Copy link
Contributor

I can rewrite the logic so that it ensures there is only ever one listWatch with a ref, but in theory that shouldn't be necessary; I'm super confused why they're not being cancelled each time despite listWatch.stop() being called.

The plot thickens -- this doesn't fix it. Even with a singular ref, somehow multiple listWatches proliferate. Even when .stop() is called before any .start(). Even if I set the ref to null temporarily after a .stop()

@agilgur5
Copy link
Contributor

agilgur5 commented Feb 16, 2024

With pagination on and when moving between pages, I get 2 SSEs

Found the main bug, the nextOffset of pagination is set by the ListWatch as well as a dep of its useEffect.
Removing that fixes the 2 SSEs. Here's my waterfall graph now:
Screenshot 2024-02-16 at 12 38 45 AM
I'm not quite sure how you got an infinite loop though, as it should only happen twice since after the second one nextOffset should remain constant 🤔

That also seems to mostly fix the .stop() not working. Doesn't need a ref and works as-is for the most part.
It seems like .stop() may not work properly when called in quick succession after .start() as with the 2 SSEs. It seems to be latency dependent, so that would explain why only some people on certain connections have experienced it (e.g. in #12626 they are using HTTP/1). That one predates 3.5.x though and I'm not entirely sure how to fix it; it might be an RxJS issue or a Server or SSE issue with the quick succession.

So I'll fix the main problem that is a regression and hopefully we won't see the .stop() issue again nor infinite loops, but those may need further investigating. EDIT: Fixed by #12672

@agilgur5 agilgur5 added this to the v3.5.x patches milestone Apr 3, 2024
@agilgur5 agilgur5 changed the title UI: Workflow list blocked when limit applied 3.5 UI: Workflow list blocked when limit applied Jul 12, 2024
@argoproj argoproj locked as resolved and limited conversation to collaborators Jul 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/ui P2 Important. All bugs with >=3 thumbs up that aren’t P0 or P1, plus: Any other bugs deemed important type/bug type/regression Regression from previous behavior (a specific type of bug)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants