Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel demo to test ProgressHandler queue strategy #51

Merged
merged 16 commits into from
Apr 6, 2024

Conversation

garrettmflynn
Copy link
Collaborator

This PR is a sandbox environment to test concerns raised in #49.

While implementing this, it became apparent that we can't use the TQDMProgressHandler directly when working with parallel progress bars because these exist on independent threads. To overcome this while still supporting the new class, all updates coming from the HTTP relay server are registered manually to the handler with the _announce method.

@garrettmflynn garrettmflynn self-assigned this Mar 25, 2024
@CodyCBakerPhD
Copy link
Member

CodyCBakerPhD commented Apr 4, 2024

Thanks @garrettmflynn this strategy makes a lot more sense now and the code reads better (I'll still suggest various improvements for annotations, but I'm pretty sure I follow what's going on) - and is more realistic to our use in the GUIDE to boot

Unfortunately, this is what it looks like on my end (Windows)

video1034276567.mp4

I had seen this behavior a while back with the multiple bar setup, where if there were too many updates going at once the web page would freeze, so I tried slowing the rate of the tasks (which is what you see in the video; all values just multiplied by 10). I even tried reducing down to two tasks to perform and still no dice

Especially unfortunately I cannot exit this task with ctrl+C KeyboardInterrupt, which you see in the last ~15 seconds of the video; at some point, something stops and the web page does a partial update but the only way I can kill the entire thing is to kill that terminal

I'll give it a try and see if the Mac one works

@CodyCBakerPhD
Copy link
Member

Yeah, the Mac one works fine for some reason

Now that I see it running, one thing I'd suggest is the very first progress bar at the top - we want to be tracking the total progress over the length of the outer list of task times (so like, 1/7 after a single job completes a full list of tasks, then 2/7, etc. each will we also have the bars below showing the progress of each individual of the 7 tasks)

Screen.Recording.2024-04-03.at.10.30.12.PM.mov

@CodyCBakerPhD
Copy link
Member

CodyCBakerPhD commented Apr 4, 2024

I wonder, could you let a Flask endpoint run that progress handler (even more like on GUIDE) and see if that perhaps helps? (It's OK for now if the flask server has to be run on a separate terminal window if that helps - or have the demo launch command be another endpoint of the server able to be controlled by a button on the client)

@garrettmflynn
Copy link
Collaborator Author

Alright I've switched the main parallel demo to Flask and added a demo_parallel_ws command to reference the previous implementation.

Just FYI there were two implementations in this demo: one that was not queued, which was enabled by default, and one that is queued.

I've swapped this now so that the queued version (rather than direct forwarding) is enabled by default.

Can you confirm if the Windows issue is still seen on one or both demos?

@CodyCBakerPhD
Copy link
Member

The flask one works great!

The websocket one still has the same problems of not updating and being unable to cancel

Can you add that 'total' top bar for the number of tasks completed? (always fixed in that top position)

@garrettmflynn
Copy link
Collaborator Author

Added!

@CodyCBakerPhD
Copy link
Member

I see the top bar present but it never updates

@garrettmflynn
Copy link
Collaborator Author

garrettmflynn commented Apr 4, 2024

Hmm it's working fine on Mac for both demos here.

Since it relies on n and total being the same, there's a chance the progress bar completes without this being explicitly sent. I'd noticed this several times before,

If this persists after switching to a global TQDM, I'll add an explicit call to pass these format options.

@CodyCBakerPhD
Copy link
Member

I also keep seeing the following like in console while Flask is running

127.0.0.1 - - [04/Apr/2024 23:09:50] "GET / HTTP/1.1" 404 -

pings about every second or so - any idea what this is? It feels like it is the thing preventing keyboard interruption

@garrettmflynn
Copy link
Collaborator Author

pings about every second or so - any idea what this is? It feels like it is the thing preventing keyboard interruption

These are requests to reconnect the Websocket connection. Since we're using the same client for both WS and non-WS servers, the latter will result in 404 errors for these requests.

I've limited the reconnection request to 3 tries so this doesn't go on forever.

Copy link

codecov bot commented Apr 5, 2024

Codecov Report

Attention: Patch coverage is 48.14815% with 14 lines in your changes are missing coverage. Please review.

Project coverage is 52.17%. Comparing base (ac20f9a) to head (11d2519).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #51      +/-   ##
==========================================
- Coverage   54.54%   52.17%   -2.38%     
==========================================
  Files           4        6       +2     
  Lines          66       92      +26     
==========================================
+ Hits           36       48      +12     
- Misses         30       44      +14     
Flag Coverage Δ
unittests 52.17% <48.14%> (-2.38%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
src/tqdm_publisher/__init__.py 100.00% <100.00%> (ø)
...m_publisher/_demos/_demo_command_line_interface.py 0.00% <0.00%> (ø)
src/tqdm_publisher/_subscriber.py 60.00% <60.00%> (ø)
src/tqdm_publisher/_handler.py 41.17% <41.17%> (ø)

@CodyCBakerPhD CodyCBakerPhD marked this pull request as ready for review April 5, 2024 22:13
@CodyCBakerPhD
Copy link
Member

Everything is working great on my end - happy to merge this as is and make improvements in follow ups

@garrettmflynn
Copy link
Collaborator Author

Alright sounds good to me!

@CodyCBakerPhD CodyCBakerPhD enabled auto-merge (squash) April 6, 2024 21:51
@CodyCBakerPhD CodyCBakerPhD merged commit 1b6afb6 into main Apr 6, 2024
13 of 14 checks passed
@CodyCBakerPhD CodyCBakerPhD deleted the parallel-queue-test branch April 6, 2024 21:51
This was referenced Apr 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants