-
Notifications
You must be signed in to change notification settings - Fork 645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Menu.py mod Adds resume from .zip, send and receive zips #118
base: master
Are you sure you want to change the base?
Conversation
The code now checks how much cpus the dev has and set the cpu_num equal to that amount.
this adds a folder and files to run a flask app to accept checkpoint uploads, sort and simply them
This is adding restore and upload functionality from remote devices and local devices
testing...
This generates a file that display the most recent sessions images and updates every second to allow you to watch progress in a web browser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, looks cool! This has quite a bit of new code which I can't review/test very thoroughly right now, but it looks like it operates pretty independently from the existing code? I'd be happy to merge this if you're willing to take responsibility for maintaining and handling any issues related to this feature! |
Hey that sounds great. Yes the current code does not affect any of your existing files but it does run and create files new folders. Rather than immediately merge it to main would you consider doing a menu branch so it's easy to access but those people know to expect functionality change and we get some feedback. I would really like some "guidance" in context of your preferences for listing run_ file to be displayed as the 99, 98, 97 options to make that front and center? A branch could give some time for those questions to settle and see how it's received In the meantime I'll confuse on optimization and documentation before a merge to main. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Newest Menu.py updates
menu.py --info
added git update detection
Checkpoint monitoring scans the newest session for newly created zips and reports them with a timestamp
Allow --URL for a custom external server
match new imports from source
Review need for HTML view file due to new patch
it’s possible tensorboard does not allow “watching”
index.html index information about json and other info
Show highest value of the trained models in index.html
print Map_locations with statistics
There is a lot of cleanup opportunity but only 1 line of python that could be buggy.
These patches add a menu.py that displays local checkpoints and downloaded checkpoints. This does not affect any local files and offers them as default runs.
3 locations are added
app.py hosts index.html and allows uploads and downloads of checkpoints
Hosted checkpoints, downloaded checkpoints and native checkpoints are kept separately.
tailwind CDN can be uncommented to look better.
defaults to 127.0.0.1:5000
The main issue I see now is training / saving times are so long you need to watch and wait for a .zip to come up before you cancel out and lose a ton of training.