Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes
Precomputing
The core training loop of PerSAM_f does a lot of unnecessary computation.
It evaluates the SAM mask detector once every epoch, but this step can be precomputed.
I modified the training loop to precompute this step.
Downsampling
Also, the current training loop uses mask images which are extremely high-resolution (some are 1280x1280 pixels).
I downsampled these to 256x256 pixels.
On Google Colab with a T4, these two changes decreased training time from ~9s to ~1s.
Caching SAM
Currently, the PerSAM_f script loads the full SAM model once for every image it trains on.
On Google Colab with a T4, this adds 8-15 seconds to the effective "training" time.
I modified the script to use a global SAM model, which it only loads once.
I have applied these same changes to
persam_f_multi_obj.py
. It seems to work, but I haven't put much focus on this, and I haven't evaluated its results.Evaluation
Evaluation results are nearly identical on PerSeg.
Original:
Modified:
You can find the eval notebook in Colab here.