-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simpler bucketing code #59
Conversation
lopho
commented
Dec 5, 2022
•
edited
Loading
edited
- Buckets are just a dict hashed by image size.
- Optionally, with resize, calculates optimal image size closest to original aspect ratio.
- Without resize it just tries to create buckets from the image sizes as they are.
- Its deterministic, no "spooky floats".
- Works with pre-sized datasets.
- API is the same as the old bucketing code.
only one thing that I'd improve upon this is dropping less samples. |
Improved migrations code. |
validation is now parallel too. |
i think ill split this PR into another one for parallelism |
parallelism is now in a separate PR #60 |
There seems to be some confusion as to why this exists: The current bucket implementation ( My approach ( Optimal sizes are determined by getting a scale factor that scales the sides of the input image so it exactly fills the maximum area given: max_area = 512*512
scale = ((w * h) / max_area)**0.5
# if you would scale h and w by this float scale, without rounding then w*h == max_area
# scale = ((w * h) / max_area)**0.5
# scale**2 = (w * h) / max_area
# max_area = (w * h) * scale**2
# max_area = (w * scale) * (h * scale) Then round to the closest multiple of 64 (or the given side divisor) that would fill max area w2 = round((w * scale) / 64) * 64
h2 = round((w * scale) / 64) * 64 This results in an image filling as much of the maximum are as possible, while retaining an aspect ratio as close as possible to the input image but with sides divisible by 64. if w2*h2 > max_area:
w = int((w * scale) / 64) * 64
h = int((h * scale) / 64) * 64 Then bucket the image using the optimal size bucket[(w, h)] = ... All this is only done, if the user wants to resize images ( bucket[(image.width, image.height)] = ... Of course this needs image sizes to have correct sides for training (divisibly by 64) Lastly, |
Also training time performance is 2-6x better.
where running a step of gradient descent on a batch would take 3 seconds. |
Benchmark results: https://pastebin.com/w9GVkZVb def benchmark(sampler, args, time_samples = 1000, output_file = None):
from time import perf_counter
print("Benchmarking bucket sampler")
bs = args.batch_size
batches = len(sampler)
total = bs * batches
tt = time_samples
results = {
'name': str(type(sampler)),
'shuffle': args.shuffle,
'time_samples' : tt,
'num_batches': batches,
'num_samples': total,
'results': {}
}
pself = 0
for _ in range(tt):
pself0 = perf_counter()
pself1 = perf_counter()
pself += (pself1 - pself0)
pself = 0
for _ in range(tt):
pself0 = perf_counter()
pself1 = perf_counter()
pself += (pself1 - pself0)
pself = (pself1 - pself0) / tt
x = []
for _ in range(tt):
x.append(sampler.__iter__())
x = []
now = perf_counter()
for _ in range(tt):
x.append(sampler.__iter__())
end = perf_counter()
took = (end - now) - pself
print(len(x))
results['results']['iterator'] = {
'total': took,
'per_epoch': took / tt
}
x = 0
for _ in range(tt):
for b in sampler:
x += len(b)
x = 0
now = perf_counter()
for _ in range(tt):
for b in sampler:
x += len(b)
end = perf_counter()
took = (end - now) - pself
print(x)
results['results']['batches'] = {
'total': took,
'per_epoch': took / tt,
'per_batch': took / (tt * batches)
}
# warmup
x = 0
for _ in range(tt):
for b in sampler:
for idx, w, h in b:
x += (idx + w + h)
x = 0
now = perf_counter()
for _ in range(tt):
for b in sampler:
for idx, w, h in b:
x += (idx + w + h)
end = perf_counter()
took = (end - now) - pself
print(x)
results['results']['samples'] = {
'total': took,
'per_epoch': took / tt,
'per_batch': took / (tt * batches),
'per_sample': took / (tt * total)
}
if output_file is not None:
from json import dump
with open(output_file, 'w') as f:
dump(results, f)
return results |
make min/max side optional, infer from max area and side divisor use manhattan instead of ratio implement iterator interface for imagestore
idc |