-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor generate.py #1948
Refactor generate.py #1948
Conversation
Move inject template into generate.py
Random then in range per units when not given.
@alejoe91 @h-mayorquin @yger @DradeAW ready to review. As a side effect we will be able one to refactor many test base on this to avoid generating and writting to disk many times toy example. |
@h-mayorquin : the test you design to checkthat get_traces does not consume more memory than allocated is not passing on macos... It will be hard to debug from my side... |
Can you run pre-commit? Also, tests are failing! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I unfortunately don't have the time to do an extensive review, but I think for my part it's ok!
I highlighted some details :)
Finally I fixed it. |
for more information, see https://pre-commit.ci
test are passing and precommit-ci is back |
@samuelgarcia good on my side! Great work!!!
|
b = refractory_sample * 20 | ||
shift = a + (b - a) * x**2 | ||
spike_times[some] += shift | ||
times0 = times0[(0 <= times0) & (times0 < N)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The N here is not defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @h-mayorquin
@samuelgarcia can you make a PR to fix it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oups. thank for this check. I will fix this
num_channels: int, | ||
sampling_frequency: float, | ||
durations: List[float], | ||
noise_level: float = 5.0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that 1.0 is the sensible default here. Why 5.0?
|
||
|
||
def generate_lazy_recording( | ||
def generate_recording_by_size( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish we went all the way to name this "generate_recording_by_memory_size" to be completly specific.
OK, this new mode of NoiseGenerator is terribly slow for long recordings (one hour).
21 times slower (and that's after the improvements in #1948). Script: import time
from spikeinterface.core.generate import NoiseGeneratorRecording
import cProfile
def generate_noise():
strategy = "on_the_fly"
strategy = "tile_pregenerated"
print(strategy)
recording = NoiseGeneratorRecording(num_channels=32, sampling_frequency=30_000.0, durations=[3600], strategy=strategy)
for i in range(5):
print(i)
x = recording.get_traces()
del x
return recording
# Profile the execution
profiler = cProfile.Profile()
profiler.enable()
start_time = time.time()
recording = generate_noise()
end_time = time.time()
profiler.disable()
profiler.print_stats(sort="cumulative")
print(f"Execution Time: {end_time - start_time:.5f} seconds")
print(recording) Complete c profiler.
|
Thank you @h-mayorquin for having a deeper look on this. on-the-fly is slower because it rnadomly generate every sample for evry channel!!! |
Yeah, I actually like your solution a lot. I am hoping that at some point we can think on an optimization that will allow us to have the advantage of your implementation (not requiring a constant memory block) but without the speed performance costs. But as you say, right now, let's keep the two options as they are useful in two different scenarios. |
after_instanciation_MiB = measure_memory_allocation() / bytes_to_MiB_factor | ||
memory_usage_MiB = after_instanciation_MiB - before_instanciation_MiB | ||
expected_allocation_MiB = dtype.itemsize * num_channels * noise_block_size / bytes_to_MiB_factor | ||
ratio = expected_allocation_MiB / expected_allocation_MiB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@samuelgarcia You modified this test so it always passes? : P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hé hé. I see it now.
this is some kind of a mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's fix it later : D
This big PR is a strong refactor of generate.py:
TODO add upsampling concept into InjectTemplatesRecording
A major improvement will be to use this for testing api, algos, metrics, visulisation without any disk space and memory use.