Seed behavior for generate #1963

h-mayorquin · 2023-09-06T04:57:25Z

I did not have time to participate in the review of #1948.

We added the following auxiliary function to generate:

spikeinterface/src/spikeinterface/core/generate.py

Lines 16 to 24 in 3359647

    
           def _ensure_seed(seed): 
        
               # when seed is None: 
        
               # we want to set one to push it in the Recordind._kwargs to reconstruct the same signal 
        
               # this is a better approach than having seed=42 or seed=my_dog_birthday because we ensure to have 
        
               # a new signal for all call with seed=None but the dump/load will still work 
        
               if seed is None: 
        
                   seed = np.random.default_rng(seed=None).integers(0, 2**63) 
        
               return seed

Which generates a different seed at runtime when no default seed is provided.

I think I understand the rationality, it tries to achieve two things.

The results of the generator should be the same after saving it and loading (@samuelgarcia, thanks for adding tests for this!).
Every call to a generator should generate new traces if the seed is not specified. Let's call this variability by default.

I don't agree completly with 2. When less experienced users utilize let's stay generate_ground_truth_recording with default parameters I want the function to have deterministic behavior. This is better for debugging and error finding as when I run the function on my system it wiill be as similar as possible as their run. With the current behavior given by ensure_seed I will not be able to reproduce their exact results unless I have their specific seed that was generated at runtime. This can be extracted but it will be difficult for new users and it will create one more round of questions and requests when discussing an issue. A more severe version of this problem is presented if users find an error when they run generate_ground_truth_recording (because the specific seed was problematic), but then they run the function again and the error dissappear. This undermines their trust in their algorithm (they just know something is wrong but can't reproduce) and makes it hard for them to report the problem to us.

I think that @samuelgarcia made this decision from a power user point of views as this is more convenient and it makes sense semantically: NoiseGeneratorRecording should generate different noise everytime you call it (just like np.random.rand). But I think error reporting and debugging is an imporant use case of these functions which I think is better served by having a specific seed by default. This is better for less experienced users. Power users that want to actually generate variability are in a better position to know what a seed is, change it themselves and roll a function like ensure_seed by themselves if needed.

Two other minor advantages of having a seed by default instead of generating at runtime:

It is better for provenance, kwargs saves exactly what the user inputed instead of a value generated on the fly.
Less code in our library and easier to document the behavior.

The text was updated successfully, but these errors were encountered:

h-mayorquin · 2024-07-04T23:04:17Z

I haved changed my view somehow and I think that most of the functions that interact with the user should actually have the behavior that @samuelgarcia implemented.

h-mayorquin added the core Changes to core module label Sep 6, 2023

h-mayorquin assigned samuelgarcia Sep 6, 2023

h-mayorquin mentioned this issue Jul 4, 2024

Seeding sorters #3131

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Seed behavior for generate #1963

Seed behavior for generate #1963

h-mayorquin commented Sep 6, 2023 •

edited

Loading

h-mayorquin commented Jul 4, 2024

Seed behavior for generate #1963

Seed behavior for generate #1963

Comments

h-mayorquin commented Sep 6, 2023 • edited Loading

h-mayorquin commented Jul 4, 2024

h-mayorquin commented Sep 6, 2023 •

edited

Loading