-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose t_start
in BaseRecording
#3117
Expose t_start
in BaseRecording
#3117
Conversation
f6d9e67
to
acd57a6
Compare
acd57a6
to
62bd935
Compare
62bd935
to
6c87d22
Compare
|
||
|
||
# TODO: deprecate original implementations ### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: this was messing up the diff so left for the end.
t_start
in BaseRecording
, extend tests and a couple of fixes.t_start
in BaseRecording
We do use the time machiner of spikeinterface in neuroconv. I would like to participate on the review. Among the Prs that you have opened, where should I start? |
Thanks @h-mayorquin! That would be great, I think:
Cheers! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of this? I think knowing about how you intend this to be used would help clarify the debate and inform the API.
I find the testing a bit complicated to read. I think there is too much indirection of the fixtures. I will have to do a second reading but I think that's a code smell.
Plus, it seems that many of the tests don't really correspond hee. Why are the tests of memory no in the memory PR. If you are iterating over the segments already on some of the fixtures you can just set their t_start attribute directly, you don't need a special method at the BaseRecording
level to set t_start
, it is a public attribute at the BaseRecordingSegment
. I don't think that the tests for the interface should rely on the interface.
I am requesting changes here because I think some of the testing does not correspond to this PR.
|
||
Parameters | ||
---------- | ||
times : 1d np.array | ||
The time vector | ||
times : int | float | 1d np.array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally would prefer to not overload this function and create a new one instead. set_t_start
.
What are the advantages of overloading this? How are you thinking about it?
But I think what kind of API we should have will become clear once I understood how are you envisioning this to be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I couldn't decide on this vs. separate functions and @samuelgarcia suggested this approach. On balance I think I prefer it as t_start
and times_vector
are mutually exclusive ways of setting custom times, so it makes sense to change in one place. It would be slightly strange to call set_t_start()
and this removes the time_vector
attribute. But I'm not sure either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Thanks for sharing, I will add this as something to discuss at the meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that my take is the following:
We use set_times
because the sampling recording is slightly irregular when we want more precision. This is a good name for what the function does, it has a clear purpose and semantics.
Why we would need to set t_start
?
The use case that I can think off is that we would like to shift all the recording to the right or the left on time. But if that is the use case I would be better to have a method that shift the recording times and works independently of whether times are handled internally with t_start and sampling frequency or a time vector.
In the first case, you shift t_start
(in most of the cases from 0) and in the second you shift the time vector.
If it is not for shifting I can't think on other use of setting t_start
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the use case would be if you had separate sessions in a single day, for example a session, 10 minute break to change some equiptment, and another recording session. If these sessions are held as different segments on a recording (or, as separate recordings) the researcher may want to hold the true recording times (e.g. session 1 started at 1 pm, session two at 1.30 pm).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, change my answer below what I think this type of case is not well served with the current implementation.
spaced timeseries data. Return the original recording, | ||
recoridng with time vectors added and list including the added time vectors. | ||
""" | ||
times_recording = copy.deepcopy(raw_recording) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have clone for this as an extractor method but if you really require this, why make the raw recording fixture per session?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benefit of the raw_recording
fixture is that durations
only needs to be defined once, then copied as set_times()
is in place. But I agree it is a lot of indirection and it is probably more readable to incorporate into the individual fixtures, possibly with DURATIONS=[...]
set at the top of the script?
Thanks @h-mayorquin! The problem is that Agree the tests are kind of messed up, in adding these for the Tomorrow I can try and remove some of the indrection in the tests and remove tests that do not correspond directly to this PR, or at least, are failing if included in this PR. WDYT? |
Yes, I think this is at the core of why I don't like this approach. It requires the final user to be aware of internal details of spikeinterface. I feel it mixes how spikeinterface handles time internally with what the user wants with its recording object. Let me elaborate: I think that a good interface will abstract those implementations details away from the user and will allow them to express things that they want. Let me illustrate a different approach with an example:
What I calim is that this should be as easy as set_times(times_from_ttl)
set_segment_start_time(t=time_when_the_second_segment_started) The interface proposed in this PR will say: sorry, those things are exclusive because of this internal spikeinterface reason, what you need to do is to compose on your own the time_vector, shift it, and then use set_times differently for each segment. What I am proposing is that we have a
Advance users or testing casses like the one in this PR can modify the attributes directly for stuff that is not covered. |
Thanks Herberto, I like the semantics of the on `set_segment_start_time`I like the semantics of this but I think this still requires the user to understand how SpikeInterface is representing time internally, and creates some hidden dependencies that could be even more confusing. At present there is hidden dependency but at least the dependency is mutual exclusivility so the two concepts can't interact and you only really need to track the one you want to use. In the proposed case there are some interactions going on under the hood which could end up in confusing behaviour. For example:
I think this is a very important discussion to have but it would require a major reworking of how SpikeInterface handles time which I think is outside the scope of the PR. I think we should make an issue to discuss this, I'd be very keen as it is important for #3072. For now, let me better motivate this PR with respect to the existing implementation of times:
Let me know what you think of the above, I think there is definately room for further optimisation for the API generally, but I think this PR represents a digestible improvement on the current situation (but alternatively would agree with removing |
Yes, that's how I am thinking about it. This is the way that is now. I don't understand the first two examples but maybe they rely on this behavior not being available? Maybe not? If not, can you illustrate the examples?
This one I agree is confusing. The current behavior is that
Yeah, mabye is an improvement, but I don't think is a big one, setting the t_start at the segment level is fine for the purposes of this PR for example which is testing but if we are gona discuss how to expose a simpler API to the end user then I want to have this more general discussion that we are having. We should not try to remove the concept of handling the internal timings with t_start because that is there for memory efficiency reasons but I maybe the user does not need to interact with it. Another possibility to the one I am proposing is only setings times with a time vector but transforming them to t_start, sampling_frequency internally if they are regular enough. I general I think we should separate internal handing from how the user interacts with him and avoid a power user bias. |
Thanks! Agree these can be split into two separate discussions. Would you agree with:
|
Mmm this is just my opinion, I don't know if I have convinced you. I think that Sam will side with you if you want to make your life more convenient here But I think not using this machinery for the tests is a good idea regardless of whether we make this a user interface or not. |
I think it's worthwhile discussing further, if we pick this lane (i.e. what is introduced in this PR) and write the documentation it will be hard to undo and there may be better options. I definately agree there is room to make this cleaner and more intutiive, for me though it is safer to keep |
This PR exposes
t_start
by allowing it to be set throughset_times()
. Nowset_times
times
parameter can take a vector (np.ndarray
) orint | float
. If the format it is treated as a time vector and if the latter at_start
. Previouslyt_start
were only available if set when loading from file.The PR extends the tests to cover the
t_start
cases and another couple of cases such assave_to_memory()
, which currently has a bug (#2515). In creating these tests another couple of small issues were found, I cherry-picked the fixes to different PRs to keep the diffs easier to review. For ease though if everyone is happy all can stay here. They will fail until this branch is rebased back on master once the other PRs are reviewed.