-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
timeseries that can handle strings #43
Conversation
Hi, thank you very much for the PR. This PR is quite large; for historical reason, this library currently does not have unit tests, hence a large change like this is quite risky. Is it possible to split this into multiple smaller PRs? This would make review much easier and much less risky. Thank you very much in advance. |
xdf.h
Outdated
@@ -174,7 +175,7 @@ class Xdf | |||
* | |||
* \sa offsets | |||
*/ | |||
void detrend(); | |||
//void detrend(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please uncomment.
Please also address the CI run issues above; seems only the ubuntu run finished successfully. Thank you very much! |
I looked through the details of the CI run issues, and I think they are running older versions of C++ that cannot handle std::variant. Both machines I have been using to develop/test this are M-series Macs, and it compiles just fine on both of them - but I verified that both used compilers with C++17 (or later) first. This is my first experience with a PR, and I'm not entirely sure what the CI process consists of. (Apologies for your needing to deal with an amateur.) If I'm reading the run issues right, the two options are either (1) try again with a newer compiler or (2) use a template instead of std::variant. Let me know your thoughts. I'm happy to assist in any way I can (just realize that I'm learning on the job here). |
Thank you very much! This library was written many years ago, and looking back from today's perspective, there is a lot that can be improved. Thank you for making the step forward! Regarding C++17: I am all for a newer version, potentially even targeting C++20 if possible. Regarding std::variant: I personally think template would probably be more suitable (but I would need to take a closer look on feasibility). Would you mind making a separate PR bumping the C++ version to 20? Alternatively I can make one later today as well. Thank you very much. |
xdf.cpp
Outdated
} | ||
} else { | ||
//read the data | ||
if(streams[index].info.channel_format.compare("float32") == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can extract a const to reuse.
xdf.cpp
Outdated
float data; | ||
Xdf::readBin(file, &data); | ||
streams[index].time_series[v].emplace_back(data); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be better to extract a function to re-use code (inside an anonymous namespace):
namespace {
// Reads data of type T up to `channel_count` and appends
// to `stream.time_series`.
templace <typename T>
void append_data(Stream& stream) {
for (int v = 0; v < stream.info.channel_count; ++v) {
T data;
Xdf::readBin(file, &data);
stream.time_series[v].push_back(std::move(data));
}
}
} // namespace
Xdf::Xdf ...
and use it like
append_data<double>(streams[index]);
xdf.cpp
Outdated
|
||
// Fill inbuf with the numeric values from the row | ||
for(auto& val : row) { | ||
std::visit([&inbuf, &read](auto&& arg) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These probably would be more readable with std::transform
together with template.
I only have a general comment on the structure of these changes (since my C++ is very limited these days). The PR is quite large mainly because it simplifies control flow by removing or summarizing redundant parts. Would it be clearer if the overall structure remained the same and only the string-related parts were adjusted? This would probably also make it easier to review. Also, some changes don't follow the library's four-space indentation style. Please make sure that each block is indented by four spaces (a few are only indented by two). |
Out of curiosity, can you check to see if this works now that it's running C++20? My C++ is very limited, and while I understand that a template solution is better than what I've done, it's not something I have the bandwidth to figure out right now, unfortunately. |
Hello, I just tried running CI again and seems there are some compiling errors. Would you mind helping taking a look? Once it compiles I'd be happy to approve the change; we can figure out template and other improvements in the future. Thank you very much. |
Before merging this PR, would it be possible to test the library with the XDF files available here: There are several XDF files in the corresponding folder, which should work with the current release. Since we don't have unit tests, we should at least make sure that we do not introduce regressions that lead to errors loading these files. |
I can look into the errors tomorrow and check to make sure it works on the test cases. Thanks! |
I believe this should be working now. Let me know if there are still issues - I did test the various files from https://osf.io/uc7wn/, with the only issue being the one noticed in PR#26. |
I should also add - I am about 99% sure that event_map is no longer being used in this version. I have not removed that code, but it may be extraneous at this point. |
Thank you; I was traveling in the last few days and will take a closer look this week. |
Sorry for the delay; looks great! Thank you very much for your contribution. |
@Yida-Lin |
Also, I now get errors when building SigViewer with these changes:
It seems related to the |
Thanks @Yida-Lin, I think templates would indeed be a preferable solution here. |
Hi, sorry it took a while, I am looking into addressing @cbrnr 's issue above with compiling with SigViewer. But I just realized an issue: why did we want to store strings in Because SigViewer relies on the If we just want to fix #19 , then I think the fix should be on If I understood correctly, #19 was not complaining about libxdf not supporting strings, but rather the dimension is incorrect (only reading N instead of M x N). Sorry I overlooked this earlier, would appreciate some clarifications. Thanks! @melissakey |
Yes, that's exactly right. |
Thanks @cbrnr . If that's the case, unfortunately we'd need to revert this PR. The fix should be applied to the usage with regard to Please let me know if I misunderstood anything; otherwise I will apply the revert and the new fix some time this week. Thanks! |
I would think so, yes, because this PR also explicitly mentioned to fix #19 (which BTW is still open). Can you confirm @melissakey? |
The challenge with eventMap goes beyond reading N instead of M x N. The eventMap object appears to be designed for stream-agnostic data such as notes. In fact, in the demonstrations of LSL that my data originates from, multiple notes were produced that belong in a 1-channel, stream-agnostic location. eventMap is perfect for that, and if all the strings were notes, the original code would work fine. In addition to those notes, I also have M x N streams coming from a cognitive task where some of the fields are strings. In the current version of the output, the cog task produces 4 streams, each with a different number of columns, 2 of which contain strings. In each of these streams, both M and N are stream-specific, which means that eventMap, as currently conceived, is not appropriate - partially because of the M X N issue, but mostly because it's not stream-specific. For this task, there is no scenario in which we can remove strings altogether. One of the fields will contain a randomly generated list of numbers/letters where the length is determined by the difficulty of the task. More generally, we pair a lot of cognitive assessment data containing a mixture of strings and numeric data with physiological data that is typically numeric. Looking at the test case from #19, it's not clear whether that string data should be treated as stream-agnostic event data requiring multiple fields (e.g. appropriate to eventMap, but with more flexibility) or a stream (more similar to what I have). I don't know much about the interface with SigViewer, so I'm not entirely sure what to say on that account. How would it/should it handle Pandas DataFrames with mixed-type data? |
I see. Thanks @melissakey . So how about this: looks to me that Let me know what you think, cc @cbrnr |
I trust your expertise @Yida-Lin! If SigViewer needs to be updated, it would be great if you could do that too! |
This fix allows the libxdf package to support strings (issue # 19). I also cleaned up "case 3" in the load_xdf file. part is required to support strings, but a lot of it is just reducing repetitive code.
Note that string support requires C++17 for std::variant