-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds the StreamingBinWriter class that allows bin data to be streamed… #29
Conversation
… to disk instead of stored in memory. To facilitate this, the Writer class was updated as follows: 1. The output filename and extension are now member variables initialized with the an object of the class. 2. Creation of the directory was moved to an externally visible method to allow a user of objects of the Writer class to create the directory explicitly (StreamingBinWriter needs a valid path to write to on initialization).
Hey @mscf, the code looks solid to me, however when I run the tests on the
I checked out the line that caused the error
Could it be that |
I think you're right @ephathaway - @mscf this could be because Justus developed this project using mypy and enforcing stricter typing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting some comments on your review to start with - thanks for busting this out so quick @mscf !
mffpy/tests/test_writer.py
Outdated
@@ -110,3 +111,58 @@ def test_writer_exports_JSON(): | |||
remove(filename) | |||
except BaseException: | |||
raise AssertionError(f"""Clean-up failed of '{filename}'.""") | |||
|
|||
##START## |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Start streaming bin writer testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added that to denote where I started adding tests and shortly thereafter stopped seeing it at all. I will delete it.
mffpy/tests/test_writer.py
Outdated
assert layout.name == device | ||
# cleanup | ||
try: | ||
remove(join(dirname, 'info.xml')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit picking - but would rmtree
on the dirname
accomplish the same thing in a single line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I had just copied the test case for the non-streaming and only changed those things I needed to. I can make this change to.
Subclass of BinWriter to support streaming bin file to disk. | ||
""" | ||
|
||
def __init__(self, sampling_rate: int, mffdir: str, data_type: str = 'EEG'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are then any limitations / expectations for sampling_rate
, mffdir
, or data_type
that are non-obvious from the casual read?
(i.e. We don't support smb file paths or some such)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added some comments.
Specifically, as regards the mffdir value, the directory must exist before the object is created, and this is noted. As far as limitations to where the folder can exist: this utilizes the same underlying functionality as the builtin open() and so has the same characteristics. These characteristics are shared by the BinWriter class as it uses open() directly to write the file.
mffpy/tests/test_writer.py
Outdated
num_channels = 256 | ||
sampling_rate = 128 | ||
# create an mffpy.Writer and add a file info, and the binary file | ||
W = Writer(dirname) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The meaning is obvious within this context, but let's avoid single letter variable names as a matter of practice (as annoying and self-referential that kind of is for such aptly named classes). Maybe
test_writer
for W
, str_bin_writer
for b
, test_reader
for R
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, another case of just copying the testcase and only changing what needed to be changed.
@@ -34,25 +34,31 @@ def __init__(self, filename: str): | |||
self.filename = filename | |||
self.files: Dict[str, Any] = {} | |||
self._bin_file_added = False | |||
self.mffdir, self.ext = splitext(self.filename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I correct in guessing that the intent here is to populate / prepare the filename variable at the time the Writer
is instantiated rather than when a call is made to actually write the file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because in order to stream a file to a directory that directory needs to be created first, so I needed some means of accessing the name of the directory to pass to the bin writer.
mffdir += '.mff' | ||
makedirs(mffdir, exist_ok=False) | ||
|
||
self.create_directory() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not disagreeing, but why break out this functionality into its own function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because currently it only creates the folder when write is called. Write needs to be called only after all of the data is collected, but in order to stream the EEG data to a file, the directory needs to already exist.
Updates code/comments to address issues raised by code review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran the tests on the updated branch and they are all passing! Besides the one comment I left, I would not make any changes.
@@ -39,6 +39,7 @@ def __init__(self, filename: str): | |||
self.file_created = False | |||
|
|||
def create_directory(self): | |||
"""Creates the directory for the recording.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the .mff directory contains a lot of extra info files besides just the signals, we might want to edit the description to say we are creating the directory for all parts of the .mff file.
… to disk instead of stored in memory.
To facilitate this, the Writer class was updated as follows: