Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In-memory MDF #35

Merged
merged 15 commits into from
May 1, 2021
Merged

In-memory MDF #35

merged 15 commits into from
May 1, 2021

Conversation

jonschumacher
Copy link
Member

@jonschumacher jonschumacher commented Apr 23, 2021

I am currently working on an implementation of an in-memory MDF.

Missing things:

Since I strictly enforce a "missing" on every non-optional key combined with checking data integrity against the MDF specs, this PR should also stop issues like #34 turning up.

@tknopp: I don't really get the failing test. The two system matrices are equal but the test fails anyways. I will further investigate this next week, but if you have a spontaneous idea, I would be glad to hear it.

@tknopp
Copy link
Member

tknopp commented Apr 24, 2021

I have no spontaneous idea. Might be something with the transfer function.

Need to review the PR in more detail but on a first sight this all looks pretty good. The validation of MDF files is something we wanted for a long time. My feeling is that the entire conversion infrastructure should work on the inMemory MDF and we might then want to give up the dict infrastructure. I suppose that this is the plan anyway since we want the inMemory type to be used directly in MPIMeasurements.

Fixed deviation error by removing side-effect in systemMatrix()
@jonschumacher jonschumacher marked this pull request as ready for review April 27, 2021 14:58
@jonschumacher
Copy link
Member Author

jonschumacher commented Apr 27, 2021

I solved the issue of the failing test. This line

blockLen = measBGFrameBlockLengths( invpermute!(measIsBGFrame(f), measFramePermutation(f)) )

had a side-effect which I solved using deepcopy(). I hope this is not a big performance issue. By the way: Why is there
function systemMatrix(f::MDFFileV2, rows, bgCorrection=true)
and
function getSystemMatrix(f::MPIFile, frequencies=nothing;
. They seem to do partly the same.

I hope, I considered everything and this is now ready to merge.

@jonschumacher jonschumacher changed the title WIP: In-memory MDF In-memory MDF Apr 27, 2021
@tknopp
Copy link
Member

tknopp commented Apr 28, 2021

perfect Jonas. Give me please some time to let me go over the PR and then let me merge it.

Regarding getSystemMatrix: This definitely needs some cleanup and one can see that the code evolved over time. We probably want to get rid of getSystemMatrix. But on the other hand getSystemMatrix is useful for all MPIFile implementations while systemMatrix aims to isolate the type specific part. So it seems we need both but I agree that the function names are really confusing right now. Maybe systemMatrix for the public interface and systemMatrixInternal or systemMatrixLowLevel for the internal interface.

@tknopp
Copy link
Member

tknopp commented May 1, 2021

This looks overall very good. I testet MPIReco which also still runs fine and thus merge this now.

@tknopp tknopp merged commit 5389925 into master May 1, 2021
@tknopp
Copy link
Member

tknopp commented May 1, 2021

@jonschumacher please delete the branch on your own if you don't need it anymore.

@tknopp
Copy link
Member

tknopp commented May 1, 2021

regarding all the smaller hacks with existing MDF files. In the long rund we should just fix the issues in the underlying test files. It seems that only the MDF1 files are problematic here.

Regarding the interface I found it interesting that the in-memory version was initialized from the lazy one. While this definitely makes sense internally we might want to make the an interface where one creates the in-memory version from a filename. This would also imply that the HDF5 handle is directly closed. Or could be explicitly closed without the need to wait for the garbage collector (we have some issue with HDF5 files being closed too late due to garbage collection being asynchrone and non-deteministic).

@jonschumacher jonschumacher deleted the JS/mdf-in-memory branch May 5, 2021 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants