Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Realtime recording note analysis #6

Open
wants to merge 12 commits into
base: default
Choose a base branch
from

Conversation

novikov-alexander
Copy link

@novikov-alexander novikov-alexander commented Jul 11, 2024

This PR is a rough draft for a real-time note analysis feature. Such a feature would be helpful for vocal and instrumental real-time performance analysis, particularly in music performance teaching or studio recording quality assurance.

The idea of the PR is to create temporary layers that are filled via the pYIN plugin as record target emits recordDurationChanged. The filled layer models are then copied to layers attached to the view. Analysis occurs with some overlap to ensure that notes are not split by analysis window boundaries. Given that consecutive windows overlap, I propose merging notes in the overlapping areas.

This PR serves as a proof of concept which works amazingly well. However, I'm not satisfied with the constant creation of temporary layers. I believe it is possible to reuse the same temporary layers again and again. Moreover, I'm unsure if there's a way to continuously retrieve features from pYIN. If continuous retrieval is possible, my current approach might not be the best, and it could be better to establish a reactive stream of features from pYIN to the note layers.

I acknowledge that there is some buggy code that needs refactoring, and the UI in the menu should be improved. I'm ready to split the PR into smaller chunks or perform additional refactoring as necessary.

Additionally, I'm uncertain whether the new functionality should be implemented within analyseNow, reanalyseSelection, or as a separate analyseRecording function, as I have done.

I would appreciate any recommendations on this topic and am prepared to enhance this feature or related aspects, as I have spare time over the next couple of months.

@novikov-alexander
Copy link
Author

I apologize for the formatting changes. I noticed that some indentations were made with tabs and others with spaces. Additionally, there were extra spaces on new lines. Visual Studio automatically standardized it, but I can attempt to revert it back.

@cannam
Copy link
Member

cannam commented Jul 12, 2024

This is very interesting! There's a lot to digest here but I've given it a quick test and it clearly works.

pYin can produce streamed output if initialised with the fixed-lag option set, though I think only for fundamental frequency data - the note segmentation is always produced at the end if I remember correctly.

I think that the SV feature-extraction model running code is prepared to "stall" while providing input to the plugin, in order to wait for the record input to provide more data, so in theory there might be a possibility of doing the whole thing as a continuous stream through to a single output layer - only without the note segmentation which presumably would involve running another extraction at the end. But I'm not confident about that.

It's possible that your approach might work better even if it does look clumsier, just because it produces all output layers and doesn't risk changing its mind about their contents at the end.

@novikov-alexander
Copy link
Author

That's very pleasant to read!
Let's stick with the current approach for now, and we can change it later.
I know you are busy, but I propose we discuss a rough plan at a comfortable pace. I can try to dedicate a few coding sessions next month to complete it.

I think I need to:
1. Improve the merging algorithm in case more than two notes overlap.
2. Calculate note value (frequency) as a weighted sum of merged parts.
3. Delete duplicated pitch tracks in overlapped areas.
4. Delete temporary layers on each iteration, or even better, reuse them.
5. Create a selector between "analyze after recording" and "analyze during recording."

What other important points would you suggest we consider?

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants