-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENHANCEMENT] Default Modulators #205
Comments
I've created an MD file with the proper proposal: |
Just to clarify: The proposal suggests that DMOD contains modulators that would be applied on instrument level only? Not on preset level? |
I worded that poorly. It just works exactly like the stock SF2 default modulators list. I've changed the MD now. |
Some details are still not clear to me. (Sorry for abusing this issue as discussion). Current state as per SF-specThe SF Spec essentially defines 3 levels of modulators:
Number 3. can override 2., and 2. can override 1. How that works in detail is clearly defined by the spec. When reading your proposal, the role of the DMOD chunk is unfortunately not yet clear to me. One possible interpretation: You're trying to introduce an additional level of modulators between 1. and 2. Another possible interpretation: DMOD chunk shall fully replace or supersede modulators mentioned in no. 1. Yet another possible interpretation: The DMOD chunk shall only serve for informational purposes to Soundfont editors, allowing them to more cleanly separate "general-apply-to-all-instrument"-modulators, and "instrument-specific" modulators. The section "Default modulator behavior" is not really helpful, because it doesn't add anything that the spec already says (whether or not you remove a zero-amount modulator or keep the amount zero is IMO just an implementation detail). The final sentence
suggests that my third interpretation is unlikely, but it doesn't help me to decide between my first or second interpretation. So I think you need to update the description of the DMOD chunk, by pointing out its role, its purpose, and how it differentiates itself from the default modulators dictated by the SF spec (no. 1). The "stakeholder" of this chunk must become clear, i.e. is it just SoundFont Editor apps, or do synth's also have to account for it. And a final remark: Pls. keep in mind that SF2 is an old but yet well established standard. Things will not change. Implementations will not change. If you want this feature to be generally accepted and usable, this customization must be backward-compatible. By that I mean, applying, i.e. structurally saving, those DMOD modulators to each and every instrument zone in IMOD for backward compatibility reasons should be seriously considered, IMO. |
@derselbst, what I meant is replacing the level 1 modulators. Essentially, with or without DMOD chunk, the default SF2 modulators are there. When the DMOD chunk is read, the modulators are added to that list, and the identical ones override the sf2 default ones. For example, assume a DMOD chunk of 2 modulators:
The default modulators for this soundfont will be:
I hope this clears things up. |
Ok, thanks for the example. So it turns out my first interpretation is the correct one. The purpose of DMOD is to introduce an extra level of modulators. This leads to the following modulator hierarchy:
Yet, for best compatibility, I again would like to recommend to save all the modulators of the DMOD chunk to each and every instrument zone in the IMOD chunk. Old synthesizers would play the soundfont correctly, while Soundfont editors can more easily recognize file-specific default modulators. IIRC, Polyphone does already go through all modulators in IMOD to figure out which are meant as default ones. So, if my last backward compatibility idea is considered, one might raise the question what the added value of a DMOD chunk would be. If it is not considered (since that is essentially "The Problem" you're trying to solve with DMOD) the question remains if this solution would be adopted by a range of SF2 implementations such that it ultimately finds acceptance by the users. I'm having twisted minds here... |
Well, that's what I'm hoping for. This is essentially an extension like the vorbis sf3 extension. Many players (like meltysynth, SF2Lib or tinysoundfont or example) don't support it at all. Actually, I know only 3 players with sf3 support: fluid, bass, and spessasynth... Since most people probably only use BASS or fluid anyways, these three (poly, fluid, bass) implementing this should be enough to get widely adopted more. After all, with sf3, musescore invented this format and since musescore is popular, the sf3 format became widely supported. And that's what I would like to happen. The three major sf2 tools supporting this chunk would make other players add support for that. It's also what I hope will happen with the SF2 RMIDI format, but that's unrelated. Maybe we could only use DMOD with dwMajor set to 3? Since the sf3 format can contain uncompressed samples, it will act like a regular sf2, but since the dwMajor is 3, it automatically rules out incompatible synths. |
Sorry for not being more reactive on this interesting subject. I had quite a lot of work with the previous version of Polyphone and I am currently busy with various life projects... So globally there is a need for a soundfont format update and I completely agree with this. Some years ago I started this (based on the sfz capabilities and the different user feedbacks): And now I am discovering: And I still need to correctly understand this ticket: Aside from this, MuseScore create the sf3 format which is the sf2 using compressed data samples as you know well. From my side I recently added the "release mode" for samples inside an instrument, so that the playback starts when the key is released (this is a personal wish since I use soundfont to play organ). Vienna (SynthFont) also added a property but I don't remember well (something like the velocity modifying the attack). This context shows that the different actors should agree with a common target. My position is that I will show no resistance in upgrading the format but I am unfortunately lacking time for managing the whole process and also a bit afraid of implementing updates and force others to follow the movement, creating thus tensions. Now, back to the default modulator subject, Polyphone could display them when clicking on the soundfont header as you proposed by email. This is a very good idea so that we know exactly how a soundfont is played. It could be possible to change the default modulators without changing the 2.04 format though, with extra processing for displaying the content of a soundfont within Polyphone. If all instruments have all default modulators defined as the first modulators, Polyphone can gather all common instrument modulators and then display them at the soundfont level instead of the instrument level. Other soundfont editors would however display all modulators for each instrument and it would maybe be harder to distinguish the default modulators from the others but... this may not be that important if the use of Polyphone is kept. This system has the advantage of staying supported by all soundfont readers and I have thus the same recommendation than @derselbst , changing the sf2 format for this particular purpose is not needed. The other and proper way is to update the soundfont format as you propose but I need a common well specified target (including other upgrade needs) with at least the agreement of the fluidsynth team. Maybe should we still use the .sf extension and simply increase the internal version number while progressively supporting more chunks and more sound properties? |
Thanks for this response, Davy. About the Default modulators:Derselbst's approachIt relies on Polyphone grouping modulators from all instrument zones into default modulators pros
cons
Spessasus's approachIt relies on a custom pros
cons
To be fair, if Polyphone remains dominant sf editor (which it probably will), option 1 might be the best approach. If that gets implemented, I'll remove the proposal from the repo. Stgiga's wBank proposalHere's how I understand it: SF2 always had a problem of lacking support for the bank LSB message, making it incompatible with XG and GM2 bank selection systems. Some synthesizers include hacks to circumvent this, like reacting to bank MSB as a drum toggle and LSB as the soundfont's bank select. But this solution isn't perfect. What #179 discovers is that the So what stgiga suggested, was to use the top byte of // storing bank LSB 60 and MSB 5
int wBank = 7685;
char bankMSB = wBank & 127; // 5
char bankLSB = wBank >> 7; // 60 And that's it. The drum toggle means that bank 128 (either one) still means a drum channel. I hope this helps, @davy7125 |
These two proposals achieve the most needed features (default modulators and bank LSB) without needing a new format or file structure. So adding them as soon as possible would extend the life of the sf2 format while giving sf v3 (or SFe) time to develop. |
Hello! My name is Sylvia. I am the lead developer of the SFe standard that you linked here. I've got a silver badge on the Polyphone forums but haven't been posting there for a while due to health issues. However even so, I'm very happy to help you understand what's going on with the soundfont enhancement projects that we (and other people) have been working on! Default modulatorsThe first thing that we're talking about is the default modulator issue, right? Well, Spessasus's solution involves adding a DMOD subchunk. This allows the bank developer to define a few modulators that apply to all instruments and/or presets in the bank. This can make life much easier for the bank developer, because they won't need to define the same modulator multiple times. This also solves the problems with the default modulator system found in legacy SF2.0x, for example "Velocity -> Filter Cutoff". I'm not too familiar with the situation that Derselbst has suggested, but from Spessasus's summary of the solutions, it seems that they suggest that Polyphone would "intelligently" detect modulators that are in all instrument zones, and then list them as additional "default modulators" that can be added to or removed. The main disadvantage listed is that it would break if edited with a non-Polyphone editor. However, because non-Polyphone editors (such as Viena or Swami) remain popular, I don't think that it's an acceptable tradeoff. The last thing that we want are proprietary extensions that only work properly with one soundfont editor or player; this has happened already with other features. By formally defining custom default modulators, we can prevent this issue. Therefore, I'm giving my support to spessasus's DMOD subchunk proposal. We can of course use both strategies; the "intelligent" default modulator detection maximises compatibility with legacy players, while the DMOD subchunk reduces modulator complexity and simplifies modulator parsing for programs that implement the feature. If we can get fluidsynth to adopt it then it would likely be good enough! As an aside, we've seen other proposals for similar "intelligent" features that would auto-detect when data is formatted in a particular way, but none of these features have received much success. One of these was a way to randomise samples. We were evaluating "intelligent" features, but we concluded that formerly defined structures will always be a better solution than attempting to "unpick" implicitly-defined data. Ultimately, it depends on whether or not you want to implement this chunk. If you don't think that there is enough use for a DMOD chunk, then we can just not implement it. Two bank selects on one fileIt looks like another thing is the bank select LSB implementation. When stgiga and I were looking through SFSPEC24.PDF, we noticed that the value used for banks (wBank) was a 16-bit value (WORD) instead of an 8-bit value (BYTE/CHAR), as spessasus said. Therefore, we concluded that it would be possible to achieve the 16384 bank system consisting of both bank select MSB and LSB without significant modifications to the SF format. Spessasus explains very well that the used (in SF2.04) byte of the wBank would be used as the MSB value and the unused (in SF2.04) byte the LSB value. Because the unsigned WORD value can be up to 65535, we can easily represent any of the 16384 combinations of bank select MSB and LSB using just the wBank. No extra fields need to be declared. The elegant property of this solution that spessasus didn't mention is that to an SF2.04 player, any bank number above 128 is ignored. In other words, the unused byte is ignored. Therefore, banks that use LSB bank selects are completely transparent to legacy players that don't support the feature. Additional programs could also be used to remove such unused presets to reduce preset generator usage. User interface modifications in Polyphone for this feature would be very simple. All you would have to do is to split the bank field into two bank fields, "Bank (MSB)" and "Bank (LSB)". The preset listings would be slightly modified from "Bank:Preset" to "MSB:LSB:Preset". File extensionsThis is something that I've not got one single answer on. While keeping the .SF2 file extension for updated versions of the format is simple and doesn't need any effort, it may mislead an end-user into attempting to play an enhanced SF bank on an non-enhanced player. This simply causes complaints that the bank developer must address. With the current loose definition of legacy SF2.0x, there is currently a problem of bank developers saying that their bank only runs on one player (for example fluidsynth or bassmidi). All keeping the .SF2 file extension does is exacerbate this issue. Therefore, we suggest using a different file extension unless the features included are completely backwards compatible with legacy players, i.e. the lack of these features does not affect the rest of the bank. The file extension would be selectable. If the features used by the bank are completely backwards compatible, then the extension .SF2 can be used, otherwise the new extension would be used. Internal version numberAs I've mentioned before, I propose that our enhancements start with the wMajor version of 4. The wMajor version of 3 used to indicate WernerSF3 may not be sufficient, as there may be a program that implements the SF3 format compression, but can't use any enhanced features that we come up with. You mentioned that we can keep one extension but increase the internal version number when extra fields, chunks and subchunks are added. This is a good idea; we might want to add some more chunks to the "hydra structure", but this is disallowed by legacy SF2.04. However, SFSPEC24.PDF never states that we can't make such changes at all. My understanding of what SFSPEC24.PDF says is that as long as the wMajor value is changed, we can do whatever the **** we want with the structure as long as the ifil version is still formatted in the same way. If we do so, then this will have to be combined with a different file extension. My proposal for this is that we just keep the 32-bit version of any enhanced SF format mostly compatible with SF2.04, with a subset being fully compatible. Structural changes that aren't compatible with SF2.04 are limited to 64-bit banks, which is something that stgiga has been researching for a long time. A "common well-specified target"Guess what? We've got this common well-specified target that you request. It's called SFe! The name is short for "SF enhanced". I'll start by saying that it seems that there was a misunderstanding about the purpose of SFe. It is not a competitor to anyone's proposals, but rather a way to combine as many of the known SF extensions together to create a complete specification that if followed, will allow programs to be compatible with any bank that may use existing extensions. The current draft specification should be sent to FluidSynth for approval, and if we can make the necessary changes to make the specification implementable in a practical amount of time, then we can promote the draft specification to a final specification. However, before the final specification can be released, FluidSynth would need to complete the Werner SF3 specification, as it is an integral part of SFe. The initial version of the SFe specification, 4.00, includes many things, including some topics that are beyond what's discussed here:
The default modulators feature is planned for the next version after 4.00, 4.01. We did a feature freeze a few months ago precisely to ensure that FluidSynth or similar playback programs would not have to implement a massive number of features. Overwhelming the program developers with a ton of features that must be added would not be a good idea. Therefore, it may be a good idea to get a few more opinions on what solution we should use for default modulators once the first version of SFe is implemented in an SF player. Other features that will be coming in future versions of SFe include MIDI lyrics, SynthFont Custom Features (if Kenneth Rundt cooperates), preset library management systems, true 64-bit support with expanded or removed field size limits and round robin sampling. If you think that we should include more or less features in the draft specification, please tell us and then we'll move forward or back some of the features that we're planning to include. Right now, we were planning to develop a reference implementation of SFe based on SpessaSynth, however if another program implements SFe, then this would negate the need for such a reference implementation! What do we need to do?
Questions to be answered
Sorry for this wall of text! If you have any concerns, please reply. |
Thanks for the detailed reply Sylvia! My suggested answers to your questions:
Let me know what you think! |
Nice to e-meet you, Sylvia silver Leaf! I just realized that I got help from the beginning on the subject of soundfont format update. Below my comments for what I could envision. Next versions of Polyphone
The .SFe formatI would use .sf as the extension for the format update since all files having this extension seem to have disappeared, and the "enhanced" will make less sense in a couple of years if this is the new standard. I would just use the word "soundfont", a sound mapping like a font would do with glyphs. 4.0 can be used as the internal format version number (ifil). First implementationI would first start with a list of features that could simply be ignored by existing soundfont readers, for allowing compatibility with minor changes. The different features I am listing come from this page, which gathers features that would be useful for a complete sample-based synthesizer.
Then, I would highly recommend using a 64-bit version of the format. 32-system are legacy systems. For compatibility, existing soundfont readers would need extra processing of the indexes but this is less work than implementing the sf3 format (from my point of view). At this stage, the conversion between sfz / sf2 would be more complete. Further featuresThese features could be delayed in subsequent version of the soundfont format:
My answers to your questions
QuestionI read in the SFe specifications
Where is it written? |
Thank you for your responses! The first thing that I'm going to say is that I'm very floored to hear that the plan is to implement the format in a future version of Polyphone. I'll give my thoughts on the answers given to the questions I asked. Let's start by summarising our opinions: What file extensions should be used and when?I'm glad that some of you have come to an agreement! We could use the Also, Davy suggests we use Should such an audit reveal another commonly-used format that uses
How do we handle the ifil value changes for new SF versions?This is another problem that I've been working on a solution to. So, we now have multiple answers to the same question. Let's start with my answer(s):
Now, let's compare it to your solutions:
Should the SFe specification be full or partial?Davy said that partial specifications are good enough for people who are familiar with the content of SFSPEC24.PDF, with a full specification being good if SFe is widely adopted. What SFe features should be moved forward or backward?We've got differing opinions on this topic. Spessasus says that we should start implementing "structurally compatible" features before SFe is even implemented:
Davy says that there should be a wider set of features, including:
The plan is to prioritise the features that spessasus says are "structurally compatible", while adding many of Davy's wider set of suggested features at a later date to avoid overwhelming program developers. If we can't get a formal specification for SFCF, can we reverse engineer it?It looks like we've got some confusion about SFCF. It stands for SynthFont Custom Features. It was mentioned in the original Polyphone soundfont enhancement proposal:
It includes the features:
When should the final specification be ready?This is a point of contention, but I think that if we do things carefully and not rush them (as spessasus said), then we will have a much higher chance of success. And of course, we will have some kind of reference implementation before a final specification releases, to allow people to ensure that their programs are compatible with SFe. Follow-up thingsSo, from what has been said, here is some things to do in the next draft milestone (4.00.9) of the SFe specification:
I've sent invitations to the SFe-team-was-taken organisation to spessasus and Davy, and if you want an invitation, I'll gladly give you one on request. We'd suggest that all future discussion on SFe be done on the SFe repository, because after all, this is the Polyphone repository and not the SFe one. Thank you for your feedback, and sorry for another wall of text! |
EDIT
New version of the proposal is here:
https://github.com/spessasus/soundfont-proposals/blob/main/default_modulators.md
The text was updated successfully, but these errors were encountered: