Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENHANCEMENT] Default Modulators #205

Open
spessasus opened this issue Jul 29, 2024 · 14 comments
Open

[ENHANCEMENT] Default Modulators #205

spessasus opened this issue Jul 29, 2024 · 14 comments

Comments

@spessasus
Copy link
Contributor

spessasus commented Jul 29, 2024

EDIT

New version of the proposal is here:
https://github.com/spessasus/soundfont-proposals/blob/main/default_modulators.md

@spessasus
Copy link
Contributor Author

I've created an MD file with the proper proposal:
https://github.com/spessasus/soundfont-proposals/blob/main/default_modulators.md

@derselbst
Copy link

Just to clarify: The proposal suggests that DMOD contains modulators that would be applied on instrument level only? Not on preset level?

@spessasus
Copy link
Contributor Author

Just to clarify: The proposal suggests that DMOD contains modulators that would be applied on instrument level only? Not on preset level?

I worded that poorly. It just works exactly like the stock SF2 default modulators list. I've changed the MD now.

@derselbst
Copy link

Some details are still not clear to me. (Sorry for abusing this issue as discussion).

Current state as per SF-spec

The SF Spec essentially defines 3 levels of modulators:

  1. Default modulators dictated by the SF Spec. These are applied to every instrument. By any SF2 synth. Always.
  2. Instrument level modulators.
  3. Preset level modulators.

Number 3. can override 2., and 2. can override 1. How that works in detail is clearly defined by the spec.


When reading your proposal, the role of the DMOD chunk is unfortunately not yet clear to me.

One possible interpretation: You're trying to introduce an additional level of modulators between 1. and 2.

Another possible interpretation: DMOD chunk shall fully replace or supersede modulators mentioned in no. 1.

Yet another possible interpretation: The DMOD chunk shall only serve for informational purposes to Soundfont editors, allowing them to more cleanly separate "general-apply-to-all-instrument"-modulators, and "instrument-specific" modulators.

The section "Default modulator behavior" is not really helpful, because it doesn't add anything that the spec already says (whether or not you remove a zero-amount modulator or keep the amount zero is IMO just an implementation detail). The final sentence

The default modulator list is altered at load time and then it acts exactly like the default SF2 modulator list.

suggests that my third interpretation is unlikely, but it doesn't help me to decide between my first or second interpretation.

So I think you need to update the description of the DMOD chunk, by pointing out its role, its purpose, and how it differentiates itself from the default modulators dictated by the SF spec (no. 1). The "stakeholder" of this chunk must become clear, i.e. is it just SoundFont Editor apps, or do synth's also have to account for it.

And a final remark: Pls. keep in mind that SF2 is an old but yet well established standard. Things will not change. Implementations will not change. If you want this feature to be generally accepted and usable, this customization must be backward-compatible. By that I mean, applying, i.e. structurally saving, those DMOD modulators to each and every instrument zone in IMOD for backward compatibility reasons should be seriously considered, IMO.

@spessasus
Copy link
Contributor Author

spessasus commented Sep 19, 2024

@derselbst, what I meant is replacing the level 1 modulators.

Essentially, with or without DMOD chunk, the default SF2 modulators are there. When the DMOD chunk is read, the modulators are added to that list, and the identical ones override the sf2 default ones.

For example, assume a DMOD chunk of 2 modulators:

  • MIDI CC 1 to vibratoToPitch, linear unipolar positive, no controller, amount 100.
  • Poly Pressure to vibratoToPitch, linear unipolar positive, no controller, amount 50.

The default modulators for this soundfont will be:

  • Velocity to attenuation (unchanged)
  • Velocity to filter (unchanged, though some synths disable that, like mine and yours)
  • Channel pressure to vibrato (unchanged)
  • Volume to attenuation (unchanged)
  • Expression to attenuation unchanged)
  • Pan to pan (unchanged)
  • Pitch wheel by pitch wheel range to initialPitch or fineTune (unchanged)
  • reverb to reverb (unchanged)
  • chorus to chorus (unchanged)
  • Mod wheel to vibrato will change the amount from the default 50 cents to 100 cents, since the DMOD modulator is identical to it, overriding its amount.
  • A new modulator, poly pressure to vibrato, 50 cents depth.

I hope this clears things up.

@derselbst
Copy link

Ok, thanks for the example. So it turns out my first interpretation is the correct one. The purpose of DMOD is to introduce an extra level of modulators. This leads to the following modulator hierarchy:

  1. Default modulators dictated by the SF Spec. These are applied to every instrument. By any SF2 synth. Always.
  2. Default modulators defined in a specific SoundFont file, and only applied in the scope of that particular file.
  3. Instrument level modulators.
  4. Preset level modulators.

Yet, for best compatibility, I again would like to recommend to save all the modulators of the DMOD chunk to each and every instrument zone in the IMOD chunk. Old synthesizers would play the soundfont correctly, while Soundfont editors can more easily recognize file-specific default modulators.

IIRC, Polyphone does already go through all modulators in IMOD to figure out which are meant as default ones. So, if my last backward compatibility idea is considered, one might raise the question what the added value of a DMOD chunk would be. If it is not considered (since that is essentially "The Problem" you're trying to solve with DMOD) the question remains if this solution would be adopted by a range of SF2 implementations such that it ultimately finds acceptance by the users. I'm having twisted minds here...

@spessasus
Copy link
Contributor Author

the question remains if this solution would be adopted by a range of SF2 implementations such that it ultimately finds acceptance by the users. I'm having twisted minds here...

Well, that's what I'm hoping for. This is essentially an extension like the vorbis sf3 extension. Many players (like meltysynth, SF2Lib or tinysoundfont or example) don't support it at all. Actually, I know only 3 players with sf3 support: fluid, bass, and spessasynth...

Since most people probably only use BASS or fluid anyways, these three (poly, fluid, bass) implementing this should be enough to get widely adopted more. After all, with sf3, musescore invented this format and since musescore is popular, the sf3 format became widely supported.

And that's what I would like to happen. The three major sf2 tools supporting this chunk would make other players add support for that. It's also what I hope will happen with the SF2 RMIDI format, but that's unrelated.

Maybe we could only use DMOD with dwMajor set to 3? Since the sf3 format can contain uncompressed samples, it will act like a regular sf2, but since the dwMajor is 3, it automatically rules out incompatible synths.

@davy7125
Copy link
Owner

Sorry for not being more reactive on this interesting subject. I had quite a lot of work with the previous version of Polyphone and I am currently busy with various life projects...

So globally there is a need for a soundfont format update and I completely agree with this. Some years ago I started this (based on the sfz capabilities and the different user feedbacks):
https://github.com/davy7125/soundfont-standard-v3

And now I am discovering:
https://github.com/SFe-Team-was-taken/SFe

And I still need to correctly understand this ticket:
#179

Aside from this, MuseScore create the sf3 format which is the sf2 using compressed data samples as you know well. From my side I recently added the "release mode" for samples inside an instrument, so that the playback starts when the key is released (this is a personal wish since I use soundfont to play organ). Vienna (SynthFont) also added a property but I don't remember well (something like the velocity modifying the attack).

This context shows that the different actors should agree with a common target. My position is that I will show no resistance in upgrading the format but I am unfortunately lacking time for managing the whole process and also a bit afraid of implementing updates and force others to follow the movement, creating thus tensions.

Now, back to the default modulator subject, Polyphone could display them when clicking on the soundfont header as you proposed by email. This is a very good idea so that we know exactly how a soundfont is played. It could be possible to change the default modulators without changing the 2.04 format though, with extra processing for displaying the content of a soundfont within Polyphone. If all instruments have all default modulators defined as the first modulators, Polyphone can gather all common instrument modulators and then display them at the soundfont level instead of the instrument level. Other soundfont editors would however display all modulators for each instrument and it would maybe be harder to distinguish the default modulators from the others but... this may not be that important if the use of Polyphone is kept. This system has the advantage of staying supported by all soundfont readers and I have thus the same recommendation than @derselbst , changing the sf2 format for this particular purpose is not needed.

The other and proper way is to update the soundfont format as you propose but I need a common well specified target (including other upgrade needs) with at least the agreement of the fluidsynth team. Maybe should we still use the .sf extension and simply increase the internal version number while progressively supporting more chunks and more sound properties?

@spessasus
Copy link
Contributor Author

Thanks for this response, Davy.

About the Default modulators:

Derselbst's approach

It relies on Polyphone grouping modulators from all instrument zones into default modulators

pros

  • all synthesizers that support modulators will work fine

cons

  • using a soundfont editor other than polyphone will mess up default modulators

Spessasus's approach

It relies on a custom DMOD chunk

pros

  • Other soundfont editors will not mess up the default modulators

cons

  • Requires a synthesizer that supports the DMOD chunk

To be fair, if Polyphone remains dominant sf editor (which it probably will), option 1 might be the best approach. If that gets implemented, I'll remove the proposal from the repo.

Stgiga's wBank proposal

Here's how I understand it:

SF2 always had a problem of lacking support for the bank LSB message, making it incompatible with XG and GM2 bank selection systems. Some synthesizers include hacks to circumvent this, like reacting to bank MSB as a drum toggle and LSB as the soundfont's bank select. But this solution isn't perfect.

What #179 discovers is that the wBank field in the preset selection is a WORD. This means that there are two bytes used, despite only one being needed for storing bank select.

So what stgiga suggested, was to use the top byte of wBank field as the LSB bank select and the bottom one as MSB:

// storing bank LSB 60 and MSB 5
int wBank = 7685;
char bankMSB = wBank & 127; // 5
char bankLSB = wBank >> 7;     // 60

And that's it. The drum toggle means that bank 128 (either one) still means a drum channel.

I hope this helps, @davy7125

@spessasus
Copy link
Contributor Author

These two proposals achieve the most needed features (default modulators and bank LSB) without needing a new format or file structure.

So adding them as soon as possible would extend the life of the sf2 format while giving sf v3 (or SFe) time to develop.

@sylvia-leaf
Copy link

sylvia-leaf commented Nov 13, 2024

Hello! My name is Sylvia. I am the lead developer of the SFe standard that you linked here. I've got a silver badge on the Polyphone forums but haven't been posting there for a while due to health issues.

However even so, I'm very happy to help you understand what's going on with the soundfont enhancement projects that we (and other people) have been working on!


Default modulators

The first thing that we're talking about is the default modulator issue, right? Well, Spessasus's solution involves adding a DMOD subchunk. This allows the bank developer to define a few modulators that apply to all instruments and/or presets in the bank. This can make life much easier for the bank developer, because they won't need to define the same modulator multiple times. This also solves the problems with the default modulator system found in legacy SF2.0x, for example "Velocity -> Filter Cutoff".

I'm not too familiar with the situation that Derselbst has suggested, but from Spessasus's summary of the solutions, it seems that they suggest that Polyphone would "intelligently" detect modulators that are in all instrument zones, and then list them as additional "default modulators" that can be added to or removed. The main disadvantage listed is that it would break if edited with a non-Polyphone editor. However, because non-Polyphone editors (such as Viena or Swami) remain popular, I don't think that it's an acceptable tradeoff.

The last thing that we want are proprietary extensions that only work properly with one soundfont editor or player; this has happened already with other features. By formally defining custom default modulators, we can prevent this issue. Therefore, I'm giving my support to spessasus's DMOD subchunk proposal. We can of course use both strategies; the "intelligent" default modulator detection maximises compatibility with legacy players, while the DMOD subchunk reduces modulator complexity and simplifies modulator parsing for programs that implement the feature. If we can get fluidsynth to adopt it then it would likely be good enough!

As an aside, we've seen other proposals for similar "intelligent" features that would auto-detect when data is formatted in a particular way, but none of these features have received much success. One of these was a way to randomise samples. We were evaluating "intelligent" features, but we concluded that formerly defined structures will always be a better solution than attempting to "unpick" implicitly-defined data.

Ultimately, it depends on whether or not you want to implement this chunk. If you don't think that there is enough use for a DMOD chunk, then we can just not implement it.


Two bank selects on one file

It looks like another thing is the bank select LSB implementation. When stgiga and I were looking through SFSPEC24.PDF, we noticed that the value used for banks (wBank) was a 16-bit value (WORD) instead of an 8-bit value (BYTE/CHAR), as spessasus said. Therefore, we concluded that it would be possible to achieve the 16384 bank system consisting of both bank select MSB and LSB without significant modifications to the SF format.

Spessasus explains very well that the used (in SF2.04) byte of the wBank would be used as the MSB value and the unused (in SF2.04) byte the LSB value. Because the unsigned WORD value can be up to 65535, we can easily represent any of the 16384 combinations of bank select MSB and LSB using just the wBank. No extra fields need to be declared.

The elegant property of this solution that spessasus didn't mention is that to an SF2.04 player, any bank number above 128 is ignored. In other words, the unused byte is ignored. Therefore, banks that use LSB bank selects are completely transparent to legacy players that don't support the feature. Additional programs could also be used to remove such unused presets to reduce preset generator usage.

User interface modifications in Polyphone for this feature would be very simple. All you would have to do is to split the bank field into two bank fields, "Bank (MSB)" and "Bank (LSB)". The preset listings would be slightly modified from "Bank:Preset" to "MSB:LSB:Preset".


File extensions

This is something that I've not got one single answer on. While keeping the .SF2 file extension for updated versions of the format is simple and doesn't need any effort, it may mislead an end-user into attempting to play an enhanced SF bank on an non-enhanced player. This simply causes complaints that the bank developer must address.

With the current loose definition of legacy SF2.0x, there is currently a problem of bank developers saying that their bank only runs on one player (for example fluidsynth or bassmidi). All keeping the .SF2 file extension does is exacerbate this issue. Therefore, we suggest using a different file extension unless the features included are completely backwards compatible with legacy players, i.e. the lack of these features does not affect the rest of the bank.

The file extension would be selectable. If the features used by the bank are completely backwards compatible, then the extension .SF2 can be used, otherwise the new extension would be used.


Internal version number

As I've mentioned before, I propose that our enhancements start with the wMajor version of 4. The wMajor version of 3 used to indicate WernerSF3 may not be sufficient, as there may be a program that implements the SF3 format compression, but can't use any enhanced features that we come up with.

You mentioned that we can keep one extension but increase the internal version number when extra fields, chunks and subchunks are added. This is a good idea; we might want to add some more chunks to the "hydra structure", but this is disallowed by legacy SF2.04. However, SFSPEC24.PDF never states that we can't make such changes at all. My understanding of what SFSPEC24.PDF says is that as long as the wMajor value is changed, we can do whatever the **** we want with the structure as long as the ifil version is still formatted in the same way. If we do so, then this will have to be combined with a different file extension.

My proposal for this is that we just keep the 32-bit version of any enhanced SF format mostly compatible with SF2.04, with a subset being fully compatible. Structural changes that aren't compatible with SF2.04 are limited to 64-bit banks, which is something that stgiga has been researching for a long time.


A "common well-specified target"

Guess what? We've got this common well-specified target that you request. It's called SFe! The name is short for "SF enhanced". I'll start by saying that it seems that there was a misunderstanding about the purpose of SFe. It is not a competitor to anyone's proposals, but rather a way to combine as many of the known SF extensions together to create a complete specification that if followed, will allow programs to be compatible with any bank that may use existing extensions.

The current draft specification should be sent to FluidSynth for approval, and if we can make the necessary changes to make the specification implementable in a practical amount of time, then we can promote the draft specification to a final specification.

However, before the final specification can be released, FluidSynth would need to complete the Werner SF3 specification, as it is an integral part of SFe.

The initial version of the SFe specification, 4.00, includes many things, including some topics that are beyond what's discussed here:

  • ifil versioning rules
    • wMajor=2 and wMajor=3 are still planned to get some use
  • isng rules
    • beyond EMU8000, so we can assume different sound engine parameters
  • UTF-8 support
    • many of the text string fields in SF2.04 were designed to work with ascii
    • however, it is simple to switch to the UTF-8 format
    • this allows users to use kana/CJK characters (kanji) in these fields
  • an extra ISFe chunk in the info-chunk
    • including everything that's in SFe to make it easier to tell if something is an enhanced feature
    • also includes feature flags so the player can communicate to the end-user what features are supported
    • programs can warn the user if it's not able to play the bank with 100% accuracy
    • more features will come up
    • right now, the DMOD subchunk is not in this sub-chunk, but we are thinking of moving it into the sub-chunk
  • WernerSF3 compression
    • initially OGG only but with more compression formats soon
    • according to WernerSF3 draft specification, any compression format can be used
    • examples of other formats in the future that could be supported include FLAC, OPUS or BWTC32Key
    • proprietary compression formats are forbidden, but read-only support and conversion to WernerSF3 is permitted
  • sm32 sub-chunk
    • 32-bit samples likely aren't going to be used in 32-bit banks, but will be good for 64-bit banks
  • 8-bit samples
    • if only the sm24 subchunk is used, then samples can be stored in 8-bit
  • Bank select LSB support
    • something that we mentioned here
    • there are some things related to the unused bit of wPreset that will be added soon (planned for version 4.04)
  • Program specification
    • well-defined guidelines for program developers to meet
  • Compatibility specification
    • information list to describe the differences between SFe and legacy SF2.04
  • Optional AWE ROM emulator plus reference samples

The default modulators feature is planned for the next version after 4.00, 4.01. We did a feature freeze a few months ago precisely to ensure that FluidSynth or similar playback programs would not have to implement a massive number of features. Overwhelming the program developers with a ton of features that must be added would not be a good idea. Therefore, it may be a good idea to get a few more opinions on what solution we should use for default modulators once the first version of SFe is implemented in an SF player.

Other features that will be coming in future versions of SFe include MIDI lyrics, SynthFont Custom Features (if Kenneth Rundt cooperates), preset library management systems, true 64-bit support with expanded or removed field size limits and round robin sampling.

If you think that we should include more or less features in the draft specification, please tell us and then we'll move forward or back some of the features that we're planning to include.

Right now, we were planning to develop a reference implementation of SFe based on SpessaSynth, however if another program implements SFe, then this would negate the need for such a reference implementation!


What do we need to do?

  • Decide on a file extension to use for banks that aren't fully compatible with SF2.04
  • Decide on how we're going to change the ifil value
  • Communicate with FluidSynth to get a well-specified target specification for WernerSF3
  • Agree with FluidSynth about a version of SFe to implement
  • Listen to feedback and release the next draft milestone
  • Repeat until the final specification is ready

Questions to be answered

  • What file extensions should be used and when?
  • How do we handle the ifil value changes for new SF versions?
  • Should the SFe specification be full or partial?
  • What SFe features should be moved forward or backward?
  • If we can't get a formal specification for SFCF, can we reverse engineer it?
  • When should the final specification be ready?

Sorry for this wall of text! If you have any concerns, please reply.

@spessasus
Copy link
Contributor Author

Thanks for the detailed reply Sylvia!

My suggested answers to your questions:

  • I agree with Davy, using .sf would be simple and effective. It would be used when the format will be fundamentally incompatible with sf2. E.g. different file structure that would error out an ordinary SF2 synth.
  • both major and minor versions are WORDs. This means values up to 65k. My suggestion is to keep the wMajor as 4 to indicate SFe and only increase wMinor. 65 thousand versions should be enough ;-)
  • not sure
  • I consider the LSB and DMOD extensions to be structurally compatible with the SF2 format. At worst, these will simply be ignored by a regular SF2 synth and nothing else. So i suggest implementing them in all major SF2 related software (fluid, bass, polyphone) ASAP. They should be easy enough to implement and will make the SF2 format bearable for a little longer while proper SFe is being developed.
  • I'm not sure what does SFCF bring to the table, but reverse engineering would be simple I think: just create a soundfont, use Viena to enable the features and compare the binary data and see what has changed. I'm assuming it also uses the info chunk.
  • it should definitely not be rushed, so not determined.

Let me know what you think!

@davy7125
Copy link
Owner

davy7125 commented Nov 14, 2024

Nice to e-meet you, Sylvia silver Leaf!

I just realized that I got help from the beginning on the subject of soundfont format update.
The problem I am facing is time and I have to prioritize, that's why I missed opportunities and remained silent. The complete redesign of the website delayed everything for a year for instance and now I am trying to optimize the sound engine inside Polyphone. The good news is that it shouldn't last too much longer since I'm currently "only" 1.5 time slower than FluidSynth (it was 10 times in version 2.1).

Below my comments for what I could envision.


Next versions of Polyphone

  • I'll first release version 2.5 of Polyphone with an optimized sound engine, for finishing what I started. This version will be able to read the DMOD chunk but not write it so that the 2.04 standard is still used. The custom default modulators will be copied in every instrument if a save occurs. I'll not split the bank number into MSB / LSB yet.
  • Then, let's say version 3.0 of Polyphone, will use the SFe format by default, sf2 being an export. Polyphone 2.5 will still be alive and will possibly receive bug fixes.

The .SFe format

I would use .sf as the extension for the format update since all files having this extension seem to have disappeared, and the "enhanced" will make less sense in a couple of years if this is the new standard. I would just use the word "soundfont", a sound mapping like a font would do with glyphs.

4.0 can be used as the internal format version number (ifil).

First implementation

I would first start with a list of features that could simply be ignored by existing soundfont readers, for allowing compatibility with minor changes. The different features I am listing come from this page, which gathers features that would be useful for a complete sample-based synthesizer.

  • default modulators
  • initial value of MIDI CC (not sure it is useful though - Polyphone initializes them depending on the curves found in modulators and maybe it is good enough)
  • sample storage: soundfont readers seems to support the sf3 format so there is not that much to do
  • extra parameters would be added at the instrument / preset levels:
    • round-robin
    • independent envelopes and LFOs for pitch / attenuation / filter (currently we have two envelopes and LFOs that can have multiple targets - quite confusing)
  • additional sample modes:
    • release (already in Polyphone 2.4)
    • one-shot
    • back and forth
  • MSB / LSB bank

Then, I would highly recommend using a 64-bit version of the format. 32-system are legacy systems. For compatibility, existing soundfont readers would need extra processing of the indexes but this is less work than implementing the sf3 format (from my point of view).

At this stage, the conversion between sfz / sf2 would be more complete.

Further features

These features could be delayed in subsequent version of the soundfont format:

  • multiple loops at the sample level (maybe hard to handle - I'm thinking of the loop offsets)
  • different kind of filters
  • using other kinds of modulator inputs
  • conditional starts / key switches
  • exclusive class normal release
  • maximum length of different fields (title, comment, ...)
  • negative attenuation (amplification)
  • attenuation in real dB
  • sm32 chunk is a good idea but that means that all uncompressed samples are 32-bit encoded - maybe it is better to implement a storage whose each sample can have its own sample rate

My answers to your questions

  • What file extensions should be used and when?
    .sf, even with version 2.04

  • How do we handle the ifil value changes for new SF versions?
    Maybe 4.x while we are in the "first implementations", that are almost supported by existing soundfont readers.
    Then 5.x when extra features are added (Further features above), further breaking compatibility with soundfont readers only supporting 2.04.

  • Should the SFe specification be full or partial?
    A partial specification is probably enough for people knowing the 2.04 version of the specifications, or for people adapting an existing software. The document would explain the changes / additions.
    A full version could be written later if the new format is adopted.

  • What SFe features should be moved forward or backward?
    Loading compressed samples => already done
    Reading 64 bit indexes should be moved backward so that existing soundfont readers can read soundfont with the new format.
    Existing soundfont readers must also ignore unknown chunks / attributes instead of rejecting the file.

  • If we can't get a formal specification for SFCF, can we reverse engineer it?
    I don't know what is SFCF

  • When should the final specification be ready?
    We need to complete the features to add in it first (my attempt in "First implementation") and then it will be updated as the implementation progresses.

Question

I read in the SFe specifications

  • New options for the SFGenerator enum => These options are listed elsewhere in the specification
  • New options for the SFModulator enum => These options are listed elsewhere in the specification

Where is it written?

@sylvia-leaf
Copy link

Thank you for your responses!

The first thing that I'm going to say is that I'm very floored to hear that the plan is to implement the format in a future version of Polyphone. I'll give my thoughts on the answers given to the questions I asked.

Let's start by summarising our opinions:


What file extensions should be used and when?

I'm glad that some of you have come to an agreement! We could use the .sf file format, but do you think that we should ensure that it is not being used by any other commonly-used file format, as using a extension that is commonly used by something else can cause issues.

Also, Davy suggests we use .sf even with legacy SF banks. While this simplifies the use of banks for us SFe users, we cannot guarantee that everyone will switch to using such a file extension. Therefore, while using .sf for legacy SF would be allowed, we'd suggest that use of .sf2 be continued if legacy SF players are one of the target platforms.

Should such an audit reveal another commonly-used format that uses .sf, there are two possible solutions that we can use:

  • the current draft of the SFe specification, version 4.00.8, specifies the file extensions .sfe32, .sfe64l and .sfe64.
    • However, these formats have the problem of being incompatible with the 8.3 filename format used in file systems such as FAT.
    • While most modern FAT implementations (such as the one used in Yamaha arranger keyboards) support LFN, it isn't guaranteed to be there.
    • Maybe, this is not too much of a problem, and these file extensions may be ok.
  • Spessasus has also suggested the file extension of .sft (for soundfont) in previous proposals.
    • This would be compliant with the 8.3 filename format, unlike the extensions found in the current draft of SFe.
    • The disadvantage is that it would require the program to "intelligently" identify the correct variant of SFe to use.
    • The ifil version, the isng value or a suitable sub-chunk in the ISFe chunk can be used as a hint to the SFe variant used by the file.

How do we handle the ifil value changes for new SF versions?

This is another problem that I've been working on a solution to. So, we now have multiple answers to the same question. Let's start with my answer(s):

  • the current draft of the SFe32 specification, version 4.00.8, specifies two methods for setting the correct ifil version.
    • the first method uses wMajor=2 (or 3 if Werner compression is used) with wMinor=128, increasing from there.
    • the second method uses wMajor=4, wMinor=0, and increases with the SFe specification version.
    • the first method is designed to be compatible with legacy SF players, while the second method is more intuitive.
  • for the next planned version (4.00.9), the first method will change.
    • the wMajor value remains the same (2 for uncompressed banks and 3 for Werner-compressed banks).
    • the MSB of the wMinor value corresponds to the major version of SFe, while the LSB corresponds to the minor version.
    • so, ifil=2.1024 for uncompressed SFe 4.00, ifil=3.1025 for compressed SFe 4.01, ifil=2.1280 for uncompressed SFe 5.00, etc.
    • this will last until SFe version 255.255 releases; it should last for the reasonable lifetime of the format.
    • the previous plan was to have a table of SFe specification versions corresponding to ifil versions; this eliminates the need for such a table.

Now, let's compare it to your solutions:

  • Spessasus proposed that wMajor remains 4 and only wMinor increases up to 65535.

    • It's similar to SFe versioning method 1, but with the difference that the wMajor value is equal to 4 instead of 2 or 3.
    • The main advantage of this method over SFe versioning method 1 is that it corresponds to the SFe specification version for 4.x.
    • However, when SFe 5.00 is released, the wMajor value remaining 4 may confuse some.
    • If we change the wMajor value every time the SFe major version changes, then it becomes SFe versioning method 2.
  • Davy proposed that wMajor starts at 4 for the foundations, while 5 and higher values are used for incompatible changes.

    • This is actually part of the plan for SFe, but only for the 64-bit version (SFe64).
    • We don't plan on making incompatible changes for SFe32, because if we make the format completely incompatible with legacy players, we may as well go 64-bit.

Should the SFe specification be full or partial?

Davy said that partial specifications are good enough for people who are familiar with the content of SFSPEC24.PDF, with a full specification being good if SFe is widely adopted.


What SFe features should be moved forward or backward?

We've got differing opinions on this topic.

Spessasus says that we should start implementing "structurally compatible" features before SFe is even implemented:

  • Dual bank select for 16384 variations.
  • User-defined default modulators with DMOD subchunk

Davy says that there should be a wider set of features, including:

  • MIDI CC initial values
  • round-robin or sampling randomisation
  • independent envelopes/LFOs for each property
  • sample playback modes
  • 64-bit format

The plan is to prioritise the features that spessasus says are "structurally compatible", while adding many of Davy's wider set of suggested features at a later date to avoid overwhelming program developers.


If we can't get a formal specification for SFCF, can we reverse engineer it?

It looks like we've got some confusion about SFCF. It stands for SynthFont Custom Features. It was mentioned in the original Polyphone soundfont enhancement proposal:

(Viena - while still using the sf2 v2.xx format - extended some of the sf2 capabilities.)

It includes the features:

  • Always Play Sample to End
  • DLS Velocity-to-Volume Envelope Attack
  • DLS Velocity-to-Modulator Envelope Attack
  • Vibrato LFO to Volume

When should the final specification be ready?

This is a point of contention, but I think that if we do things carefully and not rush them (as spessasus said), then we will have a much higher chance of success. And of course, we will have some kind of reference implementation before a final specification releases, to allow people to ensure that their programs are compatible with SFe.


Follow-up things

So, from what has been said, here is some things to do in the next draft milestone (4.00.9) of the SFe specification:

  • define the .sft file extension as the file extension used by SFe.
    • We'll also allow .sf if an audit determines that no other program is using .sf.
    • A sub-chunk in the ISFe subchunk will be defined that declares the SFe type.
    • This completes the first task listed in my previous comment.
  • change how the ifil versioning works if it makes sense.
    • Your feedback is appreciated!
    • As long as there are no problems with the current proposal, then the second task is completed.
  • the specification will remain partial.
    • The plan is to turn it into a full specification for the final version.
  • we'll move many of the most important features forward to 4.0x versions of SFe, and move other features backwards.
    • I'll likely prioritise Davy's feature suggestions.
    • However, anyone can make a suggestion to us and we'll consider it for a future version.
    • This is subject to change.

I've sent invitations to the SFe-team-was-taken organisation to spessasus and Davy, and if you want an invitation, I'll gladly give you one on request. We'd suggest that all future discussion on SFe be done on the SFe repository, because after all, this is the Polyphone repository and not the SFe one.

Thank you for your feedback, and sorry for another wall of text!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants