Skip to content

Commit

Permalink
tests + readme update
Browse files Browse the repository at this point in the history
  • Loading branch information
Natooz committed Sep 21, 2021
1 parent 08b381f commit 4c722e2
Show file tree
Hide file tree
Showing 3 changed files with 52 additions and 28 deletions.
18 changes: 11 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,11 @@ generated_midi.dump('path/to/save/file.mid') # could have been done above by gi

## Encodings

_In the figures, yellow tokens are additional tokens, and tokens are vertically stacked at index 0 from the bottom up to the top._
The figures represent the following music sheet as its corresponding token sequences.

![Music sheet example](https://github.com/Natooz/MidiTok/blob/assets/assets/music_sheet.png?raw=true "Music sheet example")

_Tokens are vertically stacked at index 0 from the bottom up to the top._

### MIDI-Like

Expand All @@ -99,7 +103,7 @@ Strategy used in the first symbolic music generative transformers and RNN / LSTM
NOTES:
* Rests act exactly like Time-shifts. It is then recommended choosing a minimum rest range of the same first beat resolution so the time is shifted with the same accuracy. For instance if your first beat resolution is ```(0, 4): 8```, you should choose a minimum rest of ```8```.

![MIDI-Like figure](https://github.com/Natooz/MidiTok/blob/assets/assets/midi_like.png?raw=true "Three notes played together with different durations")
![MIDI-Like figure](https://github.com/Natooz/MidiTok/blob/assets/assets/midi_like.png?raw=true "MIDI-Like token sequence, with Time-Shifts and Note-Off tokens")

### REMI

Expand All @@ -110,7 +114,7 @@ NOTES:
* Including tempo tokens in a multitrack task with REMI is not recommended. Generating several tracks would lead to multiple and ambiguous tempo changes. So in MidiTok only the tempo changes of the first track will be kept in the final created MIDI.
* Position tokens are always following Rest tokens to make sure the position of the following notes are explicitly stated. Bar tokens can follow Rest tokens depending on their respective value and your parameters.

![REMI figure](https://github.com/Natooz/MidiTok/blob/assets/assets/remi.png?raw=true "Time is tracked with Bar and position tokens")
![REMI figure](https://github.com/Natooz/MidiTok/blob/assets/assets/remi.png?raw=true "REMI sequence, time is tracked with Bar and position tokens")

### Compound Word

Expand All @@ -121,15 +125,15 @@ You can combine them in your model the way you want. CP Word authors concatenate

At decoding, the easiest way to predict multiple tokens (employed by the original authors) is to project the output vector of your model with several projection matrices, one for each token type.

![Compound Word figure](https://github.com/Natooz/MidiTok/blob/assets/assets/cp_word.png?raw=true "Tokens of the same family are grouped together")
![Compound Word figure](https://github.com/Natooz/MidiTok/blob/assets/assets/cp_word.png?raw=true "CP Word sequence, tokens of the same family are grouped together")

### Structured

Presented with the [Piano Inpainting Application](https://arxiv.org/abs/2107.05944), it is similar to the MIDI-Like encoding but with _Duration_ tokens instead Note-Off.
The main advantage of this encoding is the consistent token type transitions it imposes, which can greatly speed up training. The structure is as: _Pitch_ -> _Velocity_ -> _Duration_ -> _Time Shift_ -> ... (pitch again)
To keep this property, no additional token can be inserted in MidiTok's implementation.

![Structured figure](https://github.com/Natooz/MidiTok/blob/assets/assets/structured.png?raw=true "The token types always follow the same transition pattern")
![Structured figure](https://github.com/Natooz/MidiTok/blob/assets/assets/structured.png?raw=true "Structured MIDI encoding, the token types always follow the same transition pattern")

### Octuple

Expand All @@ -143,7 +147,7 @@ NOTES:
* Time signature tokens are not implemented in MidiTok.
* [Octuple Mono](miditok/octuple_mono.py) is a modified version with no program embedding at each time step.

![Octuple figure](https://github.com/Natooz/MidiTok/blob/assets/assets/octuple.png?raw=true "Sequence with notes from two different tracks, with a bar and position embeddings")
![Octuple figure](https://github.com/Natooz/MidiTok/blob/assets/assets/octuple.png?raw=true "Octuple sequence, with a bar and position embeddings")

### MuMIDI

Expand All @@ -157,7 +161,7 @@ NOTES:
* This implementation uses _Program_ tokens to distinguish tracks, on their MIDI program. Hence, two tracks with the same program will be treated as being the same.
* As in the original MuMIDI implementation, MidiTok distinguishes pitch tokens of drums from pitch tokens of other instruments. More details in the [code](miditok/mumidi.py).

![MuMIDI figure](https://github.com/Natooz/MidiTok/blob/assets/assets/mumidi.png?raw=true "Sequence with notes from two different tracks, with a bar and position embeddings")
![MuMIDI figure](https://github.com/Natooz/MidiTok/blob/assets/assets/mumidi.png?raw=true "MuMIDI sequence, with a bar and position embeddings")

### Create your own

Expand Down
33 changes: 23 additions & 10 deletions tests/tests_multitrack.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
"""

import time
from sys import stdout
from copy import deepcopy
from pathlib import Path, PurePath
from typing import Union
Expand All @@ -37,7 +37,7 @@


def multitrack_midi_to_tokens_to_midi(data_path: Union[str, Path, PurePath] = './Maestro_MIDIs',
saving_midi: bool = True):
saving_erroneous_midis: bool = True):
""" Reads a few MIDI files, convert them into token sequences, convert them back to MIDI files.
The converted back MIDI files should identical to original one, expect with note starting and ending
times quantized, and maybe a some duplicated notes removed
Expand All @@ -47,12 +47,22 @@ def multitrack_midi_to_tokens_to_midi(data_path: Union[str, Path, PurePath] = '.
files = list(Path(data_path).glob('**/*.mid'))

for i, file_path in enumerate(files):
print(f'Converting MIDI {i+1} / {len(files)} - {file_path}')
bar_len = 60
filled_len = int(round(bar_len * i / len(files)))
percents = round(100.0 * i / len(files), 2)
bar = '=' * filled_len + '-' * (bar_len - filled_len)
prog = f'\r{i} / {len(files)} [{bar}] {percents:.1f}% ...Converting MIDIs to tokens: {file_path}'
stdout.write(prog)
stdout.flush()

# Reads the MIDI
midi = MidiFile(file_path)
try:
midi = MidiFile(PurePath(file_path))
except Exception as _: # ValueError, OSError, FileNotFoundError, IOError, EOFError, mido.KeySignatureError
continue
if midi.ticks_per_beat % max(BEAT_RES_TEST.values()) != 0:
continue

t0 = time.time()
for encoding in encodings:
tokenizer = getattr(miditok, encoding)(beat_res=BEAT_RES_TEST,
additional_tokens=deepcopy(ADDITIONAL_TOKENS_TEST))
Expand All @@ -78,21 +88,22 @@ def multitrack_midi_to_tokens_to_midi(data_path: Union[str, Path, PurePath] = '.
# Checks notes
errors = midis_equals(midi_to_compare, new_midi)
if len(errors) > 0:
print(f'Failed to encode/decode MIDI with {encoding[:-8]} ({sum(len(t) for t in errors)} errors)')
print(f'MIDI {i} - {file_path} failed to encode/decode with '
f'{encoding[:-8]} ({sum(len(t) for t in errors)} errors)')
# return False

# Checks tempos
tempo_errors = []
if tokenizer.additional_tokens['Tempo'] and encoding != 'MuMIDIEncoding': # MuMIDI doesn't decode tempos
tempo_errors = tempo_changes_equals(midi_to_compare.tempo_changes, new_midi.tempo_changes)
if len(tempo_errors) > 0:
print(f'Failed to encode/decode TEMPO changes with {encoding[:-8]} ({len(tempo_errors)} errors)')
'''print(f'MIDI {i} - {file_path} failed to encode/decode TEMPO changes with '
f'{encoding[:-8]} ({len(tempo_errors)} errors)')'''

if saving_midi:
if saving_erroneous_midis and (len(errors) > 0 or len(tempo_errors) > 0):
new_midi.dump(PurePath('tests', 'test_results', f'{file_path.stem}_{encoding[:-8]}')
.with_suffix('.mid'))

t1 = time.time()
print(f'Took {t1 - t0} seconds')
return True


Expand All @@ -105,6 +116,8 @@ def midi_to_tokens_to_midi(tokenizer: miditok.MIDITokenizer, midi: MidiFile) ->
:return: The converted MIDI object
"""
tokens = tokenizer.midi_to_tokens(midi)
if len(tokens) == 0: # no track after notes quantization, this can happen
return MidiFile()
inf = miditok.get_midi_programs(midi) # programs of tracks
new_midi = tokenizer.tokens_to_midi(tokens, inf, time_division=midi.ticks_per_beat)

Expand Down
29 changes: 18 additions & 11 deletions tests/tests_one_track.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"""

import time
from sys import stdout
from copy import deepcopy
from pathlib import Path, PurePath
from typing import Union
Expand All @@ -32,26 +32,32 @@


def one_track_midi_to_tokens_to_midi(data_path: Union[str, Path, PurePath] = './Maestro_MIDIs',
saving_midi: bool = True) -> bool:
saving_erroneous_midis: bool = True) -> bool:
""" Reads a few MIDI files, convert them into token sequences, convert them back to MIDI files.
The converted back MIDI files should identical to original one, expect with note starting and ending
times quantized, and maybe a some duplicated notes removed
:param data_path: root path to the data to test
:param saving_midi: whether to save the results in a MIDI file
:param saving_erroneous_midis: will save MIDIs converted back with errors, to be used to debug
"""
encodings = ['MIDILikeEncoding', 'StructuredEncoding', 'REMIEncoding', 'CPWordEncoding', 'OctupleEncoding',
'OctupleMonoEncoding', 'MuMIDIEncoding']
files = list(Path(data_path).glob('**/*.mid'))

for i, file_path in enumerate(files):
print(f'Converting MIDI {i + 1} / {len(files)} - {file_path}')
bar_len = 60
filled_len = int(round(bar_len * i / len(files)))
percents = round(100.0 * i / len(files), 2)
bar = '=' * filled_len + '-' * (bar_len - filled_len)
prog = f'\r{i} / {len(files)} [{bar}] {percents:.1f}% ...Converting MIDIs to tokens: {file_path}'
stdout.write(prog)
stdout.flush()

# Reads the midi
midi = MidiFile(file_path)
tracks = [deepcopy(midi.instruments[0])]
has_errors = True

t0 = time.time()
for encoding in encodings:
add_tokens = deepcopy(ADDITIONAL_TOKENS_TEST)
if encoding == 'MIDILikeEncoding':
Expand All @@ -77,11 +83,13 @@ def one_track_midi_to_tokens_to_midi(data_path: Union[str, Path, PurePath] = './
# Checks its good
errors = track_equals(midi.instruments[0], track)
if len(errors) > 0:
has_errors = True
if errors[0][0] != 'len':
for err, note, exp in errors:
midi.markers.append(Marker(f'ERR {encoding[:-8]} with note {err} (pitch {note.pitch})',
note.start))
print(f'Failed to encode/decode MIDI with {encoding[:-8]} ({len(errors)} errors)')
print(f'MIDI {i} - {file_path} failed to encode/decode MIDI with '
f'{encoding[:-8]} ({len(errors)} errors)')
# return False
track.name = f'encoded with {encoding[:-8]}'
tracks.append(track)
Expand All @@ -90,12 +98,11 @@ def one_track_midi_to_tokens_to_midi(data_path: Union[str, Path, PurePath] = './
if tempo_changes is not None and tokenizer.additional_tokens['Tempo']:
tempo_errors = tempo_changes_equals(midi.tempo_changes, tempo_changes)
if len(tempo_errors) > 0:
print(f'Failed to encode/decode TEMPO changes with {encoding[:-8]} ({len(tempo_errors)} errors)')
has_errors = True
print(f'MIDI {i} - {file_path} failed to encode/decode TEMPO changes with '
f'{encoding[:-8]} ({len(tempo_errors)} errors)')

t1 = time.time()
print(f'Took {t1 - t0} seconds')

if saving_midi:
if saving_erroneous_midis and has_errors:
midi.instruments[0].name = 'original quantized'
tracks[0].name = 'original not quantized'

Expand Down

0 comments on commit 4c722e2

Please sign in to comment.