Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Annotation.write_rttm follow most of RTTM specs #75

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

JMasr
Copy link

@JMasr JMasr commented Jan 5, 2022

The method write_rttm() only allow the type SPEAKER in the first field of the RTTM File.

This pull request is for adding the type NON-SPEECH in field 1, if the label of the segment it's one of the 3 subtypes allowed in the RTTM File Format Specification (noise, music or other).

@hbredin
Copy link
Member

hbredin commented Jan 5, 2022

Thanks. Would you mind sharing a link to the RTTM file format specification?

@JMasr
Copy link
Author

JMasr commented Jan 5, 2022

Thank you for sharing and build this project.
Of course, in this NIST's paper in the Appendix A you can find the RTTM File Format Specification.

@hbredin
Copy link
Member

hbredin commented Jan 18, 2022

(sorry for the delay in getting back to you)

It looks like RTTM files may contain much more than just SPEAKER and NON-SPEECH (column Type of Table A.2).
Also, there is no clear correspondance between pyannote.core.Annotation labels and RTTM type, subtype, and name fields.

Therefore, unless you convince me otherwise and we find a way to really map Annotation to the RTTM specs, I probably won't merge this PR.

@JMasr
Copy link
Author

JMasr commented Jan 18, 2022

Hi, @hbredin. Don't worry about the delay, and thanks for taking the time to answer back.

I'm with you. Maybe this request is too poor. I think pyannote.core.Annotation is very useful for VAD, SAD, and SPK-Diarization. If we figure out a way to map better with the RTTM specs, it could be equally useful for Acoustic Events Detection or Rich Transcription.

The thing for me is that if the method pyannote.core.Annotation.write_rttm only prints with the subtype SPEAKER I can't include acoustics events such as music in the annotation. Maybe a refactoring that covers all the specs will be better. What do you think?

@hbredin
Copy link
Member

hbredin commented Jan 19, 2022

I'd definitely consider a PR that covers all the specs (or at least STT and MDE categories).

RTTM specs vs. Annotation

There is not a 100% correspondance between RTTM specs and what Annotation can handle.

for segment, track, label in annotation.itertracks(yield_label=True):
    pass
RTTM Annotation
type see below
file annotation.uri
chnl see below
tbeg segment.start
tdur segment.duration
ortho N/A
stype see below
name label when type is SPEAKER
conf N/A

N/A = information is not provided by Annotation

About type

While track is used to differentiate two identical segments (think: perfect overlap between two speakers), we could try to divert its use to provide a cue about what type it is (while still allowing to differentiate two identical segments). Note, however, thattrack is expected to be either a string or an int.

For instance, we could use track with the following convention {type}_{original_track} where type can be any type between LEXEME and SPEAKER (see column Type of Table A.2) and original_track allows to keep the original role of differentiating identical segments.

About subtype

Once we infer type from track,

  • if type is A/P or SPEAKER, subtype should be "<NA>"
  • otherwise, subtype should be label.

About chnl

We could trick annotation.uri into containing channel information (e.g. using {file}:{chnl} convention)

What do you think?

@hbredin hbredin changed the title Adding the type NON-SPEECH to the RTTM file writer method Make Annotation.write_rttm follows most of RTTM specs Jan 19, 2022
@hbredin hbredin changed the title Make Annotation.write_rttm follows most of RTTM specs Make Annotation.write_rttm follow most of RTTM specs Jan 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants