Skip to content

Latest commit

 

History

History
23 lines (21 loc) · 1.32 KB

README.md

File metadata and controls

23 lines (21 loc) · 1.32 KB

Supplementary Information for the paper "Different languages, similar encoding efficiency: comparable information rates across the human communicative niche"

These supplementary materials contain:

  • the primary data as two TAB-separated CSV files:
    • InfoRateData.csv: contains most of the primary data as follows:
      • Speaker: the unique speaker IDs
      • Language: the language code
      • Text: the text ID
      • Sex: the speaker's sex (F=female, M=male)
      • Duration: the text's duration (in seconds)
      • NS: the text's canonical number of syllables
      • ShE: the language's Shannon Entropy
      • ID: the language's Information Density
      • Age: the speaker's age (in years)
    • AutomaticSylDetect.csv: contains the results of the automatic syllable detection algorithm, as follows:
      • soundname: the unique identifier of the soundfile (= a text produced by a given speaker in a given language)
      • nsyll: the detected number of syllables
      • npause: the number of detected pauses (longer than 150ms)
      • dur: the total duration (in seconds)
      • phonationtime: the duration of the actual phonation (in seconds)
  • the full Rmarkdown analysis and plotting script InfoRate.Rmd
  • the resulting HTML output InfoRate.html