Releases
v1.3.0
Under data/
Added
Big scrape for 2022. (#464 )
Added the --fresh
flag to data/scrape/scrape.py
to facilitate running the big scrape in batches. (#464 )
Added the --exclude
flag for excluding one or more languages in data/scrape/scrape.py
. (#460 )
Added data/src/normalize.py
. (#356 )
Updated README.md
. (#360 )
Added data/cg/tsv/geo.tsv
. (#367 )
Added data/morphology
. (#369 )
Added SIGMORPHON 2021 morphology data. (#375 )
Added data/cg/tsv/jpn_hira.tsv
. (#384 )
Enforced final newlines. (#387 )
Adds all UniMorph languages to morphology. (#393 )
Added data/covering_grammar/tsv/fre_latn_phonemic.tsv
(#398 )
Added data/covering_grammar/lib/make_test_file.py
(#396 , #399 )
Added Komi-Zyrian (kpv
). (#400 )
Added Makasar (mak
). (#415 , #419 )
Added Zou (zom
). (#421 )
Added Wiyot (wiy
). (#422 )
Added Sidamo (sid
). (#423 )
Added Central Atlas Tamazight (tzm
). (#429 )
Added Chibcha (chb
). (#430 )
Added Kashmiri (kas
). (#431 )
Added Malayalam (mal
). (#434 )
Added Dhivehi (div
). (#437 )
Added Akkadian (akk
). (#441 )
Added Central Nahuatl (nhn
). (#443 )
Added Etruscan (ett
). (#444 )
Added Gujarati (guj
). (#445 )
Added Kannada (kan
). (#446 )
Added Karelian (krl
). (#447 )
Added Romagnol (rgn
). (#448 )
Added Southern Yukaghir (yux
). (#449 )
Added Urak Lawoi' (urk
). (#451 )
Added Hausa (ha
). (#452 )
Added Kashubian (csb
). (#453 )
Added Tabaru (tby
). (#455 )
Added West Makian (mqs
). (#457 )
Added Amharic (amh
). (#458 )
Added Livvi (olo
). (#459 )
Added Kalmyk (xal
). (#472 )
Added Ternate (tft
). (#473 )
Added Abkhaz (abk
). (#474 )
Added Farefare (gur
). (#475 )
Added Iban (iba
). (#476 )
Added Laz (lzz
). (#477 )
Changed
Switched to ISO 639-3 language codes. (#468 )
Updated scraped data in preparation for the SIGMORPHON 2022 shared task:
swe nno ger dut ita rum ukr bel tgl ceb ben asm per pus tha lwl
. (#461 )
Made scripts under data/frequencies/
and data/morphology/
more flexible,
especially for the purposes of preparing data for a shared task. (#461 )
Fixed the --restriction
flag for specifying multiple languages in data/scrape/scrape.py
. (#460 )
Added covering grammar coverage error log and specified error_type in error_analysis.py. (#424 )
Added error log writing in error_analysis.py. (#420 )
Added new columns in summary tables. (#365 )
Fixed broken paths in data/src/generate_phones_summary.py
and in
data/phones/HOWTO.md
. (#352 )
Added Atong (India) (aot
). (#353 )
Added Egyptian Arabic (arz
). (#354 )
Added Lolopo (ycl
). (#355 )
Fixed Unicode normalization in data/phones/slv_phonemic.phones
and
re-scraped Slovenian data. (#356 )
Updated data/phones/HOWTO.md
to include instructions on applying the
NFC Unicode normalization (#357 )
Updated data/src/normalize.py
to be more efficient. (#358 )
Fixed inaccuracies in data/phones/geo_phonemic.phones
. (#367 )
Fixed typo in data/cg/tsv/geo.tsv
and added missing character. (#370 )
Morphology URLs are now provided as a list. (#376 )
Configured and scraped Yamphu (ybi
). (#380 )
Configured and scraped Khumi Chin (cnk
). (#381 )
Made summary generation in common_characters.py
optional. (#382 )
Fixed phone counting in data/src/generate_phones_summary.py
(#390 , #392 )
Reorganizes scraping scripts under data/scrape
(#394 )
Reorganizes .phones
files and related scripts under data/phones
(#395 )
Reorganizes CG files and related scripts under data/covering_grammar
(#395 )
Reorganized data/phones/phones/fre_phonemic.phones
(#398 )
Removed data/src/
(#401 )
Renamed TSV files and phonelists to use the terms "broad"/"narrow" instead
of "phonemic"/"phonetic" (#389 , #402 , #405 )
Fixed typo in README.md
(#407 )
Fixed column ordering of the test file read by the script in
data/covering_grammar/lib/error_analysis.py
(#411 )
Fixed Common character collection in common_characters.py
(#419 )
Scraping test fixed for blt
. (#436 )
Changed URLs to point at CUNY-CL repo, where applicable. (#438 )
Under wikipron/
and elsewhere
Added
Added ckb
in languagecodes.py
. (#464 )
Added support for Python 3.10. (#462 )
Added test of phones list generation in test_data/test_summary.py
(#363 )
Added Min Nan extraction function. (#397 )
Added Tai Dam extraction function, configuration and initial scrape. (#435 )
Added test of casefold
value for languages in data/scrape/lib/languages.json
(#442 )
Added support for Python 3.11. (#479 )
Added checks for the Python source distribution and wheel on CI. (#479 )
Turned on tests for Windows on CI. (#479 )
Removed
Dropped support for Python 3.6. (#462 )
Dropped support for Python 3.7. (#479 )
Changed
Switched to ISO 639-3 language codes. (#468 )
Converted setup.py
to pyproject.toml
. (#479 )
You can’t perform that action at this time.