Releases: ETCBC/nestle1904
Releases · ETCBC/nestle1904
Clauses, phrases, frames, and subj refs
There are more edges now: when the xml refers to other parts of the xml by means of ids, we turn that into edges.
And we added phrases and clauses.
Tweaked metadata and some features
- Attributes that only ever have the value
true
get value1
(int). - Generated numbers for books, sentences, words are now all in feature
num
. - Features are not split between the
w
andwg
elements, if they have values for both, they have slightly different meanings - The descriptions of the features are a little bit clearer
Good word order
This version of the data has the words in proper order.
Achieved by a new version of the TF walker converter, which can now reorder slots.
First working dataset
Data version 0.1.1