Releases: spencermountain/compromise
13.0.0
13.0.0
major changes to .export()
and [capture] group
match-syntax.
- [breaking] move .export() and .load() methods to plugin (compromise-export)
-
- change .export() format - this hasn't worked properly since v12. (mis-parsed contractions) see #669
- [breaking] split
compromise-output
intocompromise-html
andcompromise-hash
plugins - [breaking]
.match('foo [bar]')
no-longer returns 'bar'. (use.match('foo [bar]', 0)
) - [breaking] capture groups are no longer merged.
.match('[foo] [bar]')
returns two groups accessible with the new.groups()
function - [breaking] change
.sentences()
method to return only full-sentences of matches (use.all()
instead)
modifications:
- fix nlp.clone() - hasn't worked properly, since v12. (@Drache93)
- fix issues with greedy capture [*] and [.+] -(@Drache93) 💛
- add whitespace properties (pre+post) to default json output (suppress with
.json({ whitespace: false })
) .lookup({ key: val })
with an object now returns an object back ({val: Doc})- add nlp constructor as a third param to
.extend()
- support lexicon object param in tokenize -
.tokenize('my word', { word: 'tag' })
- clean-up of scripts and tooling
- improved typescript types
- add support for some french contractions like
j'aime -> je aime
- allow null results in
.map()
function
new things:
- add new named-match syntax, with .groups() method (@Drache93)
- add
nlp.fromJSON()
method - add a new
compromise-tokenize.js
build, without the tagger, or data included.
12.3.0
12.3.0
- prefer
@titleCase
instead of#TitleCase
tag - update dependencies
- fix case-sensitive paths
- fix greedy-start match condition regression #651
- fix single period sentence runtime error
- fix potentially-unsafe regexes
- improved tagging for '-ed' verbs (#616)
- improve support for auxilary-pastTense ('was lifted') verb-phrases
- more robust number-tagging regexes
- setup typescript types for plugins #661 (thanks @Drache93!)
- verb conjugation and tagger bugfixes
- disambiguate acryonyms & yelling
12.2.1
- fix 'aint' contraction
- make Doc.world writable
- update deps
- more tests
- fix shared period with acronym at end of sentence
- fix some mis-classification of contraction
- fix over-active emoji regex
- tag 'cookin', 'hootin' as
Gerund
- support unicode single-quote symbols in contractions
12.2.0
- improved splitting in .nouns()
- add
.nouns().adjectives()
method - add
concat
param to.pre()
and.post()
- allow ellipses at start of term "....so" in
@hasEllipses
- fix matches with optional-end
foo?$
match syntax - add typescript types for subsets
12.1.0
- add 'sideEffect:false' flag to build
- considerable speedup (20%) in tagger
- ensure trimming of whitespace for root/clean/reduced text formats
- fix client-side logging
- more flexible params to
replace()
andreplaceWith()
12.0.0
compromise is a modest library that does natural-language processing in javascript.
it was built to make searching and transforming human-text easy and playful.
I'm very proud to release compromise v12, our strongest, fastest, and smallest release yet.
v10
v10 has about a dozen small changes to the api:
Values:
- cleanup & rename some
.value()
methods - new
.lessThan()
,.greaterThan()
,.equalTo()
methods
Tagging:
- change lumping behaviour of lexicon terms with multiple words
- keep more former tags after a term replace method
Matching:
- new prefix/suffix/infix matches with
_ffix
syntax - .match 'range' queries now use a real match -
#Adverb{2,4}
- new
.before()
and.after()
match methods
Misc:
- new
.random()
method tag()
supports a sequence of tags for a sequence of terms- removes
.lexicon()
method for many-lexicons concept - changes params of
.replaceWith()
method to a 'keyTags' boolean - improved .debug() and logging on clientside
v9.1.0
uses es5 prototype inheritance instead of es6 classes, for a significant reduction of babel-build filesize
v9.0.0
benchmarked between 30-50% faster on parsing+tagging
- renames
Term.tag
object toTerm.tags
so the.tag()
method can work throughout more-consistently - fixes 'Auxillary' tag typo to 'Auxiliary'
- optimisation of .match(), and tagset - significant speedup!
- adds
.tagger()
method and cleanup extra params - adds
wordStart
andwordEnd
offsets to.out('offset')
for whitespace+punctuation - new
.has()
method for faster lookups
first v7 release
v1.1.0
tag bower release
v1.0.0
added name genders and beginning of co-reference resolution ('Tony' -> 'he') API.
small breaking change on Noun.is_plural
and Noun.is_entity
, affording significant pos() speedup. Bumped Major version for these changes.
v0.5.2
tag bower release