Release date: May 5, 2021
- Added dense retrieval support for DistilBERT KD, SBERT, and ANCE.
- Added support for KILT, including integration tests.
- Added support for pre-tokenized collections and tokenization using sentencepiece and other models.
- Added support and guide to reproduce Elasticsearch multi-field experiments
- Added integration tests to reproduce text classification pseudo-relevance feedback experiments.
- Added option to make cache directory configurable.
- Improved
pyserini.fusion
. - Improved support on Windows.
- Improved dense retrieval integration test cases.
- Organized integration tests into
sparse
,dense
, andclprf
. - Refactored dense query encoders.
- Fixed bugs related to iteration order of topics.
- Cleaned up LTR related code.
- Changed terminology in documentation "replicate" to "reproduce" per latest ACM policy
Sorted by number of commits:
- Jimmy Lin (lintool)
- Xueguang Ma (MXueguang)
- Stephanie Hu (stephaniewhoo)
- Yuqi Liu (yuki617)
- Arthur Chen (ArthurChen189)
- Kai Sun (KaiSun314)
- Andrew Guo (andrewyguo)
- Chris Kamphuis (Chriskamphuis)
- Julie Tibshirani (jtibshirani)
- Mayank Anand (mayankanand007)
- Sailesh Nankani (saileshnankani)
- Shengyao Zhuang (ArvinZhuang)
- Vinay Damodaran (vrdn-23)
- Yuxuan Ji (yuxuan-ji)
- Calvin Wang (printfCalvin)
Sorted by number of commits, according to GitHub:
- Jimmy Lin (lintool)
- Xueguang Ma (MXueguang)
- Johnson Han (x65han)
- Yuqi Liu (yuki617)
- Stephanie Hu (stephaniewhoo)
- Chris Kamphuis (Chriskamphuis)
- Zeynep Akkalyoncu Yilmaz (zeynepakkalyoncu)
- Xinyu Mavis Liu (x389liu)
- Pepijn Boers (PepijnBoers)
- Ronak Pradeep (ronakice)
- Hang Cui (HangCui0510)
- Qing Guo (qguo96)
- Tommaso Teofili (tteofili)
- Kai Sun (KaiSun314)
- Marko Arezina (mrkarezina)
- Arthur Chen (ArthurChen189)
- Dahlia Chehata (Dahlia-Chehata)
- Rodrigo Nogueira (rodrigonogueira4)
- Larry Li (larryli1999)
- Jiarui Zhang (jrzhang12)
- Yuxuan Ji (yuxuan-ji)
- Tim Hatch (thatch)
- Alireza Mirzaeiyan (amirzaeiyan)
- Julie Tibshirani (jtibshirani)
- Adam Yang (adamyy)
- Mayank Anand (mayankanand007)
- Vinay Damodaran (vrdn-23)
- Yue Zhang (nsndimt)
- Calvin Wang (printfCalvin)
- Sailesh Nankani (saileshnankani)
- Rakeeb Hossain (rakeeb123)
- Jerry Huang (jhuang265)
- Jeffrey Chen (JeffreyCA)
- Shengyao Zhuang (ArvinZhuang)
- Andrew Guo (andrewyguo)
- Hector (Xinhai) Wei (HEC2018)
- Emily Ye (yemiliey)