Pyserini provides a number of pre-built Lucene indexes. To list what's available in code:
from pyserini.search.lucene import LuceneSearcher
LuceneSearcher.list_prebuilt_indexes()
from pyserini.index.lucene import IndexReader
IndexReader.list_prebuilt_indexes()
It's easy initialize a searcher from a pre-built index:
searcher = LuceneSearcher.from_prebuilt_index('robust04')
You can use this simple Python one-liner to download the pre-built index:
python -c "from pyserini.search.lucene import LuceneSearcher; LuceneSearcher.from_prebuilt_index('robust04')"
The downloaded index will be in ~/.cache/pyserini/indexes/
.
It's similarly easy initialize an index reader from a pre-built index:
index_reader = IndexReader.from_prebuilt_index('robust04')
index_reader.stats()
The output will be:
{'total_terms': 174540872, 'documents': 528030, 'non_empty_documents': 528030, 'unique_terms': 923436}
Note that unless the underlying index was built with the -optimize
option (i.e., merging all index segments into a single segment), unique_terms
will show -1.
Nope, that's not a bug.
Below is a summary of the pre-built indexes that are currently available.
Detailed configuration information for the pre-built indexes are stored in pyserini/prebuilt_index_info.py
.
- Lucene index of the CACM corpus
- Lucene index of TREC Disks 4 & 5 (minus Congressional Records), used in the TREC 2004 Robust Track
- Lucene index of the MS MARCO passage corpus with four extra preprocessed fields for LTR
- Lucene index of the MS MARCO document per-passage corpus with four extra preprocessed fields for LTR
- Lucene index of the MS MARCO document segmented corpus with four extra preprocessed fields for LTR
- Lucene index of the MS MARCO V1 document corpus.
- Lucene index of the MS MARCO V1 document corpus ('slim' version).
- Lucene index of the MS MARCO V1 document corpus ('full' version).
- Lucene index of the MS MARCO V1 document corpus with doc2query-T5 expansions.
- Lucene index (+docvectors) of the MS MARCO V1 document corpus with doc2query-T5 expansions.
- Lucene index of the MS MARCO V1 segmented document corpus.
- Lucene index of the MS MARCO V1 segmented document corpus ('slim' version).
- Lucene index of the MS MARCO V1 segmented document corpus ('full' version).
- Lucene index of the MS MARCO V1 segmented document corpus with doc2query-T5 expansions.
- Lucene index (+docvectors) of the MS MARCO V1 segmented document corpus with doc2query-T5 expansions.
- Lucene index of the MS MARCO V1 passage corpus.
- Lucene index of the MS MARCO V1 passage corpus ('slim' version).
- Lucene index of the MS MARCO V1 passage corpus ('full' version).
- Lucene index of the MS MARCO V1 passage corpus with doc2query-T5 expansions.
- Lucene index (+docvectors) of the MS MARCO V1 passage corpus with doc2query-T5 expansions.
- Lucene index of the MS MARCO V2 document corpus.
- Lucene index of the MS MARCO V2 document corpus ('slim' version).
- Lucene index of the MS MARCO V2 document corpus ('full' version).
- Lucene index of the MS MARCO V2 document corpus with doc2query-T5 expansions.
- Lucene index (+docvectors) of the MS MARCO V2 document corpus with doc2query-T5 expansions.
- Lucene index of the MS MARCO V2 segmented document corpus.
- Lucene index of the MS MARCO V2 segmented document corpus ('slim' version).
- Lucene index of the MS MARCO V2 segmented document corpus ('full' version).
- Lucene index of the MS MARCO V2 segmented document corpus with doc2query-T5 expansions.
- Lucene index (+docvectors) of the MS MARCO V2 segmented document corpus with doc2query-T5 expansions.
- Lucene index of the MS MARCO V2 passage corpus.
- Lucene index of the MS MARCO V2 passage corpus ('slim' version).
- Lucene index of the MS MARCO V2 passage corpus ('full' version).
- Lucene index of the MS MARCO V2 passage corpus with doc2query-T5 expansions.
- Lucene index (+docvectors) of the MS MARCO V2 passage corpus with doc2query-T5 expansions.
- Lucene index of the MS MARCO V2 augmented passage corpus.
- Lucene index of the MS MARCO V2 augmented passage corpus ('slim' version).
- Lucene index of the MS MARCO V2 augmented passage corpus ('full' version).
- Lucene index of the MS MARCO V2 augmented passage corpus with doc2query-T5 expansions.
- Lucene index (+docvectors) of the MS MARCO V2 augmented passage corpus with doc2query-T5 expansions.
- Lucene index of English Wikipedia for BERTserini
- Lucene index of Chinese Wikipedia for BERTserini
- Lucene index for TREC-COVID Round 5: abstract index
- Lucene index for TREC-COVID Round 5: full-text index
- Lucene index for TREC-COVID Round 5: paragraph index
- Lucene index for TREC-COVID Round 4: abstract index
- Lucene index for TREC-COVID Round 4: full-text index
- Lucene index for TREC-COVID Round 4: paragraph index
- Lucene index for TREC-COVID Round 3: abstract index
- Lucene index for TREC-COVID Round 3: full-text index
- Lucene index for TREC-COVID Round 3: paragraph index
- Lucene index for TREC-COVID Round 2: abstract index
- Lucene index for TREC-COVID Round 2: full-text index
- Lucene index for TREC-COVID Round 2: paragraph index
- Lucene index for TREC-COVID Round 1: abstract index
- Lucene index for TREC-COVID Round 1: full-text index
- Lucene index for TREC-COVID Round 1: paragraph index
- Lucene index for TREC 2019 CaST
- Lucene index of Wikipedia with DPR 100-word splits
- Lucene index of Wikipedia with DPR 100-word splits (slim version, document text not stored)
- Lucene index of Wikipedia snapshot used as KILT's knowledge source.
- Lucene index for Mr.TyDi v1.1 (Arabic).
- Lucene index for Mr.TyDi v1.1 (Bengali).
- Lucene index for Mr.TyDi v1.1 (English).
- Lucene index for Mr.TyDi v1.1 (Finnish).
- Lucene index for Mr.TyDi v1.1 (Indonesian).
- Lucene index for Mr.TyDi v1.1 (Japanese).
- Lucene index for Mr.TyDi v1.1 (Korean).
- Lucene index for Mr.TyDi v1.1 (Russian).
- Lucene index for Mr.TyDi v1.1 (Swahili).
- Lucene index for Mr.TyDi v1.1 (Telugu).
- Lucene index for Mr.TyDi v1.1 (Thai).
- Lucene index of the MS MARCO passage corpus (deprecated; use msmarco-v1-passage instead).
- Lucene index of the MS MARCO passage corpus (slim version, document text not stored) (deprecated; use msmarco-v1-passage-slim instead).
- Lucene index of the MS MARCO document corpus (deprecated; use msmarco-v1-doc instead).
- Lucene index of the MS MARCO document corpus (slim version, document text not stored) (deprecated; use msmarco-v1-doc-slim instead).
- Lucene index of the MS MARCO document corpus segmented into passages (deprecated; use msmarco-v1-doc-segmented instead).
- Lucene index of the MS MARCO document corpus segmented into passages (slim version, document text not stored) (deprecated; use msmarco-v1-doc-segmented-slim instead).
- Lucene index of the MS MARCO passage corpus with docTTTTTquery expansions (deprecated; use msmarco-v1-passage-d2q-t5 instead)
- Lucene index of the MS MARCO document corpus with per-doc docTTTTTquery expansions (deprecated; use msmarco-v1-doc-d2q-t5 instead)
- Lucene index of the MS MARCO document corpus with per-passage docTTTTTquery expansions (deprecated; use msmarco-v1-doc-segmented-d2q-t5 instead)
- Lucene flat index of BEIR (v1.0.0): TREC-COVID
- Lucene flat index of BEIR (v1.0.0): BioASQ
- Lucene flat index of BEIR (v1.0.0): NFCorpus
- Lucene flat index of BEIR (v1.0.0): NQ
- Lucene flat index of BEIR (v1.0.0): HotpotQA
- Lucene flat index of BEIR (v1.0.0): FiQA-2018
- Lucene flat index of BEIR (v1.0.0): Signal-1M
- Lucene flat index of BEIR (v1.0.0): TREC-NEWS
- Lucene flat index of BEIR (v1.0.0): Robust04
- Lucene flat index of BEIR (v1.0.0): ArguAna
- Lucene flat index of BEIR (v1.0.0): Webis-Touche2020
- Lucene flat index of BEIR (v1.0.0): CQADupStack-android
- Lucene flat index of BEIR (v1.0.0): CQADupStack-english
- Lucene flat index of BEIR (v1.0.0): CQADupStack-gaming
- Lucene flat index of BEIR (v1.0.0): CQADupStack-gis
- Lucene flat index of BEIR (v1.0.0): CQADupStack-mathematica
- Lucene flat index of BEIR (v1.0.0): CQADupStack-physics
- Lucene flat index of BEIR (v1.0.0): CQADupStack-programmers
- Lucene flat index of BEIR (v1.0.0): CQADupStack-stats
- Lucene flat index of BEIR (v1.0.0): CQADupStack-tex
- Lucene flat index of BEIR (v1.0.0): CQADupStack-unix
- Lucene flat index of BEIR (v1.0.0): CQADupStack-webmasters
- Lucene flat index of BEIR (v1.0.0): CQADupStack-wordpress
- Lucene flat index of BEIR (v1.0.0): Quora
- Lucene flat index of BEIR (v1.0.0): DBPedia
- Lucene flat index of BEIR (v1.0.0): SCIDOCS
- Lucene flat index of BEIR (v1.0.0): FEVER
- Lucene flat index of BEIR (v1.0.0): Climate-FEVER
- Lucene flat index of BEIR (v1.0.0): SciFact
- Lucene multifield index of BEIR (v1.0.0): TREC-COVID
- Lucene multifield index of BEIR (v1.0.0): BioASQ
- Lucene multifield index of BEIR (v1.0.0): NFCorpus
- Lucene multifield index of BEIR (v1.0.0): NQ
- Lucene multifield index of BEIR (v1.0.0): HotpotQA
- Lucene multifield index of BEIR (v1.0.0): FiQA-2018
- Lucene multifield index of BEIR (v1.0.0): Signal-1M
- Lucene multifield index of BEIR (v1.0.0): TREC-NEWS
- Lucene multifield index of BEIR (v1.0.0): Robust04
- Lucene multifield index of BEIR (v1.0.0): ArguAna
- Lucene multifield index of BEIR (v1.0.0): Webis-Touche2020
- Lucene multifield index of BEIR (v1.0.0): CQADupStack-android
- Lucene multifield index of BEIR (v1.0.0): CQADupStack-english
- Lucene multifield index of BEIR (v1.0.0): CQADupStack-gaming
- Lucene multifield index of BEIR (v1.0.0): CQADupStack-gis
- Lucene multifield index of BEIR (v1.0.0): CQADupStack-mathematica
- Lucene multifield index of BEIR (v1.0.0): CQADupStack-physics
- Lucene multifield index of BEIR (v1.0.0): CQADupStack-programmers
- Lucene multifield index of BEIR (v1.0.0): CQADupStack-stats
- Lucene multifield index of BEIR (v1.0.0): CQADupStack-tex
- Lucene multifield index of BEIR (v1.0.0): CQADupStack-unix
- Lucene multifield index of BEIR (v1.0.0): CQADupStack-webmasters
- Lucene multifield index of BEIR (v1.0.0): CQADupStack-wordpress
- Lucene multifield index of BEIR (v1.0.0): Quora
- Lucene multifield index of BEIR (v1.0.0): DBPedia
- Lucene multifield index of BEIR (v1.0.0): SCIDOCS
- Lucene multifield index of BEIR (v1.0.0): FEVER
- Lucene multifield index of BEIR (v1.0.0): Climate-FEVER
- Lucene multifield index of BEIR (v1.0.0): SciFact
- Lucene index for HC4 v1.0 (Chinese).
- Lucene index for HC4 v1.0 (Persian).
- Lucene index for HC4 v1.0 (Russian).
- Lucene index for NeuClir '22 (Persian).
- Lucene index for NeuClir '22 (Persian).
- Lucene index for NeuClir '22 (Russian).
cacm
robust04
[readme]
msmarco-passage-ltr
[readme]
msmarco-doc-per-passage-ltr
msmarco-document-segment-ltr
msmarco-v1-doc
[readme]
msmarco-v1-doc-slim
[readme]
msmarco-v1-doc-full
[readme]
msmarco-v1-doc-d2q-t5
[readme]
msmarco-v1-doc-d2q-t5-docvectors
[readme]
msmarco-v1-doc-segmented
[readme]
msmarco-v1-doc-segmented-slim
[readme]
msmarco-v1-doc-segmented-full
[readme]
msmarco-v1-doc-segmented-d2q-t5
[readme]
msmarco-v1-doc-segmented-d2q-t5-docvectors
[readme]
msmarco-v1-passage
[readme]
msmarco-v1-passage-slim
[readme]
msmarco-v1-passage-full
[readme]
msmarco-v1-passage-d2q-t5
[readme]
msmarco-v1-passage-d2q-t5-docvectors
[readme]
msmarco-v2-doc
[readme]
msmarco-v2-doc-slim
[readme]
msmarco-v2-doc-full
[readme]
msmarco-v2-doc-d2q-t5
[readme]
msmarco-v2-doc-d2q-t5-docvectors
[readme]
msmarco-v2-doc-segmented
[readme]
msmarco-v2-doc-segmented-slim
[readme]
msmarco-v2-doc-segmented-full
[readme]
msmarco-v2-doc-segmented-d2q-t5
[readme]
msmarco-v2-doc-segmented-d2q-t5-docvectors
[readme]
msmarco-v2-passage
[readme]
msmarco-v2-passage-slim
[readme]
msmarco-v2-passage-full
[readme]
msmarco-v2-passage-d2q-t5
[readme]
msmarco-v2-passage-d2q-t5-docvectors
[readme]
msmarco-v2-passage-augmented
[readme]
msmarco-v2-passage-augmented-slim
[readme]
msmarco-v2-passage-augmented-full
[readme]
msmarco-v2-passage-augmented-d2q-t5
[readme]
msmarco-v2-passage-augmented-d2q-t5-docvectors
[readme]
enwiki-paragraphs
zhwiki-paragraphs
trec-covid-r5-abstract
trec-covid-r5-full-text
trec-covid-r5-paragraph
trec-covid-r4-abstract
trec-covid-r4-full-text
trec-covid-r4-paragraph
trec-covid-r3-abstract
trec-covid-r3-full-text
trec-covid-r3-paragraph
trec-covid-r2-abstract
trec-covid-r2-full-text
trec-covid-r2-paragraph
trec-covid-r1-abstract
trec-covid-r1-full-text
trec-covid-r1-paragraph
cast2019
wikipedia-dpr
[readme]
wikipedia-dpr-slim
[readme]
wikipedia-kilt-doc
[readme]
mrtydi-v1.1-arabic
[readme]
mrtydi-v1.1-bengali
[readme]
mrtydi-v1.1-english
[readme]
mrtydi-v1.1-finnish
[readme]
mrtydi-v1.1-indonesian
[readme]
mrtydi-v1.1-japanese
[readme]
mrtydi-v1.1-korean
[readme]
mrtydi-v1.1-russian
[readme]
mrtydi-v1.1-swahili
[readme]
mrtydi-v1.1-telugu
[readme]
mrtydi-v1.1-thai
[readme]
msmarco-passage
[readme]
msmarco-passage-slim
[readme]
msmarco-doc
[readme]
msmarco-doc-slim
[readme]
msmarco-doc-per-passage
[readme]
msmarco-doc-per-passage-slim
[readme]
msmarco-passage-expanded
[readme]
msmarco-doc-expanded-per-doc
[readme]
msmarco-doc-expanded-per-passage
[readme]
beir-v1.0.0-trec-covid-flat
[readme]
beir-v1.0.0-bioasq-flat
[readme]
beir-v1.0.0-nfcorpus-flat
[readme]
beir-v1.0.0-nq-flat
[readme]
beir-v1.0.0-hotpotqa-flat
[readme]
beir-v1.0.0-fiqa-flat
[readme]
beir-v1.0.0-signal1m-flat
[readme]
beir-v1.0.0-trec-news-flat
[readme]
beir-v1.0.0-robust04-flat
[readme]
beir-v1.0.0-arguana-flat
[readme]
beir-v1.0.0-webis-touche2020-flat
[readme]
beir-v1.0.0-cqadupstack-android-flat
[readme]
beir-v1.0.0-cqadupstack-english-flat
[readme]
beir-v1.0.0-cqadupstack-gaming-flat
[readme]
beir-v1.0.0-cqadupstack-gis-flat
[readme]
beir-v1.0.0-cqadupstack-mathematica-flat
[readme]
beir-v1.0.0-cqadupstack-physics-flat
[readme]
beir-v1.0.0-cqadupstack-programmers-flat
[readme]
beir-v1.0.0-cqadupstack-stats-flat
[readme]
beir-v1.0.0-cqadupstack-tex-flat
[readme]
beir-v1.0.0-cqadupstack-unix-flat
[readme]
beir-v1.0.0-cqadupstack-webmasters-flat
[readme]
beir-v1.0.0-cqadupstack-wordpress-flat
[readme]
beir-v1.0.0-quora-flat
[readme]
beir-v1.0.0-dbpedia-entity-flat
[readme]
beir-v1.0.0-scidocs-flat
[readme]
beir-v1.0.0-fever-flat
[readme]
beir-v1.0.0-climate-fever-flat
[readme]
beir-v1.0.0-scifact-flat
[readme]
beir-v1.0.0-trec-covid-multifield
[readme]
beir-v1.0.0-bioasq-multifield
[readme]
beir-v1.0.0-nfcorpus-multifield
[readme]
beir-v1.0.0-nq-multifield
[readme]
beir-v1.0.0-hotpotqa-multifield
[readme]
beir-v1.0.0-fiqa-multifield
[readme]
beir-v1.0.0-signal1m-multifield
[readme]
beir-v1.0.0-trec-news-multifield
[readme]
beir-v1.0.0-robust04-multifield
[readme]
beir-v1.0.0-arguana-multifield
[readme]
beir-v1.0.0-webis-touche2020-multifield
[readme]
beir-v1.0.0-cqadupstack-android-multifield
[readme]
beir-v1.0.0-cqadupstack-english-multifield
[readme]
beir-v1.0.0-cqadupstack-gaming-multifield
[readme]
beir-v1.0.0-cqadupstack-gis-multifield
[readme]
beir-v1.0.0-cqadupstack-mathematica-multifield
[readme]
beir-v1.0.0-cqadupstack-physics-multifield
[readme]
beir-v1.0.0-cqadupstack-programmers-multifield
[readme]
beir-v1.0.0-cqadupstack-stats-multifield
[readme]
beir-v1.0.0-cqadupstack-tex-multifield
[readme]
beir-v1.0.0-cqadupstack-unix-multifield
[readme]
beir-v1.0.0-cqadupstack-webmasters-multifield
[readme]
beir-v1.0.0-cqadupstack-wordpress-multifield
[readme]
beir-v1.0.0-quora-multifield
[readme]
beir-v1.0.0-dbpedia-entity-multifield
[readme]
beir-v1.0.0-scidocs-multifield
[readme]
beir-v1.0.0-fever-multifield
[readme]
beir-v1.0.0-climate-fever-multifield
[readme]
beir-v1.0.0-scifact-multifield
[readme]
hc4-v1.0-zh
[readme]
hc4-v1.0-fa
[readme]
hc4-v1.0-ru
[readme]
neuclir22-zh
[readme]
neuclir22-fa
[readme]
neuclir22-ru
[readme]
- Lucene impact index of the MS MARCO V1 passage corpus for uniCOIL.
- Lucene impact index of the MS MARCO V1 passage corpus for uniCOIL (noexp).
- Lucene impact index of the MS MARCO V1 segmented document corpus for uniCOIL.
- Lucene impact index of the MS MARCO V1 segmented document corpus for uniCOIL (noexp) with title prepended.
- Lucene impact index of the MS MARCO V2 passage corpus for uniCOIL.
- Lucene impact index of the MS MARCO V2 passage corpus for uniCOIL (noexp).
- Lucene impact index of the MS MARCO V2 segmented document corpus for uniCOIL.
- Lucene impact index of the MS MARCO V2 segmented document corpus for uniCOIL, with title prepended.
- Lucene impact index of the MS MARCO V2 segmented document corpus for uniCOIL (noexp).
- Lucene impact index of the MS MARCO V2 segmented document corpus for uniCOIL (noexp) with title prepended
- Lucene impact index of the MS MARCO passage corpus encoded by DeepImpact
- Lucene impact index of the MS MARCO passage corpus encoded by uniCOIL-TILDE
- Lucene impact index of the MS MARCO passage corpus encoded by distill-splade-max
- Lucene impact index of the MS MARCO V2 passage corpus encoded by uniCOIL-TILDE
- Lucene impact index of the MS MARCO passage corpus encoded by uniCOIL-d2q (deprecated; use msmarco-v1-passage-unicoil instead).
- Lucene impact index of the MS MARCO doc corpus per passage expansion encoded by uniCOIL-d2q (deprecated; use msmarco-v1-doc-segmented-unicoil instead).
- Lucene impact index of the MS MARCO V2 passage corpus encoded by uniCOIL (zero-shot, no expansions) (deprecated; use msmarco-v2-passage-unicoil-noexp-0shot instead).
- Lucene impact index of the MS MARCO V2 document corpus per passage encoded by uniCOIL (zero-shot, no expansions) (deprecated; msmarco-v2-doc-segmented-unicoil-noexp-0shot).
- Lucene impact index of BEIR (v1.0.0): TREC-COVID encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): BioASQ encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): NFCorpus encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): NQ encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): HotpotQA encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): FiQA-2018 encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): Signal-1M encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): TREC-NEWS encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): Robust04 encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): ArguAna encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): Webis-Touche2020 encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): CQADupStack-android encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): CQADupStack-english encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): CQADupStack-gaming encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): CQADupStack-gis encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): CQADupStack-mathematica encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): CQADupStack-physics encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): CQADupStack-programmers encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): CQADupStack-stats encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): CQADupStack-tex encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): CQADupStack-unix encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): CQADupStack-webmasters encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): CQADupStack-wordpress encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): Quora encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): DBPedia encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): SCIDOCS encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): FEVER encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): Climate-FEVER encoded by SPLADE-distill CoCodenser-medium
- Lucene impact index of BEIR (v1.0.0): SciFact encoded by SPLADE-distill CoCodenser-medium
msmarco-v1-passage-unicoil
[readme]
msmarco-v1-passage-unicoil-noexp
[readme]
msmarco-v1-doc-segmented-unicoil
[readme]
msmarco-v1-doc-segmented-unicoil-noexp
[readme]
msmarco-v2-passage-unicoil-0shot
[readme]
msmarco-v2-passage-unicoil-noexp-0shot
[readme]
msmarco-v2-doc-segmented-unicoil-0shot
[readme]
msmarco-v2-doc-segmented-unicoil-0shot-v2
[readme]
msmarco-v2-doc-segmented-unicoil-noexp-0shot
[readme]
msmarco-v2-doc-segmented-unicoil-noexp-0shot-v2
[readme]
msmarco-passage-deepimpact
[readme]
msmarco-passage-unicoil-tilde
[readme]
msmarco-passage-distill-splade-max
[readme]
msmarco-v2-passage-unicoil-tilde
[readme]
msmarco-passage-unicoil-d2q
[readme]
msmarco-doc-per-passage-unicoil-d2q
[readme]
msmarco-v2-passage-unicoil-noexp-0shot-deprecated
[readme]
msmarco-v2-doc-per-passage-unicoil-noexp-0shot
[readme]
beir-v1.0.0-trec-covid-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-bioasq-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-nfcorpus-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-nq-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-hotpotqa-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-fiqa-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-signal1m-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-trec-news-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-robust04-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-arguana-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-webis-touche2020-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-cqadupstack-android-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-cqadupstack-english-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-cqadupstack-gaming-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-cqadupstack-gis-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-cqadupstack-mathematica-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-cqadupstack-physics-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-cqadupstack-programmers-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-cqadupstack-stats-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-cqadupstack-tex-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-cqadupstack-unix-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-cqadupstack-webmasters-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-cqadupstack-wordpress-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-quora-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-dbpedia-entity-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-scidocs-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-fever-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-climate-fever-splade_distil_cocodenser_medium
[readme]
beir-v1.0.0-scifact-splade_distil_cocodenser_medium
[readme]
- Faiss HNSW index of the MS MARCO passage corpus encoded by TCT-ColBERT
- Faiss FlatIP index of the MS MARCO passage corpus encoded by TCT-ColBERT
- Faiss FlatIP index of the MS MARCO document corpus encoded by TCT-ColBERT
- Faiss FlatIP index of the MS MARCO document corpus encoded by TCT-ColBERT-V2-HNP
- Faiss FlatIP index of Wikipedia encoded by the DPR doc encoder trained on multiple QA datasets
- Faiss FlatIP index of Wikipedia encoded by the DPR doc encoder trained on NQ
- Faiss binary index of Wikipedia encoded by the BPR doc encoder trained on NQ
- Faiss FlatIP index of the MS MARCO passage corpus encoded by the ANCE MS MARCO passage encoder
- Faiss FlatIP index of the MS MARCO document corpus encoded by the ANCE MaxP encoder
- Faiss FlatIP index of Wikipedia encoded by the ANCE-multi encoder
- Faiss FlatIP index of the MS MARCO passage corpus encoded by the SBERT MS MARCO passage encoder
- Faiss FlatIP index of the MS MARCO passage corpus encoded by the distilbert-dot-margin_mse-T2-msmarco passage encoder
- Faiss FlatIP index of the MS MARCO passage corpus encoded by msmarco-passage-distilbert-dot-tas_b-b256 passage encoder
- Faiss FlatIP index of the MS MARCO passage corpus encoded by the tct_colbert-v2 passage encoder
- Faiss FlatIP index of the MS MARCO passage corpus encoded by the tct_colbert-v2-hn passage encoder
- Faiss FlatIP index of the MS MARCO passage corpus encoded by the tct_colbert-v2-hnp passage encoder
- Faiss HNSW index of the CAsT2019 passage corpus encoded by the tct_colbert-v2 passage encoder
- Faiss index for Mr.TyDi v1.1 (Arabic) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Bengali) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (English) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Finnish) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Indonesian) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Japanese) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Korean) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Russian) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Swahili) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Telugu) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Thai) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss FlatIP index of Wikipedia DPR encoded by the retriever model from 'Distilling Knowledge from Reader to Retriever for Question Answering' trained on NQ
- Faiss FlatIP index of Wikipedia DPR encoded by the retriever model from 'Distilling Knowledge from Reader to Retriever for Question Answering' trained on TriviaQA
- Faiss index for Mr.TyDi v1.1 (Arabic) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
- Faiss index for Mr.TyDi v1.1 (Bengali) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
- Faiss index for Mr.TyDi v1.1 (English) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
- Faiss index for Mr.TyDi v1.1 (Finnish) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
- Faiss index for Mr.TyDi v1.1 (Indonesian) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
- Faiss index for Mr.TyDi v1.1 (Japanese) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
- Faiss index for Mr.TyDi v1.1 (Korean) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
- Faiss index for Mr.TyDi v1.1 (Russian) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
- Faiss index for Mr.TyDi v1.1 (Swahili) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
- Faiss index for Mr.TyDi v1.1 (Telugu) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
- Faiss index for Mr.TyDi v1.1 (Thai) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
- Faiss index for Mr.TyDi v1.1 (Arabic) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Bengali) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (English) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Finnish) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Indonesian) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Japanese) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Korean) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Russian) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Swahili) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Telugu) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Thai) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Arabic) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Bengali) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (English) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Finnish) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Indonesian) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Japanese) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Korean) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Russian) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Swahili) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Telugu) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
- Faiss index for Mr.TyDi v1.1 (Thai) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
msmarco-passage-tct_colbert-hnsw
msmarco-passage-tct_colbert-bf
msmarco-doc-tct_colbert-bf
msmarco-doc-tct_colbert-v2-hnp-bf
wikipedia-dpr-multi-bf
wikipedia-dpr-single-nq-bf
wikipedia-bpr-single-nq-hash
msmarco-passage-ance-bf
msmarco-doc-ance-maxp-bf
wikipedia-ance-multi-bf
msmarco-passage-sbert-bf
msmarco-passage-distilbert-dot-margin_mse-T2-bf
msmarco-passage-distilbert-dot-tas_b-b256-bf
msmarco-passage-tct_colbert-v2-bf
msmarco-passage-tct_colbert-v2-hn-bf
msmarco-passage-tct_colbert-v2-hnp-bf
cast2019-tct_colbert-v2-hnsw
[readme]
mrtydi-v1.1-arabic-mdpr-nq
[readme]
mrtydi-v1.1-bengali-mdpr-nq
[readme]
mrtydi-v1.1-english-mdpr-nq
[readme]
mrtydi-v1.1-finnish-mdpr-nq
[readme]
mrtydi-v1.1-indonesian-mdpr-nq
[readme]
mrtydi-v1.1-japanese-mdpr-nq
[readme]
mrtydi-v1.1-korean-mdpr-nq
[readme]
mrtydi-v1.1-russian-mdpr-nq
[readme]
mrtydi-v1.1-swahili-mdpr-nq
[readme]
mrtydi-v1.1-telugu-mdpr-nq
[readme]
mrtydi-v1.1-thai-mdpr-nq
[readme]
wikipedia-dpr-dkrr-nq
wikipedia-dpr-dkrr-tqa
mrtydi-v1.1-arabic-mdpr-tied-pft-msmarco
[readme]
mrtydi-v1.1-bengali-mdpr-tied-pft-msmarco
[readme]
mrtydi-v1.1-english-mdpr-tied-pft-msmarco
[readme]
mrtydi-v1.1-finnish-mdpr-tied-pft-msmarco
[readme]
mrtydi-v1.1-indonesian-mdpr-tied-pft-msmarco
[readme]
mrtydi-v1.1-japanese-mdpr-tied-pft-msmarco
[readme]
mrtydi-v1.1-korean-mdpr-tied-pft-msmarco
[readme]
mrtydi-v1.1-russian-mdpr-tied-pft-msmarco
[readme]
mrtydi-v1.1-swahili-mdpr-tied-pft-msmarco
[readme]
mrtydi-v1.1-telugu-mdpr-tied-pft-msmarco
[readme]
mrtydi-v1.1-thai-mdpr-tied-pft-msmarco
[readme]
mrtydi-v1.1-arabic-mdpr-tied-pft-nq
[readme]
mrtydi-v1.1-bengali-mdpr-tied-pft-nq
[readme]
mrtydi-v1.1-english-mdpr-tied-pft-nq
[readme]
mrtydi-v1.1-finnish-mdpr-tied-pft-nq
[readme]
mrtydi-v1.1-indonesian-mdpr-tied-pft-nq
[readme]
mrtydi-v1.1-japanese-mdpr-tied-pft-nq
[readme]
mrtydi-v1.1-korean-mdpr-tied-pft-nq
[readme]
mrtydi-v1.1-russian-mdpr-tied-pft-nq
[readme]
mrtydi-v1.1-swahili-mdpr-tied-pft-nq
[readme]
mrtydi-v1.1-telugu-mdpr-tied-pft-nq
[readme]
mrtydi-v1.1-thai-mdpr-tied-pft-nq
[readme]
mrtydi-v1.1-arabic-mdpr-tied-pft-msmarco-ft-all
[readme]
mrtydi-v1.1-bengali-mdpr-tied-pft-msmarco-ft-all
[readme]
mrtydi-v1.1-english-mdpr-tied-pft-msmarco-ft-all
[readme]
mrtydi-v1.1-finnish-mdpr-tied-pft-msmarco-ft-all
[readme]
mrtydi-v1.1-indonesian-mdpr-tied-pft-msmarco-ft-all
[readme]
mrtydi-v1.1-japanese-mdpr-tied-pft-msmarco-ft-all
[readme]
mrtydi-v1.1-korean-mdpr-tied-pft-msmarco-ft-all
[readme]
mrtydi-v1.1-russian-mdpr-tied-pft-msmarco-ft-all
[readme]
mrtydi-v1.1-swahili-mdpr-tied-pft-msmarco-ft-all
[readme]
mrtydi-v1.1-telugu-mdpr-tied-pft-msmarco-ft-all
[readme]
mrtydi-v1.1-thai-mdpr-tied-pft-msmarco-ft-all
[readme]