-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Denserank - add support for RepBERT and FAISS #181
base: master
Are you sure you want to change the base?
Conversation
@@ -2,11 +2,9 @@ include capreolus/data/antique.json | |||
include capreolus/data/dummy/data/dummy_trec_doc | |||
include capreolus/data/dummy_folds.json | |||
include capreolus/data/dummy.yaml | |||
include capreolus/data/msmarcopassage.folds.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these will be re-added through Crystina's PR
capreolus/benchmark/msmarco.py
Outdated
|
||
PACKAGE_PATH = constants["PACKAGE_PATH"] | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be handled through Crystina's PR
generator_type = "DefaultLuceneDocumentGenerator" | ||
config_keys_not_in_path = ["path"] | ||
config_spec = [ConfigOption("path", "Aquaint-TREC-3-4", "path to corpus")] | ||
dependencies = [Dependency(key="task", module="task", name="robust04passages")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the robust04passages
collection/benchmark will automatically invoke the task that will chunk the docs into passages.
|
||
|
||
@Encoder.register | ||
class CLEAREncoder(Encoder): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not manage to reproduce the results - should I keep this nevertheless?
pickle.dump(state_dict, f, protocol=-1) | ||
|
||
def get_tf_feature_description(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
berttext
is not only used by denserank, and hence the TF specific things can be removed. bertpassage.py
is used for TFKNRM
e.t.c
7a27a76
to
5b4dcb2
Compare
This pull request introduces 31 alerts when merging 5b4dcb2 into 1767d5a - view on LGTM.com new alerts:
|
This pull request introduces 28 alerts when merging 8e07c90 into 1767d5a - view on LGTM.com new alerts:
|
afe19f1
to
a347624
Compare
This pull request introduces 29 alerts when merging a347624 into 1767d5a - view on LGTM.com new alerts:
|
This pull request introduces 28 alerts when merging c2af812 into 3521171 - view on LGTM.com new alerts:
|
This pull request introduces 28 alerts when merging 412d075 into 3521171 - view on LGTM.com new alerts:
|
No description provided.