-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HF: switch conditional checks to self.backend
from AUTO_MODEL_CLASS
#2353
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a bunch for working on this!!
Broadly this is the right approach, but I had a different thought on how we should handle things while maintaining user back-compatibility--described in my PR comments. That should also ensure that we still respect novel subclass-overridden self.AUTO_MODEL_CLASS
values.
elif ( | ||
getattr(self.config, "model_type") in MODEL_FOR_CAUSAL_LM_MAPPING_NAMES | ||
): | ||
self.AUTO_MODEL_CLASS = transformers.AutoModelForCausalLM | ||
self.backend = "causal" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make the below warning message explicitly state that we set backend=causal
in cases like this?
lm_eval/models/huggingface.py
Outdated
sets `self.AUTO_MODEL_CLASS` appropriately if not already set. | ||
Should only be called if isinstance(pretrained, str)! otherwise pass `backend` appropriately |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a change in interface, right?
I think we should still call _get_backend
so that if someone passes an HF model we can detect backend for, it is set by us. Let's try not to force people to pass backend
unless they absolutely have to.
lm_eval/models/huggingface.py
Outdated
@@ -90,7 +90,7 @@ def __init__( | |||
**kwargs, | |||
) -> None: | |||
super().__init__() | |||
|
|||
self.backend = backend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd prefer it if we don't set self.backend
here. Would want our code to expressly error out if a user subclassing HFLM does not ever call super().__init__()
--> HFLM._get_backend()
never called.
lm_eval/models/huggingface.py
Outdated
@@ -101,6 +101,8 @@ def __init__( | |||
self._device = self._model.device | |||
self._config = self._model.config | |||
gpus = 0 | |||
# default backend to causal if not specified | |||
self.backend = self.backend if self.backend != "default" else "causal" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we handle this case in _get_backend()
and remove this line?
It should be sufficient to simple settle for the existing behavior we had in that function (it defaults to causal if we can't detect both that our model is an HF model and that it is among the registered causal or seq2seq HF model types)
lm_eval/models/huggingface.py
Outdated
config=self.config, backend=backend, trust_remote_code=trust_remote_code | ||
) | ||
# determine which of 'causal' and 'seq2seq' backends to use for HF models | ||
self._get_backend( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unindent if following my above coment!
lm_eval/models/huggingface.py
Outdated
# the default _get_backend logic, | ||
# then skip over the method. | ||
# TODO: this seems very much undesirable in some cases--our code in HFLM | ||
# references AutoModelForCausalLM at times to check for equality | ||
if self.AUTO_MODEL_CLASS is not None: | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per above comments on change in interface--this exit-early from the _get_backend()
fn means we will now force users who override AUTO_MODEL_CLASS
to set self.backend
themselves in the init.
I think instead we should drop
if self.AUTO_MODEL_CLASS is not None:
return
entirely, and let our code handle things but we should only set AUTO_MODEL_CLASS in the _get_backend()
function if AUTO_MODEL_CLASS is currently None !
That way, for subclasses that set AUTO_MODEL_CLASS themselves, we can use our existing "use user-specified backend if provided, else choose causal or seq2seq if can detect it among HF models, else fall back to assume causal" logic here (which should be what we want, so long as we signpost this behavior with logger msgs enough) and yet preserve the desired usage of having a special AUTO_MODEL_CLASS
that gets used for model init. Sound good?
(as is, this PR I think breaks HF multimodal LMs at the moment. with the changes in my comments I believe it will no longer?)
@haileyschoelkopf ready to review again! Still need to test, but wanted to make sure if I understood you correctly. |
The conditions in HFLM now check for either
causal
orseq2seq
rather than checking for theAUTO_MODEL_CLASS