feat: add intervenable_model to forward's function signature #191

eggachecat · 2024-10-09T12:44:29Z

Description

Added the intervenable_model parameter to the forward function signature, enabling user-defined intervention classes to have direct access to the model instance. This allows for more advanced manipulations such as using intervenable_model.model.lm_head(base) to interact with lower-level model components.

Testing Done

Tested the changes locally by defining custom intervention classes that access the model's internal components, including the lm_head. Verified that these interventions function as expected during model execution.

Checklist:

My PR title strictly follows the format: [Your Priority] Your Title
I have attached the testing log above
I provide enough comments to my code
I have changed documentations
I have added tests for my changes

Enable user-defined intervention classes to access the model. This allows users to interact with the model more flexibly by using constructs like `model.model.lm_head(base)`.

frankaging · 2024-10-09T18:16:57Z

Thanks for the change! The use case seems to be useful.

One general comment, could you turn the intervention signature into a more generic version using kwargs?

def forward(self, base, source, subspaces=None, **kwargs):
...

And the caller in these setter function should also take in **kwargs from user, if it is passed, then set.

The intervenable model forward call thus can take in arguments such as,

pv_model.forward(base=..., sources=[...], intervenable_model=pv_model)

Let me know if this makes sense! If you could make this change, it would be great since it will support many use cases.

eggachecat · 2024-10-10T13:15:29Z

Hi @frankaging , I think it totally makes sense and have updated the code accordingly. Please take a look when you have a chance. Thanks!

frankaging

Thanks!! Otherwise, the change LGTM!

frankaging · 2024-10-15T10:25:47Z

pyvene/models/intervenable_base.py

@@ -839,6 +839,7 @@ def _intervention_setter(
                    None,
                    intervention,
                    subspaces[key_i] if subspaces is not None else None,
+                    intervenable_model=self


could you also change this to passing **kwargs?

frankaging · 2024-10-15T10:26:12Z

pyvene/models/modeling_utils.py

@@ -431,7 +431,7 @@ def scatter_neurons(


 def do_intervention(
-    base_representation, source_representation, intervention, subspaces
+    base_representation, source_representation, intervention, subspaces, intervenable_model=None


similarly, could you also change this to passing **kwargs?

frankaging · 2024-10-15T10:26:44Z

tests/integration_tests/InterventionWithLlamaTestCase.py

+        } for layer in [1, 3]], model=self.llama)
+        intervened_outputs = pv_llama(
+            base=self.tokenizer("The capital of Spain is", return_tensors="pt").to(that.device), 
+            unit_locations={"base": 3}


with the changes, now in this line, you could pass in your customized field to the model, such as self-referencing.

eggachecat · 2024-10-17T15:58:46Z

hi @frankaging I just made another PR

Now the function signature for users looks like

    def test_with_llm_head(self):
        that = self
        _lm_head_collection = {}
        class AccessIntervenableModelIntervention:
            is_source_constant = True
            keep_last_dim = True
            intervention_types = 'access_intervenable_model_intervention'
            def __init__(self, layer_index, *args, **kwargs):
                super().__init__()
                self.layer_index = layer_index
            def __call__(self, base, source=None, subspaces=None, model=None, **kwargs):
                intervenable_model = kwargs.get('intervenable_model', None)
                assert intervenable_model is not None
                _lm_head_collection[self.layer_index] = intervenable_model.model.lm_head(base.to(that.device))
                return base
        # run with new intervention type
        pv_llama = IntervenableModel([{
            "intervention": AccessIntervenableModelIntervention(layer_index=layer),
            "component": f"model.layers.{layer}.input"
        } for layer in [1, 3]], model=self.llama)
        intervened_outputs = pv_llama(
            base=self.tokenizer("The capital of Spain is", return_tensors="pt").to(that.device), 
            unit_locations={"base": 3},
            # anything passed here will be forwarded to the __call__
            intervenable_model=pv_llama 
        )

eggachecat · 2024-11-04T06:09:22Z

👀👀👀👀

eggachecat added 2 commits October 9, 2024 20:41

feat: add model to forward's function signature

20cbf2d

Enable user-defined intervention classes to access the model. This allows users to interact with the model more flexibly by using constructs like `model.model.lm_head(base)`.

Merge branch 'stanfordnlp:main' into main

d677453

eggachecat mentioned this pull request Oct 9, 2024

[Suggestion]: Support Causal Tracing for LLaMA model #174

Open

feat: using **kwargs for extensibility

0e5defd

frankaging reviewed Oct 15, 2024

View reviewed changes

feat: make it more flexible

a6b0228

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add intervenable_model to forward's function signature #191

feat: add intervenable_model to forward's function signature #191

eggachecat commented Oct 9, 2024

frankaging commented Oct 9, 2024 •

edited

Loading

eggachecat commented Oct 10, 2024

frankaging left a comment

frankaging Oct 15, 2024

frankaging Oct 15, 2024

frankaging Oct 15, 2024

eggachecat commented Oct 17, 2024

eggachecat commented Nov 4, 2024

feat: add intervenable_model to forward's function signature #191

Are you sure you want to change the base?

feat: add intervenable_model to forward's function signature #191

Conversation

eggachecat commented Oct 9, 2024

Description

Testing Done

Checklist:

frankaging commented Oct 9, 2024 • edited Loading

eggachecat commented Oct 10, 2024

frankaging left a comment

Choose a reason for hiding this comment

frankaging Oct 15, 2024

Choose a reason for hiding this comment

frankaging Oct 15, 2024

Choose a reason for hiding this comment

frankaging Oct 15, 2024

Choose a reason for hiding this comment

eggachecat commented Oct 17, 2024

eggachecat commented Nov 4, 2024

frankaging commented Oct 9, 2024 •

edited

Loading