-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug in AlphaBeta rule? #107
Comments
Hey Niklas, the problem is that you try to register the rule to the Here's two ways to do it, one with only a single layer: Single Linear Codeimport torch
from torch.nn import Linear
from zennit.rules import AlphaBeta
layer = Linear(2, 1)
layer.weight.data = torch.tensor([[1., -1.]])
layer.bias.data = torch.tensor([-1.])
input = torch.tensor([[1., 1.]], requires_grad=True)
print('Layer-only:')
for alpha, beta in [(1., 0.), (2., 1.)]:
print(f' alpha={alpha}, beta={beta}')
# create hook and immediately register to layer
handle = AlphaBeta(alpha=alpha, beta=beta).register(layer)
output = layer(input)
relevance, = torch.autograd.grad(output, input, grad_outputs=output)
# remove the hook from the layer
handle.remove()
print(f' {relevance}') which prints: Single Linear Output
and one with the full Sequential, using a custom Composite and Attributor: Sequential Codeimport torch
from torch.nn import Linear, Sequential
from zennit.rules import AlphaBeta
from zennit.attribution import Gradient
from zennit.composites import LayerMapComposite
layer = Linear(2, 1)
layer.weight.data = torch.tensor([[1., -1.]])
layer.bias.data = torch.tensor([-1.])
# create a simple Sequential model with a single layer
model = Sequential(layer)
input = torch.tensor([[1., 1.]], requires_grad=True)
print('Custom Composite:')
for alpha, beta in [(1., 0.), (2., 1.)]:
print(f' alpha={alpha}, beta={beta}')
# create a custom composite, which maps Linear layers to AlphaBeta
composite = LayerMapComposite([((Linear,), AlphaBeta(alpha=alpha, beta=beta))])
# use the Gradient attributor on the model with our custom composite
with Gradient(model, composite) as attributor:
out, relevance = attributor(input)
print(f' {relevance}') which prints: Sequential Output
which is both what you computed by hand. Have a look at the documentation, where there is also an example with only a single layer. |
Awesome! Thank you for the quick reply. |
Hi,
I've been working very extensively with the LRP method lately and I also tried to implement the method with the most commonly used rules by myself. In order to check the correctness of my implementation, I compared some results with already existing implementations (like yours ;) ). Thereby I always get different relevances with the AlphaBeta-rule (I already opened an issue in innvestigate with the same example). Maybe you can explain the following behavior or confirm that this is really a bug on your side:
Let's suppose we have only one layer with two inputs, one output and no activation function. The layer has the following weights and bias vector: W = (1, -1) and b = -1. For the input x = (1,1), the formula of the AlphaBeta rule (Eq. (60)) in Bach et al. reduces to (in this case is r_out = -1)
This yields a relevance of (-1, 0) for the Alpha1_Beta0-rule and (-2, 0.5) for the Alpha2_Beta1-rule. But with your implementation I get both times (0.5, -0.5). Also for other choices of alpha and beta I always get the same result. Here is my code snippet for the Alpha1_Beta0 rule:
I hope you can help me to clarify this behavior.
Best Niklas
The text was updated successfully, but these errors were encountered: