Derivative of the activation function. #2

rmarchesini · 2017-11-23T16:12:35Z

Hi, my name is Ramiro, I was checking the code and I have a doubt.
When you update the parameters, related to the input layer and the hidden layer (W1,b1), you calculate the derivative of the activation function, I think that it is done in this line (ann.py file):
dZ = pY_T.dot(self.W2.T) * (1 - Z*Z) # tanh
In the particular case of the tanh I think that (1 - Z*Z) is the derivate, if this is correct so why we use Z. Recall what is stored in Z:
Z = np.tanh(X.dot(self.W1) + self.b1)
I think that we should use only X.dot(self.W1) + self.b1 to evaluate the the derivative, which is the same that use np.arctanh(Z). So the result should be (1 - np.arctanh(Z)*np.arctanh(Z)).
I'm probably wrong, just want to know why.

Thanks!
R.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Derivative of the activation function. #2

Derivative of the activation function. #2

rmarchesini commented Nov 23, 2017 •

edited

Loading

Derivative of the activation function. #2

Derivative of the activation function. #2

Comments

rmarchesini commented Nov 23, 2017 • edited Loading

rmarchesini commented Nov 23, 2017 •

edited

Loading