You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@ClarenceTee93 I have the same question, hope @eriklindernoren could shed some light on it. I've been long following the Hands On Machine Learning with ... by Aurelien Geron .. and the equation used for Batch Gradient Descent in that book is:
2 / training_size * ( X_b.T.dot( X.dot(theta) - y ) ) ; this could be re-written as 2/m * ( X_b.T.dot( ypred - y ))
Even if assuming that the, X used in the equation from @eriklindernoren , already has a bias term included for each sample, and switch from ypred - y , to - (y - ypred ) makes sense, the multiplicative factor must be included in the equation, to the best of my knowledge, the math checks out , if we carefully differentiate (ypred - y ) ^ 2 w.r.t each parameter.
grad_w = -(y - y_pred).dot(X) + self.regularization.grad(self.w)
in regression.py
should it be grad_w = -(y - y_pred).dot(X) * (1/training_size) + self.regularization.grad(self.w) ?
The text was updated successfully, but these errors were encountered: