You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ahh, looks like it does a transform. But I think this is a very non-standard way of doing things since the goal is to upstream this and have it merged to Spark's mllib. I believe they use the 1 / 0 representation internally and we shouldn't change that.
val data = task match {
case 0 =>
input.map(l => (l.label, l.features)).persist()
case 1 =>
input.map(l => (if (l.label > 0) 1.0 else -1.0, l.features)).persist()
}
benmccann
changed the title
Incorrect loss function for binary classification
Use 1/0 labels for binary classification instead of 1/-1
May 16, 2016
The loss function used in this library for binary classification is a hinge-loss function assuming labels +1 or -1:
However, the predictions being made are in the range 0-1:
The 1 / 0 used in predictions should be preferred to the 1 / -1 expected in the loss function because the negative label is represented by 0 in spark.mllib instead of −1, to be consistent with multiclass labeling.
The loss function should be changed to be more like the way Spark does it.
The text was updated successfully, but these errors were encountered: