Use 1/0 labels for binary classification instead of 1/-1 #9

benmccann · 2016-05-14T01:13:55Z

The loss function used in this library for binary classification is a hinge-loss function assuming labels +1 or -1:

case 1 =>
  1 - Math.signum(pred * label)

However, the predictions being made are in the range 0-1:

case 1 =>
  1.0 / (1.0 + Math.exp(-pred))

The 1 / 0 used in predictions should be preferred to the 1 / -1 expected in the loss function because the negative label is represented by 0 in spark.mllib instead of −1, to be consistent with multiclass labeling.

The loss function should be changed to be more like the way Spark does it.

The text was updated successfully, but these errors were encountered:

benmccann · 2016-05-16T18:38:46Z

Ahh, looks like it does a transform. But I think this is a very non-standard way of doing things since the goal is to upstream this and have it merged to Spark's mllib. I believe they use the 1 / 0 representation internally and we shouldn't change that.

val data = task match {
  case 0 =>
    input.map(l => (l.label, l.features)).persist()
  case 1 =>
    input.map(l => (if (l.label > 0) 1.0 else -1.0, l.features)).persist()
}

zdx · 2017-02-10T06:51:07Z

conclusion？

willysys · 2018-10-11T03:43:24Z

In classification problem,why compute gradient use logitloss, but get loss use hingeloss ?
get gradient in code as follows:

val mult = task match {
      case 0 =>
        pred - label
      case 1 =>
        -label * (1.0 - 1.0 / (1.0 + Math.exp(-label * pred)))
    }

get loss in code as follows:

task match {
      case 0 =>
        (pred - label) * (pred - label)
      case 1 =>
        1 - Math.signum(pred * label)            //hinge loss
    }

benmccann changed the title ~~Incorrect loss function for binary classification~~ Use 1/0 labels for binary classification instead of 1/-1 May 16, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use 1/0 labels for binary classification instead of 1/-1 #9

Use 1/0 labels for binary classification instead of 1/-1 #9

benmccann commented May 14, 2016

benmccann commented May 16, 2016

zdx commented Feb 10, 2017

willysys commented Oct 11, 2018 •

edited

Loading

Use 1/0 labels for binary classification instead of 1/-1 #9

Use 1/0 labels for binary classification instead of 1/-1 #9

Comments

benmccann commented May 14, 2016

benmccann commented May 16, 2016

zdx commented Feb 10, 2017

willysys commented Oct 11, 2018 • edited Loading

willysys commented Oct 11, 2018 •

edited

Loading