计算embeddings的相似度，为什么不直接用L2距离或者cosine距离，而是？ #19

qxzhou1010 · 2020-07-29T06:47:12Z

 private float evaluate(float[][] embeddings) {
        float[] embeddings1 = embeddings[0];
        float[] embeddings2 = embeddings[1];
        float dist = 0;
        for (int i = 0; i < 192; i++) {
            dist += Math.pow(embeddings1[i] - embeddings2[i], 2);
        }
        float same = 0;
        for (int i = 0; i < 400; i++) {
            float threshold = 0.01f * (i + 1);
            if (dist < threshold) {
                same += 1.0 / 400;
            }
        }
        return same;
    }```
- 有点没看懂为什么要这样计算最后两个embedding的相似度？

The text was updated successfully, but these errors were encountered:

h3clikejava · 2020-08-14T05:04:16Z

这不就是余弦公式么，关键是400是啥意思，哪里来的？

syaringan357 · 2020-08-17T06:10:33Z

def evaluate(embeddings):
    # Calculate evaluation metrics
    thresholds = np.arange(0, 4, 0.01)
    thresholds = thresholds + 0.01
    embeddings1 = embeddings[0]
    embeddings2 = embeddings[1]
    assert (embeddings1.shape[0] == embeddings2.shape[0])

    diff = np.subtract(embeddings1, embeddings2)
    dist = np.sum(np.square(diff))
    predict_issame = np.less(dist, thresholds)
    return np.mean(predict_issame)

这是源码

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

计算embeddings的相似度，为什么不直接用L2距离或者cosine距离，而是？ #19

计算embeddings的相似度，为什么不直接用L2距离或者cosine距离，而是？ #19

qxzhou1010 commented Jul 29, 2020 •

edited

Loading

h3clikejava commented Aug 14, 2020

syaringan357 commented Aug 17, 2020 •

edited

Loading

计算embeddings的相似度，为什么不直接用L2距离或者cosine距离，而是？ #19

计算embeddings的相似度，为什么不直接用L2距离或者cosine距离，而是？ #19

Comments

qxzhou1010 commented Jul 29, 2020 • edited Loading

h3clikejava commented Aug 14, 2020

syaringan357 commented Aug 17, 2020 • edited Loading

qxzhou1010 commented Jul 29, 2020 •

edited

Loading

syaringan357 commented Aug 17, 2020 •

edited

Loading