Duplicated negative samples for a user exist #72

swyo · 2021-05-12T08:59:50Z

First of all, thank you for sharing your codes for many reviewers.

I reviewed your codes, and explored preprocessed data in Data folder.

I find some strange thing;duplicated negative samples exist for a user.

In the paper Section 4.1 Evaluation Protocols, there is a sentence as follows.

we followed the common strategy [6, 21] that randomly samples 100 items that are not interacted by the user, ranking the test item among the 100 items.

Although you mentioned about replacement for negative sampling, I think it is reasonable to extract negative sampling without replacement for each user.

This is because the ndcg of test dataset would be over-estimated.

As an example, this scenario can be happened.

If given negative samples which has duplicated items, recommended list also can have duplicated items.

# suppose that there is a top 10 recommended list for given one positive and 99 negative samples with replacement.
recs= [10, 11, 11, 11, 9, 29, 102, 204, 23, 2]
gt = [11]
ndcg(recs, gt)

Above ndcg returns 1 / log2(1 + 2).

This ndcg is not reasonable because 11 sampled 3 times. It means other items lose their chances to be recommended.

Summary

Generally, recommended list is distinct.
However, your test negative samples has duplicated items for a user.

Please checkout as follows. (Reproduce unreasonable behavior)

for uid, iid, label in test_loader:
  assert len(set(iid)) == len(iid)

The text was updated successfully, but these errors were encountered:

swyo changed the title ~~Duplicated negative samples for a user exists~~ Duplicated negative samples for a user exist May 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duplicated negative samples for a user exist #72

Duplicated negative samples for a user exist #72

swyo commented May 12, 2021 •

edited

Loading

Duplicated negative samples for a user exist #72

Duplicated negative samples for a user exist #72

Comments

swyo commented May 12, 2021 • edited Loading

Summary

swyo commented May 12, 2021 •

edited

Loading