You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, thank you for sharing your codes for many reviewers.
I reviewed your codes, and explored preprocessed data in Data folder.
I find some strange thing;duplicated negative samples exist for a user.
In the paper Section 4.1 Evaluation Protocols, there is a sentence as follows.
we followed the common strategy [6, 21] that randomly samples 100 items that are not interacted by the user, ranking the test item among the 100 items.
Although you mentioned about replacement for negative sampling, I think it is reasonable to extract negative sampling without replacement for each user.
This is because the ndcg of test dataset would be over-estimated.
As an example, this scenario can be happened.
If given negative samples which has duplicated items, recommended list also can have duplicated items.
# suppose that there is a top 10 recommended list for given one positive and 99 negative samples with replacement.
recs= [10, 11, 11, 11, 9, 29, 102, 204, 23, 2]
gt = [11]
ndcg(recs, gt)
Above ndcg returns 1 / log2(1 + 2).
This ndcg is not reasonable because 11 sampled 3 times. It means other items lose their chances to be recommended.
Summary
Generally, recommended list is distinct.
However, your test negative samples has duplicated items for a user.
Please checkout as follows. (Reproduce unreasonable behavior)
Hello, @hexiangnan
First of all, thank you for sharing your codes for many reviewers.
I reviewed your codes, and explored preprocessed data in
Data
folder.I find some strange thing;duplicated negative samples exist for a user.
In the paper Section 4.1 Evaluation Protocols, there is a sentence as follows.
Although you mentioned about replacement for negative sampling, I think it is reasonable to extract negative sampling without replacement for each user.
This is because the ndcg of test dataset would be over-estimated.
As an example, this scenario can be happened.
If given negative samples which has duplicated items, recommended list also can have duplicated items.
Above ndcg returns
1 / log2(1 + 2)
.This ndcg is not reasonable because 11 sampled 3 times. It means other items lose their chances to be recommended.
Summary
Generally, recommended list is distinct.
However, your test negative samples has duplicated items for a user.
Please checkout as follows. (Reproduce unreasonable behavior)
The text was updated successfully, but these errors were encountered: