Dataset Builder creates duplicate query-document pairs & model predictions are odd #140

littlewine · 2020-04-20T17:35:11Z

I have the following issue, which is really odd and affects the evaluation of the neural models.
I build my data using the auto preparer and I came to realize, that when I try to make predictions on the test set, some document-query pairs are duplicated.
I am not sure why this is happening, my first guess would be in order to fill up the missing examples until the batch size, but this does not seem to be the case.

Here's most of my code:

model, prpr, dsb, dlb = preparer.prepare(model_class,
                                             train_pack
                                             )

    train_prepr = prpr.transform(train_pack)
    valid_prepr = prpr.transform(valid_pack)
    test_prepr = prpr.transform(test_pack)

    mz.dataloader.dataset_builder.DatasetBuilder()
    train_dataset = dsb.build(train_prepr)
    valid_dataset = dsb.build(valid_prepr)
    test_dataset = dsb.build(test_prepr)

    train_dl = dlb.build(train_dataset, stage='train')
    valid_dl = dlb.build(valid_dataset, stage='dev')
    test_dl = dlb.build(test_dataset, stage='test')

# training the model etc....

    test_preds = pd.DataFrame(trainer.predict(test_dl), columns=['pred'])
    test_preds['id_left'] = test_dl.id_left
    test_preds['id_right'] = test_dl._dataset[:][0]['id_right']
    test_preds['length_right'] = test_dl._dataset[:][0]['length_right']

Now, it seems that the duplicates are created through the dataset builder, but I don't understand why.

    test_dataset._data_pack.frame().duplicated(['id_left', 'id_right']).sum() 
>> 297
    test_pack.frame().duplicated(['id_left', 'id_right']).sum() 
>>0
    test_prepr.frame().duplicated(['id_left', 'id_right']).sum()
>> 0

Even more odd, is the fact that those predictions have different scores for the same document-query pairs. And those are not even always close to each other - so this can't be some rounding error or so. This is very weird, how is it possible that without re-training the model, I can get so much different predictions for the same query-document pairs in inference time???


    print(test_preds[test_preds.duplicated(['id_right', 'id_left'],
                                           keep=False)].sort_values(['id_left', 'id_right'])
          )

>>
           pred  id_left                   id_right  length_right
466  -10.889746   33-1-1  47-07395           896
499   -9.492123   33-1-1  47-07395           896
677   -6.880966   33-1-1  47-07395           896
496  -10.781660   33-1-1  98-33779           535
678   -7.954109   33-1-1  98-33779           535
1044 -11.102488   33-1-1 98-33779           535
508   -6.497414   33-1-1  95-23333           244
1326  -7.466503   33-1-1  95-23333           244

In this replicated example the model used was KNRM, but I think this happens in other models too.

The text was updated successfully, but these errors were encountered:

faneshion · 2020-09-20T07:12:53Z

Hi, @littlewine , there are indeed three kinds of datapack, i.e., point-wise, pair-wise, and list-wise. In fact, for training, we can choose either one according to the loss function. While in testing, we should not organize the datapack into pair-wise since it will add duplicate instances to fill the batch size.

littlewine added the bug Something isn't working label Apr 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataset Builder creates duplicate query-document pairs & model predictions are odd #140

Dataset Builder creates duplicate query-document pairs & model predictions are odd #140

littlewine commented Apr 20, 2020

faneshion commented Sep 20, 2020

Dataset Builder creates duplicate query-document pairs & model predictions are odd #140

Dataset Builder creates duplicate query-document pairs & model predictions are odd #140

Comments

littlewine commented Apr 20, 2020

faneshion commented Sep 20, 2020