You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
training set:
Left num: 2118; Right num: 18841;Relation num: 20360;positive example (with label 1) num: 1040(5.1%)
dev set:
Left num: 296;Right num: 2708;Relation num: 2733;positive example num: 140(5.12%)
test set:
Left num: 633;Right num: 5961;Relation num: 6165;positive example num: 293(4.75%)
I wonder if this is the official way to combine question and answer, because the proportion of positive examples in three set is only 5%, which means if a model outputs 0 forever, it can achieve 95% accuracy? And the best performence of BERT on this dataset is just 95%. The proportion of positive and negative examples is too imbalance?
The text was updated successfully, but these errors were encountered:
I make some analysis on wiki qa dataset:
Left num: 2118; Right num: 18841;Relation num: 20360;positive example (with label 1) num: 1040(5.1%)
Left num: 296;Right num: 2708;Relation num: 2733;positive example num: 140(5.12%)
Left num: 633;Right num: 5961;Relation num: 6165;positive example num: 293(4.75%)
I wonder if this is the official way to combine question and answer, because the proportion of positive examples in three set is only 5%, which means if a model outputs 0 forever, it can achieve 95% accuracy? And the best performence of BERT on this dataset is just 95%. The proportion of positive and negative examples is too imbalance?
The text was updated successfully, but these errors were encountered: