You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Wow, good catch! The sign seems not correct. Thank you for reporting it.
If I understand correctly, tau in the paper actually corresponds to 1-tau in ChainerRL's IQN, because |tau - I_{delta<0}| = |tau - (1 - I_{delta>=0})| = |(1 - tau) - I_{delta>=0}| (and if delta=0 then loss is 0 anyway). As long as quantile thresholds are sampled from U([0,1]), it should not affect its behavior as an RL algorithm, but the meaning of tau is the opposite. It should be fixed.
Hello, I have one question.
In the paper of IQN, quantile huber loss function is delta_{ij} < 0.
But chainerrl iqn code is delta_{ij} > 0.
I think this inequlity sign is not correct.
I’m sorry for poor English.
The text was updated successfully, but these errors were encountered: