Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Provide a confidence measure associated with action #60

Open
aidancc opened this issue Aug 3, 2016 · 1 comment
Open

Comments

@aidancc
Copy link

aidancc commented Aug 3, 2016

For either policy or ranking usage, it would be helpful to know the confidence associated with any particular suggested action. For example, it would be helpful to be able to distinguish between the following 2 cases:
a) the probability of achieving maximal reward from the first-ranked action >> probability of achieving maximal reward from the second-ranked action
b) the probability of achieving maximal reward from the first-ranked action =~ probability of achieving maximal reward from the second-ranked action
Furthermore, it would be useful to be able to know if a particular suggested action is skewed toward exploration and/or have the ability to prevent this on a per-request basis.

@JohnLangford
Copy link
Contributor

This is tricky, because you are asking for high precision at the
frontier of what can be estimated. There are some exploration strategies
(i.e. epsilon greedy) where we could estimate this reasonably well. But
if you care, then you should already be using some of the advanced
exploration strategies (i.e. bagging or cover). When you are using
these advanced strategies, there generally isn't any (significant)
probability of exploring obviously suboptimal actions (as determined by
the algorithm) so everything is already in case (b).

-John

On 08/03/2016 07:48 PM, aidancc wrote:

For either policy or ranking usage, it would be helpful to know the
confidence associated with any particular suggested action. For
example, it would be helpful to be able to distinguish between the
following 2 cases:
a) the probability of achieving maximal reward from the first-ranked
action >> probability of achieving maximal reward from the
second-ranked action
b) the probability of achieving maximal reward from the first-ranked
action =~ probability of achieving maximal reward from the
second-ranked action
Furthermore, it would be useful to be able to know if a particular
suggested action is skewed toward exploration and/or have the ability
to prevent this on a per-request basis.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#60, or mute the thread
https://github.com/notifications/unsubscribe-auth/AAE25m2qYZSyYgzcuWHhJq1TyutAK-u0ks5qcSi2gaJpZM4JcNT4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants