Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Returned tags should have confidence values #40

Open
ayberkozgur opened this issue Apr 29, 2014 · 9 comments
Open

Returned tags should have confidence values #40

ayberkozgur opened this issue Apr 29, 2014 · 9 comments

Comments

@ayberkozgur
Copy link
Member

Returned tags should have confidence values (e.g between 0 and 1) just like the skeleton joint confidences returned by OpenNI skeleton tracker for Kinect. This would be very useful on the user side where we can e.g decide to trust some tag more than another in a multi-tag setting.

It should be according to some metric or a combination of multiple metrics, some ideas are:

  • Pixel colors on top of tag bits indicate confidence, i.e black/white indicate high confidence while gray tones indicate low confidence etc.
  • Give tags with lower visual area lower confidence
  • Give lower confidence to tags that have lower visual area than a certain threshold
  • Give flatter tags, i.e tags that are seen from a narrower angle, lower confidence
@qbonnard
Copy link
Member

Makes sense. Another metrics would be the gradient value used for the corner detection.

What is your use case ?

@ayberkozgur
Copy link
Member Author

I have multiple tags that represent landmark objects, they lie on the table in a predefined configuration and do not move. When the camera sees multiple tags, it will set the virtual GL camera's transform to the inverse transform of the most confident tag (or a weighted average of all seen tags etc.). This way, I will be able to build a scene model and implement all sorts of things, e.g augmented 3D models, physics etc.

@qbonnard
Copy link
Member

In this case, I think it would be better to use Chilitags3D::readTagConfiguration to have a scene object whose transform is estimated out of multiple tags. The confidence would be nice to weight the multiple tags in this estimation (internally).
Have you tried Chilitags3D::readTagConfiguration already?

@ayberkozgur
Copy link
Member Author

I know the existence of tag configurations but I will also have tags that move, and I have to be able to dynamically add/remove these landmarks (simulating the destruction of augmented objects). This is why I can't take them as a single object. Also, in the (not very near) future, I am planning to estimate the landmark configuration via solving the particular SLAM problem, eliminating the predefined landmark configuration altogether.

In addition, I think once the confidence values are calculated, there should be no reason why we shouldn't export it to the user along with the tag transforms instead of keeping it internal.

@qbonnard
Copy link
Member

I was just checking that the readTagConfiguration method was not too buried ;)
Sure, the confidence values make sense.
For the dynamic modification of the landmark, would it help to have a setTagConfiguration method that would allow the modification of the tag configuration without having to use a file ? I think that's missing anyway...

@ayberkozgur
Copy link
Member Author

Yes, in fact once I think about it, setting the tag configuration dynamically makes sense for the landmark job.

@qbonnard
Copy link
Member

Noted: #42

@severin-lemaignan
Copy link
Member

Returning confidence seems interesting for advanced applications, indeed, but we also want to keep one "simple" API (aka detect tags in 2 obvious lines). So, either we add a findWithConfidence or we do some template magic on the return type to provide the confidence only when needed.

@ayberkozgur
Copy link
Member Author

This is important: Tags whose transforms change a lot over time (e.g tag flips each couple of frames) should have drastically low confidence values. In any case, I will be implementing this functionality in my application. Currently, I calculate the "spread" of the sample transform batch, which is a weighted sum of traces of the translation sample covariance matrix (3x3) and rotation sample covariance matrix (4x4 since quaternions). The sample batch is e.g the last 30 values of the tag transform.

Of course, there are other metrics who use the whole covariance matrices as well, but I used this method and I found it satisfactory. It's not expensive to calculate since only diagonal values of the cov. matrices must be calculated. See the code I'm currently using at https://github.com/chili-epfl/cellulo/blob/master/core/src/ch/epfl/chili/cellulo/math/util/TransformSampleBatch.java.

By the way, I don't see any harm in returning as much information as possible resulting from a 3D detection in a struct (as long as you can turn them off with flags during object creation or during runtime for performance concerns). The detect tags in 2 obvious lines code won't even change by 1 character and the detection result will be tag.second.transform instead of tag.second, which is more explicit and more readable if you ask me. And, if these extra calculations (e.g confidence) are turned off by default, neither the API complexity nor the default performance will change for the simple user.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants