Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better Algorithm Mk. II #37

Open
william-silversmith opened this issue May 6, 2020 · 7 comments
Open

Better Algorithm Mk. II #37

william-silversmith opened this issue May 6, 2020 · 7 comments
Labels
performance Lower memory or faster computation.

Comments

@william-silversmith
Copy link
Contributor

Discussed this some in #6

There are several faster algorithms than Wu et al's 2005 decision tree.

  • He et al 2007 describes a way of using a simpler data structure than union-find, but it requires 3x as much memory (3 arrays) for what is probably a 10-20% improvement. They don't compare with Wu et al in their paper.
  • Chang's contour tracing algorithm is very fast and probably shouldn't be ignored. It might be the best for our number of labels (see figure below from Grana).
  • Grana's 2009 paper on 2x2 block based evaluation is promising. It's complex in 2D, and extending it to 3D requires a very complicated tree, though if the chip can handle it efficiently, there's a lot of efficiencies to be gained, maybe even more than in 2D.

image

Figure from Grana, Borghesani, Cucchiara. "FAST BLOCK BASED CONNECTED COMPONENTS LABELING". IEEE 2009. doi: 10.1109/ICIP.2009.5413731
@william-silversmith william-silversmith added the performance Lower memory or faster computation. label May 6, 2020
@william-silversmith
Copy link
Contributor Author

It's also worth noting that Wu's algorithm is good enough that we are in a regime where the best known algorithm probably won't gain us more than 2x performance, while parallel could gain us several times more than that.

@william-silversmith
Copy link
Contributor Author

The block based approach is undermined by multi-label. It depends on being able to summarize entire regions using the knowledge that all foreground pixels will match.

@william-silversmith
Copy link
Contributor Author

It might still be fun to implement block-based for binary images, which are an important class of operation.

@william-silversmith
Copy link
Contributor Author

OpenCV implements Grana et al's Optimized Block Based Decision Tree for binary images (apparently by that team themselves) and I tested it out on a medical YACCLAB dataset. Their algorithm was almost twice as fast, getting 470-493 MVx/sec vs about 250 MVx/sec from mine. Pretty impressive. I had assumed that the ceiling for a good algorithm was more like 20% rather than nearly 100%.

@william-silversmith
Copy link
Contributor Author

Was able to provide a better algorithm for 6-connected which makes me happy. I think the real secret of BBDT is the ability to make 1/4 fewer writes in the first pass. Will have to try that out.

@william-silversmith
Copy link
Contributor Author

william-silversmith commented Jan 16, 2021

BBDT and approaches derived from it are not useful for multilabel, however I suspect that the pixel prediction approach is readily compatible. Create a forest of trees for each situation and use goto to select the correct tree for the next voxel. This can allow the elimination of a lot of duplicate work. 4 and 6 connected are prime candidates for this. 6-connected especially is about 2-3x worse than a null pass, so there's lots of room for improvement.

This technique is described in some detail in this paper: https://prittt.github.io/pub_files/2016acivs.pdf

For multi-label, we would use knowledge of the previous differently labeled pixel to remove pieces from the full decision tree rather than positively select information that is already known.

@william-silversmith
Copy link
Contributor Author

It would be cool if I integrated some of the crazy binary algorithms from YACCLAB. That group is on a tear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Lower memory or faster computation.
Projects
None yet
Development

No branches or pull requests

1 participant