Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on Heuristic Routing Strategy #14

Open
Jianqiuer opened this issue Mar 14, 2024 · 1 comment
Open

Question on Heuristic Routing Strategy #14

Jianqiuer opened this issue Mar 14, 2024 · 1 comment

Comments

@Jianqiuer
Copy link

I've been exploring the implementation of the heuristic routing strategy within your project and came across a specific operation that piqued my curiosity. Specifically, I noticed that the strategy doesn't directly utilize the first bounding box (IOU score index 0) prediction result for routing decisions. Instead, there seems to be an operation where the initial mask prediction result is modified by subtracting 1000 from it.

Could you please clarify the underlying principle behind this approach? I'm particularly interested in understanding:

  • The rationale for not using the first BBox prediction result as-is for heuristic routing.

  • The significance and expected impact of subtracting 1000 from the initial mask prediction result.

I believe understanding this could greatly enhance my comprehension of the heuristic routing strategy's design and its implications on the system's overall performance.

Looking forward to your insights.

Thank you for your time and consideration.

@PhyscalX
Copy link
Collaborator

PhyscalX commented Apr 10, 2024

@Jianqiuer

  1. We refer the SAM's conclusion, that box prediction is not ambiguous.
    As a result, we always select the first mask token for box.

  2. Score subtraction is a simple vectorized implementation for SAM's and ours routing strategy.
    We refer the ONNX wrapper code for SAM. [Code].

  3. We use a routing strategy slightly different from SAM's implementation.
    We rethink the ambiguity issue for K-points prompt.
    Typically, estimating an accurate "K" is non-trivial for both training and evaluating phases.
    For simplicity, we always select the top-ranked mask token for points.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants