-
Notifications
You must be signed in to change notification settings - Fork 449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor onnx-ir
to remove static shape inference and rely on rank inference only
#2478
Comments
CCing: @laggui, @skewballfox, tiruka, @hexd0t, @nathanielsimard |
While this is a bit short of runtime support, I've been thinking about a way to support compile time dynamic shapes with onnx-ir. It essentially boils down to:
This, again, would fall a bit short of runtime support, but allows more flexibility for build time configuration, and I think we'd need the same information if using burn-import for buillding the models at runtime. EDIT: from what I remember, the onnx parsing for tract involved some sort of Solver, I assume this is what they meant EDIT2: from #2304
@antimora I'm making a few assumptions here and feel free to correct me if any of them are off base. The way I see this being used (at runtime) is the user loads an ONNX file at the beginning of a program or runtime initialization, the graph is built, and then the sizes don't change. The inputs are coming through a loop where they are exactly the same size every time. If that doesn't necessarily hold and the inputs can change, then we should probably set the sizes(or capacity) to be the expected max input size, partially so if broadcasting is used, the tensors won't have to be moved around in memory. In either case, for the intermediate representation produced by onnx-ir, it makes sense to keep the shape data available.
@laggui I could open a PR for this. It might have to wait until the weekend but I know this part of the code well enough that it wouldn't take me too long. I could do this separately from the stuff about inferred shapes above to make it easier to review |
I should have made requirements more clear. Currently there are two requirements:
I would like to take up the first one behind a feature flag in onnx-ir crate. I want to disable shape information for burn-import because it is impossible to guarantee that a runtime shape would match the static shape information generated during the build. |
burn-onnxir
to Enhance Dynamic Shape Support and Shape Inferenceonnx-ir
to remove static shape inference to rely on rank inference only
onnx-ir
to remove static shape inference to rely on rank inference onlyonnx-ir
to remove static shape inference and rely on rank inference only
For 1, could you point me to some relevant examples of what the syntax might look like? or just code out some examples here? I'm mainly trying to get a better idea of what the changes would be required for node shape handling.
Even if the shapes can be changed at runtime, it still seems reasonable to want to initialize the tensors with some expected capacity. |
This feature can be implemented if a use case warrants it. Shape inferencing likely won't be needed in most cases since other libraries using onnx-ir will track shapes at runtime themselves. Currently, onnx-ir tracks shapes statically, mainly from tensor information contained in ONNX files. I didn't want downstream developers to use it assuming it was accurate information. |
The goal of this refactor is to remove all static shape inference within the
onnx-ir
module and focus solely on rank inference. This shift aims to:Simplify the Shape Inference Process:
Align with ONNX and Runtime Shapes:
onnx-ir
has distinctions between onnx-dynamic-shapes, runtime-shapes, and static-shapes, often causing confusion. By removing static shape inference, we emphasize runtime-shapes, which are central to burn-import, thereby reducing the cognitive load on developers.Enable Cleaner and More Focused Code:
Rationale
Static shape inference at build time should be handled separately, reducing the dependency on static shape information. This proposal emphasizes a cleaner, rank-inference-only approach that aligns with ONNX’s dynamic capabilities and reduces the overhead for burn-import contributors.
Action Items
onnx-ir
.onnx-ir
and tensor APIs to perform rank inference exclusively.onnx-ir
handles a variety of ONNX models with consistent rank inference across all operations.Benefits
This refactor will unburden developers from managing complex static shapes, streamline the review process, and make Burn more adaptable for dynamic ONNX models.
The text was updated successfully, but these errors were encountered: