-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrating optimizer to new IR (part 1) #1725
Conversation
# if isinstance(output, list): | ||
# return output | ||
# else: | ||
# # Currently handles single output only | ||
# self.add_count(node.op_type, output.size) | ||
# return self.new_constant(node.output[0], output) |
Check notice
Code scanning / CodeQL
Commented-out code Note
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lintrunner found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1725 +/- ##
==========================================
+ Coverage 74.84% 74.88% +0.03%
==========================================
Files 242 244 +2
Lines 25882 26344 +462
Branches 4669 4788 +119
==========================================
+ Hits 19371 19727 +356
- Misses 5641 5696 +55
- Partials 870 921 +51 ☔ View full report in Codecov by Sentry. |
Test Results 24 files ± 0 24 suites ±0 2h 4m 45s ⏱️ + 3m 10s For more details on these failures, see this check. Results for commit 61511ef. ± Comparison against base commit f8ee736. This pull request removes 17 and adds 33 tests. Note that renamed tests count towards both.
This pull request removes 1 skipped test and adds 2 skipped tests. Note that renamed tests count towards both.
♻️ This comment has been updated with latest results. |
def _get_int_attribute(node: ir.Node, name: str, default: int | None = None) -> int | None: | ||
if name in node.attributes: | ||
attr = node.attributes[name] | ||
if isinstance(attr, ir.AttrInt64): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to check the attr type because ir.Attr may carry an int64 attr as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm ... so there is no guarantee that the specialized derived classes of ir.Attr will be created? That seems potentially confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
potentially confusing
that's true. Do you have suggestions? We can turn the specialized classes into functions that will produce Attr objects, for example.
for old_value, new_value in zip(old_values, new_values): | ||
# Propagate relevant info from old value to new value | ||
# TODO(Rama): Perhaps we should merge old and new types. As of now, new | ||
# values don't have type information. Note that this could be a problem | ||
# for semantics-altering rewrite-rules: we should allow users to override | ||
# this for such rules. | ||
new_value.type = old_value.type | ||
new_value.shape = old_value.shape | ||
new_value.const_value = old_value.const_value | ||
new_value.name = old_value.name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can create a copy() method for nodes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, at least, we need some common utility we can call here.
|
||
def __init__( | ||
self, | ||
external_data_folder: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would support os.PathLike as well (in general for paths)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
folder -> dir ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this seems unused currently! Previously, it was used when converting an external tensor into a numpy array (using ONNX's numpy_helper, which only supports "str" for the external-dir incidentally). Now, this uses IR to convert to numpy array. What happens in the IR when external-tensors are converted to numpy array? Does the IR remember the external-data-folder? Or, do we need pass it as a parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is currently incomplete, we need to update the deserialize function: #1701 (comment)
val = _get_numpy_value(val) | ||
if val is None: | ||
return None | ||
if isinstance(val, bool): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious when this is needed? Do we need to update the typing annotation or tighten the invariance?
value.type = type | ||
|
||
|
||
def _get_input_element_type(node: ir.Node, index: int) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _get_input_element_type(node: ir.Node, index: int) -> int: | |
def _get_input_element_type(node: ir.Node, index: int) -> ir.DataType: |
seems appropriate based on usage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
skipping this for now. (will require dropping the .value
in the implementation in two places as well.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm w/ comments. In particular, I think const_value
should always be ir.TensorProtocol
and we can rely on this invariance.
self, opname: str, domain: str = "", version=None | ||
) -> Callable[[PartialEvaluatorFunction], PartialEvaluatorFunction]: | ||
if (domain, opname) not in self.op_evaluators: | ||
evaluator_list = [] |
Check failure
Code scanning / lintrunner
MYPY/var-annotated Error
def _get_bool_value(val: ir.Value | None) -> bool | None: | ||
if val is None: | ||
return None | ||
val = _get_numpy_value(val) |
Check failure
Code scanning / lintrunner
MYPY/assignment Error
|
||
def _get_int_attribute(node: ir.Node, name: str, default: int | None = None) -> int | None: | ||
if name in node.attributes: | ||
attr = node.attributes[name].value |
Check failure
Code scanning / lintrunner
MYPY/union-attr Error
input = _get_input(node, 0) | ||
output = _get_output(node, 0) | ||
if input is not None and output is not None: | ||
_update_type(output, input.type) |
Check failure
Code scanning / lintrunner
MYPY/arg-type Error
@register("Shape") | ||
def shape(node: ir.Node, op, state: OptimizerState) -> ReturnValue: | ||
input = node.inputs[0] | ||
shape = input.shape |
Check failure
Code scanning / lintrunner
MYPY/union-attr Error
|
||
def _init(self) -> None: | ||
self.counts = {} | ||
self.sizes = {} |
Check failure
Code scanning / lintrunner
MYPY/var-annotated Error
node.op_type, self.opset_imports[node.domain], node.domain | ||
) | ||
output_types = onnx.shape_inference.infer_node_outputs( | ||
schema, ir.serde.serialize_node(node), input_types, input_data |
Check failure
Code scanning / lintrunner
MYPY/arg-type Error
node.op_type, self.opset_imports[node.domain], node.domain | ||
) | ||
output_types = onnx.shape_inference.infer_node_outputs( | ||
schema, ir.serde.serialize_node(node), input_types, input_data |
Check failure
Code scanning / lintrunner
MYPY/arg-type Error
if any((x is not None and x.const_value is None) for x in node.inputs): | ||
return None | ||
|
||
input_values = [x.const_value.numpy() if x is not None else None for x in node.inputs] |
Check failure
Code scanning / lintrunner
MYPY/union-attr Error
if replacement is None: | ||
# No change. Process attributes. | ||
for attr in node.attributes.values(): | ||
self.visit_attribute(attr) |
Check failure
Code scanning / lintrunner
MYPY/arg-type Error
optimizer.remove_unused_nodes(ir_model) | ||
return serde.serialize_model(ir_model) | ||
else: | ||
constant_folding.fold_constants(model, onnx_shape_inference=onnx_shape_inference) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we replace constant_folding with _constant_folding eventually? If so, _constant_folding should not import stuffs from constant_folding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes ... they will migrate into this file eventually
node.op_type, self.opset_imports[node.domain], node.domain | ||
) | ||
output_types = onnx.shape_inference.infer_node_outputs( | ||
schema, ir.serde.serialize_node(node), input_types, input_data |
Check failure
Code scanning / lintrunner
MYPY/arg-type Error
node.op_type, self.opset_imports[node.domain], node.domain | ||
) | ||
output_types = onnx.shape_inference.infer_node_outputs( | ||
schema, ir.serde.serialize_node(node), input_types, input_data |
Check failure
Code scanning / lintrunner
MYPY/arg-type Error
output_types = onnx.shape_inference.infer_node_outputs( | ||
schema, | ||
ir.serde.serialize_node(node), | ||
input_types, |
Check failure
Code scanning / lintrunner
MYPY/arg-type Error
Migrated the core logic.
TO DO: decide how to handle functions. Optimizer currently incorporates function-specialization. Need to choose between function-specialization and function-inlining.