-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid overhead for synthesized nodes lookup #13424
Conversation
After Qiskit#12550 a hash implementation was added to the implementation of DAGOpNode to be able to have identical instances of dag nodes used be usable in a set or dict. This is because after Qiskit#12550 changed the DAGCircuit so the DAGOpNode instances were just a python view of the data contained in the nodes of a dag. While prior to Qiskit#12550 the actual DAGOpNode objects were returned by reference from DAG methods. However, this hash implementation has additional overhead compared to the object identity based version used before. This has caused a regression in some cases for high level synthesis when it's checking for nodes it's already synthesized. This commit addresses this by changing the dict key to be the node id instead of the node object. The integer hashing is significantly faster than the object hashing.
One or more of the following people are relevant to this code:
|
Thank you for this addition! This makes more sense. Is this improvement noticeable when benchmarking runtime? |
I did a quick asv run and it didn't flag anything as an improvement or a regression. But it is noticeable in some benchpress benchmarks it has a noticeable improvement for example running |
Pull Request Test Coverage Report for Build 11800658258Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
@@ -382,7 +382,7 @@ def _run( | |||
|
|||
# If the synthesis changed the operation (i.e. it is not None), store the result. | |||
if synthesized is not None: | |||
synthesized_nodes[node] = (synthesized, synthesized_context) | |||
synthesized_nodes[node._node_id] = (synthesized, synthesized_context) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine in this case since synthesized_nodes
is only used with one DAG (and not e.g. shared by the recursive control flow block handling).
But this optimization is something we should be careful when applying in other places, since the semantics of uniqueness in keys shifts from being global to local within a specific DAG (which means that keys will clobber each other if node indices from more than a single DAG are used in the same map). This was responsible for a bug we had in the visualization code after the initial DAG port to Rust.
After #12550 a hash implementation was added to the implementation of DAGOpNode to be able to have identical instances of dag nodes used be usable in a set or dict. This is because after #12550 changed the DAGCircuit so the DAGOpNode instances were just a python view of the data contained in the nodes of a dag. While prior to #12550 the actual DAGOpNode objects were returned by reference from DAG methods. However, this hash implementation has additional overhead compared to the object identity based version used before. This has caused a regression in some cases for high level synthesis when it's checking for nodes it's already synthesized. This commit addresses this by changing the dict key to be the node id instead of the node object. The integer hashing is significantly faster than the object hashing. (cherry picked from commit 8c6ad02)
After #12550 a hash implementation was added to the implementation of DAGOpNode to be able to have identical instances of dag nodes used be usable in a set or dict. This is because after #12550 changed the DAGCircuit so the DAGOpNode instances were just a python view of the data contained in the nodes of a dag. While prior to #12550 the actual DAGOpNode objects were returned by reference from DAG methods. However, this hash implementation has additional overhead compared to the object identity based version used before. This has caused a regression in some cases for high level synthesis when it's checking for nodes it's already synthesized. This commit addresses this by changing the dict key to be the node id instead of the node object. The integer hashing is significantly faster than the object hashing. (cherry picked from commit 8c6ad02) Co-authored-by: Matthew Treinish <[email protected]>
Summary
After #12550 a hash implementation was added to the implementation of DAGOpNode to be able to have identical instances of dag nodes used be usable in a set or dict. This is because after #12550 changed the DAGCircuit so the DAGOpNode instances were just a python view of the data contained in the nodes of a dag. While prior to #12550 the actual DAGOpNode objects were returned by reference from DAG methods. However, this hash implementation has additional overhead compared to the object identity based version used before. This has caused a regression in some cases for high level synthesis when it's checking for nodes it's already synthesized. This commit addresses this by changing the dict key to be the node id instead of the node object. The integer hashing is significantly faster than the object hashing.
Details and comments