Use rust gates for ConsolidateBlocks #12704

mtreinish · 2024-07-01T21:54:29Z

Summary

This commit moves to use rust gates for the ConsolidateBlocks transpiler
pass. Instead of generating the unitary matrices for the gates in a 2q
block Python side and passing that list to a rust function this commit
switches to passing a list of DAGOpNodes to the rust and then generating
the matrices inside the rust function directly. This is similar to what
was done in #12650 for Optimize1qGatesDecomposition. Besides being faster
to get the matrix for standard gates, it also reduces the eager
construction of Python gate objects which was a significant source of
overhead after #12459. To that end this builds on the thread of work in
the two PRs #12692 and #12701 which changed the access patterns for
other passes to minimize eager gate object construction.

Details and comments

TODO:

Fix last test failures
Benchmark and profile
Rebase after Enable avoiding Python operation creation in transpiler #12692 and Avoid Python op creation in commutative cancellation #12701 merge

This PR is built on top of #12692 and #12701 and will need to rebased after both merge. To view the contents of just this PR you can look at the last commit on the branch (I'll be force pushing updates to that last commit): e449357

qiskit-bot · 2024-07-01T21:54:34Z

One or more of the following people are relevant to this code:

@Qiskit/terra-core
@kevinhartman
@mtreinish

qiskit-bot · 2024-07-02T22:04:37Z

One or more of the following people are relevant to this code:

@Qiskit/terra-core
@kevinhartman
@mtreinish

coveralls · 2024-07-02T22:23:38Z

Pull Request Test Coverage Report for Build 9768512668

Details

119 of 202 (58.91%) changed or added relevant lines in 10 files are covered.
5 unchanged lines in 1 file lost coverage.
Overall coverage decreased (-0.05%) to 89.737%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
qiskit/transpiler/passes/optimization/consolidate_blocks.py	9	10	90.0%
crates/circuit/src/bit_data.rs	0	3	0.0%
crates/accelerate/src/convert_2q_block_matrix.rs	45	49	91.84%
crates/circuit/src/operations.rs	15	30	50.0%
crates/circuit/src/circuit_instruction.rs	5	35	14.29%
crates/circuit/src/dag_node.rs	22	52	42.31%

Files with Coverage Reduction	New Missed Lines	%
crates/qasm2/src/lex.rs	5	93.38%

Totals
Change from base Build 9767176201:	-0.05%
Covered Lines:	64632
Relevant Lines:	72024

💛 - Coveralls

This commit moves to use rust gates for the ConsolidateBlocks transpiler pass. Instead of generating the unitary matrices for the gates in a 2q block Python side and passing that list to a rust function this commit switches to passing a list of DAGOpNodes to the rust and then generating the matrices inside the rust function directly. This is similar to what was done in Qiskit#12650 for Optimize1qGatesDecomposition. Besides being faster to get the matrix for standard gates, it also reduces the eager construction of Python gate objects which was a significant source of overhead after Qiskit#12459. To that end this builds on the thread of work in the two PRs Qiskit#12692 and Qiskit#12701 which changed the access patterns for other passes to minimize eager gate object construction.

coveralls · 2024-07-03T17:00:13Z

Pull Request Test Coverage Report for Build 9781648307

Details

57 of 65 (87.69%) changed or added relevant lines in 4 files are covered.
11 unchanged lines in 3 files lost coverage.
Overall coverage decreased (-0.01%) to 89.829%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
qiskit/transpiler/passes/optimization/consolidate_blocks.py	9	10	90.0%
crates/circuit/src/bit_data.rs	0	3	0.0%
crates/accelerate/src/convert_2q_block_matrix.rs	45	49	91.84%

Files with Coverage Reduction	New Missed Lines	%
crates/qasm2/src/expr.rs	1	94.02%
crates/qasm2/src/lex.rs	4	92.11%
crates/qasm2/src/parse.rs	6	97.61%

Totals
Change from base Build 9780558630:	-0.01%
Covered Lines:	65152
Relevant Lines:	72529

💛 - Coveralls

…solidate-blocks

coveralls · 2024-07-03T18:47:59Z

Pull Request Test Coverage Report for Build 9782984075

Details

87 of 95 (91.58%) changed or added relevant lines in 4 files are covered.
4 unchanged lines in 1 file lost coverage.
Overall coverage increased (+0.02%) to 89.841%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
qiskit/transpiler/passes/optimization/consolidate_blocks.py	9	10	90.0%
crates/circuit/src/bit_data.rs	0	3	0.0%
crates/accelerate/src/convert_2q_block_matrix.rs	73	77	94.81%

Files with Coverage Reduction	New Missed Lines	%
crates/qasm2/src/lex.rs	4	92.62%

Totals
Change from base Build 9782487328:	0.02%
Covered Lines:	65174
Relevant Lines:	72544

💛 - Coveralls

crates/accelerate/src/convert_2q_block_matrix.rs

jlapeyre · 2024-07-04T02:39:05Z

crates/accelerate/src/convert_2q_block_matrix.rs

+    let mut matrix: Array2<Complex64> = match bit_map
+        .map_bits(first_node.instruction.qubits.bind(py).iter())?
+        .map(|x| x as u8)
+        .collect::<SmallVec<[u8; 2]>>()
+        .as_slice()


Why not this?

.map_bits(first_node.instruction.qubits.bind(py).iter())? .collect::<Vec<_>>() .as_slice()

Storage should not an issue here, and u8 at best won't be more performant. Maybe the 2 here in SmallVec<[u8; 2] allows the compiler to optimize?

I was thinking it was for typing reasons, but I realized I was confusing this with the 2q decomposer and the euler 1q decomposer which are using SmallVec<u8> as the output type. I agree in this case just collecting to a vec of u32 is fine. I can change it

That's what I guessed. It was not a straight copy and paste, but something like that, from a situation where it is important that the type be u8.

jlapeyre · 2024-07-04T02:40:18Z

crates/accelerate/src/convert_2q_block_matrix.rs

-    for (op_matrix, q_list) in op_list.into_iter().skip(1) {
-        let op_matrix = op_matrix.as_array();
+    for node in op_list.into_iter().skip(1) {
+        let op_matrix = get_matrix_from_inst(py, &node.instruction)?;


Line 91 would convey more information as [..] => unreachable!()

jlapeyre · 2024-07-04T03:20:50Z

crates/accelerate/src/convert_2q_block_matrix.rs

@@ -71,8 +126,42 @@ pub fn change_basis(matrix: ArrayView2<Complex64>) -> Array2<Complex64> {
    trans_matrix
 }

+#[pyfunction]


Line 50 could be [..] => None

These two suggestions, doing [..] => None, change lines not yet touched by this PR. Whether to do it here depends on whether in general you favor cleaning these things up in a PR or doing them in a separate PR. An argument against is that this is not strictly part of the point of the PR. An argument for is that a PR to do these things would not be a high priority, so it might never be done. So I favor making the change. OTOH, the arguments for leaving these as is are not unreasonable.

jlapeyre · 2024-07-05T20:59:56Z

This PR does not touch map_bits itself. But you might want to consider
this change. (jlapeyre@71e92a6) This simplifies how map_bits is implemented and how it is called.

mtreinish · 2024-07-08T14:23:52Z

This PR does not touch map_bits itself. But you might want to consider this change. (jlapeyre@71e92a6) This simplifies how map_bits is implemented and how it is called.

I would want to check with @kevinhartman and @jakelishman before making this change since there is a fair amount in flux around using BitData between #12550 and #12730. I was trying to avoid chaning the rust interface for dealing with bit data too much in this PR.

jakelishman · 2024-07-08T14:47:24Z

imo if we were to touch any meaningful parts of BitData (beyond just making it public), then we'd do it in a separate PR - this PR doesn't logically need to change the interfaces, and doing it separately would let us track performance changes.

I haven't currently touched BitData, though I imagine it probably will change a bit in coming PRs, so best to just leave it un-churned right now.

jlapeyre · 2024-07-08T15:14:32Z

I was on the fence about it. But map_bits was called four times in circuit_data, and two calls were added here in convert_2q_block_matrix. There are no other calls.

I think a separate PR is good. Especially because, yeah, there is a possible change in performance.

In fact, I would add a method on PackedInstruction (if that's the right struct) because every call to map_bits starts with an instruction (or higher).

jakelishman · 2024-07-08T15:31:51Z

I have a half-done PR that's reworking some of the Interner interfaces to avoid a couple of unnecessary allocations. I think I'm likely to touch BitData as part of that.

crates/accelerate/src/convert_2q_block_matrix.rs

coveralls · 2024-07-08T16:39:15Z

Pull Request Test Coverage Report for Build 9844736287

Details

86 of 94 (91.49%) changed or added relevant lines in 4 files are covered.
5 unchanged lines in 2 files lost coverage.
Overall coverage increased (+0.03%) to 89.868%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
qiskit/transpiler/passes/optimization/consolidate_blocks.py	9	10	90.0%
crates/circuit/src/bit_data.rs	0	3	0.0%
crates/accelerate/src/convert_2q_block_matrix.rs	72	76	94.74%

Files with Coverage Reduction	New Missed Lines	%
qiskit/transpiler/passes/synthesis/unitary_synthesis.py	2	88.35%
crates/qasm2/src/lex.rs	3	92.88%

Totals
Change from base Build 9844511268:	0.03%
Covered Lines:	65665
Relevant Lines:	73068

💛 - Coveralls

* Use rust gates for ConsolidateBlocks This commit moves to use rust gates for the ConsolidateBlocks transpiler pass. Instead of generating the unitary matrices for the gates in a 2q block Python side and passing that list to a rust function this commit switches to passing a list of DAGOpNodes to the rust and then generating the matrices inside the rust function directly. This is similar to what was done in Qiskit#12650 for Optimize1qGatesDecomposition. Besides being faster to get the matrix for standard gates, it also reduces the eager construction of Python gate objects which was a significant source of overhead after Qiskit#12459. To that end this builds on the thread of work in the two PRs Qiskit#12692 and Qiskit#12701 which changed the access patterns for other passes to minimize eager gate object construction. * Add rust filter function for DAGCircuit.collect_2q_runs() * Update crates/accelerate/src/convert_2q_block_matrix.rs --------- Co-authored-by: John Lapeyre <[email protected]>

mtreinish added performance Changelog: None Do not include in changelog Rust This PR or issue is related to Rust code in the repository mod: transpiler Issues and PRs related to Transpiler labels Jul 1, 2024

mtreinish added this to the 1.2.0 milestone Jul 1, 2024

mtreinish requested review from alexanderivrii, ShellyGarion and a team as code owners July 1, 2024 21:54

mtreinish marked this pull request as draft July 1, 2024 21:56

mtreinish added the on hold Can not fix yet label Jul 1, 2024

mtreinish force-pushed the py-access-rs-data-consolidate-blocks branch from 0a1cddb to e449357 Compare July 2, 2024 21:59

mtreinish changed the title ~~[WIP] Use rust gates for ConsolidateBlocks~~ Use rust gates for ConsolidateBlocks Jul 2, 2024

mtreinish marked this pull request as ready for review July 2, 2024 22:04

mtreinish force-pushed the py-access-rs-data-consolidate-blocks branch from e449357 to 6567728 Compare July 3, 2024 16:35

mtreinish added 2 commits July 3, 2024 14:21

Add rust filter function for DAGCircuit.collect_2q_runs()

56a21cd

Merge remote-tracking branch 'origin/main' into py-access-rs-data-con…

032bf18

…solidate-blocks

mtreinish removed on hold Can not fix yet labels Jul 3, 2024

jlapeyre self-requested a review July 3, 2024 20:33

jlapeyre reviewed Jul 3, 2024

View reviewed changes

crates/accelerate/src/convert_2q_block_matrix.rs Show resolved Hide resolved

jlapeyre reviewed Jul 4, 2024

View reviewed changes

jlapeyre reviewed Jul 8, 2024

View reviewed changes

crates/accelerate/src/convert_2q_block_matrix.rs Outdated Show resolved Hide resolved

Update crates/accelerate/src/convert_2q_block_matrix.rs

a3ffebe

Merge branch 'main' into py-access-rs-data-consolidate-blocks

0a50be9

jlapeyre approved these changes Jul 8, 2024

View reviewed changes

jlapeyre added this pull request to the merge queue Jul 8, 2024

Merged via the queue into Qiskit:main with commit 4fe9dbc Jul 8, 2024
15 checks passed

jakelishman mentioned this pull request Jul 9, 2024

Rebalance CircuitInstruction and PackedInstruction #12730

Merged

mtreinish deleted the py-access-rs-data-consolidate-blocks branch July 11, 2024 17:22

Cryoris mentioned this pull request Sep 10, 2024

Transpiler changes circuit semantics #13118

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use rust gates for ConsolidateBlocks #12704

Use rust gates for ConsolidateBlocks #12704

mtreinish commented Jul 1, 2024 •

edited

Loading

qiskit-bot commented Jul 1, 2024

qiskit-bot commented Jul 2, 2024

coveralls commented Jul 2, 2024 •

edited

Loading

coveralls commented Jul 3, 2024 •

edited

Loading

coveralls commented Jul 3, 2024 •

edited

Loading

jlapeyre Jul 4, 2024 •

edited

Loading

mtreinish Jul 8, 2024

jlapeyre Jul 8, 2024

jlapeyre Jul 4, 2024

jlapeyre Jul 4, 2024

jlapeyre Jul 8, 2024

jlapeyre commented Jul 5, 2024

mtreinish commented Jul 8, 2024

jakelishman commented Jul 8, 2024

jlapeyre commented Jul 8, 2024 •

edited

Loading

jakelishman commented Jul 8, 2024

coveralls commented Jul 8, 2024 •

edited

Loading

Use rust gates for ConsolidateBlocks #12704

Use rust gates for ConsolidateBlocks #12704

Conversation

mtreinish commented Jul 1, 2024 • edited Loading

Summary

Details and comments

qiskit-bot commented Jul 1, 2024

qiskit-bot commented Jul 2, 2024

coveralls commented Jul 2, 2024 • edited Loading

Pull Request Test Coverage Report for Build 9768512668

Details

💛 - Coveralls

coveralls commented Jul 3, 2024 • edited Loading

Pull Request Test Coverage Report for Build 9781648307

Details

💛 - Coveralls

coveralls commented Jul 3, 2024 • edited Loading

Pull Request Test Coverage Report for Build 9782984075

Details

💛 - Coveralls

jlapeyre Jul 4, 2024 • edited Loading

Choose a reason for hiding this comment

mtreinish Jul 8, 2024

Choose a reason for hiding this comment

jlapeyre Jul 8, 2024

Choose a reason for hiding this comment

jlapeyre Jul 4, 2024

Choose a reason for hiding this comment

jlapeyre Jul 4, 2024

Choose a reason for hiding this comment

jlapeyre Jul 8, 2024

Choose a reason for hiding this comment

jlapeyre commented Jul 5, 2024

mtreinish commented Jul 8, 2024

jakelishman commented Jul 8, 2024

jlapeyre commented Jul 8, 2024 • edited Loading

jakelishman commented Jul 8, 2024

coveralls commented Jul 8, 2024 • edited Loading

Pull Request Test Coverage Report for Build 9844736287

Details

💛 - Coveralls

mtreinish commented Jul 1, 2024 •

edited

Loading

coveralls commented Jul 2, 2024 •

edited

Loading

coveralls commented Jul 3, 2024 •

edited

Loading

coveralls commented Jul 3, 2024 •

edited

Loading

jlapeyre Jul 4, 2024 •

edited

Loading

jlapeyre commented Jul 8, 2024 •

edited

Loading

coveralls commented Jul 8, 2024 •

edited

Loading