Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recreate full dag instead of inplace substitution in BasisTranslator #12195

Merged
merged 5 commits into from
Jul 30, 2024

Conversation

mtreinish
Copy link
Member

Summary

This commit tweaks the internal logic of the basis translator transpiler pass to do a full dag recreation instead of inplace modification. If only a few operations were to be substituted it would probably be more efficient to do an inplace modification, but in general the basis translator ends up replacing far more operations than not. In such cases just iterating over the dag and rebuilding it is more efficient because the overhead of apply_operation_back() is minimal compared to substitute_node_with_dag() (although it's higher than subtitute_node(.., inplace=True)).

Details and comments

This commit tweaks the internal logic of the basis translator transpiler
pass to do a full dag recreation instead of inplace modification. If
only a few operations were to be substituted it would probably be more
efficient to do an inplace modification, but in general the basis
translator ends up replacing far more operations than not. In such cases
just iterating over the dag and rebuilding it is more efficient because
the overhead of `apply_operation_back()` is minimal compared to
`substitute_node_with_dag()` (although it's higher than
`subtitute_node(.., inplace=True)`).
@mtreinish mtreinish added performance Changelog: None Do not include in changelog labels Apr 17, 2024
@mtreinish mtreinish added this to the 1.1.0 milestone Apr 17, 2024
@mtreinish mtreinish requested a review from a team as a code owner April 17, 2024 01:34
@qiskit-bot
Copy link
Collaborator

One or more of the the following people are requested to review this:

  • @Qiskit/terra-core

@mtreinish
Copy link
Member Author

I ran some asv benchmarks for this PR here:

Benchmarks that have improved:

       before           after         ratio
     [166dcccd]       [595f81bf]
     <dag-recreate-bt~1>       <dag-recreate-bt>
-      2.07±0.01s       1.88±0.01s     0.91  utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('ecr')
-      1.32±0.01s       1.15±0.01s     0.87  utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('cx')
-       603±0.6ms          504±2ms     0.84  passes.MultipleBasisPassBenchmarks.time_basis_translator(20, 1024, ['rz', 'x', 'sx', 'cx', 'id'])
-         435±2ms        360±0.7ms     0.83  passes.MultipleBasisPassBenchmarks.time_basis_translator(14, 1024, ['rz', 'x', 'sx', 'cx', 'id'])
-         510±2ms          418±2ms     0.82  passes.MultipleBasisPassBenchmarks.time_basis_translator(20, 1024, ['u', 'cx', 'id'])
-         366±1ms          299±2ms     0.82  passes.MultipleBasisPassBenchmarks.time_basis_translator(14, 1024, ['u', 'cx', 'id'])
-         167±2ms          135±1ms     0.81  passes.MultipleBasisPassBenchmarks.time_basis_translator(5, 1024, ['rz', 'x', 'sx', 'cx', 'id'])
-       142±0.5ms        113±0.8ms     0.80  passes.MultipleBasisPassBenchmarks.time_basis_translator(5, 1024, ['u', 'cx', 'id'])

Benchmarks that have stayed the same:

       before           after         ratio
     [166dcccd]       [595f81bf]
     <dag-recreate-bt~1>       <dag-recreate-bt>
              n/a              n/a      n/a  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(1, 'synthesis')
              n/a              n/a      n/a  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(14, 'synthesis')
              n/a              n/a      n/a  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(2, 'synthesis')
              n/a              n/a      n/a  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(20, 'synthesis')
              n/a              n/a      n/a  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(27, 'synthesis')
              n/a              n/a      n/a  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(3, 'synthesis')
              n/a              n/a      n/a  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(5, 'synthesis')
              n/a              n/a      n/a  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(8, 'synthesis')
       6.77±0.5ms       7.42±0.6ms     1.10  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(3, 'translator')
        128±0.2ms        139±0.5ms     1.08  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm(3)
        130±0.2ms        139±0.2ms     1.08  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm_backend_with_prop(3)
        114±0.3ms        123±0.4ms     1.08  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm(1)
        117±0.2ms        125±0.3ms     1.07  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm_backend_with_prop(1)
       15.9±0.1ms       17.0±0.3ms     1.07  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(5, 'translator')
       41.4±0.3ms       43.7±0.2ms     1.06  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(8, 'translator')
        243±0.7ms          256±2ms     1.05  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(20, 'translator')
      5.19±0.04ms       5.45±0.1ms     1.05  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(2, 'translator')
      2.45±0.03ms      2.57±0.03ms     1.05  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(1, 'translator')
          432±2ms          450±6ms     1.04  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(27, 'translator')
          229±2ms          238±4ms     1.04  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm_backend_with_prop(2)
        121±0.7ms          126±1ms     1.04  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(14, 'translator')
        227±0.9ms        235±0.7ms     1.04  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm(2)
       85.2±0.3ms       87.6±0.5ms     1.03  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_qv_14_x_14(1)
          105±1ms        107±0.4ms     1.03  transpiler_levels.TranspilerLevelBenchmarks.time_schedule_qv_14_x_14(1)
       97.2±0.7ms       99.7±0.5ms     1.02  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_qv_14_x_14(0)
          262±2ms          267±2ms     1.02  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_qv_14_x_14(2)
        402±0.7ms          406±1ms     1.01  transpiler_levels.TranspilerLevelBenchmarks.time_quantum_volume_transpile_50_x_20(1)
       92.6±0.3ms       93.7±0.3ms     1.01  utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('cz')
       30.0±0.4ms       30.3±0.2ms     1.01  utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('cx')
        122±0.5ms        122±0.4ms     1.00  transpiler_levels.TranspilerLevelBenchmarks.time_schedule_qv_14_x_14(0)
       92.7±0.3ms       92.8±0.3ms     1.00  utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('ecr')
       30.1±0.3ms       30.2±0.3ms     1.00  utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('ecr')
       20.7±0.03s          20.7±0s     1.00  utility_scale.UtilityScaleBenchmarks.time_qft('ecr')
             2565             2565     1.00  transpiler_levels.TranspilerLevelBenchmarks.track_depth_quantum_volume_transpile_50_x_20(0)
             1403             1403     1.00  transpiler_levels.TranspilerLevelBenchmarks.track_depth_quantum_volume_transpile_50_x_20(1)
             1403             1403     1.00  transpiler_levels.TranspilerLevelBenchmarks.track_depth_quantum_volume_transpile_50_x_20(2)
             1296             1296     1.00  transpiler_levels.TranspilerLevelBenchmarks.track_depth_quantum_volume_transpile_50_x_20(3)
             2705             2705     1.00  transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_from_large_qasm(0)
             2005             2005     1.00  transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_from_large_qasm(1)
             2005             2005     1.00  transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_from_large_qasm(2)
                7                7     1.00  transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_from_large_qasm(3)
             2705             2705     1.00  transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_from_large_qasm_backend_with_prop(0)
             2005             2005     1.00  transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_from_large_qasm_backend_with_prop(1)
             2005             2005     1.00  transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_from_large_qasm_backend_with_prop(2)
                7                7     1.00  transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_from_large_qasm_backend_with_prop(3)
              323              323     1.00  transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_qv_14_x_14(0)
              336              336     1.00  transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_qv_14_x_14(1)
              336              336     1.00  transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_qv_14_x_14(2)
              272              272     1.00  transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_qv_14_x_14(3)
             1908             1908     1.00  utility_scale.UtilityScaleBenchmarks.track_qaoa_depth('cx')
             1896             1896     1.00  utility_scale.UtilityScaleBenchmarks.track_qaoa_depth('cz')
             1936             1936     1.00  utility_scale.UtilityScaleBenchmarks.track_qaoa_depth('ecr')
             2598             2598     1.00  utility_scale.UtilityScaleBenchmarks.track_qft_depth('cx')
             2598             2598     1.00  utility_scale.UtilityScaleBenchmarks.track_qft_depth('cz')
             2598             2598     1.00  utility_scale.UtilityScaleBenchmarks.track_qft_depth('ecr')
              972              972     1.00  utility_scale.UtilityScaleBenchmarks.track_square_heisenberg_depth('cx')
              972              972     1.00  utility_scale.UtilityScaleBenchmarks.track_square_heisenberg_depth('cz')
              972              972     1.00  utility_scale.UtilityScaleBenchmarks.track_square_heisenberg_depth('ecr')
          296±3ms          295±1ms     1.00  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_qv_14_x_14(3)
       1.86±0.01s       1.86±0.01s     1.00  transpiler_levels.TranspilerLevelBenchmarks.time_quantum_volume_transpile_50_x_20(3)
          4.93±0s       4.93±0.01s     1.00  utility_scale.UtilityScaleBenchmarks.time_qaoa('cz')
       30.3±0.5ms      30.3±0.07ms     1.00  utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('cz')
       1.53±0.01s          1.52±0s     1.00  transpiler_levels.TranspilerLevelBenchmarks.time_quantum_volume_transpile_50_x_20(2)
       18.5±0.02s       18.4±0.05s     0.99  utility_scale.UtilityScaleBenchmarks.time_qft('cx')
      8.56±0.04ms      8.50±0.05ms     0.99  utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('cx')
       93.0±0.1ms       92.2±0.2ms     0.99  utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('cx')
          2.62±0s       2.60±0.03s     0.99  utility_scale.UtilityScaleBenchmarks.time_qaoa('ecr')
      8.64±0.06ms      8.55±0.04ms     0.99  utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('cz')
      8.64±0.07ms      8.54±0.05ms     0.99  utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('ecr')
       23.8±0.04s       23.5±0.09s     0.99  utility_scale.UtilityScaleBenchmarks.time_qft('cz')
          896±6ms          882±7ms     0.98  transpiler_levels.TranspilerLevelBenchmarks.time_quantum_volume_transpile_50_x_20(0)
          817±3ms          802±2ms     0.98  passes.MultipleBasisPassBenchmarks.time_basis_translator(20, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id'])
          582±2ms          563±5ms     0.97  passes.MultipleBasisPassBenchmarks.time_basis_translator(14, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id'])
          810±3ms          777±4ms     0.96  utility_scale.UtilityScaleBenchmarks.time_qaoa('cx')
        208±0.8ms        196±0.6ms     0.94  passes.MultipleBasisPassBenchmarks.time_basis_translator(5, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id'])
       2.96±0.02s       2.76±0.03s     0.93  utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('cz')

Benchmarks that have got worse:

       before           after         ratio
     [166dcccd]       [595f81bf]
     <dag-recreate-bt~1>       <dag-recreate-bt>
+      54.3±0.2ms       63.6±0.4ms     1.17  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm(0)
+      57.0±0.2ms       65.7±0.2ms     1.15  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm_backend_with_prop(0)

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE DECREASED.

The large_qasm benchmarks are the worst case because the circuit is relatively deep and in the target basis already. If this is a concern we can add a logic switch based on the number of substations needed. From #12179 (comment) the crossover point is ~2% of the gates in a circuit will be replaced. Not sure if the performance will be exactly the same for this pass though, it might be worth benchmarking for the crossover point here too.

Copy link
Contributor

@ElePT ElePT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks straightforward, and it looks good to me. About the large_qasm benchmark, I think that it might be worth benchmarking for the 2% crossover point and potentially adding the switch. It seems relatively low effort and avoids a regression in a use case that doesn't seem very useful for practical applications, but looks likely to come up in benchmarking efforts (just thinking of people thinking of edge cases, or lazily constructing circuits). This is just my opinion, of course. Another small comment is that the circuit drawer unit test is acting out.

@mtreinish
Copy link
Member Author

Another small comment is that the circuit drawer unit test is acting out.

Weird the output circuit from the transpiler of an if statement's body is shifting from rz(pi) to rz(-pi) with a global phase of pi. So both outputs are equivalent but the visualization changes and because we don't display global phase inside control flow bodies it looks a bit weird now. I'm not sure what is causing this though because I didn't change the basis translators logic, just how we construct the output I wouldn't have expected any changes.

Comment on lines -267 to -269
if dag_updated:
flow_circ_block = dag_to_circuit(dag_block)
Copy link
Contributor

@ElePT ElePT Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this might be the origin of the test failure. In the original code, flow_circ_block = dag_to_circuit(dag_block) but in the updated one flow_circ_block = dag_to_circuit(updated_dag). Replacing updated_dag with dag_block fixes the test locally for me, however, I think that the correct line should be flow_circ_block = dag_to_circuit(updated_dag) (else, why would you update it and not use it?). Do you think this could have been an oversight in the original code and the test just got dragged with it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, in the original version of this code dag_updated was a boolean that indicated whether the input dag to apply_translation() was transformed at all. If it was we would add the modified dag to the flow_blocks list which then would get updated. Otherwise we would use the original input block unmodified. I guess maybe we want to have some logic in place to replicate the no-substitution path that I kind of clobbered here.

@mtreinish mtreinish modified the milestones: 1.1.0, 1.2.0 May 2, 2024
@mtreinish
Copy link
Member Author

I pushed this out to 1.2.0 since I haven't had a chance to fix the root cause of the test failure. That being said this might be superseded by #12246

@ElePT
Copy link
Contributor

ElePT commented Jul 29, 2024

I tried fixing the latest conflicts and proposing a solution for the failed test. To make the code work after merging the changes from #12705 I ended up applying a few hacky-looking fixes (such as re-writing qargs and cargs in new_node or this change in L381-383), there might be better workarounds. The test fix is nothing fancy, I just return the old boolean flag together with the dag in apply_translation.

@coveralls
Copy link

Pull Request Test Coverage Report for Build 10148007745

Details

  • 32 of 32 (100.0%) changed or added relevant lines in 1 file are covered.
  • 5 unchanged lines in 1 file lost coverage.
  • Overall coverage increased (+0.02%) to 89.773%

Files with Coverage Reduction New Missed Lines %
crates/qasm2/src/lex.rs 5 92.11%
Totals Coverage Status
Change from base Build 10147820667: 0.02%
Covered Lines: 66970
Relevant Lines: 74599

💛 - Coveralls

@mtreinish
Copy link
Member Author

I ran the benchmarks after @ElePT's fix and rebase:

Benchmarks that have improved:

| Change   | Before [b23c5452]    | After [94d6b9c7] <dag-recreate-bt>   |   Ratio | Benchmark (Parameter)                                              |
|----------|----------------------|--------------------------------------|---------|--------------------------------------------------------------------|
| -        | 1.91±0.02s           | 1.55±0.01s                           |    0.81 | utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('cz')  |
| -        | 1.77±0.01s           | 1.38±0.01s                           |    0.78 | utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('ecr') |
| -        | 1.32±0.01s           | 995±6ms                              |    0.76 | utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('cx')  |

Benchmarks that have stayed the same:

| Change   | Before [b23c5452]    | After [94d6b9c7] <dag-recreate-bt>   | Ratio   | Benchmark (Parameter)                                                         |
|----------|----------------------|--------------------------------------|---------|-------------------------------------------------------------------------------|
|          | 0                    | 0                                    | n/a     | utility_scale.UtilityScaleBenchmarks.track_bvlike_depth('cx')                 |
|          | 0                    | 0                                    | n/a     | utility_scale.UtilityScaleBenchmarks.track_bvlike_depth('cz')                 |
|          | 0                    | 0                                    | n/a     | utility_scale.UtilityScaleBenchmarks.track_bvlike_depth('ecr')                |
|          | 107±1ms              | 110±0.3ms                            | 1.03    | utility_scale.UtilityScaleBenchmarks.time_bvlike('cx')                        |
|          | 3.67±0.01s           | 3.73±0.1s                            | 1.02    | utility_scale.UtilityScaleBenchmarks.time_circSU2('cx')                       |
|          | 11.3±0.02s           | 11.5±0.01s                           | 1.02    | utility_scale.UtilityScaleBenchmarks.time_qv('cz')                            |
|          | 108±1ms              | 110±0.6ms                            | 1.01    | utility_scale.UtilityScaleBenchmarks.time_bvlike('ecr')                       |
|          | 3.79±0.01s           | 3.83±0.06s                           | 1.01    | utility_scale.UtilityScaleBenchmarks.time_circSU2('cz')                       |
|          | 3.78±0.01s           | 3.83±0.08s                           | 1.01    | utility_scale.UtilityScaleBenchmarks.time_circSU2('ecr')                      |
|          | 6.72±0.05s           | 6.81±0s                              | 1.01    | utility_scale.UtilityScaleBenchmarks.time_qv('cx')                            |
|          | 8.54±0.07s           | 8.65±0.01s                           | 1.01    | utility_scale.UtilityScaleBenchmarks.time_qv('ecr')                           |
|          | 332±1ms              | 331±2ms                              | 1.00    | utility_scale.UtilityScaleBenchmarks.time_bv_100('cz')                        |
|          | 109±1ms              | 109±0.8ms                            | 1.00    | utility_scale.UtilityScaleBenchmarks.time_bvlike('cz')                        |
|          | 111±1ms              | 111±0.4ms                            | 1.00    | utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('ecr')               |
|          | 19.8±0.01s           | 19.8±0.04s                           | 1.00    | utility_scale.UtilityScaleBenchmarks.time_qft('cx')                           |
|          | 395                  | 395                                  | 1.00    | utility_scale.UtilityScaleBenchmarks.track_bv_100_depth('cx')                 |
|          | 397                  | 397                                  | 1.00    | utility_scale.UtilityScaleBenchmarks.track_bv_100_depth('cz')                 |
|          | 397                  | 397                                  | 1.00    | utility_scale.UtilityScaleBenchmarks.track_bv_100_depth('ecr')                |
|          | 300                  | 300                                  | 1.00    | utility_scale.UtilityScaleBenchmarks.track_circSU2_depth('cx')                |
|          | 300                  | 300                                  | 1.00    | utility_scale.UtilityScaleBenchmarks.track_circSU2_depth('cz')                |
|          | 300                  | 300                                  | 1.00    | utility_scale.UtilityScaleBenchmarks.track_circSU2_depth('ecr')               |
|          | 1483                 | 1483                                 | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qaoa_depth('cx')                   |
|          | 1488                 | 1488                                 | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qaoa_depth('cz')                   |
|          | 1488                 | 1488                                 | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qaoa_depth('ecr')                  |
|          | 1954                 | 1954                                 | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qft_depth('cx')                    |
|          | 1954                 | 1954                                 | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qft_depth('cz')                    |
|          | 1954                 | 1954                                 | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qft_depth('ecr')                   |
|          | 2538                 | 2538                                 | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qv_depth('cx')                     |
|          | 2538                 | 2538                                 | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qv_depth('cz')                     |
|          | 2538                 | 2538                                 | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qv_depth('ecr')                    |
|          | 435                  | 435                                  | 1.00    | utility_scale.UtilityScaleBenchmarks.track_square_heisenberg_depth('cx')      |
|          | 435                  | 435                                  | 1.00    | utility_scale.UtilityScaleBenchmarks.track_square_heisenberg_depth('cz')      |
|          | 435                  | 435                                  | 1.00    | utility_scale.UtilityScaleBenchmarks.track_square_heisenberg_depth('ecr')     |
|          | 306±2ms              | 304±0.3ms                            | 0.99    | utility_scale.UtilityScaleBenchmarks.time_bv_100('cx')                        |
|          | 333±2ms              | 331±2ms                              | 0.99    | utility_scale.UtilityScaleBenchmarks.time_bv_100('ecr')                       |
|          | 22.4±0.1s            | 22.3±0.05s                           | 0.99    | utility_scale.UtilityScaleBenchmarks.time_qft('cz')                           |
|          | 22.4±0.1s            | 22.2±0s                              | 0.99    | utility_scale.UtilityScaleBenchmarks.time_qft('ecr')                          |
|          | 10.4±0.09ms          | 10.2±0.1ms                           | 0.98    | utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('cx')               |
|          | 112±0.8ms            | 110±0.6ms                            | 0.98    | utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('cz')                |
|          | 36.3±0.4ms           | 35.5±0.2ms                           | 0.98    | utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('cx')  |
|          | 36.2±0.3ms           | 35.5±0.5ms                           | 0.98    | utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('cz')  |
|          | 36.5±0.2ms           | 35.6±0.5ms                           | 0.98    | utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('ecr') |
|          | 2.84±0.01s           | 2.77±0.02s                           | 0.98    | utility_scale.UtilityScaleBenchmarks.time_qaoa('cz')                          |
|          | 10.5±0.04ms          | 10.2±0.05ms                          | 0.97    | utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('cz')               |
|          | 10.5±0.05ms          | 10.3±0.05ms                          | 0.97    | utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('ecr')              |
|          | 112±0.4ms            | 109±1ms                              | 0.97    | utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('cx')                |
|          | 2.33±0.02s           | 2.24±0.01s                           | 0.96    | utility_scale.UtilityScaleBenchmarks.time_qaoa('ecr')                         |
|          | 1.14±0.01s           | 1.07±0.01s                           | 0.94    | utility_scale.UtilityScaleBenchmarks.time_qaoa('cx')                          |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

Copy link
Contributor

@ElePT ElePT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mtreinish for running the benchmarks! It's not a huge improvement but seems to be useful for the utility scale benchmarks. I'd go ahead with the PR, what do you think?

@ElePT ElePT added the stable backport potential The bug might be minimal and/or import enough to be port to stable label Jul 30, 2024
@ElePT ElePT added this pull request to the merge queue Jul 30, 2024
Merged via the queue into Qiskit:main with commit 37b334f Jul 30, 2024
15 checks passed
mergify bot pushed a commit that referenced this pull request Jul 30, 2024
…12195)

* Recreate full dag instead of inplace substitution in BasisTranslator

This commit tweaks the internal logic of the basis translator transpiler
pass to do a full dag recreation instead of inplace modification. If
only a few operations were to be substituted it would probably be more
efficient to do an inplace modification, but in general the basis
translator ends up replacing far more operations than not. In such cases
just iterating over the dag and rebuilding it is more efficient because
the overhead of `apply_operation_back()` is minimal compared to
`substitute_node_with_dag()` (although it's higher than
`subtitute_node(.., inplace=True)`).

* Return boolean together with dag in 'apply_translation' to maintain original 'flow_blocks' logic and fix drawer test.

* Remove print

---------

Co-authored-by: Elena Peña Tapia <[email protected]>
(cherry picked from commit 37b334f)
github-merge-queue bot pushed a commit that referenced this pull request Jul 30, 2024
…12195) (#12855)

* Recreate full dag instead of inplace substitution in BasisTranslator

This commit tweaks the internal logic of the basis translator transpiler
pass to do a full dag recreation instead of inplace modification. If
only a few operations were to be substituted it would probably be more
efficient to do an inplace modification, but in general the basis
translator ends up replacing far more operations than not. In such cases
just iterating over the dag and rebuilding it is more efficient because
the overhead of `apply_operation_back()` is minimal compared to
`substitute_node_with_dag()` (although it's higher than
`subtitute_node(.., inplace=True)`).

* Return boolean together with dag in 'apply_translation' to maintain original 'flow_blocks' logic and fix drawer test.

* Remove print

---------

Co-authored-by: Elena Peña Tapia <[email protected]>
(cherry picked from commit 37b334f)

Co-authored-by: Matthew Treinish <[email protected]>
Procatv pushed a commit to Procatv/qiskit-terra-catherines that referenced this pull request Aug 1, 2024
…iskit#12195)

* Recreate full dag instead of inplace substitution in BasisTranslator

This commit tweaks the internal logic of the basis translator transpiler
pass to do a full dag recreation instead of inplace modification. If
only a few operations were to be substituted it would probably be more
efficient to do an inplace modification, but in general the basis
translator ends up replacing far more operations than not. In such cases
just iterating over the dag and rebuilding it is more efficient because
the overhead of `apply_operation_back()` is minimal compared to
`substitute_node_with_dag()` (although it's higher than
`subtitute_node(.., inplace=True)`).

* Return boolean together with dag in 'apply_translation' to maintain original 'flow_blocks' logic and fix drawer test.

* Remove print

---------

Co-authored-by: Elena Peña Tapia <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: None Do not include in changelog performance stable backport potential The bug might be minimal and/or import enough to be port to stable
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants