Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move QuantumCircuit.assign_parameters to Rust (backport #12794) #12878

Merged
merged 1 commit into from
Aug 1, 2024

Conversation

mergify[bot]
Copy link
Contributor

@mergify mergify bot commented Aug 1, 2024

Summary

This is (as far as I could tell), the last really major performance regression in our asv suite compared to 1.1.0, so with this commit, we should be at not worse for important utility-scale benchmarks.

This largely rewrites ParamTable (renamed back to ParameterTable because I kept getting confused with Param) to have more Rust-friendly interfaces available, so that assign_parameters can then use them.

This represents a 2-3x speedup in assign_parameters performance over 1.1.0, when binding simple Parameter instances. Approximately 75% of the time is now spent in Python-space Parameter.assign and ParameterExpression.numeric calls; almost all of this could be removed were we to move Parameter and ParameterExpression to have their data exposed directly to Rust space. The percentage of time spent in Python space only increases if the expressions to be bound are actual ParameterExpressions and not just Parameter.

Most changes in the test suite are because of the use of internal-only methods that changed with the new ParameterTable. The only discrepancy is a bug in test_pauli_feature_map, which was trying to assign using a set.

Details and comments

Built on #12730, so will need rebasing over it.

I think this first commit might accidentally have introduced a small (10%) regression to parametric-circuit construction time over its parent. That's a mistake if so - I should be able to fix that later. edit: on retiming, I couldn't reproduce a problem - if anything, this commit is a minor improvement.

Timings for parametric circuit benchmarks compared to 1.1.0 (the different SHA1 is because I hadn't written the commit message when I took the benchmark):

Benchmarks that have improved:

| Change   | Before [7d29dc1b] <1.1.0^0>   | After [f391e6d4]    | Ratio   | Benchmark (Parameter)                                                                                           |
|----------|-------------------------------|---------------------|---------|-----------------------------------------------------------------------------------------------------------------|
| -        | 1.50±0.05ms                   | 589±40μs            | 0.39    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 128, 128)                               |
| -        | 1.19±0.02ms                   | 427±10μs            | 0.36    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 128, 8)                                 |
| -        | 1.16±0.04s                    | 358±4ms             | 0.31    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 131072, 128)                            |
| -        | 1.52±0.02s                    | 634±8ms             | 0.42    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 131072, 131072)                         |
| -        | 1.07±0.02s                    | 350±5ms             | 0.33    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 131072, 2048)                           |
| -        | 1.13±0.01s                    | 410±3ms             | 0.36    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 131072, 32768)                          |
| -        | 1.12±0.01s                    | 370±2ms             | 0.33    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 131072, 8)                              |
| -        | 1.06±0.01s                    | 362±2ms             | 0.34    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 131072, 8192)                           |
| -        | 15.6±0.2ms                    | 5.37±0.2ms          | 0.34    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 2048, 128)                              |
| -        | 22.4±1ms                      | 7.90±0.2ms          | 0.35    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 2048, 2048)                             |
| -        | 15.2±0.6ms                    | 5.26±0.3ms          | 0.35    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 2048, 8)                                |
| -        | 270±6ms                       | 85.5±2ms            | 0.32    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 32768, 128)                             |
| -        | 264±10ms                      | 86.4±1ms            | 0.33    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 32768, 2048)                            |
| -        | 370±10ms                      | 147±3ms             | 0.40    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 32768, 32768)                           |
| -        | 271±10ms                      | 90.9±2ms            | 0.34    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 32768, 8)                               |
| -        | 274±3ms                       | 98.1±0.8ms          | 0.36    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 32768, 8192)                            |
| -        | 347±5μs                       | 144±8μs             | 0.42    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 8, 8)                                   |
| -        | 63.5±2ms                      | 21.6±1ms            | 0.34    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 8192, 128)                              |
| -        | 67.1±2ms                      | 23.8±0.6ms          | 0.35    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 8192, 2048)                             |
| -        | 61.6±2ms                      | 21.4±0.6ms          | 0.35    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 8192, 8)                                |
| -        | 87.3±2ms                      | 34.3±0.6ms          | 0.39    | circuit_construction.ParameterizedCircuitBindBench.time_bind_params(20, 8192, 8192)                             |
| -        | 3.84±0.02ms                   | 2.02±0.1ms          | 0.53    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 128, 128)       |
| -        | 3.53±0.4ms                    | 1.38±0.1ms          | 0.39    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 128, 8)         |
| -        | 2.57±0.01s                    | 995±10ms            | 0.39    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 131072, 128)    |
| -        | 3.52±0.06s                    | 1.85±0.02s          | 0.53    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 131072, 131072) |
| -        | 2.64±0.06s                    | 1.03±0.02s          | 0.39    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 131072, 2048)   |
| -        | 2.84±0.02s                    | 1.23±0.01s          | 0.43    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 131072, 32768)  |
| -        | 2.62±0.05s                    | 998±10ms            | 0.38    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 131072, 8)      |
| -        | 2.63±0.03s                    | 1.09±0.02s          | 0.41    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 131072, 8192)   |
| -        | 45.1±1ms                      | 16.5±1ms            | 0.37    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 2048, 128)      |
| -        | 50.9±0.7ms                    | 29.4±0.7ms          | 0.58    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 2048, 2048)     |
| -        | 38.7±0.3ms                    | 15.4±0.6ms          | 0.40    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 2048, 8)        |
| -        | 640±20ms                      | 247±2ms             | 0.39    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 32768, 128)     |
| -        | 650±7ms                       | 273±2ms             | 0.42    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 32768, 2048)    |
| -        | 867±20ms                      | 468±9ms             | 0.54    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 32768, 32768)   |
| -        | 672±30ms                      | 248±2ms             | 0.37    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 32768, 8)       |
| -        | 703±10ms                      | 313±6ms             | 0.45    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 32768, 8192)    |
| -        | 873±10μs                      | 501±40μs            | 0.57    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 8, 8)           |
| -        | 159±5ms                       | 66.4±1ms            | 0.42    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 8192, 128)      |
| -        | 174±2ms                       | 77.1±0.5ms          | 0.44    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 8192, 2048)     |
| -        | 156±4ms                       | 64.1±1ms            | 0.41    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 8192, 8)        |
| -        | 210±2ms                       | 113±1ms             | 0.54    | circuit_construction.ParameterizedCircuitConstructionBench.time_build_parameterized_circuit(20, 8192, 8192)     |
```<hr>This is an automatic backport of pull request #12794 done by [Mergify](https://mergify.com).

* Move `QuantumCircuit.assign_parameters` to Rust

This is (as far as I could tell), the last really major performance
regression in our asv suite compared to 1.1.0, so with this commit, we
should be at _not worse_ for important utility-scale benchmarks.

This largely rewrites `ParamTable` (renamed back to `ParameterTable`
because I kept getting confused with `Param`) to have more Rust-friendly
interfaces available, so that `assign_parameters` can then use them.

This represents a 2-3x speedup in `assign_parameters` performance over
1.1.0, when binding simple `Parameter` instances.  Approximately 75% of
the time is now spent in Python-space `Parameter.assign` and
`ParameterExpression.numeric` calls; almost all of this could be removed
were we to move `Parameter` and `ParameterExpression` to have their data
exposed directly to Rust space.  The percentage of time spent in Python
space only increases if the expressions to be bound are actual
`ParameterExpression`s and not just `Parameter`.

Most changes in the test suite are because of the use of internal-only
methods that changed with the new `ParameterTable`.  The only
discrepancy is a bug in `test_pauli_feature_map`, which was trying to
assign using a set.

* Add unit test of parameter insertion

This catches a bug that was present in the parent commit, but this PR
fixes.

* Update crates/circuit/src/imports.rs

* Fix assignment to `AnnotatedOperation`

* Rename `CircuitData::num_params` to match normal terminology

* Fix typos and 🇺🇸

* Fix lint

(cherry picked from commit a68de4f)
@mergify mergify bot requested a review from a team as a code owner August 1, 2024 11:48
@qiskit-bot
Copy link
Collaborator

Thank you for opening a new pull request.

Before your PR can be merged it will first need to pass continuous integration tests and be reviewed. Sometimes the review process can be slow, so please be patient.

While you're waiting, please feel free to review other open PRs. While only a subset of people are authorized to approve pull requests for merging, everyone is encouraged to review open pull requests. Doing reviews helps reduce the burden on the core team and helps make the project's code better for everyone.

One or more of the following people are relevant to this code:

  • @Cryoris
  • @Qiskit/terra-core
  • @ajavadia
  • @kevinhartman
  • @mtreinish

@github-actions github-actions bot added priority: high performance Changelog: None Do not include in changelog Rust This PR or issue is related to Rust code in the repository labels Aug 1, 2024
@github-actions github-actions bot added this to the 1.2.0 milestone Aug 1, 2024
@github-actions github-actions bot added the mod: circuit Related to the core of the `QuantumCircuit` class or the circuit library label Aug 1, 2024
@coveralls
Copy link

Pull Request Test Coverage Report for Build 10197698473

Details

  • 485 of 513 (94.54%) changed or added relevant lines in 5 files are covered.
  • 10 unchanged lines in 3 files lost coverage.
  • Overall coverage increased (+0.03%) to 89.954%

Changes Missing Coverage Covered Lines Changed/Added Lines %
qiskit/circuit/quantumcircuit.py 15 16 93.75%
qiskit/qasm3/exporter.py 2 3 66.67%
crates/circuit/src/parameter_table.rs 186 198 93.94%
crates/circuit/src/circuit_data.rs 255 269 94.8%
Files with Coverage Reduction New Missed Lines %
crates/qasm2/src/lex.rs 3 92.48%
qiskit/circuit/library/standard_gates/u.py 3 93.07%
qiskit/circuit/quantumcircuit.py 4 93.54%
Totals Coverage Status
Change from base Build 10197591774: 0.03%
Covered Lines: 66683
Relevant Lines: 74130

💛 - Coveralls

@jakelishman jakelishman added this pull request to the merge queue Aug 1, 2024
Merged via the queue into stable/1.2 with commit 9120b8d Aug 1, 2024
18 checks passed
@mergify mergify bot deleted the mergify/bp/stable/1.2/pr-12794 branch August 1, 2024 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: None Do not include in changelog mod: circuit Related to the core of the `QuantumCircuit` class or the circuit library performance priority: high Rust This PR or issue is related to Rust code in the repository
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants