Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the theory page of LRE user guide #2522

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs/source/guide/guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,12 @@ core-concepts.md
zne.md
pec.md
cdr.md
shadows.md
ddd.md
lre.md
rem.md
qse.md
pt.md
shadows.md
error-mitigation.md
glossary.md
```
87 changes: 87 additions & 0 deletions docs/source/guide/lre-5-theory.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
---
jupytext:
text_representation:
extension: .md
format_name: myst
format_version: 0.13
jupytext_version: 1.11.4
kernelspec:
display_name: Python 3
language: python
name: python3
---


```{warning}
The user guide for LRE in Mitiq is currently under construction.
```

# What is the theory behind LRE?

Layerwise Richardson Extrapolation (LRE), an error mitigation technique, introduced in
{cite}`Russo_2024_LRE` extends the ideas found in ZNE by allowing users to create multiple noise-scaled variations of the input
circuit such that the noiseless expectation value is extrapolated from the execution of each
noisy circuit.

Similar to [ZNE](zne.md), this process works in two steps:

- **Step 1:** Intentionally create multiple noise-scaled but logically equivalent circuits by scaling each layer or chunk of the input circuit through unitary folding.

- **Step 2:** Extrapolate to the noiseless limit using multivariate richardson extrapolation.

LRE leverages the flexible configuration space of layerwise unitary folding,
allowing for a more nuanced mitigation of errors by treating the noise level of each layer of
the quantum circuit as an independent variable.

## Step 1: Intentionally create multiple noise-scaled but logically equivalent circuits

The goal is to create noise-scaled circuits of different depths where the layers in each circuit are scaled in
a specific pattern as a result of unitary folding. This pattern is often described by the vector of scale factor vectors
generated by the fold multiplier and the chosen degree for multivariate Richardson extrapolation polynomial. For more information
on unitary folding, go to [What is the theory behind ZNE?](zne-5-theory.md).

Suppose we're interested in the value of some observable in an $n$-qubit circuit with $l$ layers.

Each layer can have a different scale factor and we can create $M$ such variations of the scaled circuit. Let $\{λ_1, λ_2, λ_3, \ldots, λ_M\}$ be the scale factors vectors used to create multiple variations of the noise-scaled circuits $\{C_{λ_1}, C_{λ_2}, C_{λ_3}, \ldots, C_{λ_M}\}$ such that each vector $λ_i$ defines the scale factors for the different layers in the input circuit $\{{λ^1}_i, {λ^2}_i, {λ^3}_i, \ldots, {λ^l}_i\}^T$.

If $d$ is the chosen degree of our multivariate polynomial, $M_j(λ_i, d)$ corresponds to the terms in the polynomial arranged in increasing order. In general, the monomial terms for a variable $l$ up to degree $d$ can be determined through the [stars and bars method](https://en.wikipedia.org/wiki/Stars_and_bars_%28combinatorics%29).

$$
\text{total number of terms in the monomial basis with max degree } d = \binom{d + l}{d}
$$

$$
\text{number of terms in the monomial basis with total degree } d = \binom{d + l - 1}{d}
$$

These monomial terms define the rows of the square sample matrix as shown below:

$$
\mathbf{A}(\Lambda, d) =
\begin{bmatrix}
M_1(λ_1, d) & M_2(λ_1, d) & \cdots & M_N(λ_1, d) \\
M_1(λ_2, d) & M_2(λ_2, d) & \cdots & M_N(λ_2, d) \\
\vdots & \vdots & \ddots & \vdots \\
M_1(λ_N, d) & M_2(λ_N, d) & \cdots & M_N(λ_N, d)
\end{bmatrix}
$$

Each monomial term in the sample matrix $\mathbf{A}$ is evaluated using the values in the scale factor vectors. In Step 2, we aim to define $O_{\mathrm{LRE}}$ as a linear combination of the noisy expectation values.

Finding the coefficients in the linear combination becomes a problem solvable through a system of linear equations $\mathbf{A} c = z$ where $c$ is the coefficients vector $(\eta_1, \eta_2, \ldots, \eta_N)^T$, $z$ is the vector of the noisy expectation values and $\mathbf{A}$ is the sample matrix evaluated using the values in the scale factor vectors.

## Step 2: Extrapolate to the noiseless limit

Each noise scaled circuit $C_{λ_i}$ has an expectation value $\langle O(λ_i) \rangle$ associated with it such that we can define a vector of the noisy expectation values $z = (\langle O(λ_1) \rangle, \langle O(λ_2) \rangle, \ldots, \langle O(λ_M)\rangle)^T$. These have a coefficient of linear combination associated with them as shown below:

$$
O_{\mathrm{LRE}} = \sum_{i=1}^{M} \eta_i \langle O(λ_i) \rangle.
$$

The system of linear equations is used to find the numerous $\eta_i$ in vector $c$. As we only need to find the noiseless expectation value, we can skip calculating the full vector of linear combination coefficients if we use the [Lagrange interpolation formula](https://files.eric.ed.gov/fulltext/EJ1231189.pdf) evaluated at $λ = 0$.

$$
O_{\rm LRE} = \sum_{i=1}^M \langle O (\boldsymbol{\lambda}_i)\rangle \frac{\det \left(\mathbf{M}_i (\boldsymbol{0}) \right)}{\det \left(\mathbf{A}\right)}.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should avoid using $\mathbf{M}$ here since the matrix above was defined with $M$.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is supposed to be one of the terms in the monomial basis evaluated at $\lambda=0$

$$

To get the matrix $\mathbf{M}_i(\mathbf{0})$, replace the $i$-th row of the sample matrix $\mathbf{A}$ by $\mathbf{e}_1=(1, 0, \ldots, 0)^T$ where except $M_1(0, d) = 1$ all the other monomial terms are zero.
23 changes: 23 additions & 0 deletions docs/source/guide/lre.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@

```{warning}
The user guide for LRE in Mitiq is currently under construction.
```
purva-thakre marked this conversation as resolved.
Show resolved Hide resolved

# Layerwise Richardson Extrapolation

Layerwise Richardson Extrapolation (LRE), an error mitigation technique, introduced in
{cite}`Russo_2024_LRE` works by creating multiple noise-scaled variations of the input
circuit such that the noiseless expectation value is extrapolated from the execution of each
noisy circuit (see the section [What is the theory behind LRE?](lre-5-theory.md)). Compared to
Zero-Noise Extrapolation, this technique treats the noise in each layer of the circuit
as an independent variable to be scaled and then extrapolated independently.


You can get started with LRE in Mitiq with the following sections of the user guide:

```{toctree}
---
maxdepth: 1
---
lre-5-theory.md
```