Status | Accepted |
---|---|
RFC # | 0015 |
Authors | Samantha Barron ([email protected]) |
Lev Bishop ([email protected]) | |
Ikko Hamamura ([email protected]) | |
Ian Hincks ([email protected]) | |
Takashi Imamichi ([email protected]) | |
Blake Johnson ([email protected]) | |
Chris Wood ([email protected]) | |
Jessie Yu ([email protected]) | |
Submitted | 2023-10-03 |
Updated | 2023-12-18 |
The current Estimator.run()
method requires that a user provide one circuit for every observable and set of parameter values that they wish to run.
(This was not always the case via circuit indices, but the history will not be discussed here.)
It is common, if not typical, for a user to want to estimate many observables corresponding to a single circuit.
Likewise, it is common, if not typical, for a user to want to supply multiple parameter value sets for the same circuit.
This RFC proposes to fix these issues by changing the Estimator.run()
signature in the following ways:
- Take the transpose of the current signature; rather than accepting
circuits
,parameter_values
, andobservables
as three different (iterable) arguments which are to be zipped together, instead localize them into an iterable of triples(circuit, parameter_values, observables)
. - In combination with 1, extend
parameter_values
andobservables
to be array-valued, that is, add the ability to explicity and conveniently specify multiple parameter value sets and observables for a single circuit.
Here is a summary of pain-points with the current Estimator.run()
interface:
-
Ergonomics. It feels unnatural to invoke the estimator as
Estimator.run([circuit, circuit], observables=[obs1, obs2])
because of the redundancy of entering the circuit twice. -
Trust. Cases like
Estimator.run([circuit, circuit], observables=[obs1, obs2])
make a user question whether the primitive implementation is going to do any optimizations to check if the same circuit appears twice, and if so, whether it will be done by memory location, circuit equality, etc.; the interface itself creates gap in trust. -
Clarity. Without reading the documentation in detail, it's not obvious that the invocation
Estimator.run([circuit], observables=[obs1, obs2])
wouldn't causecircuit
to be run with both supplied observables. In other words, that zipping is always what's done to therun()
arguments is common source of user confusion. Conversely, it's not clear that the way to collect all combinations of two observables and two parameter sets is to invokeEstimator.run([circuit] * 4, observables=[obs1, obs2] * 2), parameter_values=[params1] * 2 + [params2] * 2
. -
Performance. Given that
QuantumCircuit
hashing should be avoided and circuit equality checks can be expensive, we should move away from an interface that necessitates performant primitive implementations to go down this path at all. As one example, qubit-wise commuting observables like"IZZ"
and"XZI"
can both be estimated using the same simulation/execution, but only if the estimator understands that they share a base circuit. As a second example, when the circuits need to be serialized before they are simulated/executed (e.g. runtime or pickling-for-multi-processing), it puts the onus on the primitive implementation to detect circuit duplications.
Here is why the Detailed Design section suggests "transposing" the signature and using array-based arguments with broadcasting:
- Transposing the signature will make it obvious what the primitive intends to do with each circuit.
- Transposing the signature will let us introduce and reason about the notion "primitive unit of work", for which this RFC proposes the name "Primitives Unified Bloc - PUB".
- Array-based arguments will let users assign operational meaning to each axis (this axis is for twirling, that axis is for basis changes, etc.).
- Broadcasting rules will let users choose how to combine parameter value sets with observables in different ways, such as
- use one observable for all of N parameter value sets
- use one parameter value set for all of N observables
- zip N parameter value sets against N observables
- take all NxM combinations of N parameter value sets with M observables
- etc.
All users of the primitives stand to benefit from this proposal. Immediately, it will enable sophisticated and convenient workflows for power users though arrays and broadcasting. However, standard broadcasting rules are such that the 0D and 1D cases will feel natural to existing users---0D is essentially what we already have, and 1D will be perceived as a simple-yet-welcome bonus.
Moreover, the interface changes in this proposal will enable specific primitive implementations to enhance performance through reduced bandwidth on the runtime, compatibility with fast parameter binding, and multiplexing qubit-wise commuting observables.
We will lead the design proposal with an example:
# construct two circuits, the first with 2039 Parameters, the second with no Parameters
circuit1 = QuantumCircuit(9)
...
circuit2 = QuantumCircuit(2)
...
# specify 128 different parameter value sets for circuit1, in a 32x4 shape
parameter_values1 = np.random.uniform(size=(32, 4, 2039))
# specify 4 observables to measure for circuit 1
observables1 = ["ZZIIIIIII", "IZXIIIIII", "IIIIIIYZI", "IZXIIYIII"]
# specify 18 observables to measure for circuit2
observables2 = [Pauli("XYZIIII"), ..., SparsePauliOp({"YYIIIII": 0.2})]
# invoke an estimator
estimator = Estimator()
job = estimator.run([
(circuit1, observables1, parameter_values1),
(circuit2, observables2)
])
# wait for the result
job.result()
# get a bundle of results for every PUB
>> EstimatorResult(
>> PubResult<{evs: ndarray<32, 4>, stds: ndarray<32, 4>}, local_metadata>,
>> PubResult<{evs: ndarray<18>, stds: ndarray<18>}, local_metadata>,
>> global_metadata
>> )
The above example shows various features of the design, including combining large numbers of observables with large numbers of parameter value sets for a single circuit.
The result types in the example above are illustrative of what we propose, but do not necessarily correspond to-the-letter what will be introduced; we feel it is better to make certain detailed design choices through code review, and use this RFC to focus on concepts and structures. However, we do want to make three points about the result type clear:
- We propose that the errors appear in the data portion of the result, not the metadata, to be treated on equal footing to the estimates themselves.
- We propose that the error portion of the data should represent error bars on the expectation values, somehow. That is, standard error instead of standard deviation or variance, or possibly, a two-sided confidence interval.
- The previous bullet might make it seem like we are proposing to leave it up to an implemtation to choose which quantity to represent. We are not. We propose a homogenous choice for all implementations, documented in the base class. This can obviously not be enforced through typing, so it will have to be enforced through documentation, vigilence, and to some extent, testing.
The detailed targeted signature is as follows:
T = TypeVar("T", bound=Job)
BindingsArrayLike = Union[
BindingsArray,
NDArray,
Dict[Parameter | Tuple[Parameter], NDArray]
]
ObservablesArrayLike = Union[
ObservablesArray,
Iterable[str | Pauli | SparsePauliOp]
]
EstimatorPubLike = Union[
EstimatorPub,
Tuple[QuantumCircuit, ObservablesArrayLike, BindingsArrayLike]
]
class EstimatorBase(ABC, Generic[T]):
...
def run(self, pubs: EstimatorPubLike | Iterable[EstimatorPubLike], **options) -> T:
pass
Technical details of migration are discussed in the Migration Path section, but it is perhaps worthwhile to show some examples that demonstrate that, despite the new array and broadcasting abilities, the interface is not overly complicated for simple examples.
One circuit, many observables:
# old API
estimator.run([circuit] * 4, observables=["ZIII", "IZII", "IIZI", "IIIZ"]).result()
>> EstimatorResult(np.array([0.1, 0.2, 0.3, 0.4]), metadata={...})
# new API (we can use a single PUB)
estimator.run((circuit, ["ZIII", "IZII", "IIZI", "IIIZ"])).result()
>> [PubResult({"evs": np.array([0.1, 0.2, 0.3, 0.4]), "stds": <...>}, metadata={...})]
One circuit, one observable, many parameter value sets:
# old API
estimator.run([circuit] * 3, observables=["XYZI"] * 3, parameter_values=[[1,2,3], [3,4,5], [8,7,6]]).result()
>> EstimatorResult(np.array([0.1, 0.2, 0.3]), metadata={...})
# new API (we can use a single PUB)
estimator.run((circuit, ["XYZI"], [[1,2,3], [3,4,5], [8,7,6]])).result()
>> [PubResult({"evs": np.array([0.1, 0.2, 0.3]), "stds": <...>}, metadata={...})]
Three different circuits, three different observables:
# old API
estimator.run([circuit1, circuit2, circuit3], observables=["ZIII", "IZII", "IIZI"]).result()
>> EstimatorResult(np.array([0.1, 0.2, 0.3]), metadata={...})
# new API (unique circuits always need their own PUBs)
estimator.run([(circuit1, "ZIII"), (circuit2, "IZII"), (circuit3, "IIZI")]).result()
>> [
>> PubResult({"evs": np.array(0.1), "stds": <...>}, metadata={...}),
>> PubResult({"evs": np.array(0.2), "stds": <...>}, metadata={...}),
>> PubResult({"evs": np.array(0.3), "stds": <...>}, metadata={...})
>> ]
We propose the concept of a PUB (Primitive Unified Blocs), which we define as a single circuit along with auxiliary data required to execute the circuit relative to the primitive in question. This concept is general enough that it can be used for all primitive types, current and future, where we stress that what the “auxiliary data” is can vary between primitive types.
For example, a circuit with unbound parameters (or in OpenQASM3 terms, a circuit with input
s) alone could never qualify as a PUB for any primitive because there is not enough information to execute it, namely, numeric parameter binding values. On the other hand, conceptually, a circuit with no unbound parameters (i.e. an circuit with no input
s) alone could form a PUB for a hypothetical primitive that just runs circuits and returns counts. This suggests a natural base for all PUBs:
@dataclass
class BasePub:
circuit: QuantumCircuit
This base class can be added to Qiskit if and when needed as a non-breaking change, and has been included here mainly to make the concepts clear, rather than to indicate that we need such a base class immediately.
Most relevant, for the Estimator
primitive, in order to satisfy the definition as stated above, we propose the PUB structure
class EstimatorPub(BasePub):
observables: ObservablesArray = None
parameter_values: BindingsArray = None,
We expect the formal primitive API and primitive implementations to have a strong sense of PUBs, but we will not demand that users construct them manually in Python as they are little more than simple data containers (incidentally implemented as dataclasses above), and we do not wish to overburden them with types. This is discussed further in Type Coercion Strategy.
It is common for a user to want to do a sweep over parameter values, that is, to execute the same parametric circuit with many different parameter binding sets.
We propose a new class BindingsArray
that stores multiple sets of parameters, which can be bound to a circuit.
For example, if a circuit has 200 parameters, and a user wishes to execute the circuit for 50 different sets of values, then a single instance of BindingsArray
could represent 50 sets of 200 parameter values.
Moreover, we propose that it be array-like, so that, in this example, the BindingsArray
instance would be one-dimensional and have shape equal (50,)
.
The proposed BindingsArray
object would support, at a minimum, the following constructor examples for a circuit with three input parameters a
, b
, and c
, where we will use <>
to denote some array_like
of the specified shape:
# a single value for each a, b, and c; 0-dimensional, and compatible with current interface
BindingsArray([0.0, 1.0, 2.0])
# a single value for each a, b, and c; 0-dimensional, and compatible with current interface
# satisfies user wish to be able to specify values by name
BindingsArray(kwvals={Parameter("a"): 0.0, Parameter("b"): 1.0, Parameter("c"): 2.0})
# specify all 50 binding parameter sets in one big array
BindingsArray(<50, 3>)
# specify bindings separately for each parameter, required if they have different types
BindingsArray([<50>, <50>, <50>])
# include parameter names with the arrays, where parameters can be grouped together in tuples, or supplied separately
BindingsArray(kwargs={(Parameter("a"), Parameter("c")): <50, 2>, Parameter("b"): <50>})
# “args” and “kwargs” can be mixed
BindingsArray(<50, 2>, {Parameter("c"): <50>})
We expect the formal primitive API and primitive implementations to have a strong sense of BindingsArray
, but we will not demand that users construct them manually because we do not wish to overburden them with types, and we need to remain backwards compatible. This is discussed further in the Type Coercion Strategy and Migration Path sections.
With the Estimator
, it is common for a user to want to estimate many observables of a single circuit.
For example, all weight-1 and weight-2 Paulis that are adjacent on the connectivity graph.
For a one-hundred qubit device, this corresponds to hundreds of unique estimates to be made for a single circuit, noting that for this particular example, on the heavy-hex graph, in the absence of mitigation, only 9 circuits need to be physically run.
We propose a new ObservablesArray
object, where each element corresponds to an observable the user wants an estimated expectation value of.
We propose that the internal data model of this object have an element format
List[Tuple[str, List[int], float]]
where, for example, the observable 0.5 * IIZY + 0.2 * XIII
would be stored as
[("ZY", [2, 3], 0.5), ("X", [0], 0.2)]
This is proposed instead of {"IIZY": 0.5, "XIII": 0.2}
anticipating the overhead of storing and working with so many "I"
's for large devices.
Note that the above is simply the data model; the ObservablesArray
object, either in the constructor or special static coercion method, will accept the quantum_info
objects that are convenient to users.
However, only types such as Pauli
and SparsePauliOp
that are known to have sparse Pauli representations will be allowed; we will drop general support for BaseOperator
.
Further, through type coersion in Estimator.run()
itself, a user won't need to manually construct an ObservablesArray
, as shown in the example in the Design Proposal section.
It is up to each Estimator
implementation to, if it sees fit, to solve a graph coloring problem that decides how to produce a sufficient set of physical circuits that are capable of producing data to make each of these estimates.
An nd-array
is an object whose elements are indexed by a tuple of integers, where each integer must be bounded by the dimension of that axis.
The tuple of dimensions, one for each axis, is called the shape of the nd-array
.
For example, 1D list of 10 objects (typically numbers) is a vector and has shape (10,)
; a matrix with 5 rows and 2 columns is 2D and has shape (5, 2)
; a single number is 0D and has an empty shape tuple ()
; an nd-array
with shape (20, 10, 5) can be interpreted as a length-20 list of 10×5 matrices.
We propose the constraint that a BindingsArray
instance and an ObservablesArray
instance that live in the same PUBs must have shapes that are broadcastable, in which case, the Estimator
promises to return one expectation value estimate for each element of the broadcasted shape.
Here are some examples of common patterns expressed in terms of array broadcasting, and their accompanying visual representation in the figure that follows:
# Broadcast single observable
parameter_values = np.array(1.0)
parameter_values.shape == ()
observables = ObservablesArray(["III", "XXX", Pauli("YYY"), "ZZZ", Pauli("XYZ")])
observables.shape == (5,)
>> pub_result.shape == (5,)
# Zip
parameter_values = BindingsArray(np.random.uniform(size=(5,)))
parameter_values.shape == (5,)
observables = ObservablesArray([Pauli("III"), Pauli("XXX"), Pauli("YYY"), Pauli("ZZZ"), Pauli("XYZ")])
observables.shape == (5,)
>> pub_result.shape == (5,)
# Outer/Product
parameter_values = BindingsArray(np.random.uniform(size=(1, 6)))
parameter_values.shape == (1, 6)
observables = ObservablesArray([[Pauli("III")], [Pauli("XXX")], [Pauli("YYY")], [Pauli("ZZZ")]])
observables.shape == (4, 1)
>> pub_result.shape == (4,6)
# Standard nd generalization
parameter_values = BindingsArray(np.random.uniform(size=(3, 6)))
parameter_values.shape == (3, 6)
observables = ObservablesArray([
[[Pauli(...)], [Pauli(...)], [Pauli(...)]]],
[[Pauli(...)], [Pauli(...)], [Pauli(...)]]],
])
observables.shape == (2, 3, 1)
>> pub_result.shape == (3, 6, 2)
To minimize the number of container types that an every-day user will need to interact with, and to make the transition more seamless, we propose that several container types be associated with a TypeLike
pattern and a static method
def coerce(argument: TypeLike) -> Type
that is invoked inside of the run()
method.
For example,
BindingsArrayLike = Union[BindingsArray, ArrayLike, Mapping[Parameter, float]]
and
@staticmethod
def coerce(bindings_array: BindingsArrayLike) -> BindingsArray:
if isinstance(bindings_array, (list, ArrayLike)):
bindings_array = BindingsArray(bindings_array)
elif isinstance(bindings_array, Mapping):
bindings_array = BindingsArray(kwargs=bindings_array)
return bindings_array
In particular, we propose this kind of Coercion for the types:
EstimatorPub
BindingsArray
ObservablesArray
The results from each EstimatorPub
will be array valued, and each EstimatorPub
in the same job may have a different shape.
Consider an EstimatorPub
with shape <20, 30>
, where the shape has come from the broadcasting rules discussed elsewhere.
This will result in a 20×30 array of real estimates.
Moreover, we will want to return an array of standard deviations of the same shape.
This would result in a bundle of broadcastable arrays:
PubResult({“evs”: <20, 30>, “stds”: <20, 30>}, metadata)
The reason we are proposing a generic container for the return type instead of, e.g., an Estimator
-specific container, is because it
- Provides unified experience across primitives for users.
- code-reuse
- Provides a certain certain amount of flexibility for what can be returned without modifying the container object. Here are some examples:
- Suppose that we want to give users the option of additionally returning the covariances between estimates that arise because of the circuit multiplexing, then we could update with the field
{"cov": <20,30,20,30>}
. - Suppose we want to return some indication of which estimates came from the same physical circuit.
We need to remain backwards compatible with the existing interface to adhere to the Qiskit Deprecation Policy.
In summary, we propose the following strategy: explicitly version the Estimator
interface (i.e. add a BaseEstimatorV2
as described in this proposal) and document both the differences and the end user migration path for consumers of estimators. The user can determine which version of the estimator to use, and existing implementations can be left intact.
Similar to Backend
, we will have a BaseEstimator
, BaseEstimatorV1
equal to the existing BaseEstimator
, and BaseEstimatorV2
, as described in this proposal.
We will maintain backward compatibility by defaulting the import of Estimator
to EstimatorV1
(i.e. existing) for the duration of the deprecation period.
An alternative is to consider letting the run()
method accept, effectively, only a single EstimatorPub
:
Estimator.run(circuit, parameter_values_array, observables_array)
This has the advantage of a simpler interface, where multiple PUBs could be run by invoking the estimator multiple times. The disadvantages, which we feel are significant enough to forego this simplification, are that:
- For real backends, the user would lose the ability to cause multiple types of circuits to be loaded into the control hardware at one time. For example, if using an estimator to perform randomized benchmarking, each circuit depth would need to be a separate job.
- It would be difficult for implementations that include mitigation to share resources between PUBs. For example, if different PUBs represent different trotter step counts, there would need to be a complicated mechanism to share learning resources---that are specific to the application circuits---between multiple jobs.
Open questions for discussion and an opening for feedback.
In this proposal we have typed circuits as QuantumCircuit
. It would be possible to extend this to a CircuitLike
class which could be as simple as Union[QuantumCircuit, str]
to explicitly allow OpenQASM3 circuits as first-class inputs.
A consequence of switching to the concept of PUBs (mentioned in PUBs) is that this will allow us to introduce the .run()
method into BasePrimitive
class BasePrimitive(ABC, Generic[T]):
...
def run(self, T | Iterable[T], **options) -> List[PubResult]:
...
BindingsArray
is somewhat constrained by how Parameters
currently work in Qiskit, namely, there is no support for array-valued inputs in the same way that there is in OpenQASM 3; BindingsArray
assumes that every parameter represents a single number like a float
or an int
.
One solution could be to extend the class to allow different sub-arrays to have extra dimensions.
For example, for an input angle-array input array[angle, 15] foo;
, and for a bindings array with shape (30, 20)
, the corresponding array could have shape (30, 20, 1, 15)
, where the singleton dimension exists to express that it is targetting a single input
.
Another generalization is to allow broadcastable shapes inside of a single BindingsArray
.
For example, one could have one array with shape (1, 4, 5)
for five parameters, and another with shape (3, 1, 2)
for two parameters, resulting in a bindings array with shape (3, 4)
.
The advantage of this is allowing sparser representations of bindings arrays, and also specifying "fast" and "slow" parameters.