Skip to content

Commit

Permalink
feat: ToCupy operator
Browse files Browse the repository at this point in the history
  • Loading branch information
mrava87 committed Nov 18, 2024
1 parent ac68e3e commit 5412f0d
Show file tree
Hide file tree
Showing 9 changed files with 284 additions and 24 deletions.
2 changes: 2 additions & 0 deletions docs/source/api/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,8 @@ Basic operators
Real
Imag
Conj
ToCupy


Smoothing and derivatives
~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
109 changes: 100 additions & 9 deletions docs/source/gpu.rst
Original file line number Diff line number Diff line change
Expand Up @@ -409,23 +409,36 @@ Geophysical subsurface characterization:
when ``explicit=False``.


Example
-------
Examples
--------

Finally, let's briefly look at some example.

End-to-end GPU powered inverse problems
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Finally, let's briefly look at an example. First we write a code snippet using
``numpy`` arrays which PyLops will run on your CPU:
First we consider the most common scenario when both the model and data
vectors fit onto the GPU memory. We can therefore simply replace all our
``numpy`` arrays with ``cupy`` arrays and solve the inverse problem of
interest end-to-end on the GPU.

Let's first write a code snippet using ``numpy`` arrays, which PyLops
will run on your CPU:

.. code-block:: python
ny, nx = 400, 400
G = np.random.normal(0, 1, (ny, nx)).astype(np.float32)
x = np.ones(nx, dtype=np.float32)
# Create operator
Gop = MatrixMult(G, dtype='float32')
y = Gop * x
# Create data and invert
y = Gop @ x
xest = Gop / y
Now we write a code snippet using ``cupy`` arrays which PyLops will run on
Now we write a code snippet using ``cupy`` arrays, which PyLops will run on
your GPU:

.. code-block:: python
Expand All @@ -434,8 +447,11 @@ your GPU:
G = cp.random.normal(0, 1, (ny, nx)).astype(np.float32)
x = cp.ones(nx, dtype=np.float32)
# Create operator
Gop = MatrixMult(G, dtype='float32')
y = Gop * x
# Create data and invert
y = Gop @ x
xest = Gop / y
The code is almost unchanged apart from the fact that we now use ``cupy`` arrays,
Expand All @@ -450,15 +466,90 @@ your GPU/TPU:
G = jnp.array(np.random.normal(0, 1, (ny, nx)).astype(np.float32))
x = jnp.ones(nx, dtype=np.float32)
# Create operator
Gop = JaxOperator(MatrixMult(G, dtype='float32'))
y = Gop * x
# Create data and invert
y = Gop @ x
xest = Gop / y
# Adjoint via AD
xadj = Gop.rmatvecad(x, y)
Again, the code is almost unchanged apart from the fact that we now use ``jax`` arrays.


Mixed CPU-GPU powered inverse problems
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Let us now consider a more intricate scenario where we have acess to
a GPU-powered operator, however the model and/or data vectors are too large
to fit onto the memory.

For the sake of clarity, we consider a problem where
the operator can be written as a :class:`pylops.basicoperators.BlockDiag` of
PyLops operators. Note how, by simply sandwitching any of the GPU-powered
operator within two :class:`pylops.basicoperators.ToCupy` operators, we are
able to tell PyLops to transfer to the GPU only the part of the model vector
required by a given operator and transfer back the output to the CPU before
forming the combine output vector (i.e., the output vector of the
:class:`pylops.basicoperators.BlockDiag`)

.. code-block:: python
nops, n = 5, 4
Ms = [np.diag((i + 1) * np.ones(n, dtype=dtype)) \
for i in range(nops)]
Ms = [M.T @ M for M in Ms]
# Create operator
Mops = []
for iop in range(nops):
Mop = MatrixMult(cp.asarray(Ms[iop], dtype=dtype))
Top = ToCupy(Mop.dims, dtype=dtype)
Top1 = ToCupy(Mop.dimsd, dtype=dtype)
Mop = Top1.H @ Mop @ Top
Mops.append(Mop)
Mops = BlockDiag(Mops, forceflat=True)
# Create data and invert
x = np.ones(n * nops, dtype=dtype)
y = Mops @ x.ravel()
xest = Mops / y
Finally, let us consider a problem where
the operator can be written as a :class:`pylops.basicoperators.VStack` of
PyLops operators and the model vector can be fully transferred to the GPU.
We can use again the :class:`pylops.basicoperators.ToCupy` operator, however this
time we will only use it to move the output of each operator to the CPU.
Since we are now in a special scenario, where the input of the overall
operator sits on the GPU and the output on the
CPU, we need to inform the :class:`pylops.basicoperators.VStack` operator about this.
This can be easily done using the additional ``inoutengine`` parameter. Let's
see this with an example:

.. code-block:: python
nops, n, m = 3, 4, 5
Ms = [np.random.normal(0, 1, (n, m)) for _ in range(nops)]
# Create operator
Mops = []
for iop in range(nops):
Mop = MatrixMult(cp.asarray(Ms[iop]), dtype=dtype)
Top1 = ToCupy(Mop.dimsd, dtype=dtype)
Mop = Top1.H @ Mop
Mops.append(Mop)
Mops = VStack(Mops, inoutengine=("numpy", "cupy"))
# Create data and invert
x = cp.ones(m, dtype=dtype)
y = Mops @ x.ravel()
xest = pylops_cgls(Mops, y, x0=cp.zeros_like(x))[0]
**Note:**: this feature is currently not available for ``jax`` arrays.

Again, the code is almost unchanged apart from the fact that we now use ``jax`` arrays,

.. note::

Expand Down
4 changes: 4 additions & 0 deletions pylops/basicoperators/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
Gradient Gradient.
FirstDirectionalDerivative First Directional derivative.
SecondDirectionalDerivative Second Directional derivative.
ToCupy Convert to CuPy.
"""

from .functionoperator import *
Expand Down Expand Up @@ -72,6 +73,8 @@
from .laplacian import *
from .gradient import *
from .directionalderivative import *
from .tocupy import *


__all__ = [
"FunctionOperator",
Expand Down Expand Up @@ -107,4 +110,5 @@
"Gradient",
"FirstDirectionalDerivative",
"SecondDirectionalDerivative",
"ToCupy",
]
22 changes: 19 additions & 3 deletions pylops/basicoperators/blockdiag.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@

from pylops import LinearOperator
from pylops.basicoperators import MatrixMult
from pylops.utils.backend import get_array_module, inplace_set
from pylops.utils.backend import get_array_module, get_module, inplace_set
from pylops.utils.typing import DTypeLike, NDArray


Expand All @@ -48,6 +48,12 @@ class BlockDiag(LinearOperator):
.. versionadded:: 2.2.0
Force an array to be flattened after matvec and rmatvec.
inoutengine : :obj:`tuple`, optional
.. versionadded:: 2.4.0
Type of output vectors of `matvec` and `rmatvec. If ``None``, this is
inferred directly from the input vectors. Note that this is ignored
if ``nproc>1``.
dtype : :obj:`str`, optional
Type of elements in input array.
Expand Down Expand Up @@ -113,6 +119,7 @@ def __init__(
ops: Sequence[LinearOperator],
nproc: int = 1,
forceflat: bool = None,
inoutengine: Optional[tuple] = None,
dtype: Optional[DTypeLike] = None,
) -> None:
self.ops = ops
Expand Down Expand Up @@ -149,6 +156,7 @@ def __init__(
if self.nproc > 1:
self.pool = mp.Pool(processes=nproc)

self.inoutengine = inoutengine
dtype = _get_dtype(ops) if dtype is None else np.dtype(dtype)
clinear = all([getattr(oper, "clinear", True) for oper in self.ops])
super().__init__(
Expand All @@ -172,7 +180,11 @@ def nproc(self, nprocnew: int) -> None:
self._nproc = nprocnew

def _matvec_serial(self, x: NDArray) -> NDArray:
ncp = get_array_module(x)
ncp = (
get_array_module(x)
if self.inoutengine is None
else get_module(self.inoutengine[0])
)
y = ncp.zeros(self.nops, dtype=self.dtype)
for iop, oper in enumerate(self.ops):
y = inplace_set(
Expand All @@ -183,7 +195,11 @@ def _matvec_serial(self, x: NDArray) -> NDArray:
return y

def _rmatvec_serial(self, x: NDArray) -> NDArray:
ncp = get_array_module(x)
ncp = (
get_array_module(x)
if self.inoutengine is None
else get_module(self.inoutengine[1])
)
y = ncp.zeros(self.mops, dtype=self.dtype)
for iop, oper in enumerate(self.ops):
y = inplace_set(
Expand Down
23 changes: 20 additions & 3 deletions pylops/basicoperators/hstack.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@

from pylops import LinearOperator
from pylops.basicoperators import MatrixMult
from pylops.utils.backend import get_array_module, inplace_add, inplace_set
from pylops.utils.backend import get_array_module, get_module, inplace_add, inplace_set
from pylops.utils.typing import NDArray


Expand All @@ -48,6 +48,12 @@ class HStack(LinearOperator):
.. versionadded:: 2.2.0
Force an array to be flattened after matvec.
inoutengine : :obj:`tuple`, optional
.. versionadded:: 2.4.0
Type of output vectors of `matvec` and `rmatvec. If ``None``, this is
inferred directly from the input vectors. Note that this is ignored
if ``nproc>1``.
dtype : :obj:`str`, optional
Type of elements in input array.
Expand Down Expand Up @@ -112,6 +118,7 @@ def __init__(
ops: Sequence[LinearOperator],
nproc: int = 1,
forceflat: bool = None,
inoutengine: Optional[tuple] = None,
dtype: Optional[str] = None,
) -> None:
self.ops = ops
Expand Down Expand Up @@ -139,6 +146,8 @@ def __init__(
self.pool = None
if self.nproc > 1:
self.pool = mp.Pool(processes=nproc)

self.inoutengine = inoutengine
dtype = _get_dtype(self.ops) if dtype is None else np.dtype(dtype)
clinear = all([getattr(oper, "clinear", True) for oper in self.ops])
super().__init__(
Expand All @@ -162,7 +171,11 @@ def nproc(self, nprocnew: int):
self._nproc = nprocnew

def _matvec_serial(self, x: NDArray) -> NDArray:
ncp = get_array_module(x)
ncp = (
get_array_module(x)
if self.inoutengine is None
else get_module(self.inoutengine[0])
)
y = ncp.zeros(self.nops, dtype=self.dtype)
for iop, oper in enumerate(self.ops):
y = inplace_add(
Expand All @@ -173,7 +186,11 @@ def _matvec_serial(self, x: NDArray) -> NDArray:
return y

def _rmatvec_serial(self, x: NDArray) -> NDArray:
ncp = get_array_module(x)
ncp = (
get_array_module(x)
if self.inoutengine is None
else get_module(self.inoutengine[1])
)
y = ncp.zeros(self.mops, dtype=self.dtype)
for iop, oper in enumerate(self.ops):
y = inplace_set(
Expand Down
60 changes: 60 additions & 0 deletions pylops/basicoperators/tocupy.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
__all__ = ["ToCupy"]

from typing import Union

import numpy as np

from pylops import LinearOperator
from pylops.utils._internal import _value_or_sized_to_tuple
from pylops.utils.backend import to_cupy, to_numpy
from pylops.utils.typing import DTypeLike, InputDimsLike, NDArray


class ToCupy(LinearOperator):
r"""Convert to CuPy.
Convert an input array to CuPy in forward mode,
and convert back to NumPy in adjoint mode.
Parameters
----------
dims : :obj:`list` or :obj:`int`
Number of samples for each dimension
dtype : :obj:`str`, optional
Type of elements in input array.
name : :obj:`str`, optional
Name of operator (to be used by :func:`pylops.utils.describe.describe`)
Attributes
----------
shape : :obj:`tuple`
Operator shape
explicit : :obj:`bool`
Operator contains a matrix that can be solved explicitly
(``True``) or not (``False``)
Notes
-----
The ToCupy operator is a special operator that does not perform
any transformation on the input arrays other than converting
them from NumPy to CuPy. This operator can be used when one
is interested to create a chain of operators where only one
(or some of them) act on CuPy arrays, whilst other operate
on NumPy arrays.
"""

def __init__(
self,
dims: Union[int, InputDimsLike],
dtype: DTypeLike = "float64",
name: str = "C",
) -> None:
dims = _value_or_sized_to_tuple(dims)
super().__init__(dtype=np.dtype(dtype), dims=dims, dimsd=dims, name=name)

def _matvec(self, x: NDArray) -> NDArray:
return to_cupy(x)

def _rmatvec(self, x: NDArray) -> NDArray:
return to_numpy(x)
Loading

0 comments on commit 5412f0d

Please sign in to comment.