(I will give a talk on this PoC at the OpenZFS Developer Summit 2022.)
The ARC dynamically shares DRAM capacity among all currently imported zpools.
However, the L2ARC does not do the same for block capacity: the L2ARC vdevs of
one zpool only cache buffers of that zpool. This can be undesirable on systems
that host multiple zpools because it inhibits dynamic sharing of the cache
device capacity which is desirable if the need for L2ARC varies among zpools
over time, or if the set of zpools that are imported in the system varies over
time.
Shared L2ARC addresses this need by decoupling the L2ARC vdevs from the
zpools that store actual data. The mechanism that we use is to place the L2ARC
vdevs into a special zpool, and to adjust the L2ARC feed thread logic to use
that special zpool's L2ARC vdevs for all zpools' buffers.
High-level changes:
* Reserve "NTNX-fsvm-local-l2arc" as a magic zpool name.
We call this "the l2arc pool".
All other pools are called "primary pools".
* Make l2arc feed thread feed ARC buffers from any zpool to the l2arc zpool.
(Before this patch, the l2arc feed thread would only feed ARC buffers to
l2arc devices if they are for the same spa_t).
* Change the locking to ensure that the l2arc zpool cannot be removed while
there are ongoing reads initiated by arc_read on one of the primary pools.
This is sufficient and retains correctness of the ARC because nothing
about the fundamental operation of L2ARC changes. The only thing that changes
is that the L2ARC data is stored on vdevs outside the primary pool.
Proof Of Concept => Production
==============================
This commit is a proof-of-concept.
It works, it results in the desired performance improvement, and it's stable.
But to make it production ready, more work needs to be done.
(1) The design is based on a version of ZFS that does not support
encryption nor Persisent L2ARC. I'm no expert in either of these features.
Encryption might work just fine as long as the l2arc feed thread can access
the encryption keys for l2arc_apply_transforms.
But Persistent L2ARC definitely needs more design work
(multiple L2ARC headers?).
(2) Remove hard-coded magic name; use a property instead.
Make it opt-in so that existing setups are not affected.
Example:
zpool create -o share_l2arc_vdevs=on my-l2arc-pool
(3) Coexistence with non-shared L2ARC; also via property.
Make it opt-in so that existing setups are not affected.
Example:
zpool set use_shared_l2arc=on my-data-pool
Signed-off-by: Christian Schwarz <[email protected]>