diff --git a/README.md b/README.md
index a3207114..04881cd0 100644
--- a/README.md
+++ b/README.md
@@ -2,9 +2,9 @@
 
 AthenaPK: a performance portable version based on [Athena++](https://github.com/PrincetonUniversity/athena),  [Parthenon](https://github.com/parthenon-hpc-lab/parthenon) and [Kokkos](https://github.com/kokkos/kokkos).
 
-## Current state of the code
+## Overview
 
-For this reason, it is highly recommended to only use AthenaPK with the Kokkos and Parthenon versions that are provided by the submodules (see [building](#building)) and to build everything (AthenaPK, Parthenon, and Kokkos) together from source.
+It is highly recommended to only use AthenaPK with the Kokkos and Parthenon versions that are provided by the submodules (see [building](#building)) and to build everything (AthenaPK, Parthenon, and Kokkos) together from source.
 Neither other versions or nor using preinstalled Parthenon/Kokkos libraries have been tested.
 
 Current features include
@@ -76,19 +76,22 @@ Obtain all (AthenaPK, Parthenon, and Kokkos) sources
 Most of the general build instructions and options for Parthenon (see [here](https://parthenon-hpc-lab.github.io/parthenon/develop/src/building.html)) also apply to AthenaPK.
 The following examples are a few standard cases.
 
-Most simple configuration (only CPU, no MPI, no HDF5).
+Most simple configuration (only CPU, no MPI).
 The `Kokkos_ARCH_...` parameter should be adjusted to match the target machine where AthenaPK will be executed.
 A full list of architecture keywords is available on the [Kokkos wiki](https://kokkos.github.io/kokkos-core-wiki/keywords.html#architecture-keywords).
 
-
-    # configure with enabling Broadwell architecture (AVX2) instructions
-    cmake -S. -Bbuild-host -DKokkos_ARCH_BDW=ON -DPARTHENON_DISABLE_MPI=ON -DPARTHENON_DISABLE_HDF5=ON
+    # configure with enabling Intel Broadwell or similar architecture (AVX2) instructions
+    cmake -S. -Bbuild-host -DKokkos_ARCH_BDW=ON -DPARTHENON_DISABLE_MPI=ON
     # now build with
     cd build-host && make
     # or alternatively
     cmake --build build-host
 
-An Intel Skylake system (AVX512 instructions) with NVidia Volta V100 GPUs and with MPI and HDF5 enabled (the latter is the default option, so they don't need to be specified)
+If `cmake` has troubling finding the HDF5 library (which is required for writing analysis outputs or
+restartings simulation) an additional hint to the location of the library can be provided via
+`-DHDF5_ROOT=/path/to/local/hdf5` on the first `cmake` command for configuration.
+
+An Intel Skylake system (AVX512 instructions) with NVidia Volta V100 GPUs and with MPI enabled (the latter is the default option, so they don't need to be specified)
 
     cmake -S. -Bbuild-gpu -DKokkos_ARCH_SKX=ON -DKokkos_ENABLE_CUDA=ON -DKokkos_ARCH_VOLTA70=ON
     # now build with
diff --git a/docs/cluster.md b/docs/cluster.md
index 9fb6f20a..6787d651 100644
--- a/docs/cluster.md
+++ b/docs/cluster.md
@@ -162,6 +162,39 @@ r_sampling = 4.0
 Specifically, the resolution of the 1D profile for each meshblock is either
 `min(dx,dy,dz)/r_sampling` or `r_k/r_sampling`, whichever is smaller.
 
+## Initial perturbations
+
+Initial perturbations for both the velocity field and the magnetic field are
+supported.
+
+*Note* that these intial perturbation are currently incompatible with
+other initial conditions that modify the velocity or magnetic field, e.g.,
+an initial magnetic dipole or a uniform field.
+This restriction could be lifted if required/desired but the normalization
+of the fields would need to be adjusted.
+
+In general, the perturbations will be seeded by an inverse parabolic
+spectral profile centered on a peak wavenumber (in normalized units, i.e.,
+`k_peak = 2` mean half the box size) with a finite number of modes (default 40)
+randomly chosen between `k_peak/2` and `2*k_peak`.
+
+Pertubations are controlled by the following parameters:
+```
+<problem/cluster/init_perturb>
+# for the velocity field
+sigma_v = 0.0      # volume weighted RMS |v| in code velocity; default: 0.0 (disabled)
+k_peak_v = ???     # wavenumber in normalized units where the velocity spectrum peaks. No default value.
+num_modes_v = 40   # (optional) number of wavemodes in spectral space; default: 40
+sol_weight_v = 1.0 # (optional) power in solenoidal (rotational) modes of the perturbation. Range between 0 (fully compressive) and 1.0 (default, fully solenoidal).
+rseed_v = 1        # (optional) integer seed for RNG for wavenumbers and amplitudes
+
+# for the magnetic field
+sigma_b = 0.0      # volume weighted RMS |B| in code magnetic; default: 0.0 (disabled)
+k_peak_b = ???     # wavenumber in normalized units where the magnetic field spectrum peaks. No default value.
+num_modes_b = 40   # (optional) number of wavemodes in spectral space; default: 40
+rseed_b = 2        # (optional) integer seed for RNG for wavenumbers and amplitudes
+```
+
 ## AGN Triggering
 
 If AGN triggering is enabled, at the end of each time step, a mass accretion
diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 2066812d..076fbb9f 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -17,6 +17,7 @@ add_executable(
         hydro/srcterms/tabular_cooling.cpp
         refinement/gradient.cpp
         refinement/other.cpp
+        utils/few_modes_ft.cpp
 )
 
 add_subdirectory(pgen)
diff --git a/src/eos/adiabatic_glmmhd.hpp b/src/eos/adiabatic_glmmhd.hpp
index 9fa80b76..baebc681 100644
--- a/src/eos/adiabatic_glmmhd.hpp
+++ b/src/eos/adiabatic_glmmhd.hpp
@@ -24,8 +24,10 @@ using parthenon::Real;
 class AdiabaticGLMMHDEOS : public EquationOfState {
  public:
   AdiabaticGLMMHDEOS(Real pressure_floor, Real density_floor, Real internal_e_floor,
-                     Real gamma)
-      : EquationOfState(pressure_floor, density_floor, internal_e_floor), gamma_{gamma} {}
+                     Real velocity_ceiling, Real internal_e_ceiling, Real gamma)
+      : EquationOfState(pressure_floor, density_floor, internal_e_floor, velocity_ceiling,
+                        internal_e_ceiling),
+        gamma_{gamma} {}
 
   void ConservedToPrimitive(MeshData<Real> *md) const override;
 
@@ -66,6 +68,9 @@ class AdiabaticGLMMHDEOS : public EquationOfState {
     auto pressure_floor_ = GetPressureFloor();
     auto e_floor_ = GetInternalEFloor();
 
+    auto velocity_ceiling_ = GetVelocityCeiling();
+    auto e_ceiling_ = GetInternalECeiling();
+
     Real &u_d = cons(IDN, k, j, i);
     Real &u_m1 = cons(IM1, k, j, i);
     Real &u_m2 = cons(IM2, k, j, i);
@@ -109,6 +114,23 @@ class AdiabaticGLMMHDEOS : public EquationOfState {
     Real e_B = 0.5 * (SQR(u_b1) + SQR(u_b2) + SQR(u_b3));
     w_p = gm1 * (u_e - e_k - e_B);
 
+    // apply velocity ceiling. By default ceiling is std::numeric_limits<Real>::infinity()
+    const Real w_v2 = SQR(w_vx) + SQR(w_vy) + SQR(w_vz);
+    if (w_v2 > SQR(velocity_ceiling_)) {
+      const Real w_v = sqrt(w_v2);
+      w_vx *= velocity_ceiling_ / w_v;
+      w_vy *= velocity_ceiling_ / w_v;
+      w_vz *= velocity_ceiling_ / w_v;
+
+      u_m1 *= velocity_ceiling_ / w_v;
+      u_m2 *= velocity_ceiling_ / w_v;
+      u_m3 *= velocity_ceiling_ / w_v;
+
+      Real e_k_new = 0.5 * u_d * SQR(velocity_ceiling_);
+      u_e -= e_k - e_k_new;
+      e_k = e_k_new;
+    }
+
     // Let's apply floors explicitly, i.e., by default floor will be disabled (<=0)
     // and the code will fail if a negative pressure is encountered.
     PARTHENON_REQUIRE(w_p > 0.0 || pressure_floor_ > 0.0 || e_floor_ > 0.0,
@@ -130,6 +152,14 @@ class AdiabaticGLMMHDEOS : public EquationOfState {
       w_p = eff_pressure_floor;
     }
 
+    // temperature (internal energy) based pressure ceiling
+    const Real eff_pressure_ceiling = gm1 * u_d * e_ceiling_;
+    if (w_p > eff_pressure_ceiling) {
+      // apply temperature ceiling, correct total energy
+      u_e = (u_d * e_ceiling_) + e_k + e_B;
+      w_p = eff_pressure_ceiling;
+    }
+
     // Convert passive scalars
     for (auto n = nhydro; n < nhydro + nscalars; ++n) {
       prim(n, k, j, i) = cons(n, k, j, i) * di;
diff --git a/src/eos/adiabatic_hydro.hpp b/src/eos/adiabatic_hydro.hpp
index 4dca9ce2..bf317923 100644
--- a/src/eos/adiabatic_hydro.hpp
+++ b/src/eos/adiabatic_hydro.hpp
@@ -25,8 +25,10 @@ using parthenon::Real;
 class AdiabaticHydroEOS : public EquationOfState {
  public:
   AdiabaticHydroEOS(Real pressure_floor, Real density_floor, Real internal_e_floor,
-                    Real gamma)
-      : EquationOfState(pressure_floor, density_floor, internal_e_floor), gamma_{gamma} {}
+                    Real velocity_ceiling, Real internal_e_ceiling, Real gamma)
+      : EquationOfState(pressure_floor, density_floor, internal_e_floor, velocity_ceiling,
+                        internal_e_ceiling),
+        gamma_{gamma} {}
 
   void ConservedToPrimitive(MeshData<Real> *md) const override;
 
@@ -55,6 +57,9 @@ class AdiabaticHydroEOS : public EquationOfState {
     auto pressure_floor_ = GetPressureFloor();
     auto e_floor_ = GetInternalEFloor();
 
+    auto velocity_ceiling_ = GetVelocityCeiling();
+    auto e_ceiling_ = GetInternalECeiling();
+
     Real &u_d = cons(IDN, k, j, i);
     Real &u_m1 = cons(IM1, k, j, i);
     Real &u_m2 = cons(IM2, k, j, i);
@@ -84,6 +89,23 @@ class AdiabaticHydroEOS : public EquationOfState {
     Real e_k = 0.5 * di * (SQR(u_m1) + SQR(u_m2) + SQR(u_m3));
     w_p = gm1 * (u_e - e_k);
 
+    // apply velocity ceiling. By default ceiling is std::numeric_limits<Real>::infinity()
+    const Real w_v2 = SQR(w_vx) + SQR(w_vy) + SQR(w_vz);
+    if (w_v2 > SQR(velocity_ceiling_)) {
+      const Real w_v = sqrt(w_v2);
+      w_vx *= velocity_ceiling_ / w_v;
+      w_vy *= velocity_ceiling_ / w_v;
+      w_vz *= velocity_ceiling_ / w_v;
+
+      u_m1 *= velocity_ceiling_ / w_v;
+      u_m2 *= velocity_ceiling_ / w_v;
+      u_m3 *= velocity_ceiling_ / w_v;
+
+      Real e_k_new = 0.5 * u_d * SQR(velocity_ceiling_);
+      u_e -= e_k - e_k_new;
+      e_k = e_k_new;
+    }
+
     // Let's apply floors explicitly, i.e., by default floor will be disabled (<=0)
     // and the code will fail if a negative pressure is encountered.
     PARTHENON_REQUIRE(w_p > 0.0 || pressure_floor_ > 0.0 || e_floor_ > 0.0,
@@ -105,6 +127,14 @@ class AdiabaticHydroEOS : public EquationOfState {
       w_p = eff_pressure_floor;
     }
 
+    // temperature (internal energy) based pressure ceiling
+    const Real eff_pressure_ceiling = gm1 * u_d * e_ceiling_;
+    if (w_p > eff_pressure_ceiling) {
+      // apply temperature ceiling, correct total energy
+      u_e = (u_d * e_ceiling_) + e_k;
+      w_p = eff_pressure_ceiling;
+    }
+
     // Convert passive scalars
     for (auto n = nhydro; n < nhydro + nscalars; ++n) {
       prim(n, k, j, i) = cons(n, k, j, i) * di;
diff --git a/src/eos/eos.hpp b/src/eos/eos.hpp
index b6986d7e..b75eedce 100644
--- a/src/eos/eos.hpp
+++ b/src/eos/eos.hpp
@@ -32,9 +32,11 @@ using parthenon::Real;
 
 class EquationOfState {
  public:
-  EquationOfState(Real pressure_floor, Real density_floor, Real internal_e_floor)
+  EquationOfState(Real pressure_floor, Real density_floor, Real internal_e_floor,
+                  Real velocity_ceiling, Real internal_e_ceiling)
       : pressure_floor_(pressure_floor), density_floor_(density_floor),
-        internal_e_floor_(internal_e_floor) {}
+        internal_e_floor_(internal_e_floor), velocity_ceiling_(velocity_ceiling),
+        internal_e_ceiling_(internal_e_ceiling) {}
   virtual void ConservedToPrimitive(MeshData<Real> *md) const = 0;
 
   KOKKOS_INLINE_FUNCTION
@@ -47,8 +49,15 @@ class EquationOfState {
   KOKKOS_INLINE_FUNCTION
   Real GetInternalEFloor() const { return internal_e_floor_; }
 
+  KOKKOS_INLINE_FUNCTION
+  Real GetVelocityCeiling() const { return velocity_ceiling_; }
+
+  KOKKOS_INLINE_FUNCTION
+  Real GetInternalECeiling() const { return internal_e_ceiling_; }
+
  private:
   Real pressure_floor_, density_floor_, internal_e_floor_;
+  Real velocity_ceiling_, internal_e_ceiling_;
 };
 
 #endif // EOS_EOS_HPP_
diff --git a/src/hydro/hydro.cpp b/src/hydro/hydro.cpp
index ba903089..425f6365 100644
--- a/src/hydro/hydro.cpp
+++ b/src/hydro/hydro.cpp
@@ -385,10 +385,6 @@ std::shared_ptr<StateDescriptor> Initialize(ParameterInput *pin) {
 
   auto first_order_flux_correct =
       pin->GetOrAddBoolean("hydro", "first_order_flux_correct", false);
-  if (first_order_flux_correct && integrator != Integrator::vl2) {
-    PARTHENON_FAIL("Please use 'vl2' integrator with first order flux correction. Other "
-                   "integrators have not been tested.")
-  }
   pkg->AddParam<>("first_order_flux_correct", first_order_flux_correct);
   if (first_order_flux_correct) {
     if (fluid == Fluid::euler) {
@@ -439,6 +435,23 @@ std::shared_ptr<StateDescriptor> Initialize(ParameterInput *pin) {
       efloor = Tfloor / mbar_over_kb / (gamma - 1.0);
     }
 
+    // By default disable ceilings by setting to infinity
+    Real vceil =
+        pin->GetOrAddReal("hydro", "vceil", std::numeric_limits<Real>::infinity());
+    Real Tceil =
+        pin->GetOrAddReal("hydro", "Tceil", std::numeric_limits<Real>::infinity());
+    Real eceil = Tceil;
+    if (eceil < std::numeric_limits<Real>::infinity()) {
+      if (!pkg->AllParams().hasKey("mbar_over_kb")) {
+        PARTHENON_FAIL("Temperature ceiling requires units and gas composition. "
+                       "Either set a 'units' block and the 'hydro/He_mass_fraction' in "
+                       "input file or use a pressure floor "
+                       "(defined code units) instead.");
+      }
+      auto mbar_over_kb = pkg->Param<Real>("mbar_over_kb");
+      eceil = Tceil / mbar_over_kb / (gamma - 1.0);
+    }
+
     auto conduction = Conduction::none;
     auto conduction_str = pin->GetOrAddString("diffusion", "conduction", "none");
     if (conduction_str == "spitzer") {
@@ -472,12 +485,12 @@ std::shared_ptr<StateDescriptor> Initialize(ParameterInput *pin) {
     pkg->AddParam<>("conduction", conduction);
 
     if (fluid == Fluid::euler) {
-      AdiabaticHydroEOS eos(pfloor, dfloor, efloor, gamma);
+      AdiabaticHydroEOS eos(pfloor, dfloor, efloor, vceil, eceil, gamma);
       pkg->AddParam<>("eos", eos);
       pkg->FillDerivedMesh = ConsToPrim<AdiabaticHydroEOS>;
       pkg->EstimateTimestepMesh = EstimateTimestep<Fluid::euler>;
     } else if (fluid == Fluid::glmmhd) {
-      AdiabaticGLMMHDEOS eos(pfloor, dfloor, efloor, gamma);
+      AdiabaticGLMMHDEOS eos(pfloor, dfloor, efloor, vceil, eceil, gamma);
       pkg->AddParam<>("eos", eos);
       pkg->FillDerivedMesh = ConsToPrim<AdiabaticGLMMHDEOS>;
       pkg->EstimateTimestepMesh = EstimateTimestep<Fluid::glmmhd>;
@@ -975,6 +988,7 @@ TaskStatus FirstOrderFluxCorrect(MeshData<Real> *u0_data, MeshData<Real> *u1_dat
 
   std::vector<parthenon::MetadataFlag> flags_ind({Metadata::Independent});
   auto u0_cons_pack = u0_data->PackVariablesAndFluxes(flags_ind);
+  auto const &u0_prim_pack = u0_data->PackVariables(std::vector<std::string>{"prim"});
   auto u1_cons_pack = u1_data->PackVariablesAndFluxes(flags_ind);
   auto pkg = pmb->packages.Get("Hydro");
 
@@ -987,14 +1001,6 @@ TaskStatus FirstOrderFluxCorrect(MeshData<Real> *u0_data, MeshData<Real> *u1_dat
   if (fluid == Fluid::glmmhd) {
     c_h = pkg->Param<Real>("c_h");
   }
-  // Using "u1_prim" as "u0_prim" here because all current integrators start with copying
-  // the initial state to the "u0" register, see conditional for `stage == 1` in the
-  // hydro_driver where normally only "cons" is copied but in case for flux correction
-  // "prim", too. This means both during stage 1 and during stage 2 `u1` holds the
-  // original data at the beginning of the timestep. For flux correction we want to make a
-  // full (dt) low order update using the original data and thus use the "prim" data from
-  // u1 here.
-  auto const &u0_prim_pack = u1_data->PackVariables(std::vector<std::string>{"prim"});
 
   const int ndim = pmb->pmy_mesh->ndim;
 
diff --git a/src/main.cpp b/src/main.cpp
index 1d9ef467..8b796610 100644
--- a/src/main.cpp
+++ b/src/main.cpp
@@ -85,7 +85,7 @@ int main(int argc, char *argv[]) {
     Hydro::ProblemInitPackageData = rand_blast::ProblemInitPackageData;
     Hydro::ProblemSourceFirstOrder = rand_blast::RandomBlasts;
   } else if (problem == "cluster") {
-    pman.app_input->ProblemGenerator = cluster::ProblemGenerator;
+    pman.app_input->MeshProblemGenerator = cluster::ProblemGenerator;
     pman.app_input->MeshBlockUserWorkBeforeOutput = cluster::UserWorkBeforeOutput;
     Hydro::ProblemInitPackageData = cluster::ProblemInitPackageData;
     Hydro::ProblemSourceUnsplit = cluster::ClusterSrcTerm;
@@ -119,10 +119,6 @@ int main(int argc, char *argv[]) {
     // This line actually runs the simulation
     driver.Execute();
   }
-  // very ugly cleanup...
-  if (problem == "turbulence") {
-    turbulence::Cleanup();
-  }
 
   // call MPI_Finalize and Kokkos::finalize if necessary
   pman.ParthenonFinalize();
diff --git a/src/pgen/cluster.cpp b/src/pgen/cluster.cpp
index 69385ea0..1e21f534 100644
--- a/src/pgen/cluster.cpp
+++ b/src/pgen/cluster.cpp
@@ -25,16 +25,20 @@
 #include <string>    // c_str()
 
 // Parthenon headers
+#include "kokkos_abstraction.hpp"
 #include "mesh/domain.hpp"
 #include "mesh/mesh.hpp"
 #include <parthenon/driver.hpp>
 #include <parthenon/package.hpp>
 
 // AthenaPK headers
+#include "../eos/adiabatic_glmmhd.hpp"
+#include "../eos/adiabatic_hydro.hpp"
 #include "../hydro/hydro.hpp"
 #include "../hydro/srcterms/gravitational_field.hpp"
 #include "../hydro/srcterms/tabular_cooling.hpp"
 #include "../main.hpp"
+#include "../utils/few_modes_ft.hpp"
 
 // Cluster headers
 #include "cluster/agn_feedback.hpp"
@@ -44,10 +48,129 @@
 #include "cluster/hydrostatic_equilibrium_sphere.hpp"
 #include "cluster/magnetic_tower.hpp"
 #include "cluster/snia_feedback.hpp"
+#include "parthenon_array_generic.hpp"
+#include "utils/error_checking.hpp"
 
 namespace cluster {
 using namespace parthenon::driver::prelude;
 using namespace parthenon::package::prelude;
+using utils::few_modes_ft::FewModesFT;
+
+template <class EOS>
+void ApplyClusterClips(MeshData<Real> *md, const parthenon::SimTime &tm,
+                       const Real beta_dt, const EOS eos) {
+
+  auto hydro_pkg = md->GetBlockData(0)->GetBlockPointer()->packages.Get("Hydro");
+
+  // Apply clips -- ceilings on temperature, velocity, alfven velocity, and
+  // density floor -- within a radius of the AGN
+  const auto &dfloor = hydro_pkg->Param<Real>("cluster_dfloor");
+  const auto &eceil = hydro_pkg->Param<Real>("cluster_eceil");
+  const auto &vceil = hydro_pkg->Param<Real>("cluster_vceil");
+  const auto &vAceil = hydro_pkg->Param<Real>("cluster_vAceil");
+  const auto &clip_r = hydro_pkg->Param<Real>("cluster_clip_r");
+
+  if (clip_r > 0 && (dfloor > 0 || eceil < std::numeric_limits<Real>::infinity() ||
+                     vceil < std::numeric_limits<Real>::infinity() ||
+                     vAceil < std::numeric_limits<Real>::infinity())) {
+    // Grab some necessary variables
+    const auto &prim_pack = md->PackVariables(std::vector<std::string>{"prim"});
+    const auto &cons_pack = md->PackVariables(std::vector<std::string>{"cons"});
+    IndexRange ib = md->GetBlockData(0)->GetBoundsI(IndexDomain::interior);
+    IndexRange jb = md->GetBlockData(0)->GetBoundsJ(IndexDomain::interior);
+    IndexRange kb = md->GetBlockData(0)->GetBoundsK(IndexDomain::interior);
+    const auto nhydro = hydro_pkg->Param<int>("nhydro");
+    const auto nscalars = hydro_pkg->Param<int>("nscalars");
+
+    const Real clip_r2 = SQR(clip_r);
+    const Real vceil2 = SQR(vceil);
+    const Real vAceil2 = SQR(vAceil);
+    const Real gm1 = (hydro_pkg->Param<Real>("AdiabaticIndex") - 1.0);
+
+    parthenon::par_for(
+        DEFAULT_LOOP_PATTERN, "Cluster::ApplyClusterClips", parthenon::DevExecSpace(), 0,
+        cons_pack.GetDim(5) - 1, kb.s, kb.e, jb.s, jb.e, ib.s, ib.e,
+        KOKKOS_LAMBDA(const int &b, const int &k, const int &j, const int &i) {
+          auto &cons = cons_pack(b);
+          auto &prim = prim_pack(b);
+          const auto &coords = cons_pack.GetCoords(b);
+
+          const Real r2 =
+              SQR(coords.Xc<1>(i)) + SQR(coords.Xc<2>(j)) + SQR(coords.Xc<3>(k));
+
+          if (r2 < clip_r2) {
+            // Cell falls within clipping radius
+            eos.ConsToPrim(cons, prim, nhydro, nscalars, k, j, i);
+
+            if (dfloor > 0) {
+              const Real rho = prim(IDN, k, j, i);
+              if (rho < dfloor) {
+                cons(IDN, k, j, i) = dfloor;
+                prim(IDN, k, j, i) = dfloor;
+              }
+            }
+
+            if (vceil < std::numeric_limits<Real>::infinity()) {
+              // Apply velocity ceiling
+              const Real v2 = SQR(prim(IV1, k, j, i)) + SQR(prim(IV2, k, j, i)) +
+                              SQR(prim(IV3, k, j, i));
+              if (v2 > vceil2) {
+                // Fix the velocity to the velocity ceiling
+                const Real v = sqrt(v2);
+                cons(IM1, k, j, i) *= vceil / v;
+                cons(IM2, k, j, i) *= vceil / v;
+                cons(IM3, k, j, i) *= vceil / v;
+                prim(IV1, k, j, i) *= vceil / v;
+                prim(IV2, k, j, i) *= vceil / v;
+                prim(IV3, k, j, i) *= vceil / v;
+
+                // Remove kinetic energy
+                cons(IEN, k, j, i) -= 0.5 * prim(IDN, k, j, i) * (v2 - vceil2);
+              }
+            }
+
+            if (vAceil2 < std::numeric_limits<Real>::infinity()) {
+              // Apply Alfven velocity ceiling by raising density
+              const Real rho = prim(IDN, k, j, i);
+              const Real B2 = (SQR(prim(IB1, k, j, i)) + SQR(prim(IB2, k, j, i)) +
+                               SQR(prim(IB3, k, j, i)));
+
+              // compute Alfven mach number
+              const Real va2 = (B2 / rho);
+
+              if (va2 > vAceil2) {
+                // Increase the density to match the alfven velocity ceiling
+                const Real rho_new = std::sqrt(B2 / vAceil2);
+                cons(IDN, k, j, i) = rho_new;
+                prim(IDN, k, j, i) = rho_new;
+              }
+            }
+
+            if (eceil < std::numeric_limits<Real>::infinity()) {
+              // Apply  internal energy ceiling as a pressure ceiling
+              const Real internal_e = prim(IPR, k, j, i) / (gm1 * prim(IDN, k, j, i));
+              if (internal_e > eceil) {
+                cons(IEN, k, j, i) -= prim(IDN, k, j, i) * (internal_e - eceil);
+                prim(IPR, k, j, i) = gm1 * prim(IDN, k, j, i) * eceil;
+              }
+            }
+          }
+        });
+  }
+}
+
+void ApplyClusterClips(MeshData<Real> *md, const parthenon::SimTime &tm,
+                       const Real beta_dt) {
+  auto hydro_pkg = md->GetBlockData(0)->GetBlockPointer()->packages.Get("Hydro");
+  auto fluid = hydro_pkg->Param<Fluid>("fluid");
+  if (fluid == Fluid::euler) {
+    ApplyClusterClips(md, tm, beta_dt, hydro_pkg->Param<AdiabaticHydroEOS>("eos"));
+  } else if (fluid == Fluid::glmmhd) {
+    ApplyClusterClips(md, tm, beta_dt, hydro_pkg->Param<AdiabaticGLMMHDEOS>("eos"));
+  } else {
+    PARTHENON_FAIL("Cluster::ApplyClusterClips: Unknown EOS");
+  }
+}
 
 void ClusterSrcTerm(MeshData<Real> *md, const parthenon::SimTime &tm,
                     const Real beta_dt) {
@@ -70,7 +193,9 @@ void ClusterSrcTerm(MeshData<Real> *md, const parthenon::SimTime &tm,
 
   const auto &snia_feedback = hydro_pkg->Param<SNIAFeedback>("snia_feedback");
   snia_feedback.FeedbackSrcTerm(md, beta_dt, tm);
-}
+
+  ApplyClusterClips(md, tm, beta_dt);
+};
 
 Real ClusterEstimateTimestep(MeshData<Real> *md) {
   Real min_dt = std::numeric_limits<Real>::max();
@@ -216,6 +341,40 @@ void ProblemInitPackageData(ParameterInput *pin, parthenon::StateDescriptor *hyd
 
   SNIAFeedback snia_feedback(pin, hydro_pkg);
 
+  /************************************************************
+   * Read Clips  (ceilings and floors)
+   ************************************************************/
+
+  // Disable all clips by default with a negative radius clip
+  Real clip_r = pin->GetOrAddReal("problem/cluster/clips", "clip_r", -1.0);
+
+  // By default disable floors by setting a negative value
+  Real dfloor = pin->GetOrAddReal("problem/cluster/clips", "dfloor", -1.0);
+
+  // By default disable ceilings by setting to infinity
+  Real vceil = pin->GetOrAddReal("problem/cluster/clips", "vceil",
+                                 std::numeric_limits<Real>::infinity());
+  Real vAceil = pin->GetOrAddReal("problem/cluster/clips", "vAceil",
+                                  std::numeric_limits<Real>::infinity());
+  Real Tceil = pin->GetOrAddReal("problem/cluster/clips", "Tceil",
+                                 std::numeric_limits<Real>::infinity());
+  Real eceil = Tceil;
+  if (eceil < std::numeric_limits<Real>::infinity()) {
+    if (!hydro_pkg->AllParams().hasKey("mbar_over_kb")) {
+      PARTHENON_FAIL("Temperature ceiling requires units and gas composition. "
+                     "Either set a 'units' block and the 'hydro/He_mass_fraction' in "
+                     "input file or use a pressure floor "
+                     "(defined code units) instead.");
+    }
+    auto mbar_over_kb = hydro_pkg->Param<Real>("mbar_over_kb");
+    eceil = Tceil / mbar_over_kb / (hydro_pkg->Param<Real>("AdiabaticIndex") - 1.0);
+  }
+  hydro_pkg->AddParam("cluster_dfloor", dfloor);
+  hydro_pkg->AddParam("cluster_eceil", eceil);
+  hydro_pkg->AddParam("cluster_vceil", vceil);
+  hydro_pkg->AddParam("cluster_vAceil", vAceil);
+  hydro_pkg->AddParam("cluster_clip_r", clip_r);
+
   /************************************************************
    * Add derived fields
    * NOTE: these must be filled in UserWorkBeforeOutput
@@ -244,195 +403,422 @@ void ProblemInitPackageData(ParameterInput *pin, parthenon::StateDescriptor *hyd
     // plasma beta
     hydro_pkg->AddField("plasma_beta", m);
   }
+
+  /************************************************************
+   * Add infrastructure for initial pertubations
+   ************************************************************/
+
+  const auto sigma_v = pin->GetOrAddReal("problem/cluster/init_perturb", "sigma_v", 0.0);
+  if (sigma_v != 0.0) {
+    // peak of init vel perturb
+    auto k_peak_v = pin->GetReal("problem/cluster/init_perturb", "k_peak_v");
+    auto num_modes_v =
+        pin->GetOrAddInteger("problem/cluster/init_perturb", "num_modes_v", 40);
+    auto sol_weight_v =
+        pin->GetOrAddReal("problem/cluster/init_perturb", "sol_weight_v", 1.0);
+    uint32_t rseed_v = pin->GetOrAddInteger("problem/cluster/init_perturb", "rseed_v", 1);
+    // In principle arbitrary because the inital v_hat is 0 and the v_hat_new will contain
+    // the perturbation (and is normalized in the following to get the desired sigma_v)
+    const auto t_corr = 1e-10;
+
+    auto k_vec_v = utils::few_modes_ft::MakeRandomModes(num_modes_v, k_peak_v, rseed_v);
+
+    auto few_modes_ft = FewModesFT(pin, hydro_pkg, "cluster_perturb_v", num_modes_v,
+                                   k_vec_v, k_peak_v, sol_weight_v, t_corr, rseed_v);
+    hydro_pkg->AddParam<>("cluster/few_modes_ft_v", few_modes_ft);
+
+    // Add field for initial perturation (must not need to be consistent but defining it
+    // this way is easier for now)
+    Metadata m({Metadata::Cell, Metadata::Derived, Metadata::OneCopy},
+               std::vector<int>({3}));
+    hydro_pkg->AddField("tmp_perturb", m);
+  }
+  const auto sigma_b = pin->GetOrAddReal("problem/cluster/init_perturb", "sigma_b", 0.0);
+  if (sigma_b != 0.0) {
+    PARTHENON_REQUIRE_THROWS(hydro_pkg->Param<Fluid>("fluid") == Fluid::glmmhd,
+                             "Requested initial magnetic field perturbations but not "
+                             "solving the MHD equations.")
+    // peak of init vel perturb
+    auto k_peak_b = pin->GetReal("problem/cluster/init_perturb", "k_peak_b");
+    auto num_modes_b =
+        pin->GetOrAddInteger("problem/cluster/init_perturb", "num_modes_b", 40);
+    uint32_t rseed_b = pin->GetOrAddInteger("problem/cluster/init_perturb", "rseed_b", 2);
+    // In principle arbitrary because the inital A_hat is 0 and the A_hat_new will contain
+    // the perturbation (and is normalized in the following to get the desired sigma_b)
+    const auto t_corr = 1e-10;
+    // This field should by construction have no compressive modes, so we fix the number.
+    const auto sol_weight_b = 1.0;
+
+    auto k_vec_b = utils::few_modes_ft::MakeRandomModes(num_modes_b, k_peak_b, rseed_b);
+
+    const bool fill_ghosts = true; // as we fill a vector potential to calc B
+    auto few_modes_ft =
+        FewModesFT(pin, hydro_pkg, "cluster_perturb_b", num_modes_b, k_vec_b, k_peak_b,
+                   sol_weight_b, t_corr, rseed_b, fill_ghosts);
+    hydro_pkg->AddParam<>("cluster/few_modes_ft_b", few_modes_ft);
+
+    // Add field for initial perturation (must not need to be consistent but defining it
+    // this way is easier for now). Only add if not already done for the velocity.
+    if (sigma_v == 0.0) {
+      Metadata m({Metadata::Cell, Metadata::Derived, Metadata::OneCopy},
+                 std::vector<int>({3}));
+      hydro_pkg->AddField("tmp_perturb", m);
+    }
+  }
 }
 
 //========================================================================================
-//! \fn void MeshBlock::ProblemGenerator(ParameterInput *pin)
-//! \brief Generate problem data on each meshblock
+//! \fn void ProblemGenerator(Mesh *pmesh, ParameterInput *pin, MeshData<Real> *md)
+//! \brief Generate problem data for all blocks on rank
+//
+// Note, this requires that parthenon/mesh/pack_size=-1 during initialization so that
+// reductions work
 //========================================================================================
 
-void ProblemGenerator(MeshBlock *pmb, parthenon::ParameterInput *pin) {
-  auto hydro_pkg = pmb->packages.Get("Hydro");
+void ProblemGenerator(Mesh *pmesh, ParameterInput *pin, MeshData<Real> *md) {
+  // This could be more optimized, but require a refactor of init routines being called.
+  // However, given that it's just called during initial setup, this should not be a
+  // performance concern.
+  for (int b = 0; b < md->NumBlocks(); b++) {
+    auto pmb = md->GetBlockData(b)->GetBlockPointer();
+    auto hydro_pkg = pmb->packages.Get("Hydro");
 
-  IndexRange ib = pmb->cellbounds.GetBoundsI(IndexDomain::interior);
-  IndexRange jb = pmb->cellbounds.GetBoundsJ(IndexDomain::interior);
-  IndexRange kb = pmb->cellbounds.GetBoundsK(IndexDomain::interior);
+    IndexRange ib = pmb->cellbounds.GetBoundsI(IndexDomain::interior);
+    IndexRange jb = pmb->cellbounds.GetBoundsJ(IndexDomain::interior);
+    IndexRange kb = pmb->cellbounds.GetBoundsK(IndexDomain::interior);
 
-  // Initialize the conserved variables
-  auto &u = pmb->meshblock_data.Get()->Get("cons").data;
+    // Initialize the conserved variables
+    auto &u = pmb->meshblock_data.Get()->Get("cons").data;
 
-  auto &coords = pmb->coords;
+    auto &coords = pmb->coords;
 
-  // Get Adiabatic Index
-  const Real gam = pin->GetReal("hydro", "gamma");
-  const Real gm1 = (gam - 1.0);
+    // Get Adiabatic Index
+    const Real gam = pin->GetReal("hydro", "gamma");
+    const Real gm1 = (gam - 1.0);
 
-  /************************************************************
-   * Initialize the initial hydro state
-   ************************************************************/
-  const auto &init_uniform_gas = hydro_pkg->Param<bool>("init_uniform_gas");
-  if (init_uniform_gas) {
-    const Real rho = hydro_pkg->Param<Real>("uniform_gas_rho");
-    const Real ux = hydro_pkg->Param<Real>("uniform_gas_ux");
-    const Real uy = hydro_pkg->Param<Real>("uniform_gas_uy");
-    const Real uz = hydro_pkg->Param<Real>("uniform_gas_uz");
-    const Real pres = hydro_pkg->Param<Real>("uniform_gas_pres");
-
-    const Real Mx = rho * ux;
-    const Real My = rho * uy;
-    const Real Mz = rho * uz;
-    const Real E = rho * (0.5 * (ux * ux + uy * uy + uz * uz) + pres / (gm1 * rho));
-
-    parthenon::par_for(
-        DEFAULT_LOOP_PATTERN, "Cluster::ProblemGenerator::UniformGas",
-        parthenon::DevExecSpace(), kb.s, kb.e, jb.s, jb.e, ib.s, ib.e,
-        KOKKOS_LAMBDA(const int &k, const int &j, const int &i) {
-          u(IDN, k, j, i) = rho;
-          u(IM1, k, j, i) = Mx;
-          u(IM2, k, j, i) = My;
-          u(IM3, k, j, i) = Mz;
-          u(IEN, k, j, i) = E;
-        });
-
-    // end if(init_uniform_gas)
-  } else {
     /************************************************************
-     * Initialize a HydrostaticEquilibriumSphere
+     * Initialize the initial hydro state
      ************************************************************/
-    const auto &he_sphere =
-        hydro_pkg
-            ->Param<HydrostaticEquilibriumSphere<ClusterGravity, ACCEPTEntropyProfile>>(
-                "hydrostatic_equilibirum_sphere");
-
-    const auto P_rho_profile = he_sphere.generate_P_rho_profile(ib, jb, kb, coords);
+    const auto &init_uniform_gas = hydro_pkg->Param<bool>("init_uniform_gas");
+    if (init_uniform_gas) {
+      const Real rho = hydro_pkg->Param<Real>("uniform_gas_rho");
+      const Real ux = hydro_pkg->Param<Real>("uniform_gas_ux");
+      const Real uy = hydro_pkg->Param<Real>("uniform_gas_uy");
+      const Real uz = hydro_pkg->Param<Real>("uniform_gas_uz");
+      const Real pres = hydro_pkg->Param<Real>("uniform_gas_pres");
+
+      const Real Mx = rho * ux;
+      const Real My = rho * uy;
+      const Real Mz = rho * uz;
+      const Real E = rho * (0.5 * (ux * ux + uy * uy + uz * uz) + pres / (gm1 * rho));
 
-    // initialize conserved variables
-    parthenon::par_for(
-        DEFAULT_LOOP_PATTERN, "cluster::ProblemGenerator::UniformGas",
-        parthenon::DevExecSpace(), kb.s, kb.e, jb.s, jb.e, ib.s, ib.e,
-        KOKKOS_LAMBDA(const int &k, const int &j, const int &i) {
-          // Calculate radius
-          const Real r =
-              sqrt(coords.Xc<1>(i) * coords.Xc<1>(i) + coords.Xc<2>(j) * coords.Xc<2>(j) +
-                   coords.Xc<3>(k) * coords.Xc<3>(k));
-
-          // Get pressure and density from generated profile
-          const Real P_r = P_rho_profile.P_from_r(r);
-          const Real rho_r = P_rho_profile.rho_from_r(r);
-
-          // Fill conserved states, 0 initial velocity
-          u(IDN, k, j, i) = rho_r;
-          u(IM1, k, j, i) = 0.0;
-          u(IM2, k, j, i) = 0.0;
-          u(IM3, k, j, i) = 0.0;
-          u(IEN, k, j, i) = P_r / gm1;
-        });
-  }
+      parthenon::par_for(
+          DEFAULT_LOOP_PATTERN, "Cluster::ProblemGenerator::UniformGas",
+          parthenon::DevExecSpace(), kb.s, kb.e, jb.s, jb.e, ib.s, ib.e,
+          KOKKOS_LAMBDA(const int &k, const int &j, const int &i) {
+            u(IDN, k, j, i) = rho;
+            u(IM1, k, j, i) = Mx;
+            u(IM2, k, j, i) = My;
+            u(IM3, k, j, i) = Mz;
+            u(IEN, k, j, i) = E;
+          });
 
-  if (hydro_pkg->Param<Fluid>("fluid") == Fluid::glmmhd) {
-    /************************************************************
-     * Initialize the initial magnetic field state via a vector potential
-     ************************************************************/
-    parthenon::ParArray4D<Real> A("A", 3, pmb->cellbounds.ncellsk(IndexDomain::entire),
-                                  pmb->cellbounds.ncellsj(IndexDomain::entire),
-                                  pmb->cellbounds.ncellsi(IndexDomain::entire));
-
-    IndexRange a_ib = ib;
-    a_ib.s -= 1;
-    a_ib.e += 1;
-    IndexRange a_jb = jb;
-    a_jb.s -= 1;
-    a_jb.e += 1;
-    IndexRange a_kb = kb;
-    a_kb.s -= 1;
-    a_kb.e += 1;
+      // end if(init_uniform_gas)
+    } else {
+      /************************************************************
+       * Initialize a HydrostaticEquilibriumSphere
+       ************************************************************/
+      const auto &he_sphere =
+          hydro_pkg
+              ->Param<HydrostaticEquilibriumSphere<ClusterGravity, ACCEPTEntropyProfile>>(
+                  "hydrostatic_equilibirum_sphere");
 
-    /************************************************************
-     * Initialize an initial magnetic tower
-     ************************************************************/
-    const auto &magnetic_tower = hydro_pkg->Param<MagneticTower>("magnetic_tower");
+      const auto P_rho_profile = he_sphere.generate_P_rho_profile(ib, jb, kb, coords);
 
-    magnetic_tower.AddInitialFieldToPotential(pmb, a_kb, a_jb, a_ib, A);
+      // initialize conserved variables
+      parthenon::par_for(
+          DEFAULT_LOOP_PATTERN, "cluster::ProblemGenerator::UniformGas",
+          parthenon::DevExecSpace(), kb.s, kb.e, jb.s, jb.e, ib.s, ib.e,
+          KOKKOS_LAMBDA(const int &k, const int &j, const int &i) {
+            // Calculate radius
+            const Real r = sqrt(coords.Xc<1>(i) * coords.Xc<1>(i) +
+                                coords.Xc<2>(j) * coords.Xc<2>(j) +
+                                coords.Xc<3>(k) * coords.Xc<3>(k));
+
+            // Get pressure and density from generated profile
+            const Real P_r = P_rho_profile.P_from_r(r);
+            const Real rho_r = P_rho_profile.rho_from_r(r);
+
+            // Fill conserved states, 0 initial velocity
+            u(IDN, k, j, i) = rho_r;
+            u(IM1, k, j, i) = 0.0;
+            u(IM2, k, j, i) = 0.0;
+            u(IM3, k, j, i) = 0.0;
+            u(IEN, k, j, i) = P_r / gm1;
+          });
+    }
 
-    /************************************************************
-     * Add dipole magnetic field to the magnetic potential
-     ************************************************************/
-    const auto &init_dipole_b_field = hydro_pkg->Param<bool>("init_dipole_b_field");
-    if (init_dipole_b_field) {
-      const Real mx = hydro_pkg->Param<Real>("dipole_b_field_mx");
-      const Real my = hydro_pkg->Param<Real>("dipole_b_field_my");
-      const Real mz = hydro_pkg->Param<Real>("dipole_b_field_mz");
+    if (hydro_pkg->Param<Fluid>("fluid") == Fluid::glmmhd) {
+      /************************************************************
+       * Initialize the initial magnetic field state via a vector potential
+       ************************************************************/
+      parthenon::ParArray4D<Real> A("A", 3, pmb->cellbounds.ncellsk(IndexDomain::entire),
+                                    pmb->cellbounds.ncellsj(IndexDomain::entire),
+                                    pmb->cellbounds.ncellsi(IndexDomain::entire));
+
+      IndexRange a_ib = ib;
+      a_ib.s -= 1;
+      a_ib.e += 1;
+      IndexRange a_jb = jb;
+      a_jb.s -= 1;
+      a_jb.e += 1;
+      IndexRange a_kb = kb;
+      a_kb.s -= 1;
+      a_kb.e += 1;
+
+      /************************************************************
+       * Initialize an initial magnetic tower
+       ************************************************************/
+      const auto &magnetic_tower = hydro_pkg->Param<MagneticTower>("magnetic_tower");
+
+      magnetic_tower.AddInitialFieldToPotential(pmb.get(), a_kb, a_jb, a_ib, A);
+
+      /************************************************************
+       * Add dipole magnetic field to the magnetic potential
+       ************************************************************/
+      const auto &init_dipole_b_field = hydro_pkg->Param<bool>("init_dipole_b_field");
+      if (init_dipole_b_field) {
+        const Real mx = hydro_pkg->Param<Real>("dipole_b_field_mx");
+        const Real my = hydro_pkg->Param<Real>("dipole_b_field_my");
+        const Real mz = hydro_pkg->Param<Real>("dipole_b_field_mz");
+        parthenon::par_for(
+            DEFAULT_LOOP_PATTERN, "MagneticTower::AddInitialFieldToPotential",
+            parthenon::DevExecSpace(), a_kb.s, a_kb.e, a_jb.s, a_jb.e, a_ib.s, a_ib.e,
+            KOKKOS_LAMBDA(const int &k, const int &j, const int &i) {
+              // Compute and apply potential
+              const Real x = coords.Xc<1>(i);
+              const Real y = coords.Xc<2>(j);
+              const Real z = coords.Xc<3>(k);
+
+              const Real r3 = pow(SQR(x) + SQR(y) + SQR(z), 3. / 2);
+
+              const Real m_cross_r_x = my * z - mz * y;
+              const Real m_cross_r_y = mz * x - mx * z;
+              const Real m_cross_r_z = mx * y - mx * y;
+
+              A(0, k, j, i) += m_cross_r_x / (4 * M_PI * r3);
+              A(1, k, j, i) += m_cross_r_y / (4 * M_PI * r3);
+              A(2, k, j, i) += m_cross_r_z / (4 * M_PI * r3);
+            });
+      }
+
+      /************************************************************
+       * Apply the potential to the conserved variables
+       ************************************************************/
       parthenon::par_for(
-          DEFAULT_LOOP_PATTERN, "MagneticTower::AddInitialFieldToPotential",
-          parthenon::DevExecSpace(), a_kb.s, a_kb.e, a_jb.s, a_jb.e, a_ib.s, a_ib.e,
+          DEFAULT_LOOP_PATTERN, "cluster::ProblemGenerator::ApplyMagneticPotential",
+          parthenon::DevExecSpace(), kb.s, kb.e, jb.s, jb.e, ib.s, ib.e,
           KOKKOS_LAMBDA(const int &k, const int &j, const int &i) {
-            // Compute and apply potential
-            const Real x = coords.Xc<1>(i);
-            const Real y = coords.Xc<2>(j);
-            const Real z = coords.Xc<3>(k);
+            u(IB1, k, j, i) =
+                (A(2, k, j + 1, i) - A(2, k, j - 1, i)) / coords.Dxc<2>(j) / 2.0 -
+                (A(1, k + 1, j, i) - A(1, k - 1, j, i)) / coords.Dxc<3>(k) / 2.0;
+            u(IB2, k, j, i) =
+                (A(0, k + 1, j, i) - A(0, k - 1, j, i)) / coords.Dxc<3>(k) / 2.0 -
+                (A(2, k, j, i + 1) - A(2, k, j, i - 1)) / coords.Dxc<1>(i) / 2.0;
+            u(IB3, k, j, i) =
+                (A(1, k, j, i + 1) - A(1, k, j, i - 1)) / coords.Dxc<1>(i) / 2.0 -
+                (A(0, k, j + 1, i) - A(0, k, j - 1, i)) / coords.Dxc<2>(j) / 2.0;
+
+            u(IEN, k, j, i) += 0.5 * (SQR(u(IB1, k, j, i)) + SQR(u(IB2, k, j, i)) +
+                                      SQR(u(IB3, k, j, i)));
+          });
 
-            const Real r3 = pow(SQR(x) + SQR(y) + SQR(z), 3. / 2);
+      /************************************************************
+       * Add uniform magnetic field to the conserved variables
+       ************************************************************/
+      const auto &init_uniform_b_field = hydro_pkg->Param<bool>("init_uniform_b_field");
+      if (init_uniform_b_field) {
+        const Real bx = hydro_pkg->Param<Real>("uniform_b_field_bx");
+        const Real by = hydro_pkg->Param<Real>("uniform_b_field_by");
+        const Real bz = hydro_pkg->Param<Real>("uniform_b_field_bz");
+        parthenon::par_for(
+            DEFAULT_LOOP_PATTERN, "cluster::ProblemGenerator::ApplyUniformBField",
+            parthenon::DevExecSpace(), kb.s, kb.e, jb.s, jb.e, ib.s, ib.e,
+            KOKKOS_LAMBDA(const int &k, const int &j, const int &i) {
+              const Real bx_i = u(IB1, k, j, i);
+              const Real by_i = u(IB2, k, j, i);
+              const Real bz_i = u(IB3, k, j, i);
+
+              u(IB1, k, j, i) += bx;
+              u(IB2, k, j, i) += by;
+              u(IB3, k, j, i) += bz;
+
+              // Old magnetic energy is b_i^2, new Magnetic energy should be 0.5*(b_i +
+              // b)^2, add b_i*b + 0.5b^2  to old energy to accomplish that
+              u(IEN, k, j, i) +=
+                  bx_i * bx + by_i * by + bz_i * bz + 0.5 * (SQR(bx) + SQR(by) + SQR(bz));
+            });
+        // end if(init_uniform_b_field)
+      }
+
+    } // END if(hydro_pkg->Param<Fluid>("fluid") == Fluid::glmmhd)
+  }
 
-            const Real m_cross_r_x = my * z - mz * y;
-            const Real m_cross_r_y = mz * x - mx * z;
-            const Real m_cross_r_z = mx * y - mx * y;
+  /************************************************************
+   * Set initial velocity perturbations (requires no other velocities for now)
+   ************************************************************/
 
-            A(0, k, j, i) += m_cross_r_x / (4 * M_PI * r3);
-            A(1, k, j, i) += m_cross_r_y / (4 * M_PI * r3);
-            A(2, k, j, i) += m_cross_r_z / (4 * M_PI * r3);
-          });
+  auto pmb = md->GetBlockData(0)->GetBlockPointer();
+  IndexRange ib = pmb->cellbounds.GetBoundsI(IndexDomain::interior);
+  IndexRange jb = pmb->cellbounds.GetBoundsJ(IndexDomain::interior);
+  IndexRange kb = pmb->cellbounds.GetBoundsK(IndexDomain::interior);
+  auto hydro_pkg = pmb->packages.Get("Hydro");
+  const auto fluid = hydro_pkg->Param<Fluid>("fluid");
+  auto const &cons = md->PackVariables(std::vector<std::string>{"cons"});
+  const auto num_blocks = md->NumBlocks();
+
+  const auto sigma_v = pin->GetOrAddReal("problem/cluster/init_perturb", "sigma_v", 0.0);
+
+  if (sigma_v != 0.0) {
+    auto few_modes_ft = hydro_pkg->Param<FewModesFT>("cluster/few_modes_ft_v");
+    // Init phases on all blocks
+    for (int b = 0; b < md->NumBlocks(); b++) {
+      auto pmb = md->GetBlockData(b)->GetBlockPointer();
+      few_modes_ft.SetPhases(pmb.get(), pin);
     }
+    // As for t_corr in few_modes_ft, the choice for dt is
+    // in principle arbitrary because the inital v_hat is 0 and the v_hat_new will contain
+    // the perturbation (and is normalized in the following to get the desired sigma_v)
+    const Real dt = 1.0;
+    few_modes_ft.Generate(md, dt, "tmp_perturb");
+
+    Real v2_sum = 0.0; // used for normalization
+
+    auto perturb_pack = md->PackVariables(std::vector<std::string>{"tmp_perturb"});
+
+    pmb->par_reduce(
+        "Init sigma_v", 0, num_blocks - 1, kb.s, kb.e, jb.s, jb.e, ib.s, ib.e,
+        KOKKOS_LAMBDA(const int b, const int k, const int j, const int i, Real &lsum) {
+          const auto &coords = cons.GetCoords(b);
+          const auto &u = cons(b);
+          auto rho = u(IDN, k, j, i);
+          // The following restriction could be lifted, but requires refactoring of the
+          // logic for the normalization/reduction below
+          PARTHENON_REQUIRE(
+              u(IM1, k, j, i) == 0.0 && u(IM2, k, j, i) == 0.0 && u(IM3, k, j, i) == 0.0,
+              "Found existing non-zero velocity when setting velocity perturbations.");
+
+          u(IM1, k, j, i) = rho * perturb_pack(b, 0, k, j, i);
+          u(IM2, k, j, i) = rho * perturb_pack(b, 1, k, j, i);
+          u(IM3, k, j, i) = rho * perturb_pack(b, 2, k, j, i);
+          // No need to touch the energy yet as we'll normalize later
+
+          lsum += (SQR(u(IM1, k, j, i)) + SQR(u(IM2, k, j, i)) + SQR(u(IM3, k, j, i))) *
+                  coords.CellVolume(k, j, i) / SQR(rho);
+        },
+        v2_sum);
+
+#ifdef MPI_PARALLEL
+    PARTHENON_MPI_CHECK(MPI_Allreduce(MPI_IN_PLACE, &v2_sum, 1, MPI_PARTHENON_REAL,
+                                      MPI_SUM, MPI_COMM_WORLD));
+#endif // MPI_PARALLEL
+
+    const auto Lx = pmesh->mesh_size.x1max - pmesh->mesh_size.x1min;
+    const auto Ly = pmesh->mesh_size.x2max - pmesh->mesh_size.x2min;
+    const auto Lz = pmesh->mesh_size.x3max - pmesh->mesh_size.x3min;
+    auto v_norm = std::sqrt(v2_sum / (Lx * Ly * Lz) / (SQR(sigma_v)));
 
-    /************************************************************
-     * Apply the potential to the conserved variables
-     ************************************************************/
-    parthenon::par_for(
-        DEFAULT_LOOP_PATTERN, "cluster::ProblemGenerator::ApplyMagneticPotential",
-        parthenon::DevExecSpace(), kb.s, kb.e, jb.s, jb.e, ib.s, ib.e,
-        KOKKOS_LAMBDA(const int &k, const int &j, const int &i) {
+    pmb->par_for(
+        "Norm sigma_v", 0, num_blocks - 1, kb.s, kb.e, jb.s, jb.e, ib.s, ib.e,
+        KOKKOS_LAMBDA(const int b, const int k, const int j, const int i) {
+          const auto &u = cons(b);
+
+          u(IM1, k, j, i) /= v_norm;
+          u(IM2, k, j, i) /= v_norm;
+          u(IM3, k, j, i) /= v_norm;
+
+          u(IEN, k, j, i) +=
+              0.5 * (SQR(u(IM1, k, j, i)) + SQR(u(IM2, k, j, i)) + SQR(u(IM3, k, j, i))) /
+              u(IDN, k, j, i);
+        });
+  }
+
+  /************************************************************
+   * Set initial magnetic field perturbations (resets magnetic field field)
+   ************************************************************/
+  const auto sigma_b = pin->GetOrAddReal("problem/cluster/init_perturb", "sigma_b", 0.0);
+  if (sigma_b != 0.0) {
+    auto few_modes_ft = hydro_pkg->Param<FewModesFT>("cluster/few_modes_ft_b");
+    // Init phases on all blocks
+    for (int b = 0; b < md->NumBlocks(); b++) {
+      auto pmb = md->GetBlockData(b)->GetBlockPointer();
+      few_modes_ft.SetPhases(pmb.get(), pin);
+    }
+    // As for t_corr in few_modes_ft, the choice for dt is
+    // in principle arbitrary because the inital b_hat is 0 and the b_hat_new will contain
+    // the perturbation (and is normalized in the following to get the desired sigma_b)
+    const Real dt = 1.0;
+    few_modes_ft.Generate(md, dt, "tmp_perturb");
+
+    Real b2_sum = 0.0; // used for normalization
+
+    auto perturb_pack = md->PackVariables(std::vector<std::string>{"tmp_perturb"});
+
+    pmb->par_reduce(
+        "Init sigma_b", 0, num_blocks - 1, kb.s, kb.e, jb.s, jb.e, ib.s, ib.e,
+        KOKKOS_LAMBDA(const int b, const int k, const int j, const int i, Real &lsum) {
+          const auto &coords = cons.GetCoords(b);
+          const auto &u = cons(b);
+          // The following restriction could be lifted, but requires refactoring of the
+          // logic for the normalization/reduction below
+          PARTHENON_REQUIRE(
+              u(IB1, k, j, i) == 0.0 && u(IB2, k, j, i) == 0.0 && u(IB3, k, j, i) == 0.0,
+              "Found existing non-zero B when setting magnetic field perturbations.");
           u(IB1, k, j, i) =
-              (A(2, k, j + 1, i) - A(2, k, j - 1, i)) / coords.Dxc<2>(j) / 2.0 -
-              (A(1, k + 1, j, i) - A(1, k - 1, j, i)) / coords.Dxc<3>(k) / 2.0;
+              (perturb_pack(b, 2, k, j + 1, i) - perturb_pack(b, 2, k, j - 1, i)) /
+                  coords.Dxc<2>(j) / 2.0 -
+              (perturb_pack(b, 1, k + 1, j, i) - perturb_pack(b, 1, k - 1, j, i)) /
+                  coords.Dxc<3>(k) / 2.0;
           u(IB2, k, j, i) =
-              (A(0, k + 1, j, i) - A(0, k - 1, j, i)) / coords.Dxc<3>(k) / 2.0 -
-              (A(2, k, j, i + 1) - A(2, k, j, i - 1)) / coords.Dxc<1>(i) / 2.0;
+              (perturb_pack(b, 0, k + 1, j, i) - perturb_pack(b, 0, k - 1, j, i)) /
+                  coords.Dxc<3>(k) / 2.0 -
+              (perturb_pack(b, 2, k, j, i + 1) - perturb_pack(b, 2, k, j, i - 1)) /
+                  coords.Dxc<1>(i) / 2.0;
           u(IB3, k, j, i) =
-              (A(1, k, j, i + 1) - A(1, k, j, i - 1)) / coords.Dxc<1>(i) / 2.0 -
-              (A(0, k, j + 1, i) - A(0, k, j - 1, i)) / coords.Dxc<2>(j) / 2.0;
+              (perturb_pack(b, 1, k, j, i + 1) - perturb_pack(b, 1, k, j, i - 1)) /
+                  coords.Dxc<1>(i) / 2.0 -
+              (perturb_pack(b, 0, k, j + 1, i) - perturb_pack(b, 0, k, j - 1, i)) /
+                  coords.Dxc<2>(j) / 2.0;
+
+          // No need to touch the energy yet as we'll normalize later
+          lsum += (SQR(u(IB1, k, j, i)) + SQR(u(IB2, k, j, i)) + SQR(u(IB3, k, j, i))) *
+                  coords.CellVolume(k, j, i);
+        },
+        b2_sum);
+
+#ifdef MPI_PARALLEL
+    PARTHENON_MPI_CHECK(MPI_Allreduce(MPI_IN_PLACE, &b2_sum, 1, MPI_PARTHENON_REAL,
+                                      MPI_SUM, MPI_COMM_WORLD));
+#endif // MPI_PARALLEL
+
+    const auto Lx = pmesh->mesh_size.x1max - pmesh->mesh_size.x1min;
+    const auto Ly = pmesh->mesh_size.x2max - pmesh->mesh_size.x2min;
+    const auto Lz = pmesh->mesh_size.x3max - pmesh->mesh_size.x3min;
+    auto b_norm = std::sqrt(b2_sum / (Lx * Ly * Lz) / (SQR(sigma_b)));
+
+    pmb->par_for(
+        "Norm sigma_b", 0, num_blocks - 1, kb.s, kb.e, jb.s, jb.e, ib.s, ib.e,
+        KOKKOS_LAMBDA(const int b, const int k, const int j, const int i) {
+          const auto &u = cons(b);
+
+          u(IB1, k, j, i) /= b_norm;
+          u(IB2, k, j, i) /= b_norm;
+          u(IB3, k, j, i) /= b_norm;
 
           u(IEN, k, j, i) +=
               0.5 * (SQR(u(IB1, k, j, i)) + SQR(u(IB2, k, j, i)) + SQR(u(IB3, k, j, i)));
         });
-
-    /************************************************************
-     * Add uniform magnetic field to the conserved variables
-     ************************************************************/
-    const auto &init_uniform_b_field = hydro_pkg->Param<bool>("init_uniform_b_field");
-    if (init_uniform_b_field) {
-      const Real bx = hydro_pkg->Param<Real>("uniform_b_field_bx");
-      const Real by = hydro_pkg->Param<Real>("uniform_b_field_by");
-      const Real bz = hydro_pkg->Param<Real>("uniform_b_field_bz");
-      parthenon::par_for(
-          DEFAULT_LOOP_PATTERN, "cluster::ProblemGenerator::ApplyUniformBField",
-          parthenon::DevExecSpace(), kb.s, kb.e, jb.s, jb.e, ib.s, ib.e,
-          KOKKOS_LAMBDA(const int &k, const int &j, const int &i) {
-            const Real bx_i = u(IB1, k, j, i);
-            const Real by_i = u(IB2, k, j, i);
-            const Real bz_i = u(IB3, k, j, i);
-
-            u(IB1, k, j, i) += bx;
-            u(IB2, k, j, i) += by;
-            u(IB3, k, j, i) += bz;
-
-            // Old magnetic energy is b_i^2, new Magnetic energy should be 0.5*(b_i +
-            // b)^2, add b_i*b + 0.5b^2  to old energy to accomplish that
-            u(IEN, k, j, i) +=
-                bx_i * bx + by_i * by + bz_i * bz + 0.5 * (SQR(bx) + SQR(by) + SQR(bz));
-          });
-      // end if(init_uniform_b_field)
-    }
-
-  } // END if(hydro_pkg->Param<Fluid>("fluid") == Fluid::glmmhd)
+  }
 }
 
 void UserWorkBeforeOutput(MeshBlock *pmb, ParameterInput *pin) {
@@ -452,7 +838,9 @@ void UserWorkBeforeOutput(MeshBlock *pmb, ParameterInput *pin) {
   auto &temperature = data->Get("temperature").data;
 
   // for computing temperature from primitives
+  auto units = pkg->Param<Units>("units");
   auto mbar_over_kb = pkg->Param<Real>("mbar_over_kb");
+  auto mbar = mbar_over_kb * units.k_boltzmann();
 
   // fill derived vars (*including ghost cells*)
   auto &coords = pmb->coords;
@@ -478,7 +866,7 @@ void UserWorkBeforeOutput(MeshBlock *pmb, ParameterInput *pin) {
         log10_radius(k, j, i) = 0.5 * std::log10(r2);
 
         // compute entropy
-        const Real K = P / std::pow(rho, gam);
+        const Real K = P / std::pow(rho / mbar, gam);
         entropy(k, j, i) = K;
 
         const Real v_mag = std::sqrt(SQR(v1) + SQR(v2) + SQR(v3));
diff --git a/src/pgen/cluster/agn_feedback.cpp b/src/pgen/cluster/agn_feedback.cpp
index cd367150..1c76e913 100644
--- a/src/pgen/cluster/agn_feedback.cpp
+++ b/src/pgen/cluster/agn_feedback.cpp
@@ -17,7 +17,7 @@
 #include <parameter_input.hpp>
 #include <parthenon/package.hpp>
 
-// Athena headers
+// AthenaPK headers
 #include "../../eos/adiabatic_glmmhd.hpp"
 #include "../../eos/adiabatic_hydro.hpp"
 #include "../../main.hpp"
@@ -33,6 +33,8 @@ using namespace parthenon;
 AGNFeedback::AGNFeedback(parthenon::ParameterInput *pin,
                          parthenon::StateDescriptor *hydro_pkg)
     : fixed_power_(pin->GetOrAddReal("problem/cluster/agn_feedback", "fixed_power", 0.0)),
+      vceil_(pin->GetOrAddReal("problem/cluster/agn_feedback", "vceil",
+                               std::numeric_limits<Real>::infinity())),
       efficiency_(pin->GetOrAddReal("problem/cluster/agn_feedback", "efficiency", 1e-3)),
       thermal_fraction_(
           pin->GetOrAddReal("problem/cluster/agn_feedback", "thermal_fraction", 0.0)),
@@ -119,7 +121,8 @@ AGNFeedback::AGNFeedback(parthenon::ParameterInput *pin,
                                              (1 - efficiency_) * kinetic_jet_e_))) <
           10 * std::numeric_limits<Real>::epsilon(),
       "Specified kinetic jet velocity and temperature are incompatible with mass to "
-      "energy conversion efficiency. Choose either velocity or temperature.");
+      "energy conversion efficiency. Either the specified velocity, temperature, or "
+      "efficiency are incompatible");
 
   PARTHENON_REQUIRE(kinetic_jet_velocity_ <=
                         units.speed_of_light() * sqrt(2 * efficiency_),
@@ -134,6 +137,11 @@ AGNFeedback::AGNFeedback(parthenon::ParameterInput *pin,
   PARTHENON_REQUIRE(kinetic_jet_temperature_ >= 0,
                     "Kinetic jet temperature must be non-negative");
 
+  // Compute the internal energy ceiling from the temperature ceiling
+  const Real tceil = pin->GetOrAddReal("problem/cluster/agn_feedback", "Tceil",
+                                       std::numeric_limits<Real>::infinity());
+  eceil_ = tceil / mbar_gm1_over_kb;
+
   // Add user history output variable for AGN power
   auto hst_vars = hydro_pkg->Param<parthenon::HstVar_list>(parthenon::hist_param_key);
   if (!disabled_) {
@@ -261,15 +269,18 @@ void AGNFeedback::FeedbackSrcTerm(parthenon::MeshData<parthenon::Real> *md,
 
   // Velocity of added gas
   const Real jet_velocity = kinetic_jet_velocity_;
-#ifndef NDEBUG
   const Real jet_specific_internal_e = kinetic_jet_e_;
-#endif
 
   // Amount of momentum density ( density * velocity) to dump in each cell
   const Real jet_momentum = jet_density * jet_velocity;
 
   // Amount of total energy to dump in each cell
   const Real jet_feedback = kinetic_fraction_ * power * kinetic_scaling_factor * beta_dt;
+
+  const Real vceil = vceil_;
+  const Real vceil2 = SQR(vceil);
+  const Real eceil = eceil_;
+  const Real gm1 = (hydro_pkg->Param<Real>("AdiabaticIndex") - 1.0);
   ////////////////////////////////////////////////////////////////////////////////
 
   const parthenon::Real time = tm.time;
@@ -278,7 +289,7 @@ void AGNFeedback::FeedbackSrcTerm(parthenon::MeshData<parthenon::Real> *md,
       hydro_pkg->Param<JetCoordsFactory>("jet_coords_factory");
   const JetCoords jet_coords = jet_coords_factory.CreateJetCoords(time);
 
-  // Constant volumetric heating
+  // Appy kinietic jet and thermal feedback
   parthenon::par_for(
       DEFAULT_LOOP_PATTERN, "HydroAGNFeedback::FeedbackSrcTerm",
       parthenon::DevExecSpace(), 0, cons_pack.GetDim(5) - 1, kb.s, kb.e, jb.s, jb.e, ib.s,
@@ -321,17 +332,15 @@ void AGNFeedback::FeedbackSrcTerm(parthenon::MeshData<parthenon::Real> *md,
 
             const Real sign_jet = (h > 0) ? 1 : -1; // Above or below jet-disk
 
-        ///////////////////////////////////////////////////////////////////
-        //  We add the kinetic jet with a fixed jet velocity and specific
-        //  internal energy/temperature of the added gas. The density,
-        //  momentum, and total energy added depend on the triggered power.
-        ///////////////////////////////////////////////////////////////////
+            ///////////////////////////////////////////////////////////////////
+            //  We add the kinetic jet with a fixed jet velocity and specific
+            //  internal energy/temperature of the added gas. The density,
+            //  momentum, and total energy added depend on the triggered power.
+            ///////////////////////////////////////////////////////////////////
 
-#ifndef NDEBUG
             eos.ConsToPrim(cons, prim, nhydro, nscalars, k, j, i);
             const Real old_specific_internal_e =
                 prim(IPR, k, j, i) / (prim(IDN, k, j, i) * (eos.GetGamma() - 1.));
-#endif
 
             cons(IDN, k, j, i) += jet_density;
             cons(IM1, k, j, i) += jet_momentum * sign_jet * jet_axis_x;
@@ -339,20 +348,43 @@ void AGNFeedback::FeedbackSrcTerm(parthenon::MeshData<parthenon::Real> *md,
             cons(IM3, k, j, i) += jet_momentum * sign_jet * jet_axis_z;
             cons(IEN, k, j, i) += jet_feedback;
 
-#ifndef NDEBUG
             eos.ConsToPrim(cons, prim, nhydro, nscalars, k, j, i);
             const Real new_specific_internal_e =
                 prim(IPR, k, j, i) / (prim(IDN, k, j, i) * (eos.GetGamma() - 1.));
-            PARTHENON_DEBUG_REQUIRE(
+            PARTHENON_REQUIRE(
                 new_specific_internal_e > jet_specific_internal_e ||
                     new_specific_internal_e > old_specific_internal_e,
                 "Kinetic injection leads to temperature below jet and existing gas");
-#endif
+          }
+
+          // Apply velocity ceiling
+          const Real v2 =
+              SQR(prim(IV1, k, j, i)) + SQR(prim(IV2, k, j, i)) + SQR(prim(IV3, k, j, i));
+          if (v2 > vceil2) {
+            // Fix the velocity to the velocity ceiling
+            const Real v = sqrt(v2);
+            cons(IM1, k, j, i) *= vceil / v;
+            cons(IM2, k, j, i) *= vceil / v;
+            cons(IM3, k, j, i) *= vceil / v;
+            prim(IV1, k, j, i) *= vceil / v;
+            prim(IV2, k, j, i) *= vceil / v;
+            prim(IV3, k, j, i) *= vceil / v;
+
+            // Remove kinetic energy
+            cons(IEN, k, j, i) -= 0.5 * prim(IDN, k, j, i) * (v2 - vceil2);
+          }
+
+          // Apply  internal energy ceiling as a pressure ceiling
+          const Real internal_e = prim(IPR, k, j, i) / (gm1 * prim(IDN, k, j, i));
+          if (internal_e > eceil) {
+            cons(IEN, k, j, i) -= prim(IDN, k, j, i) * (internal_e - eceil);
+            prim(IPR, k, j, i) = gm1 * prim(IDN, k, j, i) * eceil;
           }
         }
+
         eos.ConsToPrim(cons, prim, nhydro, nscalars, k, j, i);
-        PARTHENON_DEBUG_REQUIRE(prim(IPR, k, j, i) > 0,
-                                "Kinetic injection leads to negative pressure");
+        PARTHENON_REQUIRE(prim(IPR, k, j, i) > 0,
+                          "Kinetic injection leads to negative pressure");
       });
 
   // Apply magnetic tower feedback
diff --git a/src/pgen/cluster/agn_feedback.hpp b/src/pgen/cluster/agn_feedback.hpp
index 1c88a61e..de516949 100644
--- a/src/pgen/cluster/agn_feedback.hpp
+++ b/src/pgen/cluster/agn_feedback.hpp
@@ -31,6 +31,9 @@ class AGNFeedback {
   // Efficiency converting mass to energy
   const parthenon::Real efficiency_;
 
+  // Velocity and temperature ceilings
+  parthenon::Real vceil_, eceil_;
+
   // Thermal Heating Parameters
   const parthenon::Real thermal_radius_;
 
diff --git a/src/pgen/cluster/agn_triggering.cpp b/src/pgen/cluster/agn_triggering.cpp
index 1f241d88..242de4fa 100644
--- a/src/pgen/cluster/agn_triggering.cpp
+++ b/src/pgen/cluster/agn_triggering.cpp
@@ -7,6 +7,7 @@
 //  \brief  Class for computing AGN triggering from Bondi-like and cold gas accretion
 
 #include <cmath>
+#include <fstream> // for ofstream
 #include <limits>
 
 // Parthenon headers
diff --git a/src/pgen/pgen.hpp b/src/pgen/pgen.hpp
index 84e36b00..3050cbb5 100644
--- a/src/pgen/pgen.hpp
+++ b/src/pgen/pgen.hpp
@@ -98,7 +98,7 @@ using namespace parthenon::driver::prelude;
 
 void ProblemInitPackageData(ParameterInput *pin, parthenon::StateDescriptor *pkg);
 void InitUserMeshData(ParameterInput *pin);
-void ProblemGenerator(MeshBlock *pmb, parthenon::ParameterInput *pin);
+void ProblemGenerator(Mesh *pmesh, ParameterInput *pin, MeshData<Real> *md);
 void UserWorkBeforeOutput(MeshBlock *pmb, ParameterInput *pin);
 void ClusterSrcTerm(MeshData<Real> *md, const parthenon::SimTime &tm, const Real beta_dt);
 parthenon::Real ClusterEstimateTimestep(MeshData<Real> *md);
diff --git a/src/pgen/turbulence.cpp b/src/pgen/turbulence.cpp
index 91bb935a..58882f52 100644
--- a/src/pgen/turbulence.cpp
+++ b/src/pgen/turbulence.cpp
@@ -1,6 +1,6 @@
 //========================================================================================
 // AthenaPK - a performance portable block structured AMR astrophysical MHD code.
-// Copyright (c) 20212-2022, Athena-Parthenon Collaboration. All rights reserved.
+// Copyright (c) 2021-2023, Athena-Parthenon Collaboration. All rights reserved.
 // Licensed under the 3-clause BSD License, see LICENSE file for details
 //========================================================================================
 //! \file turbulence.cpp
@@ -27,21 +27,14 @@
 // AthenaPK headers
 #include "../main.hpp"
 #include "../units.hpp"
+#include "../utils/few_modes_ft.hpp"
 
 namespace turbulence {
 using namespace parthenon::package::prelude;
-
-typedef Kokkos::complex<Real> Complex;
 using parthenon::DevMemSpace;
 using parthenon::ParArray2D;
-
-// Defining these "globally" as they are fixed across all blocks
-ParArray2D<Complex> accel_hat_, accel_hat_new_;
-ParArray2D<Real> k_vec_;
-Kokkos::View<Real ***, Kokkos::LayoutRight, DevMemSpace> random_num_;
-Kokkos::View<Real ***, Kokkos::LayoutRight, parthenon::HostMemSpace> random_num_host;
-std::mt19937 rng;
-std::uniform_real_distribution<> dist(-1.0, 1.0);
+using utils::few_modes_ft::Complex;
+using utils::few_modes_ft::FewModesFT;
 
 // TODO(?) until we are able to process multiple variables in a single hst function call
 // we'll use this enum to identify the various vars.
@@ -131,69 +124,50 @@ void ProblemInitPackageData(ParameterInput *pin, parthenon::StateDescriptor *pkg
 
   auto num_modes =
       pin->GetInteger("problem/turbulence", "num_modes"); // number of wavemodes
-  if ((num_modes > 100) && (parthenon::Globals::my_rank == 0)) {
-    std::cout << "### WARNING using more than 100 explicit modes will significantly "
-              << "increase the runtime." << std::endl
-              << "If many modes are required in the acceleration field consider using "
-              << "the driving mechanism based on full FFTs." << std::endl;
-  }
-  pkg->AddParam<>("turbulence/num_modes", num_modes);
-
-  const auto nx1 = pin->GetInteger("parthenon/meshblock", "nx1");
-  const auto nx2 = pin->GetInteger("parthenon/meshblock", "nx2");
-  const auto nx3 = pin->GetInteger("parthenon/meshblock", "nx3");
-  m = Metadata({Metadata::None, Metadata::Derived, Metadata::OneCopy},
-               std::vector<int>({2, num_modes, nx1}), "phases_i");
-  pkg->AddField("phases_i", m);
-  m = Metadata({Metadata::None, Metadata::Derived, Metadata::OneCopy},
-               std::vector<int>({2, num_modes, nx2}), "phases_j");
-  pkg->AddField("phases_j", m);
-  m = Metadata({Metadata::None, Metadata::Derived, Metadata::OneCopy},
-               std::vector<int>({2, num_modes, nx3}), "phases_k");
-  pkg->AddField("phases_k", m);
 
   uint32_t rseed =
       pin->GetOrAddInteger("problem/turbulence", "rseed", -1); // seed for random number.
   pkg->AddParam<>("turbulence/rseed", rseed);
 
-  auto kpeak =
+  auto k_peak =
       pin->GetOrAddReal("problem/turbulence", "kpeak", 0.0); // peak of the forcing spec
-  pkg->AddParam<>("turbulence/kpeak", kpeak);
+  pkg->AddParam<>("turbulence/kpeak", k_peak);
 
   auto accel_rms =
       pin->GetReal("problem/turbulence", "accel_rms"); // turbulence amplitude
   pkg->AddParam<>("turbulence/accel_rms", accel_rms);
 
-  auto tcorr =
+  auto t_corr =
       pin->GetReal("problem/turbulence", "corr_time"); // forcing autocorrelation time
-  pkg->AddParam<>("turbulence/tcorr", tcorr);
+  pkg->AddParam<>("turbulence/t_corr", t_corr);
 
   Real sol_weight = pin->GetReal("problem/turbulence", "sol_weight"); // solenoidal weight
   pkg->AddParam<>("turbulence/sol_weight", sol_weight);
 
-  // Acceleration field in Fourier space using complex to real transform.
-  accel_hat_ = ParArray2D<Complex>("accel_hat", 3, num_modes);
-  accel_hat_new_ = ParArray2D<Complex>("accel_hat_new", 3, num_modes);
-
   // list of wavenumber vectors
-  k_vec_ = ParArray2D<Real>("k_vec", 3, num_modes);
-  auto k_vec_host = Kokkos::create_mirror_view(k_vec_);
+  auto k_vec = ParArray2D<Real>("k_vec", 3, num_modes);
+  auto k_vec_host = Kokkos::create_mirror_view(k_vec);
   for (int j = 0; j < 3; j++) {
     for (int i = 1; i <= num_modes; i++) {
       k_vec_host(j, i - 1) =
           pin->GetInteger("modes", "k_" + std::to_string(i) + "_" + std::to_string(j));
     }
   }
-  Kokkos::deep_copy(k_vec_, k_vec_host);
+  Kokkos::deep_copy(k_vec, k_vec_host);
 
-  random_num_ = Kokkos::View<Real ***, Kokkos::LayoutRight, DevMemSpace>("random_num", 3,
-                                                                         num_modes, 2);
-  random_num_host = Kokkos::create_mirror_view(random_num_);
+  auto few_modes_ft = FewModesFT(pin, pkg, "turbulence", num_modes, k_vec, k_peak,
+                                 sol_weight, t_corr, rseed);
+  // object must be mutable to update the internal state of the RNG
+  pkg->AddParam<>("turbulence/few_modes_ft", few_modes_ft, true);
 
   // Check if this is is a restart and restore previous state
   if (pin->DoesParameterExist("problem/turbulence", "accel_hat_0_0_r")) {
+    // Need to extract mutable object from Params here as the original few_modes_ft above
+    // and the one in Params are different instances
+    auto *pfew_modes_ft = pkg->MutableParam<FewModesFT>("turbulence/few_modes_ft");
     // Restore (common) acceleration field in spectral space
-    auto accel_hat_host = Kokkos::create_mirror_view(accel_hat_);
+    auto accel_hat = pfew_modes_ft->GetVarHat();
+    auto accel_hat_host = Kokkos::create_mirror_view(accel_hat);
     for (int i = 0; i < 3; i++) {
       for (int m = 0; m < num_modes; m++) {
         auto real =
@@ -205,133 +179,26 @@ void ProblemInitPackageData(ParameterInput *pin, parthenon::StateDescriptor *pkg
         accel_hat_host(i, m) = Complex(real, imag);
       }
     }
-    Kokkos::deep_copy(accel_hat_, accel_hat_host);
+    Kokkos::deep_copy(accel_hat, accel_hat_host);
 
     // Restore state of random number gen
     {
       std::istringstream iss(pin->GetString("problem/turbulence", "state_rng"));
-      iss >> rng;
+      pfew_modes_ft->RestoreRNG(iss);
     }
     // Restore state of dist
     {
       std::istringstream iss(pin->GetString("problem/turbulence", "state_dist"));
-      iss >> dist;
+      pfew_modes_ft->RestoreDist(iss);
     }
-
-  } else {
-    // init RNG
-    rng.seed(rseed);
   }
 }
 
+// SetPhases is used as InitMeshBlockUserData because phases need to be reset on remeshing
 void SetPhases(MeshBlock *pmb, ParameterInput *pin) {
-  auto pm = pmb->pmy_mesh;
   auto hydro_pkg = pmb->packages.Get("Hydro");
-
-  // The following restriction could technically be lifted if the turbulence driver is
-  // directly embedded in the hydro driver rather than a user defined source as well as
-  // fixing the pack_size=-1 when using the Mesh- (not MeshBlock-)based problem generator.
-  // The restriction stems from requiring a collective MPI comm to normalize the
-  // acceleration and magnetic field, respectively. Note, that the restriction does not
-  // apply here, but for the ProblemGenerator() and Driving() function below. The check is
-  // just added here for convenience as this function is called during problem
-  // initializtion. From my (pgrete) point of view, it's currently cleaner to keep things
-  // separate and not touch the main driver at the expense of using one pack per rank --
-  // which is typically fastest on devices anyway.
-  const auto pack_size = pin->GetInteger("parthenon/mesh", "pack_size");
-  PARTHENON_REQUIRE_THROWS(pack_size == -1,
-                           "Turbulence pgen currently needs parthenon/mesh/pack_size=-1 "
-                           "to work because of global reductions.")
-
-  auto Lx = pm->mesh_size.x1max - pm->mesh_size.x1min;
-  auto Ly = pm->mesh_size.x2max - pm->mesh_size.x2min;
-  auto Lz = pm->mesh_size.x3max - pm->mesh_size.x3min;
-  // should also be easily fixed, just need to double check transforms and volume
-  // weighting everywhere
-  if ((Lx != 1.0) || (Ly != 1.0) || (Lz != 1.0)) {
-    std::stringstream msg;
-    msg << "### FATAL ERROR in turbulence driver" << std::endl
-        << "Only domain sizes with edge lengths of 1 are supported." << std::endl;
-    throw std::runtime_error(msg.str().c_str());
-  }
-
-  auto gnx1 = pm->mesh_size.nx1;
-  auto gnx2 = pm->mesh_size.nx2;
-  auto gnx3 = pm->mesh_size.nx3;
-  // as above, this restriction should/could be easily lifted
-  if ((gnx1 != gnx2) || (gnx2 != gnx3)) {
-    std::stringstream msg;
-    msg << "### FATAL ERROR in turbulence driver" << std::endl
-        << "Only cubic mesh sizes are supported." << std::endl;
-    throw std::runtime_error(msg.str().c_str());
-  }
-
-  const auto nx1 = pmb->block_size.nx1;
-  const auto nx2 = pmb->block_size.nx2;
-  const auto nx3 = pmb->block_size.nx3;
-
-  const auto gis = pmb->loc.lx1 * pmb->block_size.nx1;
-  const auto gjs = pmb->loc.lx2 * pmb->block_size.nx2;
-  const auto gks = pmb->loc.lx3 * pmb->block_size.nx3;
-
-  const auto num_modes = hydro_pkg->Param<int>("turbulence/num_modes");
-
-  // make local ref to capure in lambda
-  auto &k_vec = k_vec_;
-
-  Complex I(0.0, 1.0);
-
-  auto &base = pmb->meshblock_data.Get();
-  auto &phases_i = base->Get("phases_i").data;
-  auto &phases_j = base->Get("phases_j").data;
-  auto &phases_k = base->Get("phases_k").data;
-
-  pmb->par_for(
-      "forcing: calc phases_i", 0, nx1 - 1, KOKKOS_LAMBDA(int i) {
-        Real gi = static_cast<Real>(i + gis);
-        Real w_kx;
-        Complex phase;
-
-        for (int m = 0; m < num_modes; m++) {
-          w_kx = k_vec(0, m) * 2. * M_PI / static_cast<Real>(gnx1);
-          // adjust phase factor to Complex->Real IFT: u_hat*(k) = u_hat(-k)
-          if (k_vec(0, m) == 0.0) {
-            phase = 0.5 * Kokkos::exp(I * w_kx * gi);
-          } else {
-            phase = Kokkos::exp(I * w_kx * gi);
-          }
-          phases_i(i, m, 0) = phase.real();
-          phases_i(i, m, 1) = phase.imag();
-        }
-      });
-
-  pmb->par_for(
-      "forcing: calc phases_j", 0, nx2 - 1, KOKKOS_LAMBDA(int j) {
-        Real gj = static_cast<Real>(j + gjs);
-        Real w_ky;
-        Complex phase;
-
-        for (int m = 0; m < num_modes; m++) {
-          w_ky = k_vec(1, m) * 2. * M_PI / static_cast<Real>(gnx2);
-          phase = Kokkos::exp(I * w_ky * gj);
-          phases_j(j, m, 0) = phase.real();
-          phases_j(j, m, 1) = phase.imag();
-        }
-      });
-
-  pmb->par_for(
-      "forcing: calc phases_k", 0, nx3 - 1, KOKKOS_LAMBDA(int k) {
-        Real gk = static_cast<Real>(k + gks);
-        Real w_kz;
-        Complex phase;
-
-        for (int m = 0; m < num_modes; m++) {
-          w_kz = k_vec(2, m) * 2. * M_PI / static_cast<Real>(gnx3);
-          phase = Kokkos::exp(I * w_kz * gk);
-          phases_k(k, m, 0) = phase.real();
-          phases_k(k, m, 1) = phase.imag();
-        }
-      });
+  auto few_modes_ft = hydro_pkg->Param<FewModesFT>("turbulence/few_modes_ft");
+  few_modes_ft.SetPhases(pmb, pin);
 }
 
 //========================================================================================
@@ -475,150 +342,9 @@ void ProblemGenerator(Mesh *pmesh, ParameterInput *pin, MeshData<Real> *md) {
 void Generate(MeshData<Real> *md, Real dt) {
   auto pmb = md->GetBlockData(0)->GetBlockPointer();
   auto hydro_pkg = pmb->packages.Get("Hydro");
-
-  const auto num_modes = hydro_pkg->Param<int>("turbulence/num_modes");
-
-  Complex I(0.0, 1.0);
-  auto &random_num = random_num_;
-
-  // get a set of random numbers from the CPU so that they are deterministic
-  // when run on GPUs
-  Real v1, v2, v_sqr;
-  for (int n = 0; n < 3; n++)
-    for (int m = 0; m < num_modes; m++) {
-      do {
-        v1 = dist(rng);
-        v2 = dist(rng);
-        v_sqr = v1 * v1 + v2 * v2;
-      } while (v_sqr >= 1.0 || v_sqr == 0.0);
-
-      random_num_host(n, m, 0) = v1;
-      random_num_host(n, m, 1) = v2;
-    }
-  Kokkos::deep_copy(random_num, random_num_host);
-
-  // make local ref to capure in lambda
-  auto &k_vec = k_vec_;
-  auto &accel_hat = accel_hat_;
-  auto &accel_hat_new = accel_hat_new_;
-
-  const auto kpeak = hydro_pkg->Param<Real>("turbulence/kpeak");
-  // generate new power spectrum (injection)
-  pmb->par_for(
-      "forcing: new power spec", 0, 2, 0, num_modes - 1,
-      KOKKOS_LAMBDA(const int n, const int m) {
-        Real kmag, tmp, norm, v_sqr;
-
-        Real kx = k_vec(0, m);
-        Real ky = k_vec(1, m);
-        Real kz = k_vec(2, m);
-
-        kmag = std::sqrt(kx * kx + ky * ky + kz * kz);
-
-        accel_hat_new(n, m) = Complex(0., 0.);
-
-        tmp = std::pow(kmag / kpeak, 2.) * (2. - std::pow(kmag / kpeak, 2.));
-        if (tmp < 0.) tmp = 0.;
-        v_sqr = SQR(random_num(n, m, 0)) + SQR(random_num(n, m, 1));
-        norm = std::sqrt(-2.0 * std::log(v_sqr) / v_sqr);
-
-        accel_hat_new(n, m) =
-            Complex(tmp * norm * random_num(n, m, 0), tmp * norm * random_num(n, m, 1));
-      });
-
-  // enforce symmetry of complex to real transform
-  pmb->par_for(
-      "forcing: enforce symmetry", 0, 2, 0, num_modes - 1,
-      KOKKOS_LAMBDA(const int n, const int m) {
-        if (k_vec(0, m) == 0.) {
-          for (int m2 = 0; m2 < m; m2++) {
-            if (k_vec(1, m) == -k_vec(1, m2) && k_vec(2, m) == -k_vec(2, m2))
-              accel_hat_new(n, m) =
-                  Complex(accel_hat_new(n, m2).real(), -accel_hat_new(n, m2).imag());
-          }
-        }
-      });
-
-  const auto sol_weight = hydro_pkg->Param<Real>("turbulence/sol_weight");
-  // project
-  pmb->par_for(
-      "forcing: projection", 0, num_modes - 1, KOKKOS_LAMBDA(const int m) {
-        Real kmag;
-
-        Real kx = k_vec(0, m);
-        Real ky = k_vec(1, m);
-        Real kz = k_vec(2, m);
-
-        kmag = std::sqrt(kx * kx + ky * ky + kz * kz);
-
-        // setting kmag to 1 as a "continue" doesn't work within the parallel_for
-        // construct and it doesn't affect anything (there should never be power in the
-        // k=0 mode)
-        if (kmag == 0.) kmag = 1.;
-
-        // make it a unit vector
-        kx /= kmag;
-        ky /= kmag;
-        kz /= kmag;
-
-        Complex dot(accel_hat_new(0, m).real() * kx + accel_hat_new(1, m).real() * ky +
-                        accel_hat_new(2, m).real() * kz,
-                    accel_hat_new(0, m).imag() * kx + accel_hat_new(1, m).imag() * ky +
-                        accel_hat_new(2, m).imag() * kz);
-
-        accel_hat_new(0, m) = Complex(accel_hat_new(0, m).real() * sol_weight +
-                                          (1. - 2. * sol_weight) * dot.real() * kx,
-                                      accel_hat_new(0, m).imag() * sol_weight +
-                                          (1. - 2. * sol_weight) * dot.imag() * kx);
-        accel_hat_new(1, m) = Complex(accel_hat_new(1, m).real() * sol_weight +
-                                          (1. - 2. * sol_weight) * dot.real() * ky,
-                                      accel_hat_new(1, m).imag() * sol_weight +
-                                          (1. - 2. * sol_weight) * dot.imag() * ky);
-        accel_hat_new(2, m) = Complex(accel_hat_new(2, m).real() * sol_weight +
-                                          (1. - 2. * sol_weight) * dot.real() * kz,
-                                      accel_hat_new(2, m).imag() * sol_weight +
-                                          (1. - 2. * sol_weight) * dot.imag() * kz);
-      });
-
-  // evolve
-  const auto tcorr = hydro_pkg->Param<Real>("turbulence/tcorr");
-  Real c_drift = std::exp(-dt / tcorr);
-  Real c_diff = std::sqrt(1.0 - c_drift * c_drift);
-
-  pmb->par_for(
-      "forcing: evolve spec", 0, 2, 0, num_modes - 1,
-      KOKKOS_LAMBDA(const int n, const int m) {
-        accel_hat(n, m) = Complex(
-            accel_hat(n, m).real() * c_drift + accel_hat_new(n, m).real() * c_diff,
-            accel_hat(n, m).imag() * c_drift + accel_hat_new(n, m).imag() * c_diff);
-      });
-
-  IndexRange ib = md->GetBlockData(0)->GetBoundsI(IndexDomain::interior);
-  IndexRange jb = md->GetBlockData(0)->GetBoundsJ(IndexDomain::interior);
-  IndexRange kb = md->GetBlockData(0)->GetBoundsK(IndexDomain::interior);
-  auto acc_pack = md->PackVariables(std::vector<std::string>{"acc"});
-  auto phases_i = md->PackVariables(std::vector<std::string>{"phases_i"});
-  auto phases_j = md->PackVariables(std::vector<std::string>{"phases_j"});
-  auto phases_k = md->PackVariables(std::vector<std::string>{"phases_k"});
-  // implictly assuming cubic box of size L=1
-  pmb->par_for(
-      "Inverse FT", 0, acc_pack.GetDim(5) - 1, 0, 2, kb.s, kb.e, jb.s, jb.e, ib.s, ib.e,
-      KOKKOS_LAMBDA(const int b, const int n, const int k, const int j, const int i) {
-        Complex phase, phase_i, phase_j, phase_k;
-        acc_pack(b, n, k, j, i) = 0.0;
-
-        for (int m = 0; m < num_modes; m++) {
-          phase_i =
-              Complex(phases_i(b, 0, i - ib.s, m, 0), phases_i(b, 0, i - ib.s, m, 1));
-          phase_j =
-              Complex(phases_j(b, 0, j - jb.s, m, 0), phases_j(b, 0, j - jb.s, m, 1));
-          phase_k =
-              Complex(phases_k(b, 0, k - kb.s, m, 0), phases_k(b, 0, k - kb.s, m, 1));
-          phase = phase_i * phase_j * phase_k;
-          acc_pack(b, n, k, j, i) += 2. * (accel_hat(n, m).real() * phase.real() -
-                                           accel_hat(n, m).imag() * phase.imag());
-        }
-      });
+  // Must be mutable so the internal RNG state is updated
+  auto *few_modes_ft = hydro_pkg->MutableParam<FewModesFT>("turbulence/few_modes_ft");
+  few_modes_ft->Generate(md, dt, "acc");
 }
 
 //----------------------------------------------------------------------------------------
@@ -721,24 +447,16 @@ void Driving(MeshData<Real> *md, const parthenon::SimTime &tm, const Real dt) {
   Perturb(md, dt);
 }
 
-void Cleanup() {
-  // Ensure the Kokkos views are gargabe collected before finalized is called
-  k_vec_ = {};
-  accel_hat_ = {};
-  accel_hat_new_ = {};
-  random_num_ = {};
-  random_num_host = {};
-}
-
 void UserWorkBeforeOutput(MeshBlock *pmb, ParameterInput *pin) {
   auto hydro_pkg = pmb->packages.Get("Hydro");
 
-  const auto num_modes = hydro_pkg->Param<int>("turbulence/num_modes");
-
   // Store (common) acceleration field in spectral space
+  auto few_modes_ft = hydro_pkg->Param<FewModesFT>("turbulence/few_modes_ft");
+  auto var_hat = few_modes_ft.GetVarHat();
   auto accel_hat_host =
-      Kokkos::create_mirror_view_and_copy(parthenon::HostMemSpace(), accel_hat_);
+      Kokkos::create_mirror_view_and_copy(parthenon::HostMemSpace(), var_hat);
 
+  const auto num_modes = few_modes_ft.GetNumModes();
   for (int i = 0; i < 3; i++) {
     for (int m = 0; m < num_modes; m++) {
       pin->SetReal("problem/turbulence",
@@ -750,17 +468,11 @@ void UserWorkBeforeOutput(MeshBlock *pmb, ParameterInput *pin) {
     }
   }
   // store state of random number gen
-  {
-    std::ostringstream oss;
-    oss << rng;
-    pin->SetString("problem/turbulence", "state_rng", oss.str());
-  }
+  auto state_rng = few_modes_ft.GetRNGState();
+  pin->SetString("problem/turbulence", "state_rng", state_rng);
   // store state of distribution
-  {
-    std::ostringstream oss;
-    oss << dist;
-    pin->SetString("problem/turbulence", "state_dist", oss.str());
-  }
+  auto state_dist = few_modes_ft.GetDistState();
+  pin->SetString("problem/turbulence", "state_dist", state_dist);
 }
 
 } // namespace turbulence
diff --git a/src/utils/few_modes_ft.cpp b/src/utils/few_modes_ft.cpp
new file mode 100644
index 00000000..f80fd103
--- /dev/null
+++ b/src/utils/few_modes_ft.cpp
@@ -0,0 +1,406 @@
+//========================================================================================
+// AthenaPK - a performance portable block structured AMR astrophysical MHD code.
+// Copyright (c) 2023, Athena-Parthenon Collaboration. All rights reserved.
+// Licensed under the 3-clause BSD License, see LICENSE file for details
+//========================================================================================
+//========================================================================================
+//! \file few_modes_ft.cpp
+//  \brief Helper functions for an inverse (explicit complex to real) FT
+
+// C++ headers
+#include <random>
+#include <utility>
+
+// Parthenon headers
+#include "basic_types.hpp"
+#include "config.hpp"
+#include "globals.hpp"
+#include "kokkos_abstraction.hpp"
+#include "mesh/domain.hpp"
+#include "mesh/meshblock_pack.hpp"
+
+// AthenaPK headers
+#include "../main.hpp"
+#include "few_modes_ft.hpp"
+#include "utils/error_checking.hpp"
+
+namespace utils::few_modes_ft {
+using Complex = Kokkos::complex<parthenon::Real>;
+using parthenon::IndexRange;
+using parthenon::Metadata;
+
+FewModesFT::FewModesFT(parthenon::ParameterInput *pin, parthenon::StateDescriptor *pkg,
+                       std::string prefix, int num_modes, ParArray2D<Real> k_vec,
+                       Real k_peak, Real sol_weight, Real t_corr, uint32_t rseed,
+                       bool fill_ghosts)
+    : prefix_(prefix), num_modes_(num_modes), k_vec_(k_vec), k_peak_(k_peak),
+      t_corr_(t_corr), fill_ghosts_(fill_ghosts) {
+
+  if ((num_modes > 100) && (parthenon::Globals::my_rank == 0)) {
+    std::cout << "### WARNING using more than 100 explicit modes will significantly "
+              << "increase the runtime." << std::endl
+              << "If many modes are required in the transform field consider using "
+              << "the driving mechanism based on full FFTs." << std::endl;
+  }
+  // Ensure that all all wavevectors can be represented on the root grid
+  const auto gnx1 = pin->GetInteger("parthenon/mesh", "nx1");
+  const auto gnx2 = pin->GetInteger("parthenon/mesh", "nx2");
+  const auto gnx3 = pin->GetInteger("parthenon/mesh", "nx3");
+  // Need to make this comparison on the host as (for some reason) an extended cuda device
+  // lambda cannot live in the constructor of an object.
+  auto k_vec_host = k_vec.GetHostMirrorAndCopy();
+  for (int i = 0; i < num_modes; i++) {
+    PARTHENON_REQUIRE(std::abs(k_vec_host(0, i)) <= gnx1 / 2, "k_vec x1 mode too large");
+    PARTHENON_REQUIRE(std::abs(k_vec_host(1, i)) <= gnx2 / 2, "k_vec x2 mode too large");
+    PARTHENON_REQUIRE(std::abs(k_vec_host(2, i)) <= gnx3 / 2, "k_vec x3 mode too large");
+  }
+
+  const auto nx1 = pin->GetInteger("parthenon/meshblock", "nx1");
+  const auto nx2 = pin->GetInteger("parthenon/meshblock", "nx2");
+  const auto nx3 = pin->GetInteger("parthenon/meshblock", "nx3");
+  const auto ng_tot = fill_ghosts_ ? 2 * parthenon::Globals::nghost : 0;
+  auto m = Metadata({Metadata::None, Metadata::Derived, Metadata::OneCopy},
+                    std::vector<int>({2, num_modes, nx1 + ng_tot}), prefix + "_phases_i");
+  pkg->AddField(prefix + "_phases_i", m);
+  m = Metadata({Metadata::None, Metadata::Derived, Metadata::OneCopy},
+               std::vector<int>({2, num_modes, nx2 + ng_tot}), prefix + "_phases_j");
+  pkg->AddField(prefix + "_phases_j", m);
+  m = Metadata({Metadata::None, Metadata::Derived, Metadata::OneCopy},
+               std::vector<int>({2, num_modes, nx3 + ng_tot}), prefix + "_phases_k");
+  pkg->AddField(prefix + "_phases_k", m);
+
+  // Variable (e.g., acceleration field for turbulence driver) in Fourier space using
+  // complex to real transform.
+  var_hat_ = ParArray2D<Complex>(prefix + "_var_hat", 3, num_modes);
+  var_hat_new_ = ParArray2D<Complex>(prefix + "_var_hat_new", 3, num_modes);
+
+  PARTHENON_REQUIRE((sol_weight == -1.0) || (sol_weight >= 0.0 && sol_weight <= 1.0),
+                    "sol_weight for projection in few modes fft module needs to be "
+                    "between 0.0 and 1.0 or set to -1.0 (to disable projection).")
+  sol_weight_ = sol_weight;
+
+  random_num_ = Kokkos::View<Real ***, Kokkos::LayoutRight, parthenon::DevMemSpace>(
+      "random_num", 3, num_modes, 2);
+  random_num_host_ = Kokkos::create_mirror_view(random_num_);
+
+  rng_.seed(rseed);
+  dist_ = std::uniform_real_distribution<>(-1.0, 1.0);
+}
+
+void FewModesFT::SetPhases(MeshBlock *pmb, ParameterInput *pin) {
+  auto pm = pmb->pmy_mesh;
+  auto hydro_pkg = pmb->packages.Get("Hydro");
+
+  // The following restriction could technically be lifted if the turbulence driver is
+  // directly embedded in the hydro driver rather than a user defined source as well as
+  // fixing the pack_size=-1 when using the Mesh- (not MeshBlock-)based problem generator.
+  // The restriction stems from requiring a collective MPI comm to normalize the
+  // acceleration and magnetic field, respectively. Note, that the restriction does not
+  // apply here, but for the ProblemGenerator() and Driving() function below. The check is
+  // just added here for convenience as this function is called during problem
+  // initializtion. From my (pgrete) point of view, it's currently cleaner to keep things
+  // separate and not touch the main driver at the expense of using one pack per rank --
+  // which is typically fastest on devices anyway.
+  const auto pack_size = pin->GetInteger("parthenon/mesh", "pack_size");
+  PARTHENON_REQUIRE_THROWS(pack_size == -1,
+                           "Few modes FT currently needs parthenon/mesh/pack_size=-1 "
+                           "to work because of global reductions.")
+
+  auto Lx1 = pm->mesh_size.x1max - pm->mesh_size.x1min;
+  auto Lx2 = pm->mesh_size.x2max - pm->mesh_size.x2min;
+  auto Lx3 = pm->mesh_size.x3max - pm->mesh_size.x3min;
+
+  // Adjust (logical) grid size at levels other than the root level.
+  // This is required for simulation with mesh refinement so that the phases calculated
+  // below take the logical grid size into account. For example, the local phases at level
+  // 1 should be calculated assuming a grid that is twice as large as the root grid.
+  const auto root_level = pm->GetRootLevel();
+  auto gnx1 = pm->mesh_size.nx1 * std::pow(2, pmb->loc.level - root_level);
+  auto gnx2 = pm->mesh_size.nx2 * std::pow(2, pmb->loc.level - root_level);
+  auto gnx3 = pm->mesh_size.nx3 * std::pow(2, pmb->loc.level - root_level);
+
+  // Restriction should also be easily fixed, just need to double check transforms and
+  // volume weighting everywhere
+  PARTHENON_REQUIRE_THROWS(((gnx1 == gnx2) && (gnx2 == gnx3)) &&
+                               ((Lx1 == Lx2) && (Lx2 == Lx3)),
+                           "FMFT has only been tested with cubic meshes and constant "
+                           "dx/dy/dz. Remove this warning at your own risk.")
+
+  const auto nx1 = pmb->block_size.nx1;
+  const auto nx2 = pmb->block_size.nx2;
+  const auto nx3 = pmb->block_size.nx3;
+
+  const auto gis = pmb->loc.lx1 * pmb->block_size.nx1;
+  const auto gjs = pmb->loc.lx2 * pmb->block_size.nx2;
+  const auto gks = pmb->loc.lx3 * pmb->block_size.nx3;
+
+  // make local ref to capure in lambda
+  const auto num_modes = num_modes_;
+  auto &k_vec = k_vec_;
+
+  Complex I(0.0, 1.0);
+
+  auto &base = pmb->meshblock_data.Get();
+  auto &phases_i = base->Get(prefix_ + "_phases_i").data;
+  auto &phases_j = base->Get(prefix_ + "_phases_j").data;
+  auto &phases_k = base->Get(prefix_ + "_phases_k").data;
+
+  const auto ng = fill_ghosts_ ? parthenon::Globals::nghost : 0;
+  pmb->par_for(
+      "FMFT: calc phases_i", 0, nx1 - 1 + 2 * ng, KOKKOS_LAMBDA(int i) {
+        Real gi = static_cast<Real>((i + gis - ng) % static_cast<int>(gnx1));
+        Real w_kx;
+        Complex phase;
+
+        for (int m = 0; m < num_modes; m++) {
+          w_kx = k_vec(0, m) * 2. * M_PI / static_cast<Real>(gnx1);
+          // adjust phase factor to Complex->Real IFT: u_hat*(k) = u_hat(-k)
+          if (k_vec(0, m) == 0.0) {
+            phase = 0.5 * Kokkos::exp(I * w_kx * gi);
+          } else {
+            phase = Kokkos::exp(I * w_kx * gi);
+          }
+          phases_i(i, m, 0) = phase.real();
+          phases_i(i, m, 1) = phase.imag();
+        }
+      });
+
+  pmb->par_for(
+      "FMFT: calc phases_j", 0, nx2 - 1 + 2 * ng, KOKKOS_LAMBDA(int j) {
+        Real gj = static_cast<Real>((j + gjs - ng) % static_cast<int>(gnx2));
+        Real w_ky;
+        Complex phase;
+
+        for (int m = 0; m < num_modes; m++) {
+          w_ky = k_vec(1, m) * 2. * M_PI / static_cast<Real>(gnx2);
+          phase = Kokkos::exp(I * w_ky * gj);
+          phases_j(j, m, 0) = phase.real();
+          phases_j(j, m, 1) = phase.imag();
+        }
+      });
+
+  pmb->par_for(
+      "FMFT: calc phases_k", 0, nx3 - 1 + 2 * ng, KOKKOS_LAMBDA(int k) {
+        Real gk = static_cast<Real>((k + gks - ng) % static_cast<int>(gnx3));
+        Real w_kz;
+        Complex phase;
+
+        for (int m = 0; m < num_modes; m++) {
+          w_kz = k_vec(2, m) * 2. * M_PI / static_cast<Real>(gnx3);
+          phase = Kokkos::exp(I * w_kz * gk);
+          phases_k(k, m, 0) = phase.real();
+          phases_k(k, m, 1) = phase.imag();
+        }
+      });
+}
+
+void FewModesFT::Generate(MeshData<Real> *md, const Real dt,
+                          const std::string &var_name) {
+  auto pmb = md->GetBlockData(0)->GetBlockPointer();
+
+  const auto num_modes = num_modes_;
+
+  Complex I(0.0, 1.0);
+  auto &random_num = random_num_;
+
+  // get a set of random numbers from the CPU so that they are deterministic
+  // when run on GPUs
+  Real v1, v2, v_sqr;
+  for (int n = 0; n < 3; n++)
+    for (int m = 0; m < num_modes; m++) {
+      do {
+        v1 = dist_(rng_);
+        v2 = dist_(rng_);
+        v_sqr = v1 * v1 + v2 * v2;
+      } while (v_sqr >= 1.0 || v_sqr == 0.0);
+
+      random_num_host_(n, m, 0) = v1;
+      random_num_host_(n, m, 1) = v2;
+    }
+  Kokkos::deep_copy(random_num, random_num_host_);
+
+  // make local ref to capure in lambda
+  auto &k_vec = k_vec_;
+  auto &var_hat = var_hat_;
+  auto &var_hat_new = var_hat_new_;
+
+  const auto kpeak = k_peak_;
+
+  // generate new power spectrum (injection)
+  pmb->par_for(
+      "FMFT: new power spec", 0, 2, 0, num_modes - 1,
+      KOKKOS_LAMBDA(const int n, const int m) {
+        Real kmag, tmp, norm, v_sqr;
+
+        Real kx = k_vec(0, m);
+        Real ky = k_vec(1, m);
+        Real kz = k_vec(2, m);
+
+        kmag = std::sqrt(kx * kx + ky * ky + kz * kz);
+
+        var_hat_new(n, m) = Complex(0., 0.);
+
+        tmp = std::pow(kmag / kpeak, 2.) * (2. - std::pow(kmag / kpeak, 2.));
+        if (tmp < 0.) tmp = 0.;
+        v_sqr = SQR(random_num(n, m, 0)) + SQR(random_num(n, m, 1));
+        norm = std::sqrt(-2.0 * std::log(v_sqr) / v_sqr);
+
+        var_hat_new(n, m) =
+            Complex(tmp * norm * random_num(n, m, 0), tmp * norm * random_num(n, m, 1));
+      });
+
+  // enforce symmetry of complex to real transform
+  pmb->par_for(
+      "forcing: enforce symmetry", 0, 2, 0, num_modes - 1,
+      KOKKOS_LAMBDA(const int n, const int m) {
+        if (k_vec(0, m) == 0.) {
+          for (int m2 = 0; m2 < m; m2++) {
+            if (k_vec(1, m) == -k_vec(1, m2) && k_vec(2, m) == -k_vec(2, m2))
+              var_hat_new(n, m) =
+                  Complex(var_hat_new(n, m2).real(), -var_hat_new(n, m2).imag());
+          }
+        }
+      });
+
+  const auto sol_weight = sol_weight_;
+  if (sol_weight_ >= 0.0) {
+    // project
+    pmb->par_for(
+        "forcing: projection", 0, num_modes - 1, KOKKOS_LAMBDA(const int m) {
+          Real kmag;
+
+          Real kx = k_vec(0, m);
+          Real ky = k_vec(1, m);
+          Real kz = k_vec(2, m);
+
+          kmag = std::sqrt(kx * kx + ky * ky + kz * kz);
+
+          // setting kmag to 1 as a "continue" doesn't work within the parallel_for
+          // construct and it doesn't affect anything (there should never be power in the
+          // k=0 mode)
+          if (kmag == 0.) kmag = 1.;
+
+          // make it a unit vector
+          kx /= kmag;
+          ky /= kmag;
+          kz /= kmag;
+
+          Complex dot(var_hat_new(0, m).real() * kx + var_hat_new(1, m).real() * ky +
+                          var_hat_new(2, m).real() * kz,
+                      var_hat_new(0, m).imag() * kx + var_hat_new(1, m).imag() * ky +
+                          var_hat_new(2, m).imag() * kz);
+
+          var_hat_new(0, m) = Complex(var_hat_new(0, m).real() * sol_weight +
+                                          (1. - 2. * sol_weight) * dot.real() * kx,
+                                      var_hat_new(0, m).imag() * sol_weight +
+                                          (1. - 2. * sol_weight) * dot.imag() * kx);
+          var_hat_new(1, m) = Complex(var_hat_new(1, m).real() * sol_weight +
+                                          (1. - 2. * sol_weight) * dot.real() * ky,
+                                      var_hat_new(1, m).imag() * sol_weight +
+                                          (1. - 2. * sol_weight) * dot.imag() * ky);
+          var_hat_new(2, m) = Complex(var_hat_new(2, m).real() * sol_weight +
+                                          (1. - 2. * sol_weight) * dot.real() * kz,
+                                      var_hat_new(2, m).imag() * sol_weight +
+                                          (1. - 2. * sol_weight) * dot.imag() * kz);
+        });
+  }
+
+  // evolve
+  const auto c_drift = std::exp(-dt / t_corr_);
+  const auto c_diff = std::sqrt(1.0 - c_drift * c_drift);
+
+  pmb->par_for(
+      "FMFT: evolve spec", 0, 2, 0, num_modes - 1,
+      KOKKOS_LAMBDA(const int n, const int m) {
+        var_hat(n, m) =
+            Complex(var_hat(n, m).real() * c_drift + var_hat_new(n, m).real() * c_diff,
+                    var_hat(n, m).imag() * c_drift + var_hat_new(n, m).imag() * c_diff);
+      });
+
+  auto domain = fill_ghosts_ ? IndexDomain::entire : IndexDomain::interior;
+  IndexRange ib = md->GetBlockData(0)->GetBoundsI(domain);
+  IndexRange jb = md->GetBlockData(0)->GetBoundsJ(domain);
+  IndexRange kb = md->GetBlockData(0)->GetBoundsK(domain);
+  auto var_pack = md->PackVariables(std::vector<std::string>{var_name});
+  auto phases_i = md->PackVariables(std::vector<std::string>{prefix_ + "_phases_i"});
+  auto phases_j = md->PackVariables(std::vector<std::string>{prefix_ + "_phases_j"});
+  auto phases_k = md->PackVariables(std::vector<std::string>{prefix_ + "_phases_k"});
+
+  // implictly assuming cubic box of size L=1
+  parthenon::par_for(
+      DEFAULT_LOOP_PATTERN, "FMFT: Inverse FT", parthenon::DevExecSpace(), 0,
+      md->NumBlocks() - 1, 0, 2, kb.s, kb.e, jb.s, jb.e, ib.s, ib.e,
+      KOKKOS_LAMBDA(const int b, const int n, const int k, const int j, const int i) {
+        Complex phase, phase_i, phase_j, phase_k;
+        var_pack(b, n, k, j, i) = 0.0;
+
+        for (int m = 0; m < num_modes; m++) {
+          phase_i =
+              Complex(phases_i(b, 0, i - ib.s, m, 0), phases_i(b, 0, i - ib.s, m, 1));
+          phase_j =
+              Complex(phases_j(b, 0, j - jb.s, m, 0), phases_j(b, 0, j - jb.s, m, 1));
+          phase_k =
+              Complex(phases_k(b, 0, k - kb.s, m, 0), phases_k(b, 0, k - kb.s, m, 1));
+          phase = phase_i * phase_j * phase_k;
+          var_pack(b, n, k, j, i) += 2. * (var_hat(n, m).real() * phase.real() -
+                                           var_hat(n, m).imag() * phase.imag());
+        }
+      });
+}
+
+// Creates a random set of wave vectors with k_mag within k_peak/2 and 2*k_peak
+ParArray2D<Real> MakeRandomModes(const int num_modes, const Real k_peak,
+                                 uint32_t rseed = 31224) {
+  auto k_vec = parthenon::ParArray2D<Real>("k_vec", 3, num_modes);
+  auto k_vec_h = Kokkos::create_mirror_view_and_copy(parthenon::HostMemSpace(), k_vec);
+
+  const int k_low = std::floor(k_peak / 2);
+  const int k_high = std::ceil(2 * k_peak);
+
+  std::mt19937 rng;
+  rng.seed(rseed);
+  std::uniform_int_distribution<> dist(-k_high, k_high);
+
+  int n_mode = 0;
+  int n_attempt = 0;
+  constexpr int max_attempts = 1000000;
+  Real kx1, kx2, kx3, k_mag, ampl;
+  bool mode_exists = false;
+  while (n_mode < num_modes && n_attempt < max_attempts) {
+    n_attempt += 1;
+
+    kx1 = dist(rng);
+    kx2 = dist(rng);
+    kx3 = dist(rng);
+    k_mag = std::sqrt(SQR(kx1) + SQR(kx2) + SQR(kx3));
+
+    // Expected amplitude of the spectral function. If this is changed, it also needs to
+    // be changed in the FMFT class (or abstracted).
+    ampl = SQR(k_mag / k_peak) * (2.0 - SQR(k_mag / k_peak));
+
+    // Check is mode was already picked by chance
+    mode_exists = false;
+    for (int n_mode_exsist = 0; n_mode_exsist < n_mode; n_mode_exsist++) {
+      if (k_vec_h(0, n_mode_exsist) == kx1 && k_vec_h(1, n_mode_exsist) == kx2 &&
+          k_vec_h(2, n_mode_exsist) == kx3) {
+        mode_exists = true;
+      }
+    }
+
+    // kx1 < 0.0 because we use a explicit symmetric Complex to Real transform
+    if (ampl < 0 || k_mag < k_low || k_mag > k_high || mode_exists || kx1 < 0.0) {
+      continue;
+    }
+    k_vec_h(0, n_mode) = kx1;
+    k_vec_h(1, n_mode) = kx2;
+    k_vec_h(2, n_mode) = kx3;
+    n_mode++;
+  }
+  PARTHENON_REQUIRE_THROWS(
+      n_attempt < max_attempts,
+      "Cluster init did not succeed in calculating perturbation modes.")
+  Kokkos::deep_copy(k_vec, k_vec_h);
+
+  return k_vec;
+}
+} // namespace utils::few_modes_ft
\ No newline at end of file
diff --git a/src/utils/few_modes_ft.hpp b/src/utils/few_modes_ft.hpp
new file mode 100644
index 00000000..ce17401c
--- /dev/null
+++ b/src/utils/few_modes_ft.hpp
@@ -0,0 +1,71 @@
+
+//========================================================================================
+// AthenaPK - a performance portable block structured AMR astrophysical MHD code.
+// Copyright (c) 2023, Athena-Parthenon Collaboration. All rights reserved.
+// Licensed under the 3-clause BSD License, see LICENSE file for details
+//========================================================================================
+//========================================================================================
+//! \file few_modes_ft.hpp
+//  \brief Helper functions for an inverse (explicit complex to real) FT
+
+// Parthenon headers
+#include "basic_types.hpp"
+#include "config.hpp"
+#include <parthenon/package.hpp>
+#include <random>
+#include <sstream>
+
+// AthenaPK headers
+#include "../main.hpp"
+#include "mesh/domain.hpp"
+
+namespace utils::few_modes_ft {
+using namespace parthenon::package::prelude;
+using parthenon::Real;
+using Complex = Kokkos::complex<Real>;
+using parthenon::IndexRange;
+using parthenon::ParArray2D;
+
+class FewModesFT {
+ private:
+  int num_modes_;
+  std::string prefix_;
+  ParArray2D<Complex> var_hat_, var_hat_new_;
+  ParArray2D<Real> k_vec_;
+  Real k_peak_; // peak of the power spectrum
+  Kokkos::View<Real ***, Kokkos::LayoutRight, parthenon::DevMemSpace> random_num_;
+  Kokkos::View<Real ***, Kokkos::LayoutRight, parthenon::HostMemSpace> random_num_host_;
+  std::mt19937 rng_;
+  std::uniform_real_distribution<> dist_;
+  Real sol_weight_;  // power in solenoidal modes for projection. Set to negative to
+                     // disable projection
+  Real t_corr_;      // correlation time for evolution of Ornstein-Uhlenbeck process
+  bool fill_ghosts_; // if the inverse transform should also fill ghost zones
+
+ public:
+  FewModesFT(parthenon::ParameterInput *pin, parthenon::StateDescriptor *pkg,
+             std::string prefix, int num_modes, ParArray2D<Real> k_vec, Real k_peak,
+             Real sol_weight, Real t_corr, uint32_t rseed, bool fill_ghosts = false);
+
+  ParArray2D<Complex> GetVarHat() { return var_hat_; }
+  int GetNumModes() { return num_modes_; }
+  void SetPhases(MeshBlock *pmb, ParameterInput *pin);
+  void Generate(MeshData<Real> *md, const Real dt, const std::string &var_name);
+  void RestoreRNG(std::istringstream &iss) { iss >> rng_; }
+  void RestoreDist(std::istringstream &iss) { iss >> dist_; }
+  std::string GetRNGState() {
+    std::ostringstream oss;
+    oss << rng_;
+    return oss.str();
+  }
+  std::string GetDistState() {
+    std::ostringstream oss;
+    oss << dist_;
+    return oss.str();
+  }
+};
+
+// Creates a random set of wave vectors with k_mag within k_peak/2 and 2*k_peak
+ParArray2D<Real> MakeRandomModes(const int num_modes, const Real k_peak, uint32_t rseed);
+
+} // namespace utils::few_modes_ft
\ No newline at end of file
diff --git a/tst/regression/CMakeLists.txt b/tst/regression/CMakeLists.txt
index 31d2876f..3461a817 100644
--- a/tst/regression/CMakeLists.txt
+++ b/tst/regression/CMakeLists.txt
@@ -35,7 +35,7 @@ setup_test_serial("performance" "--driver ${PROJECT_BINARY_DIR}/bin/athenaPK \
   --driver_input ${PROJECT_SOURCE_DIR}/inputs/linear_wave3d.in --num_steps 21" "performance")
 
 setup_test_both("cluster_hse" "--driver ${PROJECT_BINARY_DIR}/bin/athenaPK \
-  --driver_input ${PROJECT_SOURCE_DIR}/inputs/cluster/hse.in --num_steps 1" "convergence")
+  --driver_input ${PROJECT_SOURCE_DIR}/inputs/cluster/hse.in --num_steps 2" "convergence")
 
 setup_test_serial("cluster_tabular_cooling" "--driver ${PROJECT_BINARY_DIR}/bin/athenaPK \
   --driver_input ${PROJECT_SOURCE_DIR}/inputs/cluster/cooling.in --num_steps 11" "convergence")
diff --git a/tst/regression/test_suites/cluster_hse/cluster_hse.py b/tst/regression/test_suites/cluster_hse/cluster_hse.py
index 8cd1d1cb..0f567e4f 100644
--- a/tst/regression/test_suites/cluster_hse/cluster_hse.py
+++ b/tst/regression/test_suites/cluster_hse/cluster_hse.py
@@ -33,12 +33,16 @@
 
 class TestCase(utils.test_case.TestCaseAbs):
     def __init__(self):
-
         # Define cluster parameters
         # Setup units
         unyt.define_unit("code_length", (1, "Mpc"))
         unyt.define_unit("code_mass", (1e14, "Msun"))
         unyt.define_unit("code_time", (1, "Gyr"))
+        unyt.define_unit("code_velocity", (1, "code_length/code_time"))
+        unyt.define_unit(
+            "code_magnetic",
+            (np.sqrt(4 * np.pi), "(code_mass/code_length)**0.5/code_time"),
+        )
         self.code_length = unyt.unyt_quantity(1, "code_length")
         self.code_mass = unyt.unyt_quantity(1, "code_mass")
         self.code_time = unyt.unyt_quantity(1, "code_time")
@@ -92,6 +96,18 @@ def __init__(self):
 
         self.norm_tol = 1e-3
 
+        self.sigma_v = unyt.unyt_quantity(75.0, "km/s")
+        char_v_lengthscale = unyt.unyt_quantity(100.0, "kpc")
+        # Note that the box scale is set in the input file directly (-0.1 to 0.1),
+        # so if the input file changes, the following line should change, too.
+        l_box = 0.2 * self.code_length
+        self.k_peak_v = l_box / char_v_lengthscale
+
+        self.sigma_b = unyt.unyt_quantity(1e-8, "G")
+        # using a different one than for the velocity
+        char_b_lengthscale = unyt.unyt_quantity(50.0, "kpc")
+        self.k_peak_b = l_box / char_b_lengthscale
+
     def Prepare(self, parameters, step):
         """
         Any preprocessing that is needed before the drive is run can be done in
@@ -140,29 +156,116 @@ def Prepare(self, parameters, step):
             f"problem/cluster/hydrostatic_equilibrium/r_fix={self.R_fix.in_units('code_length').v}",
             f"problem/cluster/hydrostatic_equilibrium/rho_fix={self.rho_fix.in_units('code_mass/code_length**3').v}",
             f"problem/cluster/hydrostatic_equilibrium/r_sampling={self.R_sampling}",
+            f"hydro/fluid={'euler' if step == 2 else 'glmmhd'}",
+            f"problem/cluster/init_perturb/sigma_v={0.0 if step == 2 else self.sigma_v.in_units('code_velocity').v}",
+            f"problem/cluster/init_perturb/k_peak_v={0.0 if step == 2 else self.k_peak_v.v}",
+            f"problem/cluster/init_perturb/sigma_b={0.0 if step == 2 else self.sigma_b.in_units('code_magnetic').v}",
+            f"problem/cluster/init_perturb/k_peak_b={0.0 if step == 2 else self.k_peak_b.v}",
+            f"parthenon/output2/id={'prim' if step == 2 else 'prim_perturb'}",
+            f"parthenon/time/nlim={-1 if step == 2 else 1}",
         ]
 
         return parameters
 
     def Analyse(self, parameters):
-        """
-        Analyze the output and determine if the test passes.
+        analyze_status = self.AnalyseHSE(parameters)
+        analyze_status &= self.AnalyseInitPert(parameters)
+        return analyze_status
 
-        This function is called after the driver has been executed. It is
-        responsible for reading whatever data it needs and making a judgment
-        about whether or not the test passes. It takes no inputs. Output should
-        be True (test passes) or False (test fails).
+    def AnalyseInitPert(self, parameters):
+        analyze_status = True
+        sys.path.insert(
+            1,
+            parameters.parthenon_path
+            + "/scripts/python/packages/parthenon_tools/parthenon_tools",
+        )
 
-        The parameters that are passed in provide the paths to relevant
-        locations and commands. Of particular importance is the path to the
-        output folder. All files from a drivers run should appear in and output
-        folder located in
-        parthenon/tst/regression/test_suites/test_name/output.
+        try:
+            import phdf
+        except ModuleNotFoundError:
+            print("Couldn't find module to load Parthenon hdf5 files.")
+            return False
 
-        It is possible in this function to read any of the output files such as
-        hdf5 output and compare them to expected quantities.
+        data_file = phdf.phdf(
+            f"{parameters.output_path}/parthenon.prim_perturb.00000.phdf"
+        )
+        dx = data_file.xf[:, 1:] - data_file.xf[:, :-1]
+        dy = data_file.yf[:, 1:] - data_file.yf[:, :-1]
+        dz = data_file.zf[:, 1:] - data_file.zf[:, :-1]
+
+        # create array of volume with (block, k, j, i) indices
+        cell_vol = np.empty(
+            (
+                data_file.x.shape[0],
+                data_file.z.shape[1],
+                data_file.y.shape[1],
+                data_file.x.shape[1],
+            )
+        )
+        for block in range(dx.shape[0]):
+            dz3d, dy3d, dx3d = np.meshgrid(
+                dz[block], dy[block], dx[block], indexing="ij"
+            )
+            cell_vol[block, :, :, :] = dx3d * dy3d * dz3d
 
-        """
+        # flatten array as prim var are also flattended
+        cell_vol = cell_vol.ravel()
+
+        prim = data_file.Get("prim")
+
+        # FIXME: For now this is hard coded - a component mapping should be done by phdf
+        prim_col_dict = {
+            "velocity_1": 1,
+            "velocity_2": 2,
+            "velocity_3": 3,
+            "magnetic_field_1": 5,
+            "magnetic_field_2": 6,
+            "magnetic_field_3": 7,
+        }
+
+        vx = prim[prim_col_dict["velocity_1"]]
+        vy = prim[prim_col_dict["velocity_2"]]
+        vz = prim[prim_col_dict["velocity_3"]]
+
+        # volume weighted rms velocity
+        rms_v = np.sqrt(
+            np.sum((vx**2 + vy**2 + vz**2) * cell_vol) / np.sum(cell_vol)
+        )
+
+        sigma_v_match = np.isclose(
+            rms_v, self.sigma_v.in_units("code_velocity").v, rtol=1e-14, atol=1e-14
+        )
+
+        if not sigma_v_match:
+            analyze_status = False
+            print(
+                f"ERROR: velocity perturbations don't match\n"
+                f"Expected {self.sigma_v.in_units('code_velocity')} but got {rms_v}\n"
+            )
+
+        bx = prim[prim_col_dict["magnetic_field_1"]]
+        by = prim[prim_col_dict["magnetic_field_2"]]
+        bz = prim[prim_col_dict["magnetic_field_3"]]
+
+        # volume weighted rms magnetic field
+        rms_b = np.sqrt(
+            np.sum((bx**2 + by**2 + bz**2) * cell_vol) / np.sum(cell_vol)
+        )
+
+        sigma_b_match = np.isclose(
+            rms_b, self.sigma_b.in_units("code_magnetic").v, rtol=1e-14, atol=1e-14
+        )
+
+        if not sigma_b_match:
+            analyze_status = False
+            print(
+                f"ERROR: magnetic field perturbations don't match\n"
+                f"Expected {self.sigma_b.in_units('code_magnetic')} but got {rms_b}\n"
+            )
+
+        return analyze_status
+
+    def AnalyseHSE(self, parameters):
         analyze_status = True
 
         self.Yp = self.He_mass_fraction
diff --git a/tst/regression/test_suites/cluster_tabular_cooling/cluster_tabular_cooling.py b/tst/regression/test_suites/cluster_tabular_cooling/cluster_tabular_cooling.py
index c5b6d9ed..e04cdc60 100644
--- a/tst/regression/test_suites/cluster_tabular_cooling/cluster_tabular_cooling.py
+++ b/tst/regression/test_suites/cluster_tabular_cooling/cluster_tabular_cooling.py
@@ -36,7 +36,6 @@
 
 class TestCase(utils.test_case.TestCaseAbs):
     def __init__(self):
-
         # Define cluster parameters
         # Setup units
         unyt.define_unit("code_length", (1, "Mpc"))
@@ -322,10 +321,10 @@ def zero_corrected_linf_err(gold, test):
             }
 
             rho = unyt.unyt_array(
-                prim[:, prim_col_dict["density"]], "code_mass*code_length**-3"
+                prim[prim_col_dict["density"], :], "code_mass*code_length**-3"
             )
             pres = unyt.unyt_array(
-                prim[:, prim_col_dict["pressure"]],
+                prim[prim_col_dict["pressure"], :],
                 "code_mass*code_length**-1*code_time**-2",
             )