-
Notifications
You must be signed in to change notification settings - Fork 39
Output data formats
Vlasiator produces three kinds of output files during a simulation run, the contents of which vary based on simulation parameters:
-
logfile.txt
, the simulation run log. This is a timestamped ascii file providing basic diagnostic output of the run, including memory usage, time steps etc. -
diagnostic.txt
. The contents of this file can be configured by thediagnostic =
options in the run config file. In general, this ascii file will contain one line per (1, 10, or so) simulation timesteps, with the columns determined by the selected data reducers. These include, for example, simple scalar values like overall plasma mass, number of velocity space blocks in the simulation, charge balance, divergence of magnetic field etc. - VLSV files are the main output data products. These files come in multiple varieties:
- Restart files. These contain the whole simulation state, including the full phase space density, all relevant electromagnetic fields and metadata. Simulations can be restarted from them (hence the name), but they tend to be very heavy, easily multiple terabytes in size for production runs. They do not contain the output of data reducer operators (detailed below).
- Bulk files. In these, reduced spatial simulation data is written for further scientific analysis. Usually, this includes moments of the distribution functions and electromagnetic fields, but can also contain much more complex data reducer operators, as listed below. It is also possible (and common) to configure a subset (e.g. every 25th cell) of the velocity distribution functions to be written for further analysis.
The VLSV library is used to write this versatile container format. Analysator can be used to load and handle these files in python.
The file format is optimized for parallel write performance: Data is dumped to disk in the same memory structure as it is in the Vlasiator simulation, as binary blobs. Once all data is written, an XML footer that describes the data gets added to the end.
An example XML footer might look like this:
<VLSV>
<MESH arraysize="208101" datasize="8" datatype="uint" max_refinement_level="1" name="SpatialGrid" type="amr_ucd" vectorsize="1" xperiodic="no" yperiodic="no" zperiodic="no">989580</MESH>
<MESH arraysize="652800" datasize="8" datatype="uint" name="fsgrid" type="multi_ucd" vectorsize="1" xperiodic="no" yperiodic="no" zperiodic="no">4011008</MESH>
<PARAMETER arraysize="1" datasize="8" datatype="float" name="time" vectorsize="1">989488</PARAMETER>
<PARAMETER arraysize="1" datasize="8" datatype="float" name="dt" vectorsize="1">989496</PARAMETER>
<VARIABLE arraysize="123544" datasize="8" datatype="uint" mesh="SpatialGrid" name="CellID" vectorsize="1">1136</VARIABLE>
<VARIABLE arraysize="652800" datasize="8" datatype="float" mesh="fsgrid" name="fg_b" unit="T" unitConversion="1.0" unitLaTeX="$\mathrm{T}$" variableLaTeX="$B$" vectorsize="3">9558184</VARIABLE>
</VLSV>
Each XML tag describes one dataset in the file, with arraysize
, datatype
, datasize
and vectorsize
describing the array. The XML tag's content contains the byte offset in the file, where this dataset's raw binary data lies.
The two most important tag types are PARAMETER
, for single numbers describing the file as a whole, such as resolutions, timesteps etc., and VARIABLE
, for spatially varying data reducer data maps.
Additional metadata is often added to the datasets, such as their physical units, LaTeX formatted plotting hints, etc.
Note that the XML tags in the file do not yet give sufficient information to describe the spatial structure of the variable arrays. The construction differs depending on the grid they are linked to (denoted by the mesh=
attribute):
-
Vlasov grid variables, typically marked with a
vg_
in their name, are stored as cell parameters in the DCCRG grid underlying the vlasov solver. As the simulation is dynamically load balanced, their memory order changes unpredictably, so the data must be presumed completely unordered in the file.Fortunately, the
CellID
variable gets written into the file first, which contains the flattened spatial index of the given simulation cells in the same order as all further Vlasov grid variables. In the simplest, non mesh-refined version, the CellID is defined as
CellID = x_index + x_size * y_index + x_size * y_size * z_index + 1
By reading both the intended target variable and the CellID, the data can thus be brought into flattened spatial order by simply sorting both arrays in the same order. In analysator, this is typically achieved by running
c = file.read_variable("CellID")
b = file.read_variable("rho")
b = b[numpy.argsort(c)]
b.reshape(f.get_spatial_mesh_size())
-
FSGrid variables are stored on the simulations fieldsolver grid, which is partitioned quite differently for performance reasons. The spatial domain is subdivided into equally sized rectangular domains, which are written for each compute rank in parallel. If written from a simulation with a single MPI rank, the resulting array is directly ordered in spatial order, as by the cellID definition above. For simulations on multiple ranks, every rank writes its data in this structure, end-to-end. The
num_writing_ranks
argument in the XML tag allows the spatial partition to be reconstructed on load time. Code that does this reconstuction is available here (C++ version) and here (python version) -
Velocity space variables (at the moment, this is only the phase space density f for every species), follow yet another structure due to the sparse velocity grid structure on which they are stored...
This is a (hopefully) up-to date list of simulation output options that can be enabled in the config file. Note that older simulation possibly use slightly different names, as the code is in constant development.
Variable name | config option | unit | meaning | literature ref |
---|---|---|---|---|
CellID | always written | cells | Spatial ordering of vlasov grid cells | |
fg_b | fg_b |
T | Overall magnetic field (vector) | Palmroth et al. 2018 |
fg_b_background | fg_backgroundb |
T | Static background magnetic field (i.e. dipole field in a magnetosphere simulation. Vector.) | Palmroth et al. 2018 |
fg_b_perturbed | fg_perturbedb |
T | Fluctuating component of the magnetic field (vector) | Palmroth et al. 2018 |
fg_e | fg_e |
V/m | Electric field, calculated as ∇ × B (Vector) | |
vg_rhom | vg_rhom |
kg/m³ | combined mass density of all simulation species | |
fg_rhom | fg_rhom |
kg/m³ | " | |
vg_rhoq | vg_rhoq |
C/m³ | combined charge density of all simulation species | |
fg_rhoq | fg_rhoq |
C/m³ | " | |
proton_vg_rho | populations_rho |
1/m³ | Number density for each simulated particle population | |
vg_v | vg_v |
m/s | Bulk plasma velocity (velocity of the centre-of-mass frame, vector) | |
fg_v | fg_v |
m/s | " | |
proton_vg_v | populations_v |
m/s | Per-population bulk velocity | |
proton_vg_ rho_thermal | populations_moments _thermal |
1/m³ | Number density for the thermal component of every population | |
proton_vg_v _thermal | " | m/s | Velocity (vector) for the thermal component of every population | |
proton_vg_ptensor _diagonal_thermal | " | Pa | Diagonal components of the pressure tensor for the thermal component of every population | |
proton_vg_ptensor _offdiagonal_thermal | " | Pa | Off-Diagonal components of the pressure tensor for the thermal component of every population | |
proton_vg_rho _nonthermal | populations_moments _nonthermal |
1/m³ | Number density for the nonthermal component of every population | |
proton_vg_v _nonthermal | " | m/s | Velocity (vector) for the nonthermal component of every population | |
proton_vg_ptensor _diagonal_nonthermal | " | Pa | Diagonal components of the pressure tensor for the nonthermal component of every population | |
proton_vg_ptensor _offdiagonal_nonthermal | " | Pa | Off-Diagonal components of the pressure tensor for the nonthermal component of every population | |
proton_minvalue | populations_vg _effectivesparsitythreshold |
m^6/s³ | Effective sparsity threshold for every cell. | Yann's PhD Thesis, page 91 |
proton_rholossadjust | ```populations_vg _rho_loss_adjust`` | 1/m³ | Tracks how much mass was lost in the sparse velocity space block removal | Yann's PhD Thesis, page 90 |
vg_lbweight | vg_lbweight |
arb. unit | Load balance metric, used for dynamic rebalancing of computational load between mpi tasks | |
vg_maxdt_acceleration | vg_maxdt_acceleration |
s | Maximum timestep limit of the acceleration solver | |
proton_vg_maxdt_acceleration | populations_vg _maxdt_acceleration |
s | ", per-population | |
vg_maxdt_translation | vg_maxdt_translation |
s | Maximum timestep limit of the translation solver | |
proton_vg_maxdt_translation | populations_vg_maxdt_translation |
s | ", per-population |