Skip to content

Commit

Permalink
Merge pull request ceph#61462 from zdover23/wip-doc-2025-01-21-cephfs…
Browse files Browse the repository at this point in the history
…-disaster-recovery-experts

doc/cephfs: edit disaster-recovery-experts (4 of x)

Reviewed-by: Anthony D'Atri <[email protected]>
  • Loading branch information
zdover23 authored Jan 22, 2025
2 parents 9348a8f + f2529d1 commit 31101e1
Showing 1 changed file with 79 additions and 55 deletions.
134 changes: 79 additions & 55 deletions doc/cephfs/disaster-recovery-experts.rst
Original file line number Diff line number Diff line change
Expand Up @@ -213,118 +213,142 @@ Using an alternate metadata pool for recovery

.. warning::

There has not been extensive testing of this procedure. It should be
undertaken with great care.
This procedure has not been extensively tested. It should be undertaken only
with great care.

If an existing file system is damaged and inoperative, it is possible to create
a fresh metadata pool and attempt to reconstruct the file system metadata into
this new pool, leaving the old metadata in place. This could be used to make a
safer attempt at recovery since the existing metadata pool would not be
modified.
If an existing file system is damaged and inoperative, then it is possible to
create a fresh metadata pool and to attempt the reconstruction the of the
damaged and inoperative file system's metadata into the new pool, while leaving
the old metadata in place. This could be used to make a safer attempt at
recovery since the existing metadata pool would not be modified.

.. caution::

During this process, multiple metadata pools will contain data referring to
the same data pool. Extreme caution must be exercised to avoid changing the
data pool contents while this is the case. Once recovery is complete, the
damaged metadata pool should be archived or deleted.
contents of the data pool while this is the case. After recovery is
complete, archive or delete the damaged metadata pool.

To begin, the existing file system should be taken down, if not done already,
to prevent further modification of the data pool. Unmount all clients and then
mark the file system failed:
To begin, the existing file system should be taken down to prevent further
modification of the data pool. Unmount all clients and then use the following
command to mark the file system failed:

::
.. prompt:: bash #

ceph fs fail <fs_name>
ceph fs fail <fs_name>

.. note::

<fs_name> here and below indicates the original, damaged file system.
``<fs_name>`` here and below refers to the original, damaged file system.

Next, create a recovery file system in which we will populate a new metadata pool
backed by the original data pool.
that is backed by the original data pool:

::
.. prompt:: bash #

ceph osd pool create cephfs_recovery_meta
ceph fs new cephfs_recovery cephfs_recovery_meta <data_pool> --recover --allow-dangerous-metadata-overlay
ceph osd pool create cephfs_recovery_meta
ceph fs new cephfs_recovery cephfs_recovery_meta <data_pool> --recover --allow-dangerous-metadata-overlay

.. note::

You may rename the recovery metadata pool and file system at a future time.
The ``--recover`` flag prevents any MDS from joining the new file system.
The ``--recover`` flag prevents any MDS daemon from joining the new file
system.

Next, we will create the intial metadata for the fs:

::
.. prompt:: bash #

cephfs-table-tool cephfs_recovery:0 reset session

.. prompt:: bash #

cephfs-table-tool cephfs_recovery:0 reset session
cephfs-table-tool cephfs_recovery:0 reset snap
cephfs-table-tool cephfs_recovery:0 reset inode
cephfs-journal-tool --rank cephfs_recovery:0 journal reset --force --yes-i-really-really-mean-it
cephfs-table-tool cephfs_recovery:0 reset snap

.. prompt:: bash #

cephfs-table-tool cephfs_recovery:0 reset inode

.. prompt:: bash #

cephfs-journal-tool --rank cephfs_recovery:0 journal reset --force --yes-i-really-really-mean-it

Now perform the recovery of the metadata pool from the data pool:

::
.. prompt:: bash #

cephfs-data-scan init --force-init --filesystem cephfs_recovery --alternate-pool cephfs_recovery_meta
cephfs-data-scan scan_extents --alternate-pool cephfs_recovery_meta --filesystem <fs_name>
cephfs-data-scan scan_inodes --alternate-pool cephfs_recovery_meta --filesystem <fs_name> --force-corrupt
cephfs-data-scan scan_links --filesystem cephfs_recovery
cephfs-data-scan init --force-init --filesystem cephfs_recovery --alternate-pool cephfs_recovery_meta

.. prompt:: bash #

cephfs-data-scan scan_extents --alternate-pool cephfs_recovery_meta --filesystem <fs_name>

.. prompt:: bash #

cephfs-data-scan scan_inodes --alternate-pool cephfs_recovery_meta --filesystem <fs_name> --force-corrupt

.. prompt:: bash #

cephfs-data-scan scan_links --filesystem cephfs_recovery

.. note::

Each scan procedure above goes through the entire data pool. This may take a
significant amount of time. See the previous section on how to distribute
this task among workers.
Each of the scan procedures above scans through the entire data pool. This
may take a long time. See the previous section on how to distribute this
task among workers.

If the damaged file system contains dirty journal data, it may be recovered next
with:
with a command of the following form:

::
.. prompt:: bash #

cephfs-journal-tool --rank=<fs_name>:0 event recover_dentries list --alternate-pool cephfs_recovery_meta
cephfs-journal-tool --rank=<fs_name>:0 event recover_dentries list --alternate-pool cephfs_recovery_meta

After recovery, some recovered directories will have incorrect statistics.
Ensure the parameters ``mds_verify_scatter`` and ``mds_debug_scatterstat`` are
set to false (the default) to prevent the MDS from checking the statistics:
Ensure that the parameters ``mds_verify_scatter`` and ``mds_debug_scatterstat``
are set to false (the default) to prevent the MDS from checking the statistics:

::
.. prompt:: bash #

ceph config rm mds mds_verify_scatter
ceph config rm mds mds_debug_scatterstat
ceph config rm mds mds_verify_scatter

.. prompt:: bash #

ceph config rm mds mds_debug_scatterstat

.. note::

Also verify the config has not been set globally or with a local ceph.conf file.
Verify that the config has not been set globally or with a local ``ceph.conf`` file.

Now, allow an MDS to join the recovery file system:
Now, allow an MDS daemon to join the recovery file system:

::
.. prompt:: bash #

ceph fs set cephfs_recovery joinable true
ceph fs set cephfs_recovery joinable true

Finally, run a forward :doc:`scrub </cephfs/scrub>` to repair recursive statistics.
Ensure you have an MDS running and issue:
Ensure that you have an MDS daemon running and issue the following command:

::
.. prompt:: bash #

ceph tell mds.cephfs_recovery:0 scrub start / recursive,repair,force
ceph tell mds.cephfs_recovery:0 scrub start / recursive,repair,force

.. note::

The `Symbolic link recovery <https://tracker.ceph.com/issues/46166>`_ is supported from Quincy.
The `Symbolic link recovery <https://tracker.ceph.com/issues/46166>`_ is
supported starting in the Quincy release.

Symbolic links were recovered as empty regular files before.

It is recommended to migrate any data from the recovery file system as soon as
possible. Do not restore the old file system while the recovery file system is
operational.
It is recommended that you migrate any data from the recovery file system as
soon as possible. Do not restore the old file system while the recovery file
system is operational.

.. note::

If the data pool is also corrupt, some files may not be restored because
backtrace information is lost. If any data objects are missing (due to
issues like lost Placement Groups on the data pool), the recovered files
will contain holes in place of the missing data.
the backtrace information associated with them is lost. If any data objects
are missing (due to issues like lost Placement Groups on the data pool),
the recovered files will contain holes in place of the missing data.

.. _Symbolic link recovery: https://tracker.ceph.com/issues/46166

0 comments on commit 31101e1

Please sign in to comment.