Skip to content

Repository Get Locked

Emruz Hossain edited this page Feb 19, 2021 · 1 revision

Sometime the backend repository get locked and subsequent backup fail.

How I will understand that the repository has been locked?

If the repository get locked, new backup will fail. If you describe the BackupSession, you should see error message indicating that the repository is already locked by other process.

kubectl describe -n <namespace> backupsession <backupsession name>

You will also see the error message in the backup sidecar/job log.

# For backup that uses sidecar (i.e. Deployment, StatefulSet etc.)
kubectl logs -n <namespace> <workload pod name> -c stash

# For backup that uses job (i.e. Database, PVC, etc.)
kubectl logs -n <namespace> <backup job's pod name> --all-containers

Why the repository get locked?

A restic process that need to lock the repository for its operation locks the repository before starting the operation. When it completes the operation, it remove the lock so that other restic proces can use the repository. Now, if the process is killed unexpectedly, it can not remove the lock. As a result the repository remain in locked state and become unusable for other process.

Possible scenarios when a repository can get locked

The repository can get locked in the following scenarios.

1. The backup job/pod containing sidecar has been terminated.

If the workload pod that has stash sidecar or backup job's pod get terminated while a backup is running, the repository can get locked. I this case, you have find out why the pod was terminated.

2. The temp-dir is set too low

Stash uses an emptyDir as temporary volume where it store cache for improving backup performance. By default the emptyDir does not have any limit on size. However, if you set the limit manually using spec.tempDir section of BackupConfiguration make sure you have set it to a reasonable size based on your targeted data size. If the tempDir limit is too low, cache size may cross the limit resulting the backup pod get evicted by Kubernetes. This is a tricky case because you may not notice that the backup pod has been evicted. You can describe the respective workload/job to check if it was the case.

In such scenario, make sure that you have set the tempDir size to a reasonable amount. You can also disable caching by setting spec.tempDir.disableCaching: true. However, this might impact the backup performance significantly.

Workaround

If your repository get locked, you have to unlock it manually. You can use Stash kubectl plugin to unlock the repository as describe here. Then, find the reason why it was locked in the first place and resolve that.