ES/tools: add script to create data directory backups on nodes. #306

mgolosova · 2019-12-04T11:54:41Z

We need it at least at bamboo node (and thus in this script hardcoded
path to the ES data directory is set to the one used on bamboo): since
it is a development node and we all have full access to all indices, we
can erroneously erase something we don't want to be erased and I don't
want to dump data from CERN again when it happens.

Not sure if it should be applied on the "production" nodes -- at least
because we don't have enough space on them :)

IATkachenko

Inline comments in code.

Utils/Elasticsearch/tools/daily/es_backup

Utils/Elasticsearch/tools/daily/es_backup.logrotate

Utils/Elasticsearch/tools/daily/es_backup

mgolosova · 2019-12-12T08:38:19Z

@IATkachenko, seems like now everything is fixed, please take a look.

Utils/Elasticsearch/tools/daily/es_backup

mgolosova · 2019-12-13T10:45:09Z

@IATkachenko, addressed all comments and fixed one more issue popped during tests.
Please take a look.

mgolosova · 2019-12-13T10:46:23Z

Backup для ES

We need it at least at `bamboo` node (and thus in this script hardcoded path to the ES data directory is set to the one used on `bamboo`): since it is a development node and we all have full access to all indices, we can erroneously erase something we don't want to be erased and I don't want to dump data from CERN again when it happens. Not sure if it should be applied on the "production" nodes -- at least because we don't have enough disk space there at this moment :)

Better to make sure that all parent directories exist as well.

Logrotate is a proper tool for files rotation, so it's better to use it instead of inventing another bicycle ;) With logrotate it is better to use fixed filename (without current date stamp), so "current" backup file name is changed to `data.tar.gz`. Due to this change we have to rotate files _before_ creating a new backup file (else we would never update it -- or rewrite every time and never rotate). 1-char suffixes indicating the period in backup file name (`*{w|m|y}.tar.gz`) are replaced with subdirectories for redability (suffixes would look especially ugly with new `data.tar.gz` base name).

…ctory. If we work with "living" instance of ES, it may do something with data at the moment we are trying to archive them -- and in this case we will have an inconsistent copy of the data directory, which is useless to restore indexed data. Current version is applicable only to the standalone (single-node) ES instance, since in cluster mode every node will create a snapshot in its local directory. To work in this mode, we may need to split the task in two: snapshot update and snapshot backup -- and make sure that the second part won't start before the first one is completed -- or configure 'remote repository' to store whole snapshot on a single node.

This way we have more predictable and expected name for the main (current, latest) backup file (`backup/snapshot.tar.gz`), weekly rotation -- in `weekly/`, monthly -- in `monthly/`, etc.

We don't want to forget to change location/filename somewhere it is used, so it is better to have it defined once and reused everywhere else.

The condition "first week of the month" would be TRUE 7 days in a row, meaning that the first day "monthly/$fname" is found missed, it will be created -- and all the other days logrotate would produce error messages like `ln: creating hard link ...: File exists`. It doesn't look really good and makes me wonder: maybe some time monthly rotation might happen *after* this "first week", and we would end up without monthly copy? So now new copy is created next to the rotation of monthly/yearly copies. And it seems to look better now.

mgolosova · 2019-12-13T15:28:51Z

@IATkachenko, thank you for reviewing the code.

I have force-pushed new version, in which "ooops!" commits are squashed with those where "ooops" has happened, and some of commits were reworded: the code has not changed.

Please take a (hopefully final) look at the new version.

mgolosova self-assigned this Dec 4, 2019

mgolosova requested a review from IATkachenko December 4, 2019 11:54

IATkachenko suggested changes Dec 4, 2019

View reviewed changes

IATkachenko suggested changes Dec 6, 2019

View reviewed changes

IATkachenko suggested changes Dec 12, 2019

View reviewed changes

Utils/Elasticsearch/tools/daily/es_backup Outdated Show resolved Hide resolved

Utils/Elasticsearch/tools/daily/es_backup Outdated Show resolved Hide resolved

Utils/Elasticsearch/tools/daily/es_backup Outdated Show resolved Hide resolved

IATkachenko previously approved these changes Dec 13, 2019

View reviewed changes

mgolosova added 15 commits December 13, 2019 18:15

ES/backup: add -p to mkdir.

38ad07a

Better to make sure that all parent directories exist as well.

ES/backup: double-quote variables.

830f216

ES/backup: append extension to filename variable.

e735ea9

ES/backup: remove excessive quotation.

f8a2d02

ES/backup: move basename $1 to a separate variable in logrotate conf.

53b2590

ES/backup: add instruction on backups configuration.

6177f56

ES/backup: make cURL request silent.

0c08a65

ES/backup: add details to error message on ES snapshot creation.

75e9f1c

ES/backup: change rotation rules (put to 'weekly' rotated files only).

75bded1

This way we have more predictable and expected name for the main (current, latest) backup file (`backup/snapshot.tar.gz`), weekly rotation -- in `weekly/`, monthly -- in `monthly/`, etc.

ES/backup: make logrotate configuration consistent with shell code.

b94686e

We don't want to forget to change location/filename somewhere it is used, so it is better to have it defined once and reused everywhere else.

ES/backup: refactoring (make code neater).

75ac562

mgolosova dismissed IATkachenko’s stale review via d53791e December 13, 2019 15:24

mgolosova force-pushed the es-data-backup branch from 93b1ab4 to d53791e Compare December 13, 2019 15:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ES/tools: add script to create data directory backups on nodes. #306

ES/tools: add script to create data directory backups on nodes. #306

mgolosova commented Dec 4, 2019

IATkachenko left a comment

mgolosova commented Dec 12, 2019

mgolosova commented Dec 13, 2019

mgolosova commented Dec 13, 2019

mgolosova commented Dec 13, 2019

ES/tools: add script to create data directory backups on nodes. #306

Are you sure you want to change the base?

ES/tools: add script to create data directory backups on nodes. #306

Conversation

mgolosova commented Dec 4, 2019

IATkachenko left a comment

Choose a reason for hiding this comment

mgolosova commented Dec 12, 2019

mgolosova commented Dec 13, 2019

mgolosova commented Dec 13, 2019

mgolosova commented Dec 13, 2019