-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ES/tools: add script to create data directory backups on nodes. #306
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inline comments in code.
@IATkachenko, seems like now everything is fixed, please take a look. |
@IATkachenko, addressed all comments and fixed one more issue popped during tests. |
We need it at least at `bamboo` node (and thus in this script hardcoded path to the ES data directory is set to the one used on `bamboo`): since it is a development node and we all have full access to all indices, we can erroneously erase something we don't want to be erased and I don't want to dump data from CERN again when it happens. Not sure if it should be applied on the "production" nodes -- at least because we don't have enough disk space there at this moment :)
Better to make sure that all parent directories exist as well.
Logrotate is a proper tool for files rotation, so it's better to use it instead of inventing another bicycle ;) With logrotate it is better to use fixed filename (without current date stamp), so "current" backup file name is changed to `data.tar.gz`. Due to this change we have to rotate files _before_ creating a new backup file (else we would never update it -- or rewrite every time and never rotate). 1-char suffixes indicating the period in backup file name (`*{w|m|y}.tar.gz`) are replaced with subdirectories for redability (suffixes would look especially ugly with new `data.tar.gz` base name).
…ctory. If we work with "living" instance of ES, it may do something with data at the moment we are trying to archive them -- and in this case we will have an inconsistent copy of the data directory, which is useless to restore indexed data. Current version is applicable only to the standalone (single-node) ES instance, since in cluster mode every node will create a snapshot in its local directory. To work in this mode, we may need to split the task in two: snapshot update and snapshot backup -- and make sure that the second part won't start before the first one is completed -- or configure 'remote repository' to store whole snapshot on a single node.
This way we have more predictable and expected name for the main (current, latest) backup file (`backup/snapshot.tar.gz`), weekly rotation -- in `weekly/`, monthly -- in `monthly/`, etc.
We don't want to forget to change location/filename somewhere it is used, so it is better to have it defined once and reused everywhere else.
The condition "first week of the month" would be TRUE 7 days in a row, meaning that the first day "monthly/$fname" is found missed, it will be created -- and all the other days logrotate would produce error messages like `ln: creating hard link ...: File exists`. It doesn't look really good and makes me wonder: maybe some time monthly rotation might happen *after* this "first week", and we would end up without monthly copy? So now new copy is created next to the rotation of monthly/yearly copies. And it seems to look better now.
93b1ab4
to
d53791e
Compare
@IATkachenko, thank you for reviewing the code. I have force-pushed new version, in which "ooops!" commits are squashed with those where "ooops" has happened, and some of commits were reworded: the code has not changed. Please take a (hopefully final) look at the new version. |
We need it at least at
bamboo
node (and thus in this script hardcodedpath to the ES data directory is set to the one used on
bamboo
): sinceit is a development node and we all have full access to all indices, we
can erroneously erase something we don't want to be erased and I don't
want to dump data from CERN again when it happens.
Not sure if it should be applied on the "production" nodes -- at least
because we don't have enough space on them :)