-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi Node Cluster support for move-node #22
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool.
I think move-node
should only move one node at a time, not more.
Also, we need to add a really strong e2e tests, even one that covers mixed shards (with all nodes having one shard for one db).
@dianabarsan can you have another look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one! I left a few questions and requests inline.
src/move-node.js
Outdated
if (!removedNodes.includes(oldNode)) { | ||
removedNodes.push(oldNode); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can this happen? As far as I can tell, only way this would happen is if the node had no shards allocated. Is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so. I think I ran into it while doing some tests. I suspect that it happened if you used move-shards separately and had moved all shards off of the node but without removing the node. This function would still make sure it gets removed even if the shards had been moved previously - something along those lines.
Pushed the changes and unit tests @dianabarsan. Let me know if things make sense. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the changes!! I left a few comments below. This is getting together quite beautifully :)
docker-compose -f ./scripts/couchdb-cluster.yml down --remove-orphans --volumes | ||
# Let's get the shard mapping - we need this to migrate to a different cluster. | ||
shard_mapping=$(docker compose -f ../docker-compose-test.yml run couch-migration get-shard-mapping) | ||
echo $shard_mapping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know I've suggested we don't actually pass this as a parameter to move-node
. Does this work if we change the actions to first move the nodes under the old install and then start up the new install?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually want to avoid running any commands with side-effects on the old installs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this would actually be on the cloned data of the old install, so it doesn't matter. It's the data that we are preparing for the new install, whether we run an action when old couch is running vs new couch is running, it should be the same.
test/unit/get-shard-mapping.spec.js
Outdated
expect(utils.getDbs.calledOnce).to.be.true; | ||
expect(utils.getUrl.calledThrice).to.be.true; | ||
expect(utils.request.calledThrice).to.be.true; | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add a unit test where a shard for one db is on one node, while for the other db the same shard is on another node? It'd be cool to see what happens in real life when this is the case. I've seen this occur very frequently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This was enlightening. While digging into this, I investigated several clusters and it indeed turns out that the mapping for different databases are different. Some use 1, 2, 3 while others might use 2, 3, 1 and so forth. I've also confirmed that this is expected behavior for couch. One of the determinants for this is the timing of the database creation. Therefore this has necessitated changing the mapping to include database information - as there is no concept of a global shard map for all databases in the couch server. I have added the tests to confirm that the new approach works for this scenario. It's actually the only approach that works if you want to preserve shard mapping across migrations.
@dianabarsan welcome back! Can you take another look here please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent progress! I added some comment about being very intentional with logical choices, instead of reactive.
@dianabarsan can you have another look here please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super cool!
Just one pending request, apologies I didn't catch this one earlier :(
Changes pushed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🌟 Excellent!
This is an attempt at adding a multi-node support for move-nodes command.
shard_mapping=$(get-shard-mapping)
pipe in the shard_mapping:
echo '$shard_mapping' | move-node '{"couchdb@old-node-1":"couchdb@new-node-1"}'
The above commands can all be run via
docker compose
in line with the other commands in this repo.