Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tasks for retiring and detaching forests #701

Open
khashayar opened this issue Oct 19, 2020 · 9 comments
Open

Add tasks for retiring and detaching forests #701

khashayar opened this issue Oct 19, 2020 · 9 comments

Comments

@khashayar
Copy link

Currently there is no way to delete a forest using ml-gradle.

@rjrudin
Copy link
Contributor

rjrudin commented Oct 19, 2020

What interface do you have in mind here - i.e. what properties would you want to specify on the command line? Just -PforestName, or something else? How about deleting multiple forests at once, or deleting all forests for a database on a specific host or in a specific group?

@rjrudin
Copy link
Contributor

rjrudin commented Oct 19, 2020

Also, what's the use case? "As an ml-gradle user, I want to delete a forest, so that...."

@khashayar
Copy link
Author

I mainly need this task in association with removing a host mlRemoveHost from a cluster. As you know a host can only leave a cluster if there's no forest assigned to it, which is why I wanted to remove all the forests of a given host.

So back to you question, I would be nice to be able to:

  • Remove a single forest
  • Remove all the forests of a given database
  • Remove all the forests of a given host

Thank you.

@khashayar
Copy link
Author

Btw... Speaking of mlRemoveHost, Is there a possibility to extend the functionality of that task so it can handle removing of the assigned forests, before leaving the cluster instead of just failing?

@rjrudin
Copy link
Contributor

rjrudin commented Oct 19, 2020

If the use case is for removing a host, then you'd normally follow a procedure of (let's assume 3 hosts with 2 primary forests per host plus replicas, and data in every primary forest, and host 3 is the one you want to remove):

  1. Retire the forests on host 3
  2. After the rebalancer copies all the data to primary forests on hosts 1 and 2, detach the forests on host 3

After doing the above, you can safely remove host 3.

You've used the verbs "remove" and "delete" - are you really looking to automate the retire/detach process? Because at least for the process of removing a host, there's not a use case for deleting the forests - unless there's no data in them that needs to be rebalanced (you'd still need to detach them though).

Note that if you are looking to automate retire/detach, they'd be two separate Gradle tasks, as the rebalancing process could take hours to finish depending on the amount of data.

@khashayar
Copy link
Author

What you described is exactly my use case... retiring and detaching a forest to be able for a host to leave a cluster.

I tested that manually with a forest with around 10gb and the rebalancing didn't take more than a few mins and since MarkLogic recommends max of 200gb per forest, I was guessing that it should still be manageable to automate this process, considering the rebalancing process.

If you think in reality, it doesn't make sense to automate the whole process then disregard my request and I would fall back to the multiple gradle task approach to achieve this.

@rjrudin
Copy link
Contributor

rjrudin commented Oct 19, 2020

I think there's value in tasks like this:

./gradlew mlRetireForests -PdatabaseName=myDatabase -PhostNames=host3
./gradlew mlDetachForests -PdatabaseName=myDatabase -PhostNames=3

The catch is - how does ml-gradle, or any client, know when the rebalancer is finished? That isn't an ml-gradle problem, it's a problem for any client of the Manage API. If there's a program that can query the Manage API to know when the rebalancer is finished, that can be made an optional part of mlRetireForests or a new task itself - e.g. mlWaitForRebalancer -PdatabaseName=myDatabase.

You could then do everything like this:

./gradlew mlRetireForests mlWaitForRebalancer mlDetachForests -PdatabaseName=myDatabase -PhostNames=host3

And we could then have an "aggregate" task that does everything:

./gradlew mlRemoveForests -PdatabaseName=myDatabase -PhostNames=host3

Want to take a crack first at writing the program to know when the rebalancer is done?

@khashayar
Copy link
Author

I like your idea on how to wait for the rebalancer to be done and the chaining of tasks and it could also be used for mlRemoveHost task!

@rjrudin rjrudin changed the title Add a task to remove a forest Add tasks for retiring and detaching forests Dec 10, 2020
@rjrudin rjrudin transferred this issue from marklogic/ml-gradle Jul 21, 2021
@rjrudin
Copy link
Contributor

rjrudin commented Jul 21, 2021

Moved this to ml-app-deployer as most of the work will need to occur here first.

@rjrudin rjrudin transferred this issue from marklogic/ml-app-deployer Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants