You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What is the bug?
On 2.18, model auto redeploy feature has been removed for remote models in this PR meaning after b/g or node/cluster restart, remote models wont be auto redeployed to new nodes automatically.
When the same had been back-ported to AOS 2.17, a bug was identified due to which ML Sync-up job would stop running automatically when a cluster is upgraded from 2.15 or any older versions to 2.17 on AOS. This condition returns directly if the model encountered by auto redeployer is a remote model and sync-up job is never getting started
The sync-up is only starting again with manual intervention by changing the sync-up interval setting to a different value than the previous one(Default is 10 seconds)
PUT /_cluster/settings
{
"persistent": {
"plugins.ml_commons.sync_up_job_interval_in_seconds": 5
}
}
How can one reproduce the bug?
Steps to reproduce the behavior:
On AOS, create a cluster on 2.15 version. Verify the sync_up job interval by getting the setting plugins.ml_commons.sync_up_job_interval_in_seconds
Check the logs to see the ML Sync-up logs at the intervals specified by above setting
Upgrade the cluster to 2.17 version
Check the logs again and you can see the sunc-up logs are missing
What is the expected behavior?
Sync-up job should run automatically when cluster is upgraded to 2.17 version
Do you have any screenshots?
If applicable, add screenshots to help explain your problem.
Do you have any additional context?
Add any other context about the problem.
The text was updated successfully, but these errors were encountered:
What is the bug?
On 2.18, model auto redeploy feature has been removed for remote models in this PR meaning after b/g or node/cluster restart, remote models wont be auto redeployed to new nodes automatically.
When the same had been back-ported to AOS 2.17, a bug was identified due to which ML Sync-up job would stop running automatically when a cluster is upgraded from 2.15 or any older versions to 2.17 on AOS. This condition returns directly if the model encountered by auto redeployer is a
remote
model and sync-up job is never getting startedThe sync-up is only starting again with manual intervention by changing the sync-up interval setting to a different value than the previous one(Default is 10 seconds)
How can one reproduce the bug?
Steps to reproduce the behavior:
plugins.ml_commons.sync_up_job_interval_in_seconds
What is the expected behavior?
Sync-up job should run automatically when cluster is upgraded to 2.17 version
Do you have any screenshots?
If applicable, add screenshots to help explain your problem.
Do you have any additional context?
Add any other context about the problem.
The text was updated successfully, but these errors were encountered: