-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fleet]: Unable to upgrade agents from 8.10.x to 8.11.0 #169825
Comments
Pinging @elastic/fleet (Team:Fleet) |
@karanbirsingh-qasource Please review. |
Secondary review for this ticket is Done |
FYI @kpollich |
This is not a bug, we changed the logic so that the latest released agent version is used everywhere (including install commands). It should appear as 8.11.0 as soon as agent is released. |
Closing as expected behavior, thanks @juliaElastic |
Hi Team, Thank you for confirming the expected behavior, we will keep this noted at our end. Thanks!! |
We have revalidated this issue on released production kibana cloud environment and observed this issue reproducible there. Observations:
Screenshot: Build details: Hence, we are reopening this issue. Thanks!! |
I looked into this, and seems like we indeed have a bug, even after 8.11 agent is released. Alternatively we could just add back the kibana version to the list of agents in stateful kibana (at least when kibana is on GA version) |
Thanks, @juliaElastic for the investigation. The original intent with introducing the build-time product version manifest was to avoid network instability when fetching the agent versions, as the I do think we should probably introduce some kind of "eventual consistency" logic into Kibana to check with that API at runtime if some condition is met. Not seeing the current stack version in the available versions JSON file is a good condition, as you've mentioned. We could also periodically query that API in a Kibana Background task to keep the available versions list on disk up to date. Either way, we need to have a "stale-while-revalidate" style workflow of
+1 - this should be an immediate priority for us to fix. We're blocking the simplest path to upgrading agent to 8.11.0 today. |
I've created a Known issue article and the workaround (suggested by Julia) would be to just type |
@kpollich Moreover with the early agent release coming soon I don't think we can exclusively rely on a file created at Kibana build time. Agree on fixing this now. |
## Summary Closes #169825 This PR adds logic to Fleet's `/api/agents/available_versions` endpoint that will ensure we periodically try to fetch from the live product versions API at https://www.elastic.co/api/product_versions to make sure we have eventual consistency in the list of available agent versions. Currently, Kibana relies entirely on a static file generated at build time from the above API. If the API isn't up-to-date with the latest agent version (e.g. kibana completed its build before agent), then that build of Kibana will never "see" the corresponding build of agent. This API endpoint is cached for two hours to prevent overfetching from this external API, and from constantly going out to disk to read from the agent versions file. ## To do - [x] Update unit tests - [x] Consider airgapped environments ## On airgapped environments In airgapped environments, we're going to try and fetch from the `product_versions` API and that request is going to fail. What we've seen happen in some environments is that these requests do not "fail fast" and instead wait until a network timeout is reached. I'd love to avoid that timeout case and somehow detect airgapped environments and avoid calling this API at all. However, we don't have a great deterministic way to know if someone is in an airgapped environment. The best guess I think we can make is by checking whether `xpack.fleet.registryUrl` is set to something other than `https://epr.elastic.co`. Curious if anyone has thoughts on this. ## Screenshots ![image](https://github.com/elastic/kibana/assets/6766512/0906817c-0098-4b67-8791-d06730f450f6) ![image](https://github.com/elastic/kibana/assets/6766512/59e7c132-f568-470f-b48d-53761ddc2fde) ![image](https://github.com/elastic/kibana/assets/6766512/986372df-a90f-48c3-ae24-c3012e8f7730) ## To test 1. Set up Fleet Server + ES + Kibana 2. Spin up a Fleet Server running Agent v8.11.0 3. Enroll an agent running v8.10.4 (I used multipass) 4. Verify the agent can be upgraded from the UI --------- Co-authored-by: Kibana Machine <[email protected]>
…170974) ## Summary Closes elastic#169825 This PR adds logic to Fleet's `/api/agents/available_versions` endpoint that will ensure we periodically try to fetch from the live product versions API at https://www.elastic.co/api/product_versions to make sure we have eventual consistency in the list of available agent versions. Currently, Kibana relies entirely on a static file generated at build time from the above API. If the API isn't up-to-date with the latest agent version (e.g. kibana completed its build before agent), then that build of Kibana will never "see" the corresponding build of agent. This API endpoint is cached for two hours to prevent overfetching from this external API, and from constantly going out to disk to read from the agent versions file. ## To do - [x] Update unit tests - [x] Consider airgapped environments ## On airgapped environments In airgapped environments, we're going to try and fetch from the `product_versions` API and that request is going to fail. What we've seen happen in some environments is that these requests do not "fail fast" and instead wait until a network timeout is reached. I'd love to avoid that timeout case and somehow detect airgapped environments and avoid calling this API at all. However, we don't have a great deterministic way to know if someone is in an airgapped environment. The best guess I think we can make is by checking whether `xpack.fleet.registryUrl` is set to something other than `https://epr.elastic.co`. Curious if anyone has thoughts on this. ## Screenshots ![image](https://github.com/elastic/kibana/assets/6766512/0906817c-0098-4b67-8791-d06730f450f6) ![image](https://github.com/elastic/kibana/assets/6766512/59e7c132-f568-470f-b48d-53761ddc2fde) ![image](https://github.com/elastic/kibana/assets/6766512/986372df-a90f-48c3-ae24-c3012e8f7730) ## To test 1. Set up Fleet Server + ES + Kibana 2. Spin up a Fleet Server running Agent v8.11.0 3. Enroll an agent running v8.10.4 (I used multipass) 4. Verify the agent can be upgraded from the UI --------- Co-authored-by: Kibana Machine <[email protected]> (cherry picked from commit cd909f0) # Conflicts: # x-pack/plugins/fleet/server/services/agents/versions.ts
…170974) (#171039) # Backport This will backport the following commits from `main` to `8.11`: - [[Fleet] Fix inability to upgrade agents from 8.10.4 -> 8.11 (#170974)](#170974) <!--- Backport version: 8.9.8 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Kyle Pollich","email":"[email protected]"},"sourceCommit":{"committedDate":"2023-11-10T16:08:09Z","message":"[Fleet] Fix inability to upgrade agents from 8.10.4 -> 8.11 (#170974)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/169825\r\n\r\nThis PR adds logic to Fleet's `/api/agents/available_versions` endpoint\r\nthat will ensure we periodically try to fetch from the live product\r\nversions API at https://www.elastic.co/api/product_versions to make sure\r\nwe have eventual consistency in the list of available agent versions.\r\n\r\nCurrently, Kibana relies entirely on a static file generated at build\r\ntime from the above API. If the API isn't up-to-date with the latest\r\nagent version (e.g. kibana completed its build before agent), then that\r\nbuild of Kibana will never \"see\" the corresponding build of agent.\r\n\r\nThis API endpoint is cached for two hours to prevent overfetching from\r\nthis external API, and from constantly going out to disk to read from\r\nthe agent versions file.\r\n\r\n## To do\r\n- [x] Update unit tests\r\n- [x] Consider airgapped environments\r\n\r\n## On airgapped environments\r\n\r\nIn airgapped environments, we're going to try and fetch from the\r\n`product_versions` API and that request is going to fail. What we've\r\nseen happen in some environments is that these requests do not \"fail\r\nfast\" and instead wait until a network timeout is reached.\r\n\r\nI'd love to avoid that timeout case and somehow detect airgapped\r\nenvironments and avoid calling this API at all. However, we don't have a\r\ngreat deterministic way to know if someone is in an airgapped\r\nenvironment. The best guess I think we can make is by checking whether\r\n`xpack.fleet.registryUrl` is set to something other than\r\n`https://epr.elastic.co`. Curious if anyone has thoughts on this.\r\n\r\n## Screenshots\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/0906817c-0098-4b67-8791-d06730f450f6)\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/59e7c132-f568-470f-b48d-53761ddc2fde)\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/986372df-a90f-48c3-ae24-c3012e8f7730)\r\n\r\n## To test\r\n\r\n1. Set up Fleet Server + ES + Kibana\r\n2. Spin up a Fleet Server running Agent v8.11.0\r\n3. Enroll an agent running v8.10.4 (I used multipass)\r\n4. Verify the agent can be upgraded from the UI\r\n\r\n---------\r\n\r\nCo-authored-by: Kibana Machine <[email protected]>","sha":"cd909f03b1d71da93041a0b5c184243aa6506dea","branchLabelMapping":{"^v8.12.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:fix","Team:Fleet","backport:prev-minor","v8.12.0","v8.11.1"],"number":170974,"url":"https://github.com/elastic/kibana/pull/170974","mergeCommit":{"message":"[Fleet] Fix inability to upgrade agents from 8.10.4 -> 8.11 (#170974)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/169825\r\n\r\nThis PR adds logic to Fleet's `/api/agents/available_versions` endpoint\r\nthat will ensure we periodically try to fetch from the live product\r\nversions API at https://www.elastic.co/api/product_versions to make sure\r\nwe have eventual consistency in the list of available agent versions.\r\n\r\nCurrently, Kibana relies entirely on a static file generated at build\r\ntime from the above API. If the API isn't up-to-date with the latest\r\nagent version (e.g. kibana completed its build before agent), then that\r\nbuild of Kibana will never \"see\" the corresponding build of agent.\r\n\r\nThis API endpoint is cached for two hours to prevent overfetching from\r\nthis external API, and from constantly going out to disk to read from\r\nthe agent versions file.\r\n\r\n## To do\r\n- [x] Update unit tests\r\n- [x] Consider airgapped environments\r\n\r\n## On airgapped environments\r\n\r\nIn airgapped environments, we're going to try and fetch from the\r\n`product_versions` API and that request is going to fail. What we've\r\nseen happen in some environments is that these requests do not \"fail\r\nfast\" and instead wait until a network timeout is reached.\r\n\r\nI'd love to avoid that timeout case and somehow detect airgapped\r\nenvironments and avoid calling this API at all. However, we don't have a\r\ngreat deterministic way to know if someone is in an airgapped\r\nenvironment. The best guess I think we can make is by checking whether\r\n`xpack.fleet.registryUrl` is set to something other than\r\n`https://epr.elastic.co`. Curious if anyone has thoughts on this.\r\n\r\n## Screenshots\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/0906817c-0098-4b67-8791-d06730f450f6)\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/59e7c132-f568-470f-b48d-53761ddc2fde)\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/986372df-a90f-48c3-ae24-c3012e8f7730)\r\n\r\n## To test\r\n\r\n1. Set up Fleet Server + ES + Kibana\r\n2. Spin up a Fleet Server running Agent v8.11.0\r\n3. Enroll an agent running v8.10.4 (I used multipass)\r\n4. Verify the agent can be upgraded from the UI\r\n\r\n---------\r\n\r\nCo-authored-by: Kibana Machine <[email protected]>","sha":"cd909f03b1d71da93041a0b5c184243aa6506dea"}},"sourceBranch":"main","suggestedTargetBranches":["8.11"],"targetPullRequestStates":[{"branch":"main","label":"v8.12.0","labelRegex":"^v8.12.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/170974","number":170974,"mergeCommit":{"message":"[Fleet] Fix inability to upgrade agents from 8.10.4 -> 8.11 (#170974)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/169825\r\n\r\nThis PR adds logic to Fleet's `/api/agents/available_versions` endpoint\r\nthat will ensure we periodically try to fetch from the live product\r\nversions API at https://www.elastic.co/api/product_versions to make sure\r\nwe have eventual consistency in the list of available agent versions.\r\n\r\nCurrently, Kibana relies entirely on a static file generated at build\r\ntime from the above API. If the API isn't up-to-date with the latest\r\nagent version (e.g. kibana completed its build before agent), then that\r\nbuild of Kibana will never \"see\" the corresponding build of agent.\r\n\r\nThis API endpoint is cached for two hours to prevent overfetching from\r\nthis external API, and from constantly going out to disk to read from\r\nthe agent versions file.\r\n\r\n## To do\r\n- [x] Update unit tests\r\n- [x] Consider airgapped environments\r\n\r\n## On airgapped environments\r\n\r\nIn airgapped environments, we're going to try and fetch from the\r\n`product_versions` API and that request is going to fail. What we've\r\nseen happen in some environments is that these requests do not \"fail\r\nfast\" and instead wait until a network timeout is reached.\r\n\r\nI'd love to avoid that timeout case and somehow detect airgapped\r\nenvironments and avoid calling this API at all. However, we don't have a\r\ngreat deterministic way to know if someone is in an airgapped\r\nenvironment. The best guess I think we can make is by checking whether\r\n`xpack.fleet.registryUrl` is set to something other than\r\n`https://epr.elastic.co`. Curious if anyone has thoughts on this.\r\n\r\n## Screenshots\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/0906817c-0098-4b67-8791-d06730f450f6)\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/59e7c132-f568-470f-b48d-53761ddc2fde)\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/986372df-a90f-48c3-ae24-c3012e8f7730)\r\n\r\n## To test\r\n\r\n1. Set up Fleet Server + ES + Kibana\r\n2. Spin up a Fleet Server running Agent v8.11.0\r\n3. Enroll an agent running v8.10.4 (I used multipass)\r\n4. Verify the agent can be upgraded from the UI\r\n\r\n---------\r\n\r\nCo-authored-by: Kibana Machine <[email protected]>","sha":"cd909f03b1d71da93041a0b5c184243aa6506dea"}},{"branch":"8.11","label":"v8.11.1","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Kibana Machine <[email protected]>
Hi folks, I'm following up on all the issues/threads where this was reported and tracked. Due to 8.11.1 being pulled forward for an emergency release, this bugfix has missed the 8.11.1 build candidate window. The root cause fix in #170974 will be available in the 8.11.2 release scheduled for public availability on December 4th, 2023. In the meantime, there is a documented workaround available in our support knowledge base here: https://support.elastic.co/knowledge/86bef2c1 (requires an Elastic Cloud login). |
8.10.3
under Add agent flyout.
After some manual testing on 8.11.1, it is possible to upgrade agents to 8.11.0 on this patch release. However, upgrading to agent 8.11.1 is still not possible without manually entering 8.11.1 into the agent upgrade box and selecting the "Use 8.11.1 as a custom option (not recommended)" option that appears due to #170974 not being included in the final BC. However, the 8.10.x -> 8.11.0 upgrade should be unblocked by this release. |
Can the workaround steps be shared for those that do not use Elastic Cloud? I use on-prem myself. |
Hello @Aqualie I am sharing it here. @kpollich Idk if we can somehow put it in the release notes. Summary / Table of ContentsThe Fleet UI in Kibana 8.11.0:
EnvironmentThe problem affects users on Kibana 8.11.0 in any platform. WorkaroundOnly when attempting to upgrade Elastic Agents currently on version less or equal to 8.10.3From the Fleet UI, the version field in the version picker in modal window allows to enter any version. If you are in Elastic Agent 8.10.4, this workaround is not applicable and follow the one on the next section. This workaround is applicable to any Elastic Agent version, but more complex as it has to be done via APIsFor a single Elastic Agent, you can run in Kibana Dev Tools:
Example request for a known group of Elastic Agent IDs:
Example request for Elastic Agents running a specific policy and below a specific version (e.g. 8.11.0):
A trick to get the Elastic Agent IDs we see in the Fleet UI is to inspect the Network tab of the Developer Tools of the browser and look for the request The Fleet APIs are documented here. FixFixed by PR #170974, landing in 8.11.2. Due to 8.11.1 being pulled forward for an emergency release, the root cause fix will not land until 8.11.2 on December 4th, 2023. References
|
I am working on getting this known issue into the 8.11.0 and 8.11.1 release notes within the next day or two. |
Maybe we should update this slightly, since it affects both 8.11.0 and 8.11.1, not just 8.11.0. |
## Summary The 8.11.1 release notes included #170974 which didn't actually land in 8.11.1. We shipped BC2 of 8.11.1 which was built from this Kibana commit: https://github.com/elastic/kibana/commits/09feaf416f986b239b8e8ad95ecdda0f9d56ebec. The PR was not merged until after this commit, so the bug is still present (though [mitigated slightly](#169825 (comment))) in 8.11.1. This PR removes the erroneous release note from the 8.11.1 release notes. How can we make sure the fix _does_ get included in the eventual 8.11.2 release notes?
…71200) ## Summary The 8.11.1 release notes included elastic#170974 which didn't actually land in 8.11.1. We shipped BC2 of 8.11.1 which was built from this Kibana commit: https://github.com/elastic/kibana/commits/09feaf416f986b239b8e8ad95ecdda0f9d56ebec. The PR was not merged until after this commit, so the bug is still present (though [mitigated slightly](elastic#169825 (comment))) in 8.11.1. This PR removes the erroneous release note from the 8.11.1 release notes. How can we make sure the fix _does_ get included in the eventual 8.11.2 release notes? (cherry picked from commit 480fcef)
…71200) (#171249) # Backport This will backport the following commits from `main` to `8.11`: - [[Fleet] Remove agent upgrade fix from 8.11.1 release notes (#171200)](#171200) <!--- Backport version: 8.9.7 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Kyle Pollich","email":"[email protected]"},"sourceCommit":{"committedDate":"2023-11-14T21:57:06Z","message":"[Fleet] Remove agent upgrade fix from 8.11.1 release notes (#171200)\n\n## Summary\r\n\r\nThe 8.11.1 release notes included #170974 which didn't actually land in\r\n8.11.1. We shipped BC2 of 8.11.1 which was built from this Kibana\r\ncommit:\r\nhttps://github.com/elastic/kibana/commits/09feaf416f986b239b8e8ad95ecdda0f9d56ebec.\r\nThe PR was not merged until after this commit, so the bug is still\r\npresent (though [mitigated\r\nslightly](https://github.com/elastic/kibana/issues/169825#issuecomment-1808453016))\r\nin 8.11.1.\r\n\r\nThis PR removes the erroneous release note from the 8.11.1 release\r\nnotes. How can we make sure the fix _does_ get included in the eventual\r\n8.11.2 release notes?","sha":"480fcef6985b21c1a3c22d4657aeefc761fec5a3","branchLabelMapping":{"^v8.12.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["Team:Docs","release_note:skip","backport:prev-minor","v8.12.0","v8.11.2"],"number":171200,"url":"https://github.com/elastic/kibana/pull/171200","mergeCommit":{"message":"[Fleet] Remove agent upgrade fix from 8.11.1 release notes (#171200)\n\n## Summary\r\n\r\nThe 8.11.1 release notes included #170974 which didn't actually land in\r\n8.11.1. We shipped BC2 of 8.11.1 which was built from this Kibana\r\ncommit:\r\nhttps://github.com/elastic/kibana/commits/09feaf416f986b239b8e8ad95ecdda0f9d56ebec.\r\nThe PR was not merged until after this commit, so the bug is still\r\npresent (though [mitigated\r\nslightly](https://github.com/elastic/kibana/issues/169825#issuecomment-1808453016))\r\nin 8.11.1.\r\n\r\nThis PR removes the erroneous release note from the 8.11.1 release\r\nnotes. How can we make sure the fix _does_ get included in the eventual\r\n8.11.2 release notes?","sha":"480fcef6985b21c1a3c22d4657aeefc761fec5a3"}},"sourceBranch":"main","suggestedTargetBranches":["8.11"],"targetPullRequestStates":[{"branch":"main","label":"v8.12.0","labelRegex":"^v8.12.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/171200","number":171200,"mergeCommit":{"message":"[Fleet] Remove agent upgrade fix from 8.11.1 release notes (#171200)\n\n## Summary\r\n\r\nThe 8.11.1 release notes included #170974 which didn't actually land in\r\n8.11.1. We shipped BC2 of 8.11.1 which was built from this Kibana\r\ncommit:\r\nhttps://github.com/elastic/kibana/commits/09feaf416f986b239b8e8ad95ecdda0f9d56ebec.\r\nThe PR was not merged until after this commit, so the bug is still\r\npresent (though [mitigated\r\nslightly](https://github.com/elastic/kibana/issues/169825#issuecomment-1808453016))\r\nin 8.11.1.\r\n\r\nThis PR removes the erroneous release note from the 8.11.1 release\r\nnotes. How can we make sure the fix _does_ get included in the eventual\r\n8.11.2 release notes?","sha":"480fcef6985b21c1a3c22d4657aeefc761fec5a3"}},{"branch":"8.11","label":"v8.11.2","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Kyle Pollich <[email protected]>
Hi Team, We have revalidated this issue on latest 8.11.3 kibana cloud environment and found it fixed now: Observations:
Screen Recording: Agents.-.Fleet.-.Elastic.-.Google.Chrome.2023-12-13.11-35-42.mp4Agents.-.Fleet.-.Elastic.-.Google.Chrome.2023-12-13.11-37-12.mp4Build details: Hence we are marking this issue as QA:Validated. Thanks! |
Kibana Build details:
Host OS: All
Preconditions:
Steps to reproduce:
8.10.3
.Screenshot:
Expected Result:
Agent install commands should be available with version
8.11.0
under Add agent flyout.The text was updated successfully, but these errors were encountered: