Skip to content

Commit

Permalink
[8.11] [Fleet] Fix inability to upgrade agents from 8.10.4 -> 8.11 (#…
Browse files Browse the repository at this point in the history
…170974) (#171039)

# Backport

This will backport the following commits from `main` to `8.11`:
- [[Fleet] Fix inability to upgrade agents from 8.10.4 -> 8.11
(#170974)](#170974)

<!--- Backport version: 8.9.8 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Kyle
Pollich","email":"[email protected]"},"sourceCommit":{"committedDate":"2023-11-10T16:08:09Z","message":"[Fleet]
Fix inability to upgrade agents from 8.10.4 -> 8.11 (#170974)\n\n##
Summary\r\n\r\nCloses
https://github.com/elastic/kibana/issues/169825\r\n\r\nThis PR adds
logic to Fleet's `/api/agents/available_versions` endpoint\r\nthat will
ensure we periodically try to fetch from the live product\r\nversions
API at https://www.elastic.co/api/product_versions to make sure\r\nwe
have eventual consistency in the list of available agent
versions.\r\n\r\nCurrently, Kibana relies entirely on a static file
generated at build\r\ntime from the above API. If the API isn't
up-to-date with the latest\r\nagent version (e.g. kibana completed its
build before agent), then that\r\nbuild of Kibana will never \"see\" the
corresponding build of agent.\r\n\r\nThis API endpoint is cached for two
hours to prevent overfetching from\r\nthis external API, and from
constantly going out to disk to read from\r\nthe agent versions
file.\r\n\r\n## To do\r\n- [x] Update unit tests\r\n- [x] Consider
airgapped environments\r\n\r\n## On airgapped environments\r\n\r\nIn
airgapped environments, we're going to try and fetch from
the\r\n`product_versions` API and that request is going to fail. What
we've\r\nseen happen in some environments is that these requests do not
\"fail\r\nfast\" and instead wait until a network timeout is
reached.\r\n\r\nI'd love to avoid that timeout case and somehow detect
airgapped\r\nenvironments and avoid calling this API at all. However, we
don't have a\r\ngreat deterministic way to know if someone is in an
airgapped\r\nenvironment. The best guess I think we can make is by
checking whether\r\n`xpack.fleet.registryUrl` is set to something other
than\r\n`https://epr.elastic.co`. Curious if anyone has thoughts on
this.\r\n\r\n##
Screenshots\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/0906817c-0098-4b67-8791-d06730f450f6)\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/59e7c132-f568-470f-b48d-53761ddc2fde)\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/986372df-a90f-48c3-ae24-c3012e8f7730)\r\n\r\n##
To test\r\n\r\n1. Set up Fleet Server + ES + Kibana\r\n2. Spin up a
Fleet Server running Agent v8.11.0\r\n3. Enroll an agent running v8.10.4
(I used multipass)\r\n4. Verify the agent can be upgraded from the
UI\r\n\r\n---------\r\n\r\nCo-authored-by: Kibana Machine
<[email protected]>","sha":"cd909f03b1d71da93041a0b5c184243aa6506dea","branchLabelMapping":{"^v8.12.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:fix","Team:Fleet","backport:prev-minor","v8.12.0","v8.11.1"],"number":170974,"url":"https://github.com/elastic/kibana/pull/170974","mergeCommit":{"message":"[Fleet]
Fix inability to upgrade agents from 8.10.4 -> 8.11 (#170974)\n\n##
Summary\r\n\r\nCloses
https://github.com/elastic/kibana/issues/169825\r\n\r\nThis PR adds
logic to Fleet's `/api/agents/available_versions` endpoint\r\nthat will
ensure we periodically try to fetch from the live product\r\nversions
API at https://www.elastic.co/api/product_versions to make sure\r\nwe
have eventual consistency in the list of available agent
versions.\r\n\r\nCurrently, Kibana relies entirely on a static file
generated at build\r\ntime from the above API. If the API isn't
up-to-date with the latest\r\nagent version (e.g. kibana completed its
build before agent), then that\r\nbuild of Kibana will never \"see\" the
corresponding build of agent.\r\n\r\nThis API endpoint is cached for two
hours to prevent overfetching from\r\nthis external API, and from
constantly going out to disk to read from\r\nthe agent versions
file.\r\n\r\n## To do\r\n- [x] Update unit tests\r\n- [x] Consider
airgapped environments\r\n\r\n## On airgapped environments\r\n\r\nIn
airgapped environments, we're going to try and fetch from
the\r\n`product_versions` API and that request is going to fail. What
we've\r\nseen happen in some environments is that these requests do not
\"fail\r\nfast\" and instead wait until a network timeout is
reached.\r\n\r\nI'd love to avoid that timeout case and somehow detect
airgapped\r\nenvironments and avoid calling this API at all. However, we
don't have a\r\ngreat deterministic way to know if someone is in an
airgapped\r\nenvironment. The best guess I think we can make is by
checking whether\r\n`xpack.fleet.registryUrl` is set to something other
than\r\n`https://epr.elastic.co`. Curious if anyone has thoughts on
this.\r\n\r\n##
Screenshots\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/0906817c-0098-4b67-8791-d06730f450f6)\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/59e7c132-f568-470f-b48d-53761ddc2fde)\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/986372df-a90f-48c3-ae24-c3012e8f7730)\r\n\r\n##
To test\r\n\r\n1. Set up Fleet Server + ES + Kibana\r\n2. Spin up a
Fleet Server running Agent v8.11.0\r\n3. Enroll an agent running v8.10.4
(I used multipass)\r\n4. Verify the agent can be upgraded from the
UI\r\n\r\n---------\r\n\r\nCo-authored-by: Kibana Machine
<[email protected]>","sha":"cd909f03b1d71da93041a0b5c184243aa6506dea"}},"sourceBranch":"main","suggestedTargetBranches":["8.11"],"targetPullRequestStates":[{"branch":"main","label":"v8.12.0","labelRegex":"^v8.12.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/170974","number":170974,"mergeCommit":{"message":"[Fleet]
Fix inability to upgrade agents from 8.10.4 -> 8.11 (#170974)\n\n##
Summary\r\n\r\nCloses
https://github.com/elastic/kibana/issues/169825\r\n\r\nThis PR adds
logic to Fleet's `/api/agents/available_versions` endpoint\r\nthat will
ensure we periodically try to fetch from the live product\r\nversions
API at https://www.elastic.co/api/product_versions to make sure\r\nwe
have eventual consistency in the list of available agent
versions.\r\n\r\nCurrently, Kibana relies entirely on a static file
generated at build\r\ntime from the above API. If the API isn't
up-to-date with the latest\r\nagent version (e.g. kibana completed its
build before agent), then that\r\nbuild of Kibana will never \"see\" the
corresponding build of agent.\r\n\r\nThis API endpoint is cached for two
hours to prevent overfetching from\r\nthis external API, and from
constantly going out to disk to read from\r\nthe agent versions
file.\r\n\r\n## To do\r\n- [x] Update unit tests\r\n- [x] Consider
airgapped environments\r\n\r\n## On airgapped environments\r\n\r\nIn
airgapped environments, we're going to try and fetch from
the\r\n`product_versions` API and that request is going to fail. What
we've\r\nseen happen in some environments is that these requests do not
\"fail\r\nfast\" and instead wait until a network timeout is
reached.\r\n\r\nI'd love to avoid that timeout case and somehow detect
airgapped\r\nenvironments and avoid calling this API at all. However, we
don't have a\r\ngreat deterministic way to know if someone is in an
airgapped\r\nenvironment. The best guess I think we can make is by
checking whether\r\n`xpack.fleet.registryUrl` is set to something other
than\r\n`https://epr.elastic.co`. Curious if anyone has thoughts on
this.\r\n\r\n##
Screenshots\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/0906817c-0098-4b67-8791-d06730f450f6)\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/59e7c132-f568-470f-b48d-53761ddc2fde)\r\n\r\n\r\n![image](https://github.com/elastic/kibana/assets/6766512/986372df-a90f-48c3-ae24-c3012e8f7730)\r\n\r\n##
To test\r\n\r\n1. Set up Fleet Server + ES + Kibana\r\n2. Spin up a
Fleet Server running Agent v8.11.0\r\n3. Enroll an agent running v8.10.4
(I used multipass)\r\n4. Verify the agent can be upgraded from the
UI\r\n\r\n---------\r\n\r\nCo-authored-by: Kibana Machine
<[email protected]>","sha":"cd909f03b1d71da93041a0b5c184243aa6506dea"}},{"branch":"8.11","label":"v8.11.1","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Kibana Machine <[email protected]>
  • Loading branch information
kpollich and kibanamachine authored Nov 12, 2023
1 parent 8c233cd commit 00995c2
Show file tree
Hide file tree
Showing 4 changed files with 203 additions and 42 deletions.
3 changes: 2 additions & 1 deletion x-pack/plugins/fleet/server/routes/agent/handlers.ts
Original file line number Diff line number Diff line change
Expand Up @@ -354,8 +354,9 @@ function isStringArray(arr: unknown | string[]): arr is string[] {

export const getAvailableVersionsHandler: RequestHandler = async (context, request, response) => {
try {
const availableVersions = await AgentService.getAvailableVersions({});
const availableVersions = await AgentService.getAvailableVersions();
const body: GetAvailableVersionsResponse = { items: availableVersions };

return response.ok({ body });
} catch (error) {
return defaultFleetErrorHandler({ error, response });
Expand Down
6 changes: 6 additions & 0 deletions x-pack/plugins/fleet/server/services/agents/crud.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,9 @@ import type { ElasticsearchClient } from '@kbn/core/server';
import { elasticsearchServiceMock, savedObjectsClientMock } from '@kbn/core/server/mocks';

import { AGENTS_INDEX } from '../../constants';
import { createAppContextStartContractMock } from '../../mocks';
import type { Agent } from '../../types';
import { appContextService } from '../app_context';

import { auditLoggingService } from '../audit_logging';

Expand All @@ -30,6 +32,7 @@ const mockedAuditLoggingService = auditLoggingService as jest.Mocked<typeof audi

describe('Agents CRUD test', () => {
const soClientMock = savedObjectsClientMock.create();
let mockContract: ReturnType<typeof createAppContextStartContractMock>;
let esClientMock: ElasticsearchClient;
let searchMock: jest.Mock;

Expand All @@ -41,6 +44,9 @@ describe('Agents CRUD test', () => {
openPointInTime: jest.fn().mockResolvedValue({ id: '1' }),
closePointInTime: jest.fn(),
} as unknown as ElasticsearchClient;

mockContract = createAppContextStartContractMock();
appContextService.start(mockContract);
});

function getEsResponse(ids: string[], total: number) {
Expand Down
110 changes: 105 additions & 5 deletions x-pack/plugins/fleet/server/services/agents/versions.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@

import { readFile } from 'fs/promises';

import fetch from 'node-fetch';

let mockKibanaVersion = '300.0.0';
let mockConfig = {};
jest.mock('../app_context', () => {
Expand All @@ -21,25 +23,40 @@ jest.mock('../app_context', () => {
});

jest.mock('fs/promises');
jest.mock('node-fetch');

const mockedReadFile = readFile as jest.MockedFunction<typeof readFile>;
const mockedFetch = fetch as jest.MockedFunction<typeof fetch>;

const emptyResponse = {
status: 200,
text: jest.fn().mockResolvedValue(JSON.stringify({})),
} as any;

import { getAvailableVersions } from './versions';

describe('getAvailableVersions', () => {
beforeEach(() => {
mockedReadFile.mockReset();
mockedFetch.mockReset();
});

it('should return available version and filter version < 7.17', async () => {
mockKibanaVersion = '300.0.0';
mockedReadFile.mockResolvedValue(`["8.1.0", "8.0.0", "7.17.0", "7.16.0"]`);
mockedFetch.mockResolvedValueOnce(emptyResponse);

const res = await getAvailableVersions({ cached: false, includeCurrentVersion: true });
const res = await getAvailableVersions({ includeCurrentVersion: true, ignoreCache: true });

expect(res).toEqual(['300.0.0', '8.1.0', '8.0.0', '7.17.0']);
});

it('should not strip -SNAPSHOT from kibana version', async () => {
mockKibanaVersion = '300.0.0-SNAPSHOT';
mockedReadFile.mockResolvedValue(`["8.1.0", "8.0.0", "7.17.0", "7.16.0"]`);
mockedFetch.mockResolvedValueOnce(emptyResponse);

const res = await getAvailableVersions({ cached: false, includeCurrentVersion: true });
const res = await getAvailableVersions({ includeCurrentVersion: true, ignoreCache: true });
expect(res).toEqual(['300.0.0-SNAPSHOT', '8.1.0', '8.0.0', '7.17.0']);
});

Expand All @@ -51,17 +68,19 @@ describe('getAvailableVersions', () => {
},
};
mockedReadFile.mockResolvedValue(`["8.1.0", "8.0.0", "7.17.0", "7.16.0"]`);
mockedFetch.mockResolvedValueOnce(emptyResponse);

const res = await getAvailableVersions({ cached: false });
const res = await getAvailableVersions({ ignoreCache: true });

expect(res).toEqual(['8.1.0', '8.0.0', '7.17.0']);
});

it('should not include the current version if includeCurrentVersion = false', async () => {
mockKibanaVersion = '300.0.0-SNAPSHOT';
mockedReadFile.mockResolvedValue(`["8.1.0", "8.0.0", "7.17.0", "7.16.0"]`);
mockedFetch.mockResolvedValueOnce(emptyResponse);

const res = await getAvailableVersions({ cached: false, includeCurrentVersion: false });
const res = await getAvailableVersions({ includeCurrentVersion: false, ignoreCache: true });

expect(res).toEqual(['8.1.0', '8.0.0', '7.17.0']);
});
Expand All @@ -74,9 +93,90 @@ describe('getAvailableVersions', () => {
},
};
mockedReadFile.mockRejectedValue({ code: 'ENOENT' });
mockedFetch.mockResolvedValueOnce(emptyResponse);

const res = await getAvailableVersions({ cached: false });
const res = await getAvailableVersions({ ignoreCache: true });

expect(res).toEqual(['300.0.0']);
});

it('should include versions returned from product_versions API', async () => {
mockKibanaVersion = '300.0.0';
mockedReadFile.mockResolvedValue(`["8.1.0", "8.0.0", "7.17.0", "7.16.0"]`);
mockedFetch.mockResolvedValueOnce({
status: 200,
text: jest.fn().mockResolvedValue(
JSON.stringify([
[
{
title: 'Elastic Agent 8.1.0',
version_number: '8.1.0',
},
{
title: 'Elastic Agent 8.10.0',
version_number: '8.10.0',
},
{
title: 'Elastic Agent 8.9.2',
version_number: '8.9.2',
},
,
],
])
),
} as any);

const res = await getAvailableVersions({ ignoreCache: true });

// Should sort, uniquify and filter out versions < 7.17
expect(res).toEqual(['8.10.0', '8.9.2', '8.1.0', '8.0.0', '7.17.0']);
});

it('should cache results', async () => {
mockKibanaVersion = '300.0.0';
mockedReadFile.mockResolvedValue(`["8.1.0", "8.0.0", "7.17.0", "7.16.0"]`);
mockedFetch.mockResolvedValueOnce({
status: 200,
text: jest.fn().mockResolvedValue(
JSON.stringify([
[
{
title: 'Elastic Agent 8.1.0',
version_number: '8.1.0',
},
{
title: 'Elastic Agent 8.10.0',
version_number: '8.10.0',
},
{
title: 'Elastic Agent 8.9.2',
version_number: '8.9.2',
},
,
],
])
),
} as any);

await getAvailableVersions();

mockedFetch.mockResolvedValueOnce({
status: 200,
text: jest.fn().mockResolvedValue(
JSON.stringify([
[
{
title: 'Elastic Agent 300.0.0',
version_number: '300.0.0',
},
],
])
),
} as any);

const res2 = await getAvailableVersions();

expect(mockedFetch).toBeCalledTimes(1);
expect(res2).not.toContain('300.0.0');
});
});
126 changes: 90 additions & 36 deletions x-pack/plugins/fleet/server/services/agents/versions.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,27 @@
import { readFile } from 'fs/promises';
import Path from 'path';

import { REPO_ROOT } from '@kbn/repo-info';
import fetch from 'node-fetch';
import pRetry from 'p-retry';
import { uniq } from 'lodash';
import semverGte from 'semver/functions/gte';
import semverGt from 'semver/functions/gt';
import semverCoerce from 'semver/functions/coerce';

import { REPO_ROOT } from '@kbn/repo-info';

import { appContextService } from '..';

const MINIMUM_SUPPORTED_VERSION = '7.17.0';
const AGENT_VERSION_BUILD_FILE = 'x-pack/plugins/fleet/target/agent_versions_list.json';

let availableVersions: string[] | undefined;
// Endpoint maintained by the web-team and hosted on the elastic website
const PRODUCT_VERSIONS_URL = 'https://www.elastic.co/api/product_versions';

// Cache available versions in memory for 1 hour
const CACHE_DURATION = 1000 * 60 * 60;
let CACHED_AVAILABLE_VERSIONS: string[] | undefined;
let LAST_FETCHED: number | undefined;

export const getLatestAvailableVersion = async (
includeCurrentVersion?: boolean
Expand All @@ -30,55 +39,100 @@ export const getLatestAvailableVersion = async (
};

export const getAvailableVersions = async ({
cached = true,
includeCurrentVersion,
ignoreCache = false, // This is only here to allow us to ignore the cache in tests
}: {
cached?: boolean;
includeCurrentVersion?: boolean;
}): Promise<string[]> => {
// Use cached value to avoid reading from disk each time
if (cached && availableVersions) {
return availableVersions;
ignoreCache?: boolean;
} = {}): Promise<string[]> => {
const logger = appContextService.getLogger();

if (LAST_FETCHED && !ignoreCache) {
const msSinceLastFetched = Date.now() - (LAST_FETCHED || 0);

if (msSinceLastFetched < CACHE_DURATION && CACHED_AVAILABLE_VERSIONS !== undefined) {
logger.debug(`Cache is valid, returning cached available versions`);

return CACHED_AVAILABLE_VERSIONS;
}

logger.debug('Cache has expired, fetching available versions from disk + API');
}

// Read a static file generated at build time
const config = appContextService.getConfig();
let versionsToDisplay: string[] = [];

const kibanaVersion = appContextService.getKibanaVersion();

let availableVersions: string[] = [];

// First, grab available versions from the static file that's placed on disk at build time
try {
const file = await readFile(Path.join(REPO_ROOT, AGENT_VERSION_BUILD_FILE), 'utf-8');

// Exclude versions older than MINIMUM_SUPPORTED_VERSION and pre-release versions (SNAPSHOT, rc..)
// De-dup and sort in descending order
const data: string[] = JSON.parse(file);

const versions = data
.map((item: any) => semverCoerce(item)?.version || '')
.filter((v: any) => semverGte(v, MINIMUM_SUPPORTED_VERSION))
.sort((a: any, b: any) => (semverGt(a, b) ? -1 : 1));
versionsToDisplay = uniq(versions) as string[];
availableVersions = [...availableVersions, ...data];
} catch (error) {
// If we can't read from the file, the error is non-blocking. We'll try to source data from the
// product versions API later.
logger.debug(`Error reading file ${AGENT_VERSION_BUILD_FILE}: ${error.message}`);
}

// Next, fetch from the product versions API. This API call is aggressively cached, so we won't
// fetch from the live API more than `TIME_BETWEEN_FETCHES` milliseconds.
const apiVersions = await fetchAgentVersionsFromApi();

// Coerce each version to a semver object and compare to our `MINIMUM_SUPPORTED_VERSION` - we
// only want support versions in the final result. We'll also sort by newest version first.
availableVersions = uniq([...availableVersions, ...apiVersions])
.map((item: any) => semverCoerce(item)?.version || '')
.filter((v: any) => semverGte(v, MINIMUM_SUPPORTED_VERSION))
.sort((a: any, b: any) => (semverGt(a, b) ? -1 : 1));

// If the current stack version isn't included in the list of available versions, add it
// at the front of the array
const hasCurrentVersion = availableVersions.some((v) => v === kibanaVersion);
if (includeCurrentVersion && !hasCurrentVersion) {
availableVersions = [kibanaVersion, ...availableVersions];
}

const appendCurrentVersion = includeCurrentVersion;
// Allow upgrading to the current stack version if this override flag is provided via `kibana.yml`.
// This is useful for development purposes.
if (availableVersions.length === 0 && !config?.internal?.onlyAllowAgentUpgradeToKnownVersions) {
availableVersions = [kibanaVersion];
}

if (appendCurrentVersion) {
// Add current version if not already present
const hasCurrentVersion = versionsToDisplay.some((v) => v === kibanaVersion);
// Don't prime the cache in tests
if (!ignoreCache) {
CACHED_AVAILABLE_VERSIONS = availableVersions;
LAST_FETCHED = Date.now();
}

versionsToDisplay = !hasCurrentVersion
? [kibanaVersion].concat(versionsToDisplay)
: versionsToDisplay;
}
return availableVersions;
};

availableVersions = versionsToDisplay;
async function fetchAgentVersionsFromApi() {
const logger = appContextService.getLogger();

return availableVersions;
} catch (e) {
if (e.code === 'ENOENT' && !config?.internal?.onlyAllowAgentUpgradeToKnownVersions) {
// If the file does not exist, return the current version
return [kibanaVersion];
}
throw e;
const options = {
headers: {
'Content-Type': 'application/json',
},
};

const response = await pRetry(() => fetch(PRODUCT_VERSIONS_URL, options), { retries: 1 });
const rawBody = await response.text();

// We need to handle non-200 responses gracefully here to support airgapped environments where
// Kibana doesn't have internet access to query this API
if (response.status >= 400) {
logger.debug(`Status code ${response.status} received from versions API: ${rawBody}`);
return [];
}
};

const jsonBody = JSON.parse(rawBody);

const versions: string[] = (jsonBody.length ? jsonBody[0] : [])
.filter((item: any) => item?.title?.includes('Elastic Agent'))
.map((item: any) => item?.version_number);

return versions;
}

0 comments on commit 00995c2

Please sign in to comment.