Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing test: Security Solution Cypress.x-pack/plugins/security_solution/public/management/cypress/e2e/response_actions/response_console/execute·cy·ts - Response console Execute operations: "execute --command" - should execute a command "execute --command" - should execute a command #172319

Closed
kibanamachine opened this issue Nov 30, 2023 · 4 comments · Fixed by #172463
Assignees
Labels
failed-test A test failure on a tracked branch, potentially flaky-test Team:Defend Workflows “EDR Workflows” sub-team of Security Solution Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc.

Comments

@kibanamachine
Copy link
Contributor

A test failed on a tracked branch

AssertionError: Timed out retrying after 60000ms: Expected to find content: 'test-host-2321' but never did.
    at waitForEndpointListPageToBeLoaded (webpack:///./tasks/response_console.ts:17:32)
    at Context.eval (webpack:///./e2e/response_actions/response_console/execute.cy.ts:67:40)

First failure: CI Build - main

@kibanamachine kibanamachine added the failed-test A test failure on a tracked branch, potentially flaky-test label Nov 30, 2023
@botelastic botelastic bot added the needs-team Issues missing a team label label Nov 30, 2023
@kibanamachine kibanamachine added the Team:Defend Workflows “EDR Workflows” sub-team of Security Solution label Nov 30, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-defend-workflows (Team:Defend Workflows)

@botelastic botelastic bot removed the needs-team Issues missing a team label label Nov 30, 2023
@kevinlog
Copy link
Contributor

kevinlog commented Nov 30, 2023

@ashokaditya looks like this failed on Serverless specifically. If it continues to be flaky, we could just marked this as brokenInServerless

cc @szwarckonrad - it looks like it may be have been one of the flaky Agent installs? I can see the 2 attempts to install here. I remember you had suggested an additional retry, do you think it would work here?

@kevinlog kevinlog assigned szwarckonrad and unassigned ashokaditya Nov 30, 2023
@szwarckonrad
Copy link
Contributor

I don't think this is the case with this particular failure.
Logs for this job:

  1. Endpoint host creation failed - log
  2. Second try do not throw and therefore goes from task (retryable endpoint creation) located in before to first test case

We can observe here, in the different test failure, that when the task fails it will throw errors two times. First will be caught and retry procedure will start and second will cause the before to fail altogether.

Given that test in question failed on test case and not in before and the fact that only one error was thrown in the task I believe that the task itself was successful in creating an endpoint host but failed because it never came up in Kibana.

Response console -- Execute operations -- execute --command - should execute a command (failed) (attempt 2)

@MindyRS MindyRS added the Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. label Dec 14, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

szwarckonrad added a commit that referenced this issue Dec 21, 2023
This PR addresses 3 known issues with environment setup on Serverless CI
pipelines.

### Setup task fails on:
### 1. Host recreation after first failed attempt

Test fails on: `AssertionError: Timed out retrying after 60000ms:
Expected to find content: 'test-host-2321' but never did.`

Failed job from before:
[here](https://buildkite.com/elastic/kibana-on-merge/builds/38728#018c219b-c0b2-41a6-9cba-2613fa85382c)

Endpoint creation task is successful (host is enrolled with fleet),
however, it doesn't appear in kibana. Since we are using /metadata
endpoint to list all agents I've added a check in the task to see if the
endpoint makes it to the metadata . If it fails to do so I delete the
endpoint and do a retry with additional index (.fleet-agents,
metadata-current and metadata-unified) search (thanks @joeypoon)

Successful job:
[here](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4284#018c462c-21ec-44c0-afea-a630253e7717)

### 2. Fleet server not coming up

Test fails on: `│ERROR Error: Timed out waiting for fleet server
[dev-fleet-server.8284.gns5] to register with Elasticsarch`

Failed job from before:
[here](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4166#018c368c-2434-4a2a-aaab-ae097ef26843)

If first attempt at creating and enrolling fleet server fails we do a
retry.

Successful job:
[here](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4285#018c462c-487d-4a1c-8a37-71ca62915878)


### 3. Package policy creation fails

Test fails on: `CypressError: cy.task('indexFleetEndpointPolicy') failed
with the following error: Request failed with status code 500`

**Couldn't recreate in CI.**
Failed job from before:
[here](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4204#018c3f17-3a3a-43dd-a31b-6bc5109d4193)

Package installation fails with `no_shard_available_action_exception`
error. We retry api call.



closes #170482 (agent creation)
closes #172920 (agent creation)
closes #172319 (agent creation)
closes #172326 (package policy)

1000 test runs on single test file (issues were occuring in setup tasks,
not test cases itself):
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4496
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4497
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4498
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4499
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4500
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4501
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4502
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4503
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4504
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4505
kibanamachine pushed a commit to kibanamachine/kibana that referenced this issue Dec 21, 2023
…c#172463)

This PR addresses 3 known issues with environment setup on Serverless CI
pipelines.

### Setup task fails on:
### 1. Host recreation after first failed attempt

Test fails on: `AssertionError: Timed out retrying after 60000ms:
Expected to find content: 'test-host-2321' but never did.`

Failed job from before:
[here](https://buildkite.com/elastic/kibana-on-merge/builds/38728#018c219b-c0b2-41a6-9cba-2613fa85382c)

Endpoint creation task is successful (host is enrolled with fleet),
however, it doesn't appear in kibana. Since we are using /metadata
endpoint to list all agents I've added a check in the task to see if the
endpoint makes it to the metadata . If it fails to do so I delete the
endpoint and do a retry with additional index (.fleet-agents,
metadata-current and metadata-unified) search (thanks @joeypoon)

Successful job:
[here](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4284#018c462c-21ec-44c0-afea-a630253e7717)

### 2. Fleet server not coming up

Test fails on: `│ERROR Error: Timed out waiting for fleet server
[dev-fleet-server.8284.gns5] to register with Elasticsarch`

Failed job from before:
[here](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4166#018c368c-2434-4a2a-aaab-ae097ef26843)

If first attempt at creating and enrolling fleet server fails we do a
retry.

Successful job:
[here](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4285#018c462c-487d-4a1c-8a37-71ca62915878)

### 3. Package policy creation fails

Test fails on: `CypressError: cy.task('indexFleetEndpointPolicy') failed
with the following error: Request failed with status code 500`

**Couldn't recreate in CI.**
Failed job from before:
[here](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4204#018c3f17-3a3a-43dd-a31b-6bc5109d4193)

Package installation fails with `no_shard_available_action_exception`
error. We retry api call.

closes elastic#170482 (agent creation)
closes elastic#172920 (agent creation)
closes elastic#172319 (agent creation)
closes elastic#172326 (package policy)

1000 test runs on single test file (issues were occuring in setup tasks,
not test cases itself):
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4496
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4497
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4498
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4499
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4500
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4501
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4502
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4503
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4504
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4505

(cherry picked from commit ff0351e)
kibanamachine added a commit that referenced this issue Dec 21, 2023
…172463) (#173848)

# Backport

This will backport the following commits from `main` to `8.12`:
- [[EDR Workflows][Serverless] E2E Endpoint creation fine tuning
(#172463)](#172463)

<!--- Backport version: 8.9.7 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Konrad
Szwarc","email":"[email protected]"},"sourceCommit":{"committedDate":"2023-12-21T14:59:41Z","message":"[EDR
Workflows][Serverless] E2E Endpoint creation fine tuning
(#172463)\n\nThis PR addresses 3 known issues with environment setup on
Serverless CI\r\npipelines.\r\n\r\n### Setup task fails on:\r\n### 1.
Host recreation after first failed attempt\r\n\r\nTest fails on:
`AssertionError: Timed out retrying after 60000ms:\r\nExpected to find
content: 'test-host-2321' but never did.`\r\n\r\nFailed job from
before:\r\n[here](https://buildkite.com/elastic/kibana-on-merge/builds/38728#018c219b-c0b2-41a6-9cba-2613fa85382c)\r\n\r\nEndpoint
creation task is successful (host is enrolled with fleet),\r\nhowever,
it doesn't appear in kibana. Since we are using /metadata\r\nendpoint to
list all agents I've added a check in the task to see if the\r\nendpoint
makes it to the metadata . If it fails to do so I delete the\r\nendpoint
and do a retry with additional index (.fleet-agents,\r\nmetadata-current
and metadata-unified) search (thanks @joeypoon)\r\n\r\nSuccessful
job:\r\n[here](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4284#018c462c-21ec-44c0-afea-a630253e7717)\r\n\r\n###
2. Fleet server not coming up\r\n\r\nTest fails on: `│ERROR Error: Timed
out waiting for fleet server\r\n[dev-fleet-server.8284.gns5] to register
with Elasticsarch`\r\n\r\nFailed job from
before:\r\n[here](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4166#018c368c-2434-4a2a-aaab-ae097ef26843)\r\n\r\nIf
first attempt at creating and enrolling fleet server fails we do
a\r\nretry.\r\n\r\nSuccessful
job:\r\n[here](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4285#018c462c-487d-4a1c-8a37-71ca62915878)\r\n\r\n\r\n###
3. Package policy creation fails\r\n\r\nTest fails on: `CypressError:
cy.task('indexFleetEndpointPolicy') failed\r\nwith the following error:
Request failed with status code 500`\r\n\r\n**Couldn't recreate in
CI.**\r\nFailed job from
before:\r\n[here](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4204#018c3f17-3a3a-43dd-a31b-6bc5109d4193)\r\n\r\nPackage
installation fails with `no_shard_available_action_exception`\r\nerror.
We retry api call.\r\n\r\n\r\n\r\ncloses
#170482 (agent
creation)\r\ncloses #172920
(agent creation)\r\ncloses
#172319 (agent
creation)\r\ncloses #172326
(package policy)\r\n\r\n1000 test runs on single test file (issues were
occuring in setup tasks,\r\nnot test cases
itself):\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4496\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4497\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4498\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4499\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4500\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4501\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4502\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4503\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4504\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4505","sha":"ff0351eb5edd7f57ab856c48f7aa45c6ae95e502","branchLabelMapping":{"^v8.13.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Team:Defend
Workflows","v8.12.0","v8.13.0"],"number":172463,"url":"https://github.com/elastic/kibana/pull/172463","mergeCommit":{"message":"[EDR
Workflows][Serverless] E2E Endpoint creation fine tuning
(#172463)\n\nThis PR addresses 3 known issues with environment setup on
Serverless CI\r\npipelines.\r\n\r\n### Setup task fails on:\r\n### 1.
Host recreation after first failed attempt\r\n\r\nTest fails on:
`AssertionError: Timed out retrying after 60000ms:\r\nExpected to find
content: 'test-host-2321' but never did.`\r\n\r\nFailed job from
before:\r\n[here](https://buildkite.com/elastic/kibana-on-merge/builds/38728#018c219b-c0b2-41a6-9cba-2613fa85382c)\r\n\r\nEndpoint
creation task is successful (host is enrolled with fleet),\r\nhowever,
it doesn't appear in kibana. Since we are using /metadata\r\nendpoint to
list all agents I've added a check in the task to see if the\r\nendpoint
makes it to the metadata . If it fails to do so I delete the\r\nendpoint
and do a retry with additional index (.fleet-agents,\r\nmetadata-current
and metadata-unified) search (thanks @joeypoon)\r\n\r\nSuccessful
job:\r\n[here](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4284#018c462c-21ec-44c0-afea-a630253e7717)\r\n\r\n###
2. Fleet server not coming up\r\n\r\nTest fails on: `│ERROR Error: Timed
out waiting for fleet server\r\n[dev-fleet-server.8284.gns5] to register
with Elasticsarch`\r\n\r\nFailed job from
before:\r\n[here](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4166#018c368c-2434-4a2a-aaab-ae097ef26843)\r\n\r\nIf
first attempt at creating and enrolling fleet server fails we do
a\r\nretry.\r\n\r\nSuccessful
job:\r\n[here](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4285#018c462c-487d-4a1c-8a37-71ca62915878)\r\n\r\n\r\n###
3. Package policy creation fails\r\n\r\nTest fails on: `CypressError:
cy.task('indexFleetEndpointPolicy') failed\r\nwith the following error:
Request failed with status code 500`\r\n\r\n**Couldn't recreate in
CI.**\r\nFailed job from
before:\r\n[here](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4204#018c3f17-3a3a-43dd-a31b-6bc5109d4193)\r\n\r\nPackage
installation fails with `no_shard_available_action_exception`\r\nerror.
We retry api call.\r\n\r\n\r\n\r\ncloses
#170482 (agent
creation)\r\ncloses #172920
(agent creation)\r\ncloses
#172319 (agent
creation)\r\ncloses #172326
(package policy)\r\n\r\n1000 test runs on single test file (issues were
occuring in setup tasks,\r\nnot test cases
itself):\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4496\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4497\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4498\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4499\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4500\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4501\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4502\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4503\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4504\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4505","sha":"ff0351eb5edd7f57ab856c48f7aa45c6ae95e502"}},"sourceBranch":"main","suggestedTargetBranches":["8.12"],"targetPullRequestStates":[{"branch":"8.12","label":"v8.12.0","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.13.0","labelRegex":"^v8.13.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/172463","number":172463,"mergeCommit":{"message":"[EDR
Workflows][Serverless] E2E Endpoint creation fine tuning
(#172463)\n\nThis PR addresses 3 known issues with environment setup on
Serverless CI\r\npipelines.\r\n\r\n### Setup task fails on:\r\n### 1.
Host recreation after first failed attempt\r\n\r\nTest fails on:
`AssertionError: Timed out retrying after 60000ms:\r\nExpected to find
content: 'test-host-2321' but never did.`\r\n\r\nFailed job from
before:\r\n[here](https://buildkite.com/elastic/kibana-on-merge/builds/38728#018c219b-c0b2-41a6-9cba-2613fa85382c)\r\n\r\nEndpoint
creation task is successful (host is enrolled with fleet),\r\nhowever,
it doesn't appear in kibana. Since we are using /metadata\r\nendpoint to
list all agents I've added a check in the task to see if the\r\nendpoint
makes it to the metadata . If it fails to do so I delete the\r\nendpoint
and do a retry with additional index (.fleet-agents,\r\nmetadata-current
and metadata-unified) search (thanks @joeypoon)\r\n\r\nSuccessful
job:\r\n[here](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4284#018c462c-21ec-44c0-afea-a630253e7717)\r\n\r\n###
2. Fleet server not coming up\r\n\r\nTest fails on: `│ERROR Error: Timed
out waiting for fleet server\r\n[dev-fleet-server.8284.gns5] to register
with Elasticsarch`\r\n\r\nFailed job from
before:\r\n[here](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4166#018c368c-2434-4a2a-aaab-ae097ef26843)\r\n\r\nIf
first attempt at creating and enrolling fleet server fails we do
a\r\nretry.\r\n\r\nSuccessful
job:\r\n[here](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4285#018c462c-487d-4a1c-8a37-71ca62915878)\r\n\r\n\r\n###
3. Package policy creation fails\r\n\r\nTest fails on: `CypressError:
cy.task('indexFleetEndpointPolicy') failed\r\nwith the following error:
Request failed with status code 500`\r\n\r\n**Couldn't recreate in
CI.**\r\nFailed job from
before:\r\n[here](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4204#018c3f17-3a3a-43dd-a31b-6bc5109d4193)\r\n\r\nPackage
installation fails with `no_shard_available_action_exception`\r\nerror.
We retry api call.\r\n\r\n\r\n\r\ncloses
#170482 (agent
creation)\r\ncloses #172920
(agent creation)\r\ncloses
#172319 (agent
creation)\r\ncloses #172326
(package policy)\r\n\r\n1000 test runs on single test file (issues were
occuring in setup tasks,\r\nnot test cases
itself):\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4496\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4497\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4498\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4499\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4500\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4501\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4502\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4503\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4504\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4505","sha":"ff0351eb5edd7f57ab856c48f7aa45c6ae95e502"}}]}]
BACKPORT-->

Co-authored-by: Konrad Szwarc <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
failed-test A test failure on a tracked branch, potentially flaky-test Team:Defend Workflows “EDR Workflows” sub-team of Security Solution Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants