From 132fb11d5b871da520918909ac073cba3bb289e0 Mon Sep 17 00:00:00 2001 From: natasha-moore-elastic Date: Fri, 1 Mar 2024 16:25:03 +0000 Subject: [PATCH 1/4] =?UTF-8?q?Adds=20Agent=E2=80=93Endpoint=20troubleshoo?= =?UTF-8?q?ting=20doc?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .../management/ts-management.asciidoc | 59 +++++++++++++++++++ 1 file changed, 59 insertions(+) diff --git a/docs/troubleshooting/management/ts-management.asciidoc b/docs/troubleshooting/management/ts-management.asciidoc index 46f2405c0a..d9851e1a11 100644 --- a/docs/troubleshooting/management/ts-management.asciidoc +++ b/docs/troubleshooting/management/ts-management.asciidoc @@ -78,3 +78,62 @@ image::images/transforms-start.png[Transforms page with Start option selected] . On the confirmation message that displays, click *Start* to restart the transform. . The transform’s status changes to `started`. If it doesn't change, refresh the page. ==== + +[discrete] +[[ts-agent-connection]] +.{agent} and Endpoint connection issues +[%collapsible] +==== + +After {agent} installs Endpoint, Endpoint connects to {agent} over a local relay connection to report its health status and receive policy updates and response action requests. If that connection cannot be established, the {elastic-defend} integration will cause {agent} to be in an `Unhealthy` status, and Endpoint won't operate properly. + +[float] +==== Identify if the issue is happening + +You can identify if this issue is happening in the following ways: + +* Run {agent}'s status command: +** `sudo /opt/Elastic/Agent/elastic-agent status` (Linux) +** `sudo /Library/Elastic/Agent/elastic-agent status` (macOS) +** `c:\Program Files\Elastic\Agent\elastic-agent.exe status` (Windows) + +If the status result for `endpoint-security` says that Endpoint has missed check-ins or `localhost:6788` cannot be bound to, it might indicate this problem is occurring. + +* If the problem starts happening right after installing Endpoint, check the value of `fleet.agent.id` in the following file: +** `/opt/Elastic/Endpoint/elastic-endpoint.yaml` (Linux) +** `/Library/Elastic/Endpoint/elastic-endpoint.yaml` (macOS) +** `c:\Program Files\Elastic\Endpoint\elastic-endpoint.yaml` (Windows) + +If the value of `fleet.agent.id` is `00000000-0000-0000-0000-000000000000`, this indicates this problem is occurring. + +NOTE: If this problem starts happening after Endpoint has already been installed and working properly, then this value will have changed even though the problem is happening. + +[float] +==== Examine Endpoint logs + +If you've confirmed that the issue is happening, you can look at Endpoint log messages to identify the cause: + +* `Failed to find connection to validate. Is {agent} listening on 127.0.0.1:6788?` or `Failed to validate connection. Is {agent} running as root/admin?` means that Endpoint is not able to create an initial connection to {agent} over port `6788`. + +* `Unable to make GRPC connection in deadline(60s). Fetching connection info again` means that Endpoint's original connection to {agent} over port `6788` worked, but the connection over port `6789` is failing. + +[float] +==== Resolve the issue + +To debug and resolve the issue, follow these steps: + +. Since 8.7.0, Endpoint diagnostics contain a file named `analysis.txt` that contains information about what may cause this issue. As of 8.11.2, {agent} diagnostics automatically include Endpoint diagnostics. For previous versions, you can gather Endpoint diagnostics by running: +** `sudo /opt/Elastic/Endpoint/elastic-endpoint diagnostics` (Linux) +** `sudo /Library/Elastic/Endpoint/elastic-endpoint diagnostics` (macOS) +** `c:\Program Files\Elastic\Endpoint\elastic-endpoint.exe diagnostics` (Windows) + +. Make sure nothing else on your device is listening on ports `6788` or `6789` by running: +** `sudo netstat -anp --tcp` (Linux) +** `sudo netstat -an -f inet` (macOS) +** `netstat -an` (Windows) + +. Make sure `localhost` can be resolved to `127.0.0.1` by running: +** `ping -4 -c 1 localhost` (Linux) +** `ping -c 1 localhost` (macOS) +** `ping -4 localhost` (Windows) +==== \ No newline at end of file From 01ede89ee051a3dc956064b4e976aa43323e863c Mon Sep 17 00:00:00 2001 From: natasha-moore-elastic Date: Mon, 4 Mar 2024 15:45:39 +0000 Subject: [PATCH 2/4] Removes attributes from log message --- docs/troubleshooting/management/ts-management.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/troubleshooting/management/ts-management.asciidoc b/docs/troubleshooting/management/ts-management.asciidoc index d9851e1a11..313c058a65 100644 --- a/docs/troubleshooting/management/ts-management.asciidoc +++ b/docs/troubleshooting/management/ts-management.asciidoc @@ -113,7 +113,7 @@ NOTE: If this problem starts happening after Endpoint has already been installed If you've confirmed that the issue is happening, you can look at Endpoint log messages to identify the cause: -* `Failed to find connection to validate. Is {agent} listening on 127.0.0.1:6788?` or `Failed to validate connection. Is {agent} running as root/admin?` means that Endpoint is not able to create an initial connection to {agent} over port `6788`. +* `Failed to find connection to validate. Is Agent listening on 127.0.0.1:6788?` or `Failed to validate connection. Is Agent running as root/admin?` means that Endpoint is not able to create an initial connection to {agent} over port `6788`. * `Unable to make GRPC connection in deadline(60s). Fetching connection info again` means that Endpoint's original connection to {agent} over port `6788` worked, but the connection over port `6789` is failing. From f2c6f607b0c189353b3454f60e2be2121b188c3a Mon Sep 17 00:00:00 2001 From: natasha-moore-elastic <137783811+natasha-moore-elastic@users.noreply.github.com> Date: Wed, 6 Mar 2024 09:50:26 +0000 Subject: [PATCH 3/4] Update docs/troubleshooting/management/ts-management.asciidoc Co-authored-by: Joe Peeples --- docs/troubleshooting/management/ts-management.asciidoc | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/docs/troubleshooting/management/ts-management.asciidoc b/docs/troubleshooting/management/ts-management.asciidoc index 313c058a65..c09fe9214a 100644 --- a/docs/troubleshooting/management/ts-management.asciidoc +++ b/docs/troubleshooting/management/ts-management.asciidoc @@ -93,17 +93,23 @@ After {agent} installs Endpoint, Endpoint connects to {agent} over a local relay You can identify if this issue is happening in the following ways: * Run {agent}'s status command: ++ +-- ** `sudo /opt/Elastic/Agent/elastic-agent status` (Linux) ** `sudo /Library/Elastic/Agent/elastic-agent status` (macOS) ** `c:\Program Files\Elastic\Agent\elastic-agent.exe status` (Windows) - +-- ++ If the status result for `endpoint-security` says that Endpoint has missed check-ins or `localhost:6788` cannot be bound to, it might indicate this problem is occurring. * If the problem starts happening right after installing Endpoint, check the value of `fleet.agent.id` in the following file: ++ +-- ** `/opt/Elastic/Endpoint/elastic-endpoint.yaml` (Linux) ** `/Library/Elastic/Endpoint/elastic-endpoint.yaml` (macOS) ** `c:\Program Files\Elastic\Endpoint\elastic-endpoint.yaml` (Windows) - +-- ++ If the value of `fleet.agent.id` is `00000000-0000-0000-0000-000000000000`, this indicates this problem is occurring. NOTE: If this problem starts happening after Endpoint has already been installed and working properly, then this value will have changed even though the problem is happening. From 4626cd0a635e1ba8ff7f85143578e4dd90240b36 Mon Sep 17 00:00:00 2001 From: natasha-moore-elastic Date: Wed, 6 Mar 2024 14:53:07 +0000 Subject: [PATCH 4/4] Indents note --- docs/troubleshooting/management/ts-management.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/troubleshooting/management/ts-management.asciidoc b/docs/troubleshooting/management/ts-management.asciidoc index c09fe9214a..7a3fd994c6 100644 --- a/docs/troubleshooting/management/ts-management.asciidoc +++ b/docs/troubleshooting/management/ts-management.asciidoc @@ -111,7 +111,7 @@ If the status result for `endpoint-security` says that Endpoint has missed check -- + If the value of `fleet.agent.id` is `00000000-0000-0000-0000-000000000000`, this indicates this problem is occurring. - ++ NOTE: If this problem starts happening after Endpoint has already been installed and working properly, then this value will have changed even though the problem is happening. [float]