Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.13] Adds Agent–Endpoint troubleshooting doc (backport #4870) #4895

Merged
merged 1 commit into from
Mar 11, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions docs/troubleshooting/management/ts-management.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -78,3 +78,68 @@ image::images/transforms-start.png[Transforms page with Start option selected]
. On the confirmation message that displays, click *Start* to restart the transform.
. The transform’s status changes to `started`. If it doesn't change, refresh the page.
====

[discrete]
[[ts-agent-connection]]
.{agent} and Endpoint connection issues
[%collapsible]
====

After {agent} installs Endpoint, Endpoint connects to {agent} over a local relay connection to report its health status and receive policy updates and response action requests. If that connection cannot be established, the {elastic-defend} integration will cause {agent} to be in an `Unhealthy` status, and Endpoint won't operate properly.

[float]
==== Identify if the issue is happening

You can identify if this issue is happening in the following ways:

* Run {agent}'s status command:
+
--
** `sudo /opt/Elastic/Agent/elastic-agent status` (Linux)
** `sudo /Library/Elastic/Agent/elastic-agent status` (macOS)
** `c:\Program Files\Elastic\Agent\elastic-agent.exe status` (Windows)
--
+
If the status result for `endpoint-security` says that Endpoint has missed check-ins or `localhost:6788` cannot be bound to, it might indicate this problem is occurring.

* If the problem starts happening right after installing Endpoint, check the value of `fleet.agent.id` in the following file:
+
--
** `/opt/Elastic/Endpoint/elastic-endpoint.yaml` (Linux)
** `/Library/Elastic/Endpoint/elastic-endpoint.yaml` (macOS)
** `c:\Program Files\Elastic\Endpoint\elastic-endpoint.yaml` (Windows)
--
+
If the value of `fleet.agent.id` is `00000000-0000-0000-0000-000000000000`, this indicates this problem is occurring.
+
NOTE: If this problem starts happening after Endpoint has already been installed and working properly, then this value will have changed even though the problem is happening.

[float]
==== Examine Endpoint logs

If you've confirmed that the issue is happening, you can look at Endpoint log messages to identify the cause:

* `Failed to find connection to validate. Is Agent listening on 127.0.0.1:6788?` or `Failed to validate connection. Is Agent running as root/admin?` means that Endpoint is not able to create an initial connection to {agent} over port `6788`.

* `Unable to make GRPC connection in deadline(60s). Fetching connection info again` means that Endpoint's original connection to {agent} over port `6788` worked, but the connection over port `6789` is failing.

[float]
==== Resolve the issue

To debug and resolve the issue, follow these steps:

. Since 8.7.0, Endpoint diagnostics contain a file named `analysis.txt` that contains information about what may cause this issue. As of 8.11.2, {agent} diagnostics automatically include Endpoint diagnostics. For previous versions, you can gather Endpoint diagnostics by running:
** `sudo /opt/Elastic/Endpoint/elastic-endpoint diagnostics` (Linux)
** `sudo /Library/Elastic/Endpoint/elastic-endpoint diagnostics` (macOS)
** `c:\Program Files\Elastic\Endpoint\elastic-endpoint.exe diagnostics` (Windows)

. Make sure nothing else on your device is listening on ports `6788` or `6789` by running:
** `sudo netstat -anp --tcp` (Linux)
** `sudo netstat -an -f inet` (macOS)
** `netstat -an` (Windows)

. Make sure `localhost` can be resolved to `127.0.0.1` by running:
** `ping -4 -c 1 localhost` (Linux)
** `ping -c 1 localhost` (macOS)
** `ping -4 localhost` (Windows)
====