Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ECS metadata allowing cloudwatch-logs to be linked with traces #93

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 68 additions & 5 deletions lib/aws-xray-sdk/plugins/ecs.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,82 @@

module XRay
module Plugins
# Due to lack of ECS container metadata service, the only host information
# available is the host name.
Comment on lines -6 to -7
Copy link
Contributor Author

@etiennechabert etiennechabert Sep 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is not anymore relevant

module ECS
include Logging

ORIGIN = 'AWS::ECS::Container'.freeze

# Only compatible with v4!
# The v3 metadata url does not contain cloudwatch informations
METADATA_ENV_KEY = 'ECS_CONTAINER_METADATA_URI_V4'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we have a strong dependency with Fargate 1.4, otherwise this environment variable will not be defined:
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-metadata-endpoint-v4-fargate.html

Sadly the necessary information are not exposed with the meta-information of the v3:
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-metadata-endpoint-v3-fargate.html


def self.aws
@@aws ||= begin
{ ecs: { container: Socket.gethostname } }
metadata = get_metadata()

begin
metadata[:ecs][:container] = Socket.gethostname
rescue StandardError => e
@@aws = {}
Logging.logger.warn %(cannot get the ecs container hostname due to: #{e.message}.)
metadata[:ecs][:container] = nil
end

@@aws = {
ecs: metadata[:ecs],
cloudwatch_logs: metadata[:cloudwatch_logs]
}
end

private

def self.get_metadata()
begin
metadata_uri = URI(ENV[METADATA_ENV_KEY])
req = Net::HTTP::Get.new(metadata_uri)
metadata_json = do_request(req)
return parse_metadata(metadata_json)
rescue StandardError => e
Logging.logger.warn %(cannot get the ecs instance metadata due to: #{e.message}. Make sure you are using Fargate platform version >=1.4.0)
{ ecs: {}, cloudwatch_logs: {} }
end
end

def self.parse_metadata(json_str)
data = JSON(json_str)

metadata = {
ecs: {
container_arn: data['ContainerARN'],
},
cloudwatch_logs: {
log_group: data["LogOptions"]['awslogs-group'],
log_region: data["LogOptions"]['awslogs-region'],
arn: data['ContainerARN']
}
}
Comment on lines +49 to +57
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially, much more could be added to the x-ray segment.

This particular information for example would be valuable to hunt regressions:

com.amazonaws.ecs.task-definition-version
Screenshot 2023-09-11 at 12 52 41

metadata
end

def self.do_request(request)
begin
response = Net::HTTP.start(request.uri.hostname, read_timeout: 1) { |http|
http.request(request)
}

if response.code == '200'
return response.body
else
raise(StandardError.new('Unsuccessful response::' + response.code + '::' + response.message))
end
rescue StandardError => e
# Two attempts in total to complete the request successfully
@retries ||= 0
if @retries < 1
@retries += 1
retry
else
Logging.logger.warn %(Failed to complete request due to: #{e.message}.)
raise e
end
end
end
end
Expand Down
51 changes: 51 additions & 0 deletions test/aws-xray-sdk/tc_plugin.rb
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ def test_get_runtime_context
WebMock.reset!
end

# EC2 Plugin
def test_ec2_metadata_v2_successful
dummy_json = '{\"availabilityZone\" : \"us-east-2a\", \"imageId\" : \"ami-03cca83dd001d4666\",
\"instanceId\" : \"i-07a181803de94c666\", \"instanceType\" : \"t3.xlarge\"}'
Expand All @@ -43,6 +44,7 @@ def test_ec2_metadata_v2_successful
ami_id: 'ami-03cca83dd001d4666'
}
}
# We should probably use `assert_equal` here ? Always true otherwise...
assert expected, XRay::Plugins::EC2.aws
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to my PR, but assert is always returning true here... regardless what is passed

We probably want to use assert_equal if we want to compare something expected with XRay::Plugins::EC2.aws

WebMock.reset!
end
Expand Down Expand Up @@ -80,4 +82,53 @@ def test_ec2_metadata_fail
assert expected, XRay::Plugins::EC2.aws
WebMock.reset!
end

# ECS Plugin
def test_ecs_metadata_successful
dummy_metadata_uri = 'http://169.254.170.2/v4/a_random_id'
dummy_json = {
"ContainerARN"=>"arn:aws:ecs:eu-central-1:an_id:container/a_cluster/a_cluster_id/a_task_id",
"LogOptions"=>{"awslogs-group"=>"/ecs/a_service_name", "awslogs-region"=>"eu-central-1", "awslogs-stream"=>"ecs/a_service_name/a_task_id"},
}

ENV[XRay::Plugins::ECS::METADATA_ENV_KEY] = dummy_metadata_uri
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried a few things to mutate the ENV in a better way...

Could not do it without adding extra dependencies to the project

stub_request(:get, dummy_metadata_uri)
.to_return(status: 200, body: dummy_json.to_json, headers: {})

expected = {
ecs: {
container: Socket.gethostname,
container_arn: 'arn:aws:ecs:eu-central-1:an_id:container/a_cluster/a_cluster_id/a_task_id',
},
cloudwatch_logs: {:log_group=>"/ecs/a_service_name", :log_region=>"eu-central-1", :arn=>"arn:aws:ecs:eu-central-1:an_id:container/a_cluster/a_cluster_id/a_task_id"}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the key change here, with this information present in the trace, the logs should be linked

}
assert_equal expected, XRay::Plugins::ECS.aws
WebMock.reset!
ENV.delete(XRay::Plugins::ECS::METADATA_ENV_KEY)
end

def test_ecs_metadata_fail
dummy_metadata_uri = 'http://169.254.170.2/v4/a_random_id'
ENV['ECS_CONTAINER_METADATA_URI_V4'] = dummy_metadata_uri

stub_request(:get, dummy_metadata_uri)
.to_raise(StandardError)

expected = {
ecs: {container: Socket.gethostname},
cloudwatch_logs: {}
}
assert_equal expected, XRay::Plugins::ECS.aws
WebMock.reset!
ENV.delete(XRay::Plugins::ECS::METADATA_ENV_KEY)
end

def test_ecs_metadata_not_defined
expected = {
ecs: {container: Socket.gethostname},
cloudwatch_logs: {}
}
assert_equal expected, XRay::Plugins::ECS.aws
WebMock.reset!
end
end