Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQS client async requests leaking HttpWebRequest objects #1808

Closed
mr-giles opened this issue Mar 3, 2021 · 5 comments
Closed

SQS client async requests leaking HttpWebRequest objects #1808

mr-giles opened this issue Mar 3, 2021 · 5 comments
Labels
bug This issue is a bug. closed-for-staleness module/sdk-core needs-reproduction This issue needs reproduction. queued

Comments

@mr-giles
Copy link

mr-giles commented Mar 3, 2021

Description

I'm using the AmazonSQSClient to send and receive (via long-polling) messages. My (simple) app sends a message once a minute to a queue, and sits long-polling another queue for incoming messages. I am seeing memory usage increase slowly while running, from c.20MB on app startup to c.160MB after 48 hours of operation. All my app is doing is this SQS message queue interaction.

Reproduction Steps

Instantiate the client:

// Use the default cred store against the Windows user
client = new AmazonSQSClient();

Code executed on a timer to send a message:

// Create a new time-limited CancellationTokenSource hooked off our main 'app exit' CancellationToken
var cts = CancellationTokenSource.CreateLinkedTokenSource(ctsTx.Token);
cts.CancelAfter(5000);

// Serialise the message object to JSON text
var messageBody = message.ToPayload();

try
{
    SendMessageResponse responseSendMsg = await client.SendMessageAsync(SqsQueueUrl, messageBody, cts.Token).ConfigureAwait(false);

    log.Info("Send successful! MsgID: {0}, HttpStatusCode: {1}", responseSendMsg.MessageId, responseSendMsg.HttpStatusCode);
}
catch (TaskCanceledException)
{
    log.Info("Sending task canceled");
}
catch (Exception e)
{
    log.Error("Failure to send: {0}", e.Message);
}
finally
{
    cts.Dispose();
}

Method to receive messages from a queue. This is actually called by a timer rather than in a loop so I can back off after errors.

var request = new ReceiveMessageRequest
{
	QueueUrl = SqsQueueUrl,
	WaitTimeSeconds = 20, // Long polling
	MaxNumberOfMessages = 10
};

var cts = CancellationTokenSource.CreateLinkedTokenSource(ctsTx.Token);
ReceiveMessageResponse response;

try
{
	response = await client.ReceiveMessageAsync(request, cts.Token).ConfigureAwait(false);
	log.Debug("Received {0} message(s)", response.Messages.Count);
}
catch (Exception e)
{
	log.Error("Failure to receive: {0}", e.Message);
}
finally
{
	cts.Dispose();
}

if (response.Messages.Count > 0)
{
	// Delete message from queue using client.DeleteMessageBatchAsync
	// (Code omitted here for brevity)
}

Logs

Running in debug (in VS 2019 on Win 10) does not show increasing memory to the same level so this is hard to troubleshoot.

Running Release builds (Server 2012 R2) shows increasing memory. I dumped the memory to PerfView after running my app for 48 hours. During this time my app has been doing nothing but the SQS client calls, effectively it is idling, and it has received no messages from the SQS queue. The dump shows:

sqs_github_issue_0

All those HttpWebRequests seem to be referenced from within the AWS SDK:

sqs_github_issue_1

Environment

  • Package Version: AWSSDK.Core: v3.5.3.2, AWSSDK.SQS v3.5.1.21
  • OS Info: Windows Server 2012 R2
  • Build Environment VS 2019 Pro on Windows 10
  • Targeted .NET Platform: v4.8 (Windows/Desktop)

Resolution

I am far from an expert in analyzing memory dumps. To me, this looks like the SDK is not disposing of HttpWebRequest objects after they're finished with. That would explain the increasing memory. But I could be reading the dump incorrectly.

I can't see any .Dispose() method on the ReceiveMessageRequest, ReceiveMessageResponse or SendMessageResponse classes, so I don't there's anything I can do to clean up after a message request returns.


This is a 🐛 bug-report

@mr-giles mr-giles added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Mar 3, 2021
@mr-giles mr-giles changed the title SQS client requests leaking HttpWebRequest objects SQS client async requests leaking HttpWebRequest objects Mar 3, 2021
@ashishdhingra ashishdhingra added module/sdk-core needs-reproduction This issue needs reproduction. and removed needs-triage This issue or PR still needs to be triaged. labels Mar 3, 2021
@ashishdhingra
Copy link
Contributor

Might be related to #1602

@bill-poole
Copy link

@ashishdhingra, it might be a contributing factor but if so, a very minor one. Don’t get me wrong, if this leak exists, it is vital that it is plugged. If anything, it’s further evidence that the .NET SQS client is dodgy and needs re-engineering.

The performance issues identified in #1602 are a result of the underlying/fundamental design of the .NET SQS client, which makes heavy use of string concatenation, which by its nature is going to be EXTREMELY heap allocation heavy.

i.e., the performance issues identified in #1602 are (from my analysis) primarily a result of a ludicrous amount of heap allocation and memory copying.

You could at least consider using a StringBuilder and then pooling the StringBuilder instances. That would make a dramatic improvement in performance. But that still wouldn’t be the right answer.

I recommend you look into the high-performance messaging transport work done with System.IO.Pipelines and its application in ASP.NET Core, as well as with the .NET gRPC transport developed by James Newton-King. I’d also suggest reaching out to David Fowler (Microsoft) to take advantage of the work he’s been doing with “project bedrock”, for which new APIs are apparently slated for .NET 6 in November.

@bill-poole
Copy link

David Fowler has a blog post explaining the work they did with System.IO.Pipelines here.

@bill-poole
Copy link

bill-poole commented Mar 27, 2021

By the way @ashishdhingra, I’m thrilled that #1602 is still at least visible to AWS, as evidenced by your referencing it above. It would be very much appreciated if someone from AWS could respond to my query for a status update in issue #1602.

@hunanniu hunanniu added the queued label Apr 5, 2021
@github-actions
Copy link

github-actions bot commented Apr 6, 2022

We have noticed this issue has not received attention in 1 year. We will close this issue for now. If you think this is in error, please feel free to comment and reopen the issue.

@github-actions github-actions bot added closing-soon This issue will automatically close in 4 days unless further comments are made. closed-for-staleness and removed closing-soon This issue will automatically close in 4 days unless further comments are made. labels Apr 6, 2022
@github-actions github-actions bot closed this as completed Apr 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. closed-for-staleness module/sdk-core needs-reproduction This issue needs reproduction. queued
Projects
None yet
Development

No branches or pull requests

4 participants