Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create rewriter for x86 strcpy chain #1272

Merged
merged 8 commits into from
Jul 26, 2023
Merged

Conversation

ptomin
Copy link
Collaborator

@ptomin ptomin commented Jul 24, 2023

Many x86 binaries contanains strcpy(<dst>, <src>) compiled as
scasb/movsd/movsb sequence:

    	mov	edi,<src>
    	mov	edx,<dst>
    	or	ecx,0FFh
    	xor	eax,eax
    	repne scasb
    	not	ecx
    	sub	edi,ecx
    	mov	esi,edi
    	mov	eax,ecx
    	mov	edi,edx
    	shr	ecx,2h
    	rep movsd
    	mov	ecx,eax
    	and	ecx,3h
    	rep movsb
    	ret

    Many x86 binaries contanains strcpy(<dst>, <src>) compiled as
    scasb/movsd/movsb sequence:
    ```
    	mov	edi,<src>
    	mov	edx,<dst>
    	or	ecx,0FFh
    	xor	eax,eax
    	repne scasb
    	not	ecx
    	sub	edi,ecx
    	mov	esi,edi
    	mov	eax,ecx
    	mov	edi,edx
    	shr	ecx,2h
    	rep movsd
    	mov	ecx,eax
    	and	ecx,3h
    	rep movsb
    	ret
    ```
@ptomin ptomin self-assigned this Jul 24, 2023
@ptomin ptomin added the enhancement This is a feature request label Jul 24, 2023
@ptomin ptomin requested a review from uxmal July 24, 2023 19:01
var stms = block.Statements.ToArray();
for (int i = 0; i < stms.Length - 1; i++)
{
if (TryRewriteStrcpy(subject, stms[i], stms[i + 1]))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there is another non-related statement between stms[i] and stms[i+1]? Would it be better to "chase" the definition-uses in the SsaState to get more accurate results?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code excecutes after expression coalescing, so there is not statements between two memcpy calls normally. And it is unsafe to reorder statements in common case.

src/Arch/X86/Analysis/StrcpyChainRewiter.cs Outdated Show resolved Hide resolved
src/Arch/X86/Analysis/StrcpyChainRewiter.cs Outdated Show resolved Hide resolved
src/Arch/X86/Analysis/StrcpyChainRewiter.cs Outdated Show resolved Hide resolved
src/Arch/X86/Analysis/StrcpyChainRewiter.cs Outdated Show resolved Hide resolved
Add support for any 2^n aligmnent intead of using magic number 4.
@uxmal
Copy link
Owner

uxmal commented Jul 26, 2023

Looks good. Thanks for the contribution!

@uxmal uxmal merged commit bda7be6 into uxmal:master Jul 26, 2023
5 of 7 checks passed
@ptomin ptomin deleted the rewrite-strcpy-chain branch July 27, 2023 07:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement This is a feature request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants