Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusng ATT opcode's suffix encoding #186

Open
sdasgup3 opened this issue Jan 19, 2019 · 4 comments
Open

Confusng ATT opcode's suffix encoding #186

sdasgup3 opened this issue Jan 19, 2019 · 4 comments

Comments

@sdasgup3
Copy link

Hello Team,
I am confused with the ATT encoding suffix as shown in the following example

./xed -64 -A -d F3410F7E00
F3410F7E00
ICLASS: MOVQ   CATEGORY: DATAXFER   EXTENSION: SSE2  IFORM: MOVQ_XMMdq_MEMq_0F7E   ISA_SET: SSE2
SHORT: movqq  (%r8), %xmm0

Is the opcode encoding movqq correct? It is not accepted by as or even xed.
To make sure, I tried assembling movq (%r8), %xmm0 using 'as' and run ./xed -A -64 -i <assembled file> and get the same opcode movqq.

Please help.

@markcharney
Copy link
Contributor

markcharney commented Jan 20, 2019

  1. encoder does not use att syntax. The encoder uses its own syntax. The new asmparse syntax is closer to a real assembler syntax but still a work in progress. XED’s asmparse uses the Intel-syntax.

  2. personally i would be delighted if the ATT SYSV syntax disappeared. I know that is not practical (linux...), don’t flame me; I can dream... Why? because afaik, there is no actual specification for that syntax variant. (This is where the internet finds the info pages for binutils/gas and points me to them. (please don’t). Or better some old spec from the 1980s...)).

  3. All that said, the size-based suffix-appending algorithm for ATT SYSV syntax in XED is pretty simple and apparently broken in this situation. (see comment above about not having a spec). I guess I will have to look into it...

@sdasgup3
Copy link
Author

Thanks @markcharney

@sdasgup3
Copy link
Author

@markcharney Can you help me answer the following questions ? (Sorry if this is not the right thread to post this)

  1. How were ICLASS's in XED assigned?
    1.1) Is it exclusive for each instruction variant (memory/register/immediate) and with a specific immediate width (for immediate instructions).
  2. Are there any instances where two instructions with the same ICLASS have radically different behaviour?

@markcharney
Copy link
Contributor

For the most part, iclasses are how most of us think about instructions. The rep/lock forms of stuff complicate that simple picture. Aliased encodings also complicate that picture. Did you mean iclass or iform? The xed iform incorporates operand information to try to further disambiguate an encoding. Those are also a defined based on the operand specifications.

If you look at the STTNI instr, the same iclass can do somewhat different stuff depending on the bits in the imm8 operand. And clearly a short REP string op is very different than something that traverses gigabytes. Some x86 instructions (like CALL) are very complicated and can do radically different things. So I'm not really sure what you are asking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants