Skip to content

Commit

Permalink
Merge branch 'improve-assembly-hooks'
Browse files Browse the repository at this point in the history
  • Loading branch information
Sewer56 committed Dec 22, 2023
2 parents 94f9048 + 1e2aa75 commit 8b07ec3
Show file tree
Hide file tree
Showing 12 changed files with 1,033 additions and 433 deletions.
92 changes: 21 additions & 71 deletions docs/dev/design/assembly-hooks/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,13 +119,13 @@ The following table below shows common hook lengths, for:
- [Targeted Memory Allocation (TMA)](../../platform/overview.md#recommended-targeted-memory-allocation) (expected best case) when above `Relative Jump` range.
- Worst case scenario.

| Architecture | Relative | TMA | Worst Case |
|----------------|---------------------|--------------|-----------------|
| x86^[1]^ | 5 bytes (+- 2GiB) | 5 bytes | 5 bytes |
| x86_64 | 5 bytes (+- 2GiB) | 6 bytes^[2]^ | 13 bytes^[3]^ |
| x86_64 (macOS) | 5 bytes (+- 2GiB) | 13 bytes^[4]^| 13 bytes^[3]^ |
| ARM64 | 4 bytes (+- 128MiB) | 12 bytes^[6]^| 20 bytes^[5]^ |
| ARM64 (macOS) | 4 bytes (+- 128MiB) | 12 bytes^[6]^| 20 bytes^[5]^ |
| Architecture | Relative | TMA | Worst Case |
| -------------- | ------------------- | ------------- | ------------- |
| x86^[1]^ | 5 bytes (+- 2GiB) | 5 bytes | 5 bytes |
| x86_64 | 5 bytes (+- 2GiB) | 6 bytes^[2]^ | 13 bytes^[3]^ |
| x86_64 (macOS) | 5 bytes (+- 2GiB) | 13 bytes^[4]^ | 13 bytes^[3]^ |
| ARM64 | 4 bytes (+- 128MiB) | 12 bytes^[6]^ | 20 bytes^[5]^ |
| ARM64 (macOS) | 4 bytes (+- 128MiB) | 12 bytes^[6]^ | 20 bytes^[5]^ |

^[1]^: x86 can reach any address from any address with relative branch due to integer overflow/wraparound.
^[2]^: [`jmp [<Address>]`, with &lt;Address&gt; at &lt; 2GiB](../../arch/operations.md#jumpabsoluteindirect).
Expand All @@ -136,6 +136,8 @@ The following table below shows common hook lengths, for:

## Thread Safety & Memory Layout

!!! info "Extra: [Thread Safety on x86](../common.md#thread-safety-on-x86)"

In order to support thread safety, while retaining maximum runtime performance, the buffers where the
original and hook code are contained have a very specific memory layout (shown below)

Expand Down Expand Up @@ -187,69 +189,7 @@ hook: ; Backup (Hook)

!!! info "When transitioning between Enabled/Disabled state, we place a temporary branch at `entry`, this allows us to manipulate the remaining code safely."

```asm
entry: ; Currently Applied (Hook)
b original ; Temp branch to original
mov x0, x2
b back_to_code
original: ; Backup (Original)
mov x0, x1
add x0, x2
b back_to_code
hook: ; Backup (Hook)
add x1, x1
mov x0, x2
b back_to_code
```

!!! note "Don't forget to clear instruction cache on non-x86 architectures which need it."

This ensures we can safely overwrite the remaining code...

Then we overwrite `entry` code with `hook` code, except the branch:

```asm
entry: ; Currently Applied (Hook)
b original ; Branch to original
add x0, x2 ; overwritten with 'original' code.
b back_to_code ; overwritten with 'original' code.
original: ; Backup (Original)
mov x0, x1
add x0, x2
b back_to_code
hook: ; Backup (Hook)
add x1, x1
mov x0, x2
b back_to_code
```

And lastly, overwrite the branch.

To do this, read the original `sizeof(nint)` bytes at `entry`, replace branch bytes with original bytes
and do an atomic write. This way, the remaining instruction is safely replaced.

```asm
entry: ; Currently Applied (Hook)
add x1, x1 ; 'original' code.
add x0, x2 ; 'original' code.
b back_to_code ; 'original' code.
original: ; Backup (Original)
mov x0, x1
add x0, x2
b back_to_code
hook: ; Backup (Hook)
add x1, x1
mov x0, x2
b back_to_code
```

This way we achieve zero overhead CPU-wise, at expense of some memory.
!!! info "Read [Thread Safe Enable/Disable of Hooks](../common.md#thread-safe-enabledisable-of-hooks) for more info."

## Legacy Compatibility Considerations

Expand All @@ -266,4 +206,14 @@ This means a few functionalities must be supported here:

- Supporting Assembly via FASM.
- As this is only possible in Windows (FASM can't be recompiled on other OSes as library), this feature will be getting dropped.
- The `Reloaded.Hooks` wrapper will continue to ship FASM for backwards compatibility, however mods are expected to migrate to the new library in the future.
- The `Reloaded.Hooks` wrapper will continue to ship FASM for backwards compatibility, however mods are expected to migrate to the new library in the future.

## Limits

Assembly hook info is packed by default to save on memory space. By default, the following limits apply:

| Property | 4 Byte Instruction (e.g. ARM) | x86 | Unknown |
| -------------------- | ----------------------------- | ------ | ------- |
| Max Branch Length | 4 | 5 | 8 |
| Max Orig Code Length | 16KiB | 4KiB | 128MiB |
| Max Hook Code Length | 2MiB | 128KiB | 1GiB |
129 changes: 94 additions & 35 deletions docs/dev/design/branch-hooks/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Notably it differs in the following ways:
```mermaid
flowchart TD
CF[Caller Function]
RW[ReverseWrapper]
RW[Stub]
HK["&lt;Your Function&gt;"]
OM[Original Method]
Expand All @@ -65,7 +65,7 @@ flowchart TD
HK["&lt;Your Function&gt;"]
OM[Original Method]
CF -- "call &lt;<Your Function>&gt; instead of original" --> HK
CF -- "call 'Your Function' instead of original" --> HK
HK -. "Calls &lt;Optionally&gt;" .-> OM
OM -. "Returns" .-> HK
```
Expand All @@ -74,6 +74,24 @@ This option allows for a small performance improvement, saving 1 instruction and

This is on by default (can be disabled), and will take into effect when no conversion between calling conventions is needed and target is within 'Relative Jump' range for your CPU architecture.

### When Activated (with Calling Convention Conversion)

```mermaid
flowchart TD
CF[Caller Function]
RW[ReverseWrapper]
HK["&lt;Your Function&gt;"]
W[Wrapper]
OM[Original Method]
CF -- "call wrapper" --> RW
RW -- jump to your code --> HK
HK -. "Calls &lt;Optionally&gt;" .-> W
W -- "call original (wrapped)" --> OM
OM -. "Returns" .-> W
W -. "Returns" .-> HK
```

### When Deactivated

```mermaid
Expand All @@ -92,39 +110,53 @@ When the hook is deactivated, the stub is replaced with a direct jump back to th
By bypassing your code entirely, it is safe for your dynamic library (`.dll`/`.so`/`.dylib`)
to unload from the process.

## Example
## Thread Safety & Memory Layout

!!! info "Extra: [Thread Safety on x86](../common.md#thread-safety-on-x86)"

Emplacing the jump to the stub and patching within the stub are atomic operations on all supported platforms.

The 'branch hook' stub uses the following memory layout:

```text
- ( [ReverseWrapper] OR [Branch to Custom Function] ) OR [Branch to Original Function]
- Branch to Original Function
- [Wrapper] (If Calling Convention Conversion is needed)
```

!!! tip "The library is optimised to not use redundant memory"

### Before
For example, in x86 (32-bit), a `jmp` instruction can reach any address from any address. In that situation,
we don't write `Branch to Original Function` to the buffer at all, provided a `ReverseWrapper` is not needed,
as it is not necessary.

### Examples

!!! info "Using x86 Assembly."

#### Before

```asm
; x86 Assembly
originalCaller:
; Some code...
call originalFunction
; More code...
originalFunction:
; Function implementation...
```

### After (Fast Mode)
#### After (Fast Mode)

```asm
; x86 Assembly
originalCaller:
; Some code...
call newFunction
call userFunction ; To user method
; More code...
newFunction:
userFunction:
; New function implementation...
call originalFunction ; Optional.
originalFunction:
; Original function implementation...
```

### After
#### After

```asm
; x86 Assembly
Expand All @@ -134,46 +166,73 @@ originalCaller:
; More code...
stub:
; == BranchToCustom ==
jmp newFunction
; nop padding to 8 bytes (if needed)
; == BranchToCustom ==
; == BranchToOriginal ==
jmp originalFunction
; == BranchToOriginal ==
newFunction:
; New function implementation...
call originalFunction ; Optional.
```

originalFunction:
; Original function implementation...
#### After (with Calling Convention Conversion)

```asm
; x86 Assembly
originalCaller:
; Some code...
call stub
; More code...
stub:
; == ReverseWrapper ==
; implementation..
call userFunction
; ..implementation
; == ReverseWrapper ==
; == Wrapper ==
; implementation ..
jmp originalFunction
; .. implementation
; == Wrapper ==
; == BranchToOriginal ==
jmp originalFunction ; Whenever disabled :wink:
; == BranchToOriginal ==
userFunction:
; New function implementation...
call wrapper; (See Above)
```

### After (with Calling Convention Conversion)
#### After (Disabled)

```asm
; x86 Assembly
originalCaller:
; Some code...
call wrapper
call stub
; More code...
wrapper:
; call convention conversion implementation
call newFunction
; call convention conversion implementation
ret
stub:
<jmp to `jmp originalFunction`> ; We disable the hook by branching to instruction that branches to original
jmp originalFunction ; Whenever disabled :wink:
newFunction:
; New function implementation...
call reverseWrapper ; Optional.
reverseWrapper:
; call convention conversion implementation
call originalFunction
; call convention conversion implementation
ret
call originalFunction ; Optional.
originalFunction:
; Original function implementation...
```

## Thread Safety & Memory Layout
### Switching State

!!! info "When transitioning between Enabled/Disabled state, we place a temporary branch at `stub`, this allows us to manipulate the remaining code safely."

Emplacing the jump to the stub and patching within the stub are atomic operations on all supported platforms.
!!! info "Read [Thread Safe Enable/Disable of Hooks](../common.md#thread-safe-enabledisable-of-hooks) for more info."
Loading

0 comments on commit 8b07ec3

Please sign in to comment.