InCTF 2020 ArchRide Writeup • Super Pork Chop

This past month, I played InCTF, a mid-level CTF organized by India’s premier team, bi0s. I managed to solve one challenge called ArchRide - my first solve of a 1000-point challenge.

The file given is a bzip that contains an ELF. Loading the ELF into Ghidra, we see a simple main function that reads a 14-character key and runs it through two check function. If the checks pass, the binary will print “Surprise!” and create a new file called surprise.

The first check function looks like this in Ghidra:

ulong check_func1(byte *buf_15bytes)
{
  ulong uVar1;

  if ((int)(char)(buf_15bytes[2] ^ *buf_15bytes ^ buf_15bytes[4]) == DAT_00302020) {
    uVar1 = (ulong)DAT_00302024;
    if ((((((int)(char)(buf_15bytes[6] ^ buf_15bytes[2] ^ buf_15bytes[4]) == DAT_00302024) &&
          (uVar1 = (ulong)DAT_00302028,
          (int)(char)(buf_15bytes[8] ^ buf_15bytes[4] ^ buf_15bytes[6]) == DAT_00302028)) &&
         (uVar1 = (ulong)DAT_0030202c,
         (int)(char)(buf_15bytes[10] ^ buf_15bytes[6] ^ buf_15bytes[8]) == DAT_0030202c)) &&
        (((uVar1 = (ulong)DAT_00302030,
          (int)(char)(buf_15bytes[0xc] ^ buf_15bytes[8] ^ buf_15bytes[10]) == DAT_00302030 &&
          (uVar1 = (ulong)DAT_00302034,
          (int)(char)(buf_15bytes[1] ^ buf_15bytes[10] ^ buf_15bytes[0xc]) == DAT_00302034)) &&
         ((uVar1 = (ulong)DAT_00302038,
          (int)(char)(buf_15bytes[3] ^ buf_15bytes[0xc] ^ buf_15bytes[1]) == DAT_00302038 &&
          ((uVar1 = (ulong)DAT_0030203c,
           (int)(char)(buf_15bytes[5] ^ buf_15bytes[1] ^ buf_15bytes[3]) == DAT_0030203c &&
           (uVar1 = (ulong)DAT_00302040,
           (int)(char)(buf_15bytes[7] ^ buf_15bytes[3] ^ buf_15bytes[5]) == DAT_00302040)))))))) &&
       ((uVar1 = (ulong)DAT_00302044,
        (int)(char)(buf_15bytes[9] ^ buf_15bytes[5] ^ buf_15bytes[7]) == DAT_00302044 &&
        ((((uVar1 = (ulong)DAT_00302048,
           (int)(char)(buf_15bytes[0xb] ^ buf_15bytes[7] ^ buf_15bytes[9]) == DAT_00302048 &&
           (uVar1 = (ulong)DAT_0030204c,
           (int)(char)(buf_15bytes[0xd] ^ buf_15bytes[9] ^ buf_15bytes[0xb]) == DAT_0030204c)) &&
          (uVar1 = (ulong)DAT_00302050,
          (int)(char)(*buf_15bytes ^ buf_15bytes[0xb] ^ buf_15bytes[0xd]) == DAT_00302050)) &&
         (uVar1 = (ulong)DAT_00302054,
         (int)(char)(buf_15bytes[2] ^ buf_15bytes[0xd] ^ *buf_15bytes) == DAT_00302054)))))) {
      uVar1 = 1;
    }
  }
  else {
    uVar1 = 0;
  }
  return uVar1;
}}

Hmm, not too pretty. Seems like a job for a symbolic execution engine!

Because I didn’t feel like entering all the constraints, I tried angr first, but didn’t have much success. It took forever to run and was exploring the solution space extremely slowly. So after trying different things and getting more and more angry at angr, I ended up sticking all the constraints in a z3 script.

from z3 import *

flag = [BitVec('{}'.format(i), 8) for i in range(0xf)]
s = Solver()

check_value = [ 0x6e, 0x6e, 0x5b, 0x7e, 0x01,
                0x32, 0x47, 0x25, 0x27, 0x7f,
                0x67, 0x6d, 0x4e, 0x44 ]

s.add((flag[2] ^ flag[0] ^ flag[4]) == check_value[0])
s.add((flag[6] ^ flag[2] ^ flag[4]) == check_value[1])
s.add((flag[8] ^ flag[4] ^ flag[6]) == check_value[2])
s.add((flag[10] ^ flag[6] ^ flag[8]) == check_value[3])
s.add((flag[0xc] ^ flag[8] ^ flag[10]) == check_value[4])
s.add((flag[1] ^ flag[10] ^ flag[0xc]) == check_value[5])
s.add((flag[3] ^ flag[0xc] ^ flag[1]) == check_value[6])
s.add((flag[5] ^ flag[1] ^ flag[3]) == check_value[7])
s.add((flag[7] ^ flag[3] ^ flag[5]) == check_value[8])
s.add((flag[9] ^ flag[5] ^ flag[7]) == check_value[9])
s.add((flag[0xb] ^ flag[7] ^ flag[9]) == check_value[10])
s.add((flag[0xd] ^ flag[9] ^ flag[0xb]) == check_value[11])
s.add((flag[0] ^ flag[0xb] ^ flag[0xd]) == check_value[12])
s.add((flag[2] ^ flag[0xd] ^ flag[0]) == check_value[13])

if s.check() == sat:
    print("Found solution")
    model = s.model()
    flag_str = ''.join([chr(int(str(model[flag[i]]))) for i in range(len(model))])
    print(flag_str)
    s.add(flag[0] != s.model()[flag[0]])
else:
    print("No solution")

It immediately spits out a working key (we didn’t even need to add constraints for the second check). Entering the key results in this output. Uh-oh.

This time, unzipping the bzip results in a 32-bit x86 executable file. Not a big deal. We load it into Ghidra to see that the file is mostly the same, but the check function has different check values that needed to be fulfilled. After looking at it for a while, I realized that order of flag bytes xored together was the same as the first binary, so I didn’t need to change those. The only thing that changed was the check values.

After taking apart a couple more x86 surprise binaries, I eventually tried to find an automated way of dumping these check values, as there would likely be many more turtles on the way down. Looking at where the check values were stored in memory, I noticed this:

The check values are almost at the top of the .data section! Meaning that if we dump the .data section and do little cleaning, we can easily pull the check values into python. I ended up using readelf, although objdump was a close runner-up.

$ readelf -x .data surprise_bin2 | head | tail -n +5 | \ 
    cut -d' ' -f4,5,6,7
1e000000 7b000000 2d000000 37000000
06000000 04000000 39000000 14000000
72000000 53000000 29000000 43000000
3a000000 6a000000 00000000 00000000
66000000 3e000000 46000000 6c000000
51000000 4f000000 33000000

From there, it’s relatively easy to parse the values in python:

readelf_cmd = "readelf -x .data " + binname +" | head -n8 | tail -n +5"
    " | cut -d' ' -f4,5,6,7"
dump = subprocess.check_output(readelf_cmd, shell=True)
dump_nums = dump.split()
check_value = [int(dump_nums[i][:2], 16) for i in range(14)]

Excellent! But when we get to the 4th binary, there’s a somewhat unwelcome surprise:

Now we need a way to run a 64-bit ARM binary. Searching around online, I found a way to setup a aarch64 qemu docker image. Docker provides some official images.

After getting the qemu-arm-static/qemu-aarch64-static binary from https://github.com/multiarch/qemu-user-static/releases, it was just a matter of running:

docker run -v $(pwd)/qemu-aarch64-static:/usr/bin/qemu-aarch64-static -it arm64v8/debian:stretch-slim bash

I copied the binary into the docker container and ran it. Fortunately, the binary has the exact same checks as the x86 surprise binaries, so the z3 part of the script remained the same. The location of the check values shifted somewhat in memory, however, so after digging around a little in Ghidra, I noticed that there was a named xor1 symbol in the aarch64 binary that corresponded to the check values. Surely gdb would recognize it too? Indeed it does, and its output is much easier to parse than that of readelf.

$ gdb -n --batch surprise_bin17 -ex "x/14wx &xor1" | cut -d':' -f2
0x00000020      0x0000002a      0x00000033      0x00000045
0x0000004e      0x00000038      0x0000001e      0x00000001
0x0000006e      0x00000070      0x00000044      0x00000052
0x0000006b      0x00000059

This showcases a pair of my favorite arguments to gdb, --batch and -ex. The --batch option lets you run gdb in a “one-shot” fashion so it exits after running the command. This allows you to use gdb’s powerful capabilities without having to manually interact with its user interface. The -ex command allows you to specify a command to be executed, and it can be repeated to issue a series of commands. Calling it from python gives us the check values with minimal effort:

gdb_cmd = "gdb -n --batch " + binname + " -ex 'x/14wx &xor1'"
    " | cut -d':' -f2"
dump = subprocess.check_output(gdb_cmd, shell=True)
dump_nums = dump.split()
check_value = [int(dump_nums[i], 16) for i in range(14)]

After unraveling a couple more x86, x86_64, and aarch64 binaries, the 8th surprise binary shows something new:

Time for a new docker container:

docker run -v $(pwd)/qemu-arm-static:/usr/bin/qemu-arm-static -ti arm32v7/debian:stretch-slim bash

Other than the new docker container, no other changes are needed to crack 32-bit ARM binaries.

Everything goes swimmingly until we hit the 17th surprise binary.

The most popular 64-bit PowerPC docker image on DockerHub is ppc64le/debian. But upon trying to run the binary in this image, it fails. Why?

Little did I know that ppc64le doesn’t stand for something like “PowerPC64 limited edition”, but rather “PowerPC64 little endian". Oops. The file command said “MSB”, which is big endian. I almost gave up at this point, but after searching a bit more, I found a not-too-suspicious big endian docker image:

docker run -v $(pwd)/qemu-ppc64-static:/usr/bin/qemu-ppc64-static -it powerpc64/debian bash

It worked! I was worried there would be 32-bit PowerPC or MIPS binaries further down the line, but this was the last new architecture.

At this point, it was close to 1am, and the flower girl pretended to lock the bedroom door and go to sleep without me.

Up to now, I had been manually running the binaries in docker and copy-pasting the keys. I was at the 30th binary, and it was getting tiring. I needed to fully automate the process somehow. I’m a docker newbie, but I finally figured out how to send the key in an automated fashion. It looked like the following for aarch64:

os.system("docker cp " + binname + " " + ARM64_DOCKER_ID + ":/")
os.system("docker exec " + ARM64_DOCKER_ID + " bash -c 'echo " +
    flag_str + " | ./" + binname + "'")
os.system("docker cp " + ARM64_DOCKER_ID + ":/surprise .")
extract_and_rename_bin()

For easier debugging, my script would automatically solve one level of the binary per call, so I called it in a loop:

for i in {0..100}; do python surprise-meta-solver.py; done

Around the 127th binary or so, it failed, which means something interesting happened. Hey look, it’s the flag!

Later, I found that if I had read the decryption algorithm more carefully and reimplemented it, I could have solved the challenge without docker at all. But on the bright side, I learned a lot about qemu and docker, and ~600 points was not too bad! Full solution script located here.