CSAW Embedded Security Challenges 2020 - Quals' writeup

Aswin

2020-12-20

In this blog post, we will look at reverse engineering and solving a RISC-V binary, which was the challenge for qualifying for the CSAW 2020 Embedded Security Challenge (ESC), using the reverse engineering tool Ghidra SRE​.

Ghidra SRE, out of the box, does not support analyzing binaries for the RISC-V architecture. In order to get around that, we will be setting up a special RISC-V Ghidra Processor Module. For that, we just have to clone the Module’s files into the directory named Processors of the Ghidra’s program files. Simple as that!

You can get the challenge file over here.


Initial Analysis

We start by searching the main function in the folder named Functions inside the Symbol Tree, which is the entry point in the binary. For better clarity and effectiveness in inspecting the disassembly, we can view the function in the form of a control flow graph. For that, pick Function Graphs from the Window menu. A new window will pop up with the control flow graph of the main function.

Firstly, we can see get_uart_input(), being called to allow the user to enter the challenge number, in a while loop, with a condition that the input is between 1 and 4.

Upon further inspection, we can see that get_uart_input() being called again after prompting the user to get the input, and subsequently, three functions: challenge_1, challenge_2, challenge_3, are called, after moving the input, which is in the offset -0x2c from the stack pointer register s0, of the stack frame. We can see that the input passed to each of these functions are the same, from which we can deduce that there’s only a single prompt for all of the three challenges.

Challenge_1

In challenge_1, the first thing that catches anyone’s eye is the three concurrent branchings, just after the function sets up the stack frame and variables. We can see one of the arguments which was passed to the function, presumably our input, being loaded into a register in each of these branches. Upon a closer look, we can also see the offset of the variable that subsequently gets loaded and compared: zero, one, and two. From this, we can conclude that the binary is comparing the first three characters of our input here.

Further down the list, we can see a while loop, also after loading our input, comparing each character of our input with each character of a string, if not equal prints “Incorrect”, and if equal, increments the variable used as the index.

Using the debugger, we can analyze this further. Firstly, we can start a GDB server using qemu.

$ qemu-system-riscv32 -s -S -nographic -machine sifive_e -kernel qual-esc2020.elf

Italian Trulli

Then, connect to it, using GDB multi-arch.

$ gdb-multiarch qual-esc2020.elf

Italian Trulli

Similarly, we can set breakpoints in each of the branching instructions and see what the binary is comparing out input to. We can see that the first three branching that we talked about above, checks for the letters ‘C’ ‘S’ and ‘E’, and then ERA*.+,), in the loop, afterward.

Entering CSEERA*.+,) as input, we solve challenge_1.

Challenge_2

From the main function, this time, the user is asked to input 16 characters, before calling the function challenge_2. Notably, in challenge_2, we can see a string being loaded, just after the stack gets initialized.

Upon closer analysis we can see that the string is ‘ezpzlemonsqeezy’ and it also shows up in the Defined Strings section.

Later, we can see a loop, similar to one we had on challenge_1.

On the very first instruction on the loop, we can see the specific string being loaded, and then the iterator is added, to access the character at the required offset of the string. Then, that character is being XOR-ed with 0x2a, and the result is being compared to our input. We can also see that the program branches off to exit the function if found the comparison was found unequal.

This can be easily scripted in Python, to find the flag.

s = 'ezpzlemonsqueezy'
print ''.join ([chr (ord(i)*0x2a) for i in s])

This brings us to OPZPFOGEDY[_OOPS, which can be used to solve this challenge.

It is also worth noting that the string can also be found by attaching the debugger to QEMU, like we did in challenge 1 and see what the branching instruction is comparing to, in each iteration.

Challenge_3

In challenge 3, we can see that the string ‘ezpzlemonsqueezy’ is loaded and passed into a function called arx. This function performs various operations on the string and produces a transformed string. The input for this challenge requires 16 characters and the reverse of the string produced by the arx() is compared with our input.

The working of arx() is as follows, There is a loop which iterates 10 times and in each iteration, the following actions are performed in the string:

And once the loop comes to an end, each character is replaced by its modulo with 0x3c added with 0x22

The code for arx() would be as follows:

void arx(char * st, int length) {
    for (int i = 0; i < 10; i++) {
        //Adding 0x11 to each character
        for (int j = 0; j < length; j++) {
            st[j] = st[j] + 0x11;
        }
        // Left rotation of the string
        char ch = st[0];
        for (int j = 0; j < length; j++) {
            st[j] = st[j + 1];
        }
        st[length - 1] = ch;
        //XOR each character with 0xd    
        for (int j = 0; j < length; j++) {
            st[j] = st[j] ^ 0xd;
        }
    }
    // Modulo and addition for each character
    for (int i = 0; i < length; i++) {
        st[i] = st[i] % 0x3c;
        st[i] = st[i] + 0x22;
    }
    return;
}

The required input can be obtained by making a simple python script implementing this logic:

def arx(s):
    l = [ord(i) for i in s]
    for _ in range(10):
        l = [ ((i+0x11)%0x100) for i in l]
        l = l[1:] + [l[0]]
        l = [ (i ^ 0xd) for i in l]
    l = [ ( (i%0x3c) + 0x22 ) for i in l ]
    return ''.join([chr(i) for i in l])

s = "ezpzlemonsqueezy"
print arx(s)[::-1]

The required input we obtain is: I0E;3.<2<3G<33C#

Conclusion

Here, we have presented the approach we used to solve the three given challenges in the RISC-V binary given as part of CSAW Embedded Security Challenge 2020. We have done so using the Ghidra SRE along with the RISC-V processor module using techniques such as static analysis and dynamic analysis.