Thursday, November 12, 2015

Strengths and Weaknesses of LLVM's SafeStack Buffer Overflow Protection

by Samuel Groß

Introduction

In June 2015, a new memory corruption exploit mitigation named SafeStack was merged into the llvm development branch by Peter Collingbourne from Google and will be available with the upcoming 3.8 release. SafeStack was developed as part of the Code Pointer Integrity (CPI) project but is also available as stand-alone mitigation. We like to stay ahead of the curve on security, so this post aims to discuss the inner workings and the security benefits of SafeStack for consideration in future attacks and possible future improvements to the feature.

SafeStack in a Nutshell

SafeStack is a mitigation similar to (but potentially more powerful than) Stack Cookies. It tries to protect critical data on the stack by separating the native stack into two areas: A safe stack, which is used for control flow information as well as data that is only ever accessed in a safe way (as determined through static analysis). And an unsafe stack which is used for everything else that is stored on the stack. The two stacks are located in different memory regions in the process's address space and thus prevent a buffer overflow on the unsafe stack from corrupting anything on the safe stack.

SafeStack promises a generally good protection against common stack based memory corruption attacks while introducing only a low performance overhead (around 0.1% on average according to the documentation) when implemented.

When SafeStack is enabled, the stack pointer register (esp/rsp on x86/x64 respectively) will be used for the safe stack while the unsafe stack is tracked by a thread-local variable. The unsafe stack is allocated during initialization of the binary by mmap'ing a region of readable and writable memory and preceding this region with a guard page, presumably to catch stack overflows in the unsafe stack region.

SafeStack is (currently) incompatible with Stack Cookies and disables them when it is used.

Implementation Details

SafeStack is implemented as an llvm instrumentation pass, the main logic is implemented in lib/Transforms/Instrumentation/SafeStack.cpp. The instrumentation pass runs as one of the last steps before (native) code generation.

More technically: The instrumentation pass works by examining all "alloca" instructions in the intermediate representation (IR) of a function (clang first compiles the code into llvm's intermediate representation and later, after various instrumentation/optimization passes, translates the IR into machine code). An "alloca" instruction allocates space on the stack for a local variable or array. The SafeStack instrumentation pass then traverses the list of instructions that make use of this variable and determines whether these accesses are safe or not. If any access is determined to be "unsafe" by the instrumentation pass, the "alloca" instruction is replaced by code that allocates space on the unsafe stack instead and the instructions using the variable are updated accordingly.

The IsSafeStackAlloc function is responsible for deciding whether a stack variable can ever be accessed in an "unsafe" way. The definition of "unsafe" is currently rather conservative: a variable is relocated to the unsafe stack in the following cases:

  • a pointer to the variable is stored somewhere in memory
  • an element of an array is accessed with a non-constant index (i.e. another variable)
  • a variable sized array is accessed (with constant or non-constant index)
  • a pointer to the variable is given to a function as argument

The SafeStack runtime support, which is responsible for allocating and initializing the unsafe stack, can be found here. As previously mentioned, the unsafe stack is just a regular mmap'ed region.

Exploring SafeStack: Implementation in Practice

Let's now look at a very simple example to understand how SafeStack works under the hood. For my testing I compiled clang/llvm from source following this guide: http://clang.llvm.org/get_started.html

We'll use the following C code snippet:

void function(char *str) {
    char buffer[16];
    strcpy(buffer, str);
}

Let's start by looking at the generated assembly when no stack protection is used. For that we compile with "clang -O1 example.c" (optimization is enabled to reduce noise)

0000000000400580 <function>:
  400580:    48 83 ec 18            sub    rsp,0x18
  400584:    48 89 f8               mov    rax,rdi
  400587:    48 8d 3c 24            lea    rdi,[rsp]
  40058b:    48 89 c6               mov    rsi,rax
  40058e:    e8 bd fe ff ff         call   400450 <strcpy@plt>
  400593:    48 83 c4 18            add    rsp,0x18
  400597:    c3                     ret

Easy enough. The function allocates space on the stack for the buffer at 400580, then calls strcpy with a pointer to the buffer at 40058e. 

Now let's look at the assembly code generated when using Stack Cookies. For that we need to use the -fstack-protector flag (available in gcc and clang): "clang -O1 -fstack-protector example.c":


00000000004005f0 <function>:
  4005f0:    48 83 ec 18            sub    rsp,0x18
  4005f4:    48 89 f8               mov    rax,rdi
  4005f7:    64 48 8b 0c 25 28 00   mov    rcx,QWORD PTR fs:0x28
  4005fe:    00 00
  400600:    48 89 4c 24 10         mov    QWORD PTR [rsp+0x10],rcx
  400605:    48 8d 3c 24            lea    rdi,[rsp]
  400609:    48 89 c6               mov    rsi,rax
  40060c:    e8 9f fe ff ff         call   4004b0 <strcpy@plt>
  400611:    64 48 8b 04 25 28 00   mov    rax,QWORD PTR fs:0x28
  400618:    00 00
  40061a:    48 3b 44 24 10         cmp    rax,QWORD PTR [rsp+0x10]
  40061f:    75 05                  jne    400626 <function+0x36>
  400621:    48 83 c4 18            add    rsp,0x18
  400625:    c3                     ret
  400626:    e8 95 fe ff ff         call   4004c0 <_stack_chk_fail@plt>

At 4005f7 the master cookie (the reference value of the cookie) is read from the Thread Control Block (TCB which is a per thread data structure provided by libc) and put on the stack, below the return address. Later, at 40061a,  that value is then compared with the value in the TCB before the function returns. If the two values do not match, __stack_chk_fail is called which terminates the process with a message similar to this one: "*** stack smashing detected ***: ./example terminated".

Now we'll enable SafeStack by using the -fsanitize=safe-stack flag: "clang -O1 -fsanitize=safe-stack example.c":

0000000000410d70 <function>:
  410d70:   41 56                  push   r14
  410d72:   53                     push   rbx
  410d73:   50                     push   rax
  410d74:   48 89 f8               mov    rax,rdi
  410d77:   4c 8b 35 6a 92 20 00   mov    r14,QWORD PTR [rip+0x20926a]
  410d7e:   64 49 8b 1e            mov    rbx,QWORD PTR fs:[r14]
  410d82:   48 8d 7b f0            lea    rdi,[rbx-0x10]
  410d86:   64 49 89 3e            mov    QWORD PTR fs:[r14],rdi
  410d8a:   48 89 c6               mov    rsi,rax
  410d8d:   e8 be 00 ff ff         call   400e50 <strcpy@plt>
  410d92:   64 49 89 1e            mov    QWORD PTR fs:[r14],rbx
  410d96:   48 83 c4 08            add    rsp,0x8
  410d9a:   5b                     pop    rbx
  410d9b:   41 5e                  pop    r14
  410d9d:   c3                     ret

At 410d7e the current value of the unsafe stack pointer is retrieved from Thread Local Storage (TLS). Since each thread also has it's own unsafe stack, the stack pointer for the unsafe stack gets stored as a thread local variable. Next, at 410d82, the program allocates space for our buffer on the unsafe thread and writes the new value back to the TLS (410d86). It then calls the strcpy function with a pointer into the unsafe stack. In the function epilog (410d92), the old value of the unsafe stack pointer is written back into TLS (Basically, these instruction do the equivalent of "sub rsp, x; ... add rsp, x", but for the unsafe stack) and the function returns.

If we compile our program with the "-fsanitize=safe-stack option" and an overflow occurs, the saved return address (on the safe stack) is unaffected and the program likely segfaults as it tries to write behind the unsafe stack into unmapped/unwritable memory.

Security Details: Stack Cookies vs. SafeStack

While Stack Cookies provide fairly good protection against stack corruption exploits, the security measure in general has a few weaknesses. In particular, bypasses are possible in at least the following scenarios:

  • The vulnerability in code is a non-linear overflow/arbitrary relative write on the stack. In this case the cookie can simply be "skipped over".
  • Data (e.g. function pointers) further up the stack can be corrupted and are used before the function returns. 
  • The attacker has access to an information leak. Depending on the nature of the leak, the attacker can either leak the cookie from the stack directly or leak the master cookie. Once obtained, the attacker overflows the stack and overwrites the cookie again with the value obtained in the information leak.
  • In the case of weak entropy. If not enough entropy is available during generation of the cookie value, an attacker may be able to calculate the correct cookie value.
  • In the case of a forking service, the stack cookie value will stay the same for all child processes. This may make it possible to bruteforce the stack cookie value byte-by-byte, overwriting only a single byte of the cookie and observing whether the process crashes (wrong guess) or continues past the next return statement (correct guess). This would require at most 255 tries per unknown stack cookie byte.

It is important to note however, that most stack based overflows that are caused by functions operating on C strings (e.g. strcpy) are unexploitable when compiled with stack cookies enabled. As most stack cookie implementations usually force one of the bytes of the stack cookie to be a zero byte which makes string overwriting past that impossible with a C string (it's still possible with a network buffer and raw memory copy though).

Possible Implementation bugs aside, SafeStack is, at least in theory, immune to all of these due to the separation of the memory regions.

However, what SafeStack (by design) does not protect against is corruption of data on the unsafe stack. Or, phrased differently, the security of SafeStack is based around the assumption that no critical data is stored on the unsafe stack.

Moreover, in contrast to Stack Cookies, SafeStack does not prevent the callee from corrupting data of the caller (more precisely, Stack Cookies prevent the caller from using the corrupted data after the callee returns). The following example demonstrates this:

void smash_me() {
    char buffer[16];
    gets(buffer);
}

int main() {
    char buffer[16];
    memset(buffer, 0, sizeof(buffer));
    smash_me();
    puts(buffer);
    return 0;
}

Compiling this code with "-fsanitize=safe-stack" and supplying more than 16 bytes as input will overflow into the buffer of main() and corrupt its content. In contrast, when compiled with "-fstack-protector", the overflow will be detected and the process terminated before main() uses the corrupted buffer.
This weakness could be (partially) addressed by using Stack Cookies in addition to SafeStack. In this scenario, the master cookie could even be stored on the safe stack and regenerated for every function call (or chain of function calls). This would further protect against some of the weaknesses of plain Stack Cookies as described above.

The lack of unsafe stack protections combined with the conservativeness of the current definition of "unsafe" in the implementation potentially provides an attacker with enough critical data on the unsafe stack to compromise the application. As an example, we'll devise a, more or less, realistic piece of code that will result in the (security critical) variable 'pl' being placed on the unsafe stack, above 'buffer' (Although it seems that enabling optimization during compilation causes less variables to be placed on the unsafe stack):

void determine_privilege_level(int *pl) {
    // dummy function
    *pl = 0x42;
}

int main() {
    int pl;
    char buffer[16];
    determine_privilege_level(&pl);
    gets(buffer);             // This can overflow and corrupt 'pl'
    printf("privilege level: %x\n", pl);
    return 0;
}

This "data-only" attack is possible due to the fact that the current implementation never recurses into called functions but rather considers (most) function arguments as unsafe.

The risk of corrupting critical data on the unsafe stack can however be greatly decreased through improved static analysis, variable reordering, and, as mentioned above, by protecting the callee's unsafe stack frame.

It should also be noted that the current implementation does not protect the safe stack in any other way besides system level ASLR. This means that an information leak combined with an arbitrary write primitive will still allow an attacker to overwrite the return address (or other data) on the safe stack. See the comment at the top of the runtime support implementation for more information. Finally we should mention there has been an academic study that points out some additional detail regarding CPI.

Conclusion

With the exceptions noted above, SafeStack's implemented security measures are a superset of those of Stack Cookies, allowing it to prevent exploitation of stack based vulnerabilities in many scenarios. This combined with the low performance overhead could make SafeStack a good choice during compilation in the future.

SafeStack is still in its early stages, but it looks to be a very promising new addition to a developer's arsenal of compiler provided exploit mitigations. We wouldn't call it the end-all of buffer overflows, but it's a significant hurdle for attackers to overcome.

No comments :

Post a Comment