Friday, August 19, 2016

Introducing: SafeURL - A set of SSRF Protection Libraries

Introducing: SafeURL - A set of SSRF Protection Libraries

Code by IncludeSec team, with contributions by our Intern Mohammad Al Amin

At Include Security, we believe that a reactive approach to security can fall short when it's not backed by proactive roots. We see new offensive tools for pen-testing and vulnerability analysis being created and released all the time. In regards to SSRF vulnerabilities, we saw an opportunity to release code for developers to assist in protecting against these sorts of security issues. So we're releasing a new set of language specific libraries to help developers effectively protect against SSRF issues. In this blog post, we'll introduce the concept of SafeURL; with details about how it works, as well as how developers can use it, and our plans for rewarding those who find vulnerabilities in it!

Overview

  1. Preface: Server Side Request Forgery
  2. Our Proposed Solution
  3. Installation
  4. Usage
  5. Demo and Bug Bounty Contest

Preface: Server Side Request Forgery

Server Side Request Forgery (SSRF) is a vulnerability that gives an attacker the ability to create requests from a vulnerable server. SSRF attacks are commonly used to target not only the host server itself, but also hosts on the internal network that would normally be inaccessible due to firewalls.
SSRF allows an attacker to:
  • Scan and attack systems from the internal network that are not normally accessible
  • Enumerate and attack services that are running on these hosts
  • Exploit host-based authentication services
As is the case with many web application vulnerabilities, SSRF is possible because of a lack of user input validation. For example, a web application that accepts a URL input in order to go fetch that resource from the internet can be given a valid URL such as http://google.com
But the application may also accept URLs such as:
  • http://localhost
  • http://10.0.0.1
  • file:///localhost/example.txt
  • 127.0.0.1:22
When those kinds of inputs are not validated, attackers are able to access internal resources that are not intended to be public.

Our Proposed Solution

SafeURL is a library, originally conceptualized as "SafeCURL" by Jack Whitton (aka @fin1te), that protects against SSRF by validating each part of the URL against a white or black list before making the request. SafeURL can also be used to validate URLs. SafeURL intends to be a simple replacement for libcurl methods in PHP and Python as well as java.net.URLConnection in Scala.
The source for the libraries are available on our Github:
  1. SafeURL for PHP - Primarily developed by @fin1te
  2. SafeURL for Python - Ported by @nicolasrod
  3. SafeURL for Scala - Ported by @saelo

Other Mitigation Techniques

Our approach is focused on protection on the application layer. Other techniques used by some Silicon Valley companies to combat SSRF include:
  • Setting up wrappers for HTTP client calls which are forwarded to a single-purposed proxy that prevents it from talking to any internal hosts based on firewall rules as the HTTP requests are proxied
  • At the application server layer, hijack all socket connections to ensure they meet a developer configured policy by enforcing iptables rules or more advanced interactions with the app server's networking layer

Installation

PHP

SafeURL can be included in any PHP project by cloning the repository on our Github and importing it into your project.

Python

SafeURL can be used in Python apps by cloning the repository on our Github and importing it like this:
  from safeurl import safeurl

Scala

To use SafeURL in Scala applications, clone the repository and store in the app/ folder of your Play application and import it.
  import com.includesecurity.safeurl._

Usage

PHP

SafeURL is designed to be a drop-in replacement for the curl_exec() function in PHP. It can simply be replaced with SafeURL::execute() wrapped in a try {} catch {} block.
try { $url = "http://www.google.com"; $curlHandle = curl_init(); //Your usual cURL options curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (SafeURL)"); //Execute using SafeURL $response = SafeURL::execute($url, $curlHandle); } catch (Exception $e) { //URL wasn"t safe } Options such as white and black lists can be modified. For example:
$options = new Options(); $options->addToList("blacklist", "domain", "(.*)\.fin1te\.net"); $options->addToList("whitelist", "scheme", "ftp"); //This will now throw an InvalidDomainException $response = SafeURL::execute("http://example.com", $curlHandle, $options); //Whilst this will be allowed, and return the response $response = SafeURL::execute("ftp://example.com", $curlHandle, $options);

Python

SafeURL serves as a replacement for PyCurl in Python.
try: su = safeurl.SafeURL() res = su.execute("https://example.com";) except: print "Unexpected error:", sys.exc_info()
Example of modifying options:
try: sc = safeurl.SafeURL() opt = safeurl.Options() opt.clearList("whitelist") opt.clearList("blacklist") opt.setList("whitelist", [ "google.com" , "youtube.com"], "domain") su.setOptions(opt) res = su.execute("http://www.youtube.com") except: print "Unexpected error:", sys.exc_info()

Scala

SafeURL replaces the JVM Class URLConnection that is normally used in Scala.
try { val resp = SafeURL.fetch("http://google.com") val r = Await.result(resp, 500 millis) } catch { //URL wasnt safe }
Options:
SafeURL.defaultConfiguration.lists.ip.blacklist ::= "12.34.0.0/16" SafeURL.defaultConfiguration.lists.domain.blacklist ::= "example.com"

Demo, Bug Bounty Contest, and Further Contributions

An important question to ask is: Is SafeURL really safe? Don't take our word for it. Try to hack it yourself! We're hosting live demo apps in each language for anyone to try and bypass SafeURL and perform a successful SSRF attack. On each site there is a file called key.txt on the server's local filesystem with the following .htaccess policy:
<Files key.txt>
  Order deny,allow
  Deny from allow
  Allow from 127.0.0.1

  ErrorDocument 403 /oops.html
</Files>
If you can read the contents of the file through a flaw in SafeURL and tell us how you did it (patch plz?), we will contact you about your reward. As a thank you to the community, we're going to reward up to one Bitcoin for any security issues. If you find a non-security bug in the source of any of our libraries, please contact us as well you'll have our thanks and a shout-out.
The challenges are being hosted at the following URLs:
PHP: safeurl-php.excludesecurity.com
Python: safeurl-python.excludesecurity.com
Scala: safeurl-scala.excludesecurity.com

If you can contribute a Pull Request and port the SafeURL concept to other languages (such as Java, Ruby, C#, etc.) we could throw you you some Bitcoin as a thank you.

Good luck and thanks for helping us improve SafeURL!

Thursday, November 12, 2015

Strengths and Weaknesses of LLVM's SafeStack Buffer Overflow Protection

by Samuel Groß

Introduction

In June 2015, a new memory corruption exploit mitigation named SafeStack was merged into the llvm development branch by Peter Collingbourne from Google and will be available with the upcoming 3.8 release. SafeStack was developed as part of the Code Pointer Integrity (CPI) project but is also available as stand-alone mitigation. We like to stay ahead of the curve on security, so this post aims to discuss the inner workings and the security benefits of SafeStack for consideration in future attacks and possible future improvements to the feature.

SafeStack in a Nutshell

SafeStack is a mitigation similar to (but potentially more powerful than) Stack Cookies. It tries to protect critical data on the stack by separating the native stack into two areas: A safe stack, which is used for control flow information as well as data that is only ever accessed in a safe way (as determined through static analysis). And an unsafe stack which is used for everything else that is stored on the stack. The two stacks are located in different memory regions in the process's address space and thus prevent a buffer overflow on the unsafe stack from corrupting anything on the safe stack.

SafeStack promises a generally good protection against common stack based memory corruption attacks while introducing only a low performance overhead (around 0.1% on average according to the documentation) when implemented.

When SafeStack is enabled, the stack pointer register (esp/rsp on x86/x64 respectively) will be used for the safe stack while the unsafe stack is tracked by a thread-local variable. The unsafe stack is allocated during initialization of the binary by mmap'ing a region of readable and writable memory and preceding this region with a guard page, presumably to catch stack overflows in the unsafe stack region.

SafeStack is (currently) incompatible with Stack Cookies and disables them when it is used.

Implementation Details

SafeStack is implemented as an llvm instrumentation pass, the main logic is implemented in lib/Transforms/Instrumentation/SafeStack.cpp. The instrumentation pass runs as one of the last steps before (native) code generation.

More technically: The instrumentation pass works by examining all "alloca" instructions in the intermediate representation (IR) of a function (clang first compiles the code into llvm's intermediate representation and later, after various instrumentation/optimization passes, translates the IR into machine code). An "alloca" instruction allocates space on the stack for a local variable or array. The SafeStack instrumentation pass then traverses the list of instructions that make use of this variable and determines whether these accesses are safe or not. If any access is determined to be "unsafe" by the instrumentation pass, the "alloca" instruction is replaced by code that allocates space on the unsafe stack instead and the instructions using the variable are updated accordingly.

The IsSafeStackAlloc function is responsible for deciding whether a stack variable can ever be accessed in an "unsafe" way. The definition of "unsafe" is currently rather conservative: a variable is relocated to the unsafe stack in the following cases:

  • a pointer to the variable is stored somewhere in memory
  • an element of an array is accessed with a non-constant index (i.e. another variable)
  • a variable sized array is accessed (with constant or non-constant index)
  • a pointer to the variable is given to a function as argument

The SafeStack runtime support, which is responsible for allocating and initializing the unsafe stack, can be found here. As previously mentioned, the unsafe stack is just a regular mmap'ed region.

Exploring SafeStack: Implementation in Practice

Let's now look at a very simple example to understand how SafeStack works under the hood. For my testing I compiled clang/llvm from source following this guide: http://clang.llvm.org/get_started.html

We'll use the following C code snippet:

void function(char *str) {
    char buffer[16];
    strcpy(buffer, str);
}

Let's start by looking at the generated assembly when no stack protection is used. For that we compile with "clang -O1 example.c" (optimization is enabled to reduce noise)

0000000000400580 <function>:
  400580:    48 83 ec 18            sub    rsp,0x18
  400584:    48 89 f8               mov    rax,rdi
  400587:    48 8d 3c 24            lea    rdi,[rsp]
  40058b:    48 89 c6               mov    rsi,rax
  40058e:    e8 bd fe ff ff         call   400450 <strcpy@plt>
  400593:    48 83 c4 18            add    rsp,0x18
  400597:    c3                     ret

Easy enough. The function allocates space on the stack for the buffer at 400580, then calls strcpy with a pointer to the buffer at 40058e. 

Now let's look at the assembly code generated when using Stack Cookies. For that we need to use the -fstack-protector flag (available in gcc and clang): "clang -O1 -fstack-protector example.c":


00000000004005f0 <function>:
  4005f0:    48 83 ec 18            sub    rsp,0x18
  4005f4:    48 89 f8               mov    rax,rdi
  4005f7:    64 48 8b 0c 25 28 00   mov    rcx,QWORD PTR fs:0x28
  4005fe:    00 00
  400600:    48 89 4c 24 10         mov    QWORD PTR [rsp+0x10],rcx
  400605:    48 8d 3c 24            lea    rdi,[rsp]
  400609:    48 89 c6               mov    rsi,rax
  40060c:    e8 9f fe ff ff         call   4004b0 <strcpy@plt>
  400611:    64 48 8b 04 25 28 00   mov    rax,QWORD PTR fs:0x28
  400618:    00 00
  40061a:    48 3b 44 24 10         cmp    rax,QWORD PTR [rsp+0x10]
  40061f:    75 05                  jne    400626 <function+0x36>
  400621:    48 83 c4 18            add    rsp,0x18
  400625:    c3                     ret
  400626:    e8 95 fe ff ff         call   4004c0 <_stack_chk_fail@plt>

At 4005f7 the master cookie (the reference value of the cookie) is read from the Thread Control Block (TCB which is a per thread data structure provided by libc) and put on the stack, below the return address. Later, at 40061a,  that value is then compared with the value in the TCB before the function returns. If the two values do not match, __stack_chk_fail is called which terminates the process with a message similar to this one: "*** stack smashing detected ***: ./example terminated".

Now we'll enable SafeStack by using the -fsanitize=safe-stack flag: "clang -O1 -fsanitize=safe-stack example.c":

0000000000410d70 <function>:
  410d70:   41 56                  push   r14
  410d72:   53                     push   rbx
  410d73:   50                     push   rax
  410d74:   48 89 f8               mov    rax,rdi
  410d77:   4c 8b 35 6a 92 20 00   mov    r14,QWORD PTR [rip+0x20926a]
  410d7e:   64 49 8b 1e            mov    rbx,QWORD PTR fs:[r14]
  410d82:   48 8d 7b f0            lea    rdi,[rbx-0x10]
  410d86:   64 49 89 3e            mov    QWORD PTR fs:[r14],rdi
  410d8a:   48 89 c6               mov    rsi,rax
  410d8d:   e8 be 00 ff ff         call   400e50 <strcpy@plt>
  410d92:   64 49 89 1e            mov    QWORD PTR fs:[r14],rbx
  410d96:   48 83 c4 08            add    rsp,0x8
  410d9a:   5b                     pop    rbx
  410d9b:   41 5e                  pop    r14
  410d9d:   c3                     ret

At 410d7e the current value of the unsafe stack pointer is retrieved from Thread Local Storage (TLS). Since each thread also has it's own unsafe stack, the stack pointer for the unsafe stack gets stored as a thread local variable. Next, at 410d82, the program allocates space for our buffer on the unsafe thread and writes the new value back to the TLS (410d86). It then calls the strcpy function with a pointer into the unsafe stack. In the function epilog (410d92), the old value of the unsafe stack pointer is written back into TLS (Basically, these instruction do the equivalent of "sub rsp, x; ... add rsp, x", but for the unsafe stack) and the function returns.

If we compile our program with the "-fsanitize=safe-stack option" and an overflow occurs, the saved return address (on the safe stack) is unaffected and the program likely segfaults as it tries to write behind the unsafe stack into unmapped/unwritable memory.

Security Details: Stack Cookies vs. SafeStack

While Stack Cookies provide fairly good protection against stack corruption exploits, the security measure in general has a few weaknesses. In particular, bypasses are possible in at least the following scenarios:

  • The vulnerability in code is a non-linear overflow/arbitrary relative write on the stack. In this case the cookie can simply be "skipped over".
  • Data (e.g. function pointers) further up the stack can be corrupted and are used before the function returns. 
  • The attacker has access to an information leak. Depending on the nature of the leak, the attacker can either leak the cookie from the stack directly or leak the master cookie. Once obtained, the attacker overflows the stack and overwrites the cookie again with the value obtained in the information leak.
  • In the case of weak entropy. If not enough entropy is available during generation of the cookie value, an attacker may be able to calculate the correct cookie value.
  • In the case of a forking service, the stack cookie value will stay the same for all child processes. This may make it possible to bruteforce the stack cookie value byte-by-byte, overwriting only a single byte of the cookie and observing whether the process crashes (wrong guess) or continues past the next return statement (correct guess). This would require at most 255 tries per unknown stack cookie byte.

It is important to note however, that most stack based overflows that are caused by functions operating on C strings (e.g. strcpy) are unexploitable when compiled with stack cookies enabled. As most stack cookie implementations usually force one of the bytes of the stack cookie to be a zero byte which makes string overwriting past that impossible with a C string (it's still possible with a network buffer and raw memory copy though).

Possible Implementation bugs aside, SafeStack is, at least in theory, immune to all of these due to the separation of the memory regions.

However, what SafeStack (by design) does not protect against is corruption of data on the unsafe stack. Or, phrased differently, the security of SafeStack is based around the assumption that no critical data is stored on the unsafe stack.

Moreover, in contrast to Stack Cookies, SafeStack does not prevent the callee from corrupting data of the caller (more precisely, Stack Cookies prevent the caller from using the corrupted data after the callee returns). The following example demonstrates this:

void smash_me() {
    char buffer[16];
    gets(buffer);
}

int main() {
    char buffer[16];
    memset(buffer, 0, sizeof(buffer));
    smash_me();
    puts(buffer);
    return 0;
}

Compiling this code with "-fsanitize=safe-stack" and supplying more than 16 bytes as input will overflow into the buffer of main() and corrupt its content. In contrast, when compiled with "-fstack-protector", the overflow will be detected and the process terminated before main() uses the corrupted buffer.
This weakness could be (partially) addressed by using Stack Cookies in addition to SafeStack. In this scenario, the master cookie could even be stored on the safe stack and regenerated for every function call (or chain of function calls). This would further protect against some of the weaknesses of plain Stack Cookies as described above.

The lack of unsafe stack protections combined with the conservativeness of the current definition of "unsafe" in the implementation potentially provides an attacker with enough critical data on the unsafe stack to compromise the application. As an example, we'll devise a, more or less, realistic piece of code that will result in the (security critical) variable 'pl' being placed on the unsafe stack, above 'buffer' (Although it seems that enabling optimization during compilation causes less variables to be placed on the unsafe stack):

void determine_privilege_level(int *pl) {
    // dummy function
    *pl = 0x42;
}

int main() {
    int pl;
    char buffer[16];
    determine_privilege_level(&pl);
    gets(buffer);             // This can overflow and corrupt 'pl'
    printf("privilege level: %x\n", pl);
    return 0;
}

This "data-only" attack is possible due to the fact that the current implementation never recurses into called functions but rather considers (most) function arguments as unsafe.

The risk of corrupting critical data on the unsafe stack can however be greatly decreased through improved static analysis, variable reordering, and, as mentioned above, by protecting the callee's unsafe stack frame.

It should also be noted that the current implementation does not protect the safe stack in any other way besides system level ASLR. This means that an information leak combined with an arbitrary write primitive will still allow an attacker to overwrite the return address (or other data) on the safe stack. See the comment at the top of the runtime support implementation for more information. Finally we should mention there has been an academic study that points out some additional detail regarding CPI.

Conclusion

With the exceptions noted above, SafeStack's implemented security measures are a superset of those of Stack Cookies, allowing it to prevent exploitation of stack based vulnerabilities in many scenarios. This combined with the low performance overhead could make SafeStack a good choice during compilation in the future.

SafeStack is still in its early stages, but it looks to be a very promising new addition to a developer's arsenal of compiler provided exploit mitigations. We wouldn't call it the end-all of buffer overflows, but it's a significant hurdle for attackers to overcome.

Thursday, November 5, 2015

Firmware dumping technique for an ARM Cortex-M0 SoC

by Kris Brosch

One of the first major goals when reversing a new piece of hardware is getting a copy of the firmware. Once you have access to the firmware, you can reverse engineer it by disassembling the machine code.

Sometimes you can get access to the firmware without touching the hardware, by downloading a firmware update file for example. More often, you need to interact with the chip where the firmware is stored. If the chip has a debug port that is accessible, it may allow you to read the firmware through that interface. However, most modern chips have security features that when enabled, prevent firmware from being read through the debugging interface. In these situations, you may have to resort to decapping the chip, or introducing glitches into the hardware logic by manipulating inputs such as power or clock sources and leveraging the resulting behavior to successfully bypass these security implementations.

This blog post is a discussion of a new technique that we've created to dump the firmware stored on a particular Bluetooth system-on-chip (SoC), and how we bypassed that chip's security features to do so by only using the debugging interface of the chip. We believe this technique is a vulnerability in the code protection features of this SoC and as such have notified the IC vendor prior to publication of this blog post.

The SoC

The SoC in question is a Nordic Semiconductor nRF51822. The nRF51822 is a popular Bluetooth SoC with an ARM Cortex-M0 CPU core and built-in Bluetooth hardware. The chip's manual is available here.

Chip security features that prevent code readout vary in implementation among the many microcontrollers and SoCs available from various manufacturers, even among those that use the same ARM cores. The nRF51822's code protection allows the developer to prevent the debugging interface from being able to read either all of code and memory (flash and RAM) sections, or a just a subsection of these areas. Additionally, some chips have options to prevent debugger access entirely. The nRF51822 doesn't provide such a feature to developers; it just disables memory accesses through the debugging interface.

The nRF51822 has a serial wire debug (SWD) interface, a two-wire (in addition to ground) debugging interface available on many ARM chips. Many readers may be familiar with JTAG as a physical interface that often provides access to hardware and software debugging features of chips. Some ARM cores support a debugging protocol that works over the JTAG physical interface; SWD is a different physical interface that can be used to access the same software debugging features of a chip that ARM JTAG does. OpenOCD is an open source tool that can be used to access the SWD port.

This document contains a pinout diagram of the nRF51822. Luckily the hardware target we were analyzing has test points connected to the SWDIO and SWDCLK chip pins with PCB traces that were easy to follow. By connecting to these test points with a SWD adapter, we can use OpenOCD to access the chip via SWD. There are many debug adapters supported by OpenOCD, some of which support SWD.

Exploring the Debugger Access

Once OpenOCD is connected to the target, we can run debugging commands, and read/write some ARM registers, however we are prevented from reading out the code section. In the example below, we connect to the target with OpenOCD and attempt to read memory sections from the target chip. We proceed to reset the processor and read from the address 0x00000000 and the address that we determine is in the program counter (pc) register (0x000114cc), however nothing but zeros is returned. Of course we know there is code there, but the code protection counter-measures are preventing us from accessing it:

> reset halt
target state: halted
target halted due to debug-request, current mode: Thread
xPSR: 0xc1000000 pc: 0x000114cc msp: 0x20001bd0
> mdw 0x00000000
0x00000000: 00000000
> mdw 0x000114cc 10
0x000114cc: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
0x000114ec: 00000000 00000000

We can however read and write CPU registers, including the program counter (pc), and we can single-step through instructions (we just don't know what instructions, since we can't read them):

> reg r0 0x12345678
r0 (/32): 0x12345678
> step
target state: halted
target halted due to single-step, current mode: Thread
xPSR: 0xc1000000 pc: 0x000114ce msp: 0x20001bd0
> reg pc 0x00011500
pc (/32): 0x00011500
> step
target state: halted
target halted due to single-step, current mode: Thread
xPSR: 0xc1000000 pc: 0x00011502 msp: 0x20001bd0


We can also read a few of the memory-mapped configuration registers. Here we are reading a register named "RBPCONF" (short for readback protection) in a collection of registers named "UICR" (User Information Configuration Registers); you can find the address of this register in the nRF51 Series Reference Manual:

> mdw 0x10001004
0x10001004: ffff00ff


According to the manual, a value of 0xffff00ff in the RBPCONF register means "Protect all" (PALL) is enabled (bits 15..8, labeled "B" in this table, are set to 0), and "Protect region 0" (PR0) is disabled (bits 7..0, labeled "A", are set to1):


The PALL feature being enabled is what is responsible for preventing us from accessing the code section and subsequently causing our read commands to return zeros.

The other protection feature, PR0, is not enabled in this case, but it's worth mentioning because the protection bypass discussed in this article could bypass PR0 as well. If enabled, it would prevent the debugger from reading memory below a configurable address. Note that flash (and therefore the firmware we want) exists at a lower address than RAM. PR0 also prevents code running outside of the protected region from reading any data within the protected region.

Unfortunately, it is not possible to disable PALL without erasing the entire chip, wiping away the firmware with it. However, it is possible to bypass this readback protection by leveraging our debug access to the CPU.

Devising a Protection Bypass

An initial plan to dump the firmware via a debugging interface might be to load some code into RAM that reads the firmware from flash into a RAM buffer that we could then read. However, we don't have access to RAM because PALL is enabled. Even if PALL were disabled, PR0 could have been enabled, which would prevent our code in RAM (which would be the unprotected region) from reading flash (in the protected region). This plan won't work if either PALL or PR0 is enabled.

To bypass the memory protections, we need a way to read the protected data and we need a place to write it that we can access. In this case, only code that exists in protected memory can read protected memory. So our method of reading data will be to jump to an instruction in protected memory using our debugger access, and then to execute that instruction. The instruction will read the protected data into a CPU register, at which time we can then read the value out of the CPU register using our debugger access. How do we know what instruction to jump to? We'll have to blindly search protected memory for a load instruction that will read from an address we supply in a register. Once we've found such an instruction, we can exploit it to read out all of the firmware.

Finding a Load Instruction

Our debugger access lets us write to the pc register in order to jump to any instruction, and it lets us single step the instruction execution. We can also read and write the contents of the general purpose CPU registers. In order to read from the protected memory, we have to find a load word instruction with a register operand, set the operand register to a target address, and execute that one instruction. Since we can't read the flash, we don't know what instructions are where, so it might seem difficult to find the right instruction. However, all we need is an instruction that reads memory from an address in some register to a register, which is a pretty common operation. A load word instruction would work, or a pop instruction, for example.

We can search for the right instruction using trial and error. First, we set the program counter to somewhere we guess a useful instruction might be. Then, we set all the CPU registers to an address we're interested in and then single step. Next we examine the registers. If we are lucky, the instruction we just executed loaded data from an address stored in another register. If one of the registers has changed to a value that might exist at the target address, then we may have found a useful load instruction.

We might as well start at the reset vector - at least we know there are valid instructions there. Here we're resetting the CPU, setting the general purpose registers and stack pointer to zero (the address we're trying), and single stepping, then examining the registers:

> reset halt
target state: halted
target halted due to debug-request, current mode: Thread
xPSR: 0xc1000000 pc: 0x000114cc msp: 0x20001bd0
> reg r0 0x00000000
r0 (/32): 0x00000000
> reg r1 0x00000000
r1 (/32): 0x00000000
> reg r2 0x00000000
r2 (/32): 0x00000000
> reg r3 0x00000000
r3 (/32): 0x00000000
> reg r4 0x00000000
r4 (/32): 0x00000000
> reg r5 0x00000000
r5 (/32): 0x00000000
> reg r6 0x00000000
r6 (/32): 0x00000000
> reg r7 0x00000000
r7 (/32): 0x00000000
> reg r8 0x00000000
r8 (/32): 0x00000000
> reg r9 0x00000000
r9 (/32): 0x00000000
> reg r10 0x00000000
r10 (/32): 0x00000000
> reg r11 0x00000000
r11 (/32): 0x00000000
> reg r12 0x00000000
r12 (/32): 0x00000000
> reg sp 0x00000000
sp (/32): 0x00000000
> step
target state: halted
target halted due to single-step, current mode: Thread
xPSR: 0xc1000000 pc: 0x000114ce msp: 00000000
> reg
===== arm v7m registers
(0) r0 (/32): 0x00000000
(1) r1 (/32): 0x00000000
(2) r2 (/32): 0x00000000
(3) r3 (/32): 0x10001014
(4) r4 (/32): 0x00000000
(5) r5 (/32): 0x00000000
(6) r6 (/32): 0x00000000
(7) r7 (/32): 0x00000000
(8) r8 (/32): 0x00000000
(9) r9 (/32): 0x00000000
(10) r10 (/32): 0x00000000
(11) r11 (/32): 0x00000000
(12) r12 (/32): 0x00000000
(13) sp (/32): 0x00000000
(14) lr (/32): 0xFFFFFFFF
(15) pc (/32): 0x000114CE
(16) xPSR (/32): 0xC1000000
(17) msp (/32): 0x00000000
(18) psp (/32): 0xFFFFFFFC
(19) primask (/1): 0x00
(20) basepri (/8): 0x00
(21) faultmask (/1): 0x00
(22) control (/2): 0x00
===== Cortex-M DWT registers
(23) dwt_ctrl (/32)
(24) dwt_cyccnt (/32)
(25) dwt_0_comp (/32)
(26) dwt_0_mask (/4)
(27) dwt_0_function (/32)
(28) dwt_1_comp (/32)
(29) dwt_1_mask (/4)
(30) dwt_1_function (/32)

Looks like r3 was set to 0x10001014. Is that the value at address zero? Let's see what happens when we load the registers with four instead:
> reset halt
target state: halted
target halted due to debug-request, current mode: Thread
xPSR: 0xc1000000 pc: 0x000114cc msp: 0x20001bd0
> reg r0 0x00000004
r0 (/32): 0x00000004
> reg r1 0x00000004
r1 (/32): 0x00000004
> reg r2 0x00000004
r2 (/32): 0x00000004
> reg r3 0x00000004
r3 (/32): 0x00000004
> reg r4 0x00000004
r4 (/32): 0x00000004
> reg r5 0x00000004
r5 (/32): 0x00000004
> reg r6 0x00000004
r6 (/32): 0x00000004
> reg r7 0x00000004
r7 (/32): 0x00000004
> reg r8 0x00000004
r8 (/32): 0x00000004
> reg r9 0x00000004
r9 (/32): 0x00000004
> reg r10 0x00000004
r10 (/32): 0x00000004
> reg r11 0x00000004
r11 (/32): 0x00000004
> reg r12 0x00000004
r12 (/32): 0x00000004
> reg sp 0x00000004
sp (/32): 0x00000004
> step
target state: halted
target halted due to single-step, current mode: Thread
xPSR: 0xc1000000 pc: 0x000114ce msp: 0x00000004
> reg
===== arm v7m registers
(0) r0 (/32): 0x00000004
(1) r1 (/32): 0x00000004
(2) r2 (/32): 0x00000004
(3) r3 (/32): 0x10001014
(4) r4 (/32): 0x00000004
(5) r5 (/32): 0x00000004
(6) r6 (/32): 0x00000004
(7) r7 (/32): 0x00000004
(8) r8 (/32): 0x00000004
(9) r9 (/32): 0x00000004
(10) r10 (/32): 0x00000004
(11) r11 (/32): 0x00000004
(12) r12 (/32): 0x00000004
(13) sp (/32): 0x00000004
(14) lr (/32): 0xFFFFFFFF
(15) pc (/32): 0x000114CE
(16) xPSR (/32): 0xC1000000
(17) msp (/32): 0x00000004
(18) psp (/32): 0xFFFFFFFC
(19) primask (/1): 0x00
(20) basepri (/8): 0x00
(21) faultmask (/1): 0x00
(22) control (/2): 0x00
===== Cortex-M DWT registers
(23) dwt_ctrl (/32)
(24) dwt_cyccnt (/32)
(25) dwt_0_comp (/32)
(26) dwt_0_mask (/4)
(27) dwt_0_function (/32)
(28) dwt_1_comp (/32)
(29) dwt_1_mask (/4)
(30) dwt_1_function (/32)

Nope, r3 gets the same value, so we're not interested in the first instruction. Let's continue on to the second:

> reg r0 0x00000000
r0 (/32): 0x00000000
> reg r1 0x00000000
r1 (/32): 0x00000000
> reg r2 0x00000000
r2 (/32): 0x00000000
> reg r3 0x00000000
r3 (/32): 0x00000000
> reg r4 0x00000000
r4 (/32): 0x00000000
> reg r5 0x00000000
r5 (/32): 0x00000000
> reg r6 0x00000000
r6 (/32): 0x00000000
> reg r7 0x00000000
r7 (/32): 0x00000000
> reg r8 0x00000000
r8 (/32): 0x00000000
> reg r9 0x00000000
r9 (/32): 0x00000000
> reg r10 0x00000000
r10 (/32): 0x00000000
> reg r11 0x00000000
r11 (/32): 0x00000000
> reg r12 0x00000000
r12 (/32): 0x00000000
> reg sp 0x00000000
sp (/32): 0x00000000
> step
target state: halted
target halted due to single-step, current mode: Thread
xPSR: 0xc1000000 pc: 0x000114d0 msp: 00000000
> reg
===== arm v7m registers
(0) r0 (/32): 0x00000000
(1) r1 (/32): 0x00000000
(2) r2 (/32): 0x00000000
(3) r3 (/32): 0x20001BD0
(4) r4 (/32): 0x00000000
(5) r5 (/32): 0x00000000
(6) r6 (/32): 0x00000000
(7) r7 (/32): 0x00000000
(8) r8 (/32): 0x00000000
(9) r9 (/32): 0x00000000
(10) r10 (/32): 0x00000000
(11) r11 (/32): 0x00000000
(12) r12 (/32): 0x00000000
(13) sp (/32): 0x00000000
(14) lr (/32): 0xFFFFFFFF
(15) pc (/32): 0x000114D0
(16) xPSR (/32): 0xC1000000
(17) msp (/32): 0x00000000
(18) psp (/32): 0xFFFFFFFC
(19) primask (/1): 0x00
(20) basepri (/8): 0x00
(21) faultmask (/1): 0x00
(22) control (/2): 0x00
===== Cortex-M DWT registers
(23) dwt_ctrl (/32)
(24) dwt_cyccnt (/32)
(25) dwt_0_comp (/32)
(26) dwt_0_mask (/4)
(27) dwt_0_function (/32)
(28) dwt_1_comp (/32)
(29) dwt_1_mask (/4)
(30) dwt_1_function (/32)

OK, this time r3 was set to 0x20001BD0. Is that the value at address zero? Let's see what happens when we run the second instruction with the registers set to 4:
> reset halt
target state: halted
target halted due to debug-request, current mode: Thread
xPSR: 0xc1000000 pc: 0x000114cc msp: 0x20001bd0
> step
target state: halted
target halted due to single-step, current mode: Thread
xPSR: 0xc1000000 pc: 0x000114ce msp: 0x20001bd0
> reg r0 0x00000004
r0 (/32): 0x00000004
> reg r1 0x00000004
r1 (/32): 0x00000004
> reg r2 0x00000004
r2 (/32): 0x00000004
> reg r3 0x00000004
r3 (/32): 0x00000004
> reg r4 0x00000004
r4 (/32): 0x00000004
> reg r5 0x00000004
r5 (/32): 0x00000004
> reg r6 0x00000004
r6 (/32): 0x00000004
> reg r7 0x00000004
r7 (/32): 0x00000004
> reg r8 0x00000004
r8 (/32): 0x00000004
> reg r9 0x00000004
r9 (/32): 0x00000004
> reg r10 0x00000004
r10 (/32): 0x00000004
> reg r11 0x00000004
r11 (/32): 0x00000004
> reg r12 0x00000004
r12 (/32): 0x00000004
> reg sp 0x00000004
sp (/32): 0x00000004
> step
target state: halted
target halted due to single-step, current mode: Thread
xPSR: 0xc1000000 pc: 0x000114d0 msp: 0x00000004
> reg
===== arm v7m registers
(0) r0 (/32): 0x00000004
(1) r1 (/32): 0x00000004
(2) r2 (/32): 0x00000004
(3) r3 (/32): 0x000114CD
(4) r4 (/32): 0x00000004
(5) r5 (/32): 0x00000004
(6) r6 (/32): 0x00000004
(7) r7 (/32): 0x00000004
(8) r8 (/32): 0x00000004
(9) r9 (/32): 0x00000004
(10) r10 (/32): 0x00000004
(11) r11 (/32): 0x00000004
(12) r12 (/32): 0x00000004
(13) sp (/32): 0x00000004
(14) lr (/32): 0xFFFFFFFF
(15) pc (/32): 0x000114D0
(16) xPSR (/32): 0xC1000000
(17) msp (/32): 0x00000004
(18) psp (/32): 0xFFFFFFFC
(19) primask (/1): 0x00
(20) basepri (/8): 0x00
(21) faultmask (/1): 0x00
(22) control (/2): 0x00
===== Cortex-M DWT registers
(23) dwt_ctrl (/32)
(24) dwt_cyccnt (/32)
(25) dwt_0_comp (/32)
(26) dwt_0_mask (/4)
(27) dwt_0_function (/32)
(28) dwt_1_comp (/32)
(29) dwt_1_mask (/4)
(30) dwt_1_function (/32)

This time, r3 got 0x00014CD. This value actually strongly implies we're reading memory. Why? The value is actually the reset vector. According to the Cortex-M0 documentation, the reset vector is at address 4, and when we reset the chip, the PC is set to 0x000114CC (the least significant bit is set in the reset vector, changing C to D, because the Cortex-M0 operates in Thumb mode).

Let's try reading the two instructions we just were testing:

> reset halt
target state: halted
target halted due to debug-request, current mode: Thread
xPSR: 0xc1000000 pc: 0x000114cc msp: 0x20001bd0
> step
target state: halted
target halted due to single-step, current mode: Thread
xPSR: 0xc1000000 pc: 0x000114ce msp: 0x20001bd0
> reg r0 0x000114cc
r0 (/32): 0x000114CC
> reg r1 0x000114cc
r1 (/32): 0x000114CC
> reg r2 0x000114cc
r2 (/32): 0x000114CC
> reg r3 0x000114cc
r3 (/32): 0x000114CC
> reg r4 0x000114cc
r4 (/32): 0x000114CC
> reg r5 0x000114cc
r5 (/32): 0x000114CC
> reg r6 0x000114cc
r6 (/32): 0x000114CC
> reg r7 0x000114cc
r7 (/32): 0x000114CC
> reg r8 0x000114cc
r8 (/32): 0x000114CC
> reg r9 0x000114cc
r9 (/32): 0x000114CC
> reg r10 0x000114cc
r10 (/32): 0x000114CC
> reg r11 0x000114cc
r11 (/32): 0x000114CC
> reg r12 0x000114cc
r12 (/32): 0x000114CC
> reg sp 0x000114cc
sp (/32): 0x000114CC
> step
target state: halted
target halted due to single-step, current mode: Thread
xPSR: 0xc1000000 pc: 0x000114d0 msp: 0x000114cc
> reg r3
r3 (/32): 0x681B4B13

The r3 register has the value 0x681B4B13. That disassembles to two load instructions, the first relative to the pc, the second relative to r3:

$ printf "\x13\x4b\x1b\x68" > /tmp/armcode
$ arm-none-eabi-objdump -D --target binary -Mforce-thumb -marm /tmp/armcode

/tmp/armcode:     file format binary


Disassembly of section .data:

00000000 <.data>:
   0:   4b13            ldr     r3, [pc, #76]   ; (0x50)
   2:   681b            ldr     r3, [r3, #0]

In case you don't read Thumb assembly, that second instruction is a load register instruction (ldr); it's taking an address from the r3 register, adding an offset of zero, and loading the value from that address into the r3 register.

We've found a load instruction that lets us read memory from an arbitrary address. Again, this is useful because only code in the protected memory can read the protected memory. The trick is that being able to read and write CPU registers using OpenOCD lets us execute those instructions however we want. If we hadn't been lucky enough to find the load word instruction so close to the reset vector, we could have reset the processor and written a value to the pc register (jumping to an arbitrary address) to try more instructions. Since we were lucky though, we can just step through the first instruction.

Dumping the Firmware

Now that we've found a load instruction that we can execute to read from arbitrary addresses, our firmware dumping process is as follows:
  1. Reset the CPU
  2. Single step (we don't care about the first instruction)
  3. Put the address we want to read from into r3
  4. Single step (this loads from the address in r3 to r3)
  5. Read the value from r3
Here's a ruby script to automate the process:

#!/usr/bin/env ruby require 'net/telnet' debug = Net::Telnet::new("Host" => "localhost", "Port" => 4444) dumpfile = File.open("dump.bin", "w") ((0x00000000/4)...(0x00040000)/4).each do |i| address = i * 4 debug.cmd("reset halt") debug.cmd("step") debug.cmd("reg r3 0x#{address.to_s 16}") debug.cmd("step") response = debug.cmd("reg r3") value = response.match(/: 0x([0-9a-fA-F]{8})/)[1].to_i 16 dumpfile.write([value].pack("V")) puts "0x%08x: 0x%08x" % [address, value] end dumpfile.close debug.close
The ruby script connects to the OpenOCD user interface, which is available via a telnet connection on localhost. It then loops through addresses that are multiples of four, using the load instruction we found to read data from those addresses.

Vendor Response

IncludeSec contacted NordicSemi via their customer support channel where they received a copy of this blog post. From NordicSemi customer support: "We take this into consideration together with other factors, and the discussions around this must be kept internal."

We additionally reached out to the only engineer who had security in his title and he didn't really want a follow-up Q&A call or further info and redirected us to only talk to customer support. So that's about all we can do for coordinated disclosure on our side.


Conclusion

Once we have a copy of the firmware image, we can do whatever disassembly or reverse engineering we want with it. We can also now disable the chip's PALL protection in order to more easily debug the code. To disable PALL, you need to erase the chip, but that's not a problem since we can immediately re-flash the chip using the dumped firmware. Once that the chip has been erased and re-programmed to disable the protection we can freely use the debugger to: read and write RAM, set breakpoints, and so on. We can even attach GDB to OpenOCD, and debug the firmware that way.

The technique described here won't work on all microcontrollers or SoCs; it only applies to situations where you have access to a debugging interface that can read and write CPU registers but not protected memory. Despite the limitation though, the technique can be used to dump firmware from nRF51822 chips and possibly others that use similar protections. We feel this is a vulnerability in the design of the nRF51822 code protection.

Are you using other cool techniques to dump firmware? Do you know of any other microcontrollers or SoCs that might be vulnerable to this type of code protection bypass? Let us know in the comments.

Wednesday, August 19, 2015

A light-weight forensic analysis of the AshleyMadison Hack

by Erik Cabetas

-----------[Intro]

So Ashley Madison(AM) got hacked, it was first announced about a month ago and the attackers claimed they'd drop the full monty of user data if the AM website did not cease operations. The AM parent company Avid Life Media(ALM) did not cease business operations for the site and true to their word it seems the attackers have leaked everything they promised on August 18th 2015 including:

  • full database dumps of user data
  • emails
  • internal ALM documents
  • as well as a limited number of user passwords


Back in college I used to do forensics contests for the "Honey Net Project" and thought this might be a fun nostalgic trip to try and recreate my pseudo-forensics investigation style on the data within the AM leak.

Disclaimer: I will not be releasing any personal or confidential information
within this blog post that may be found in the AM leak. The purpose of
this blog post is to provide an honest holistic forensic analysis and minimal
statistical analysis of the data found within the leak. Consider this a
journalistic exploration more than anything.

Also note, that the credit card files were deleted and not reviewed as part of this write-up

-----------[Grabbing the Leak]

First we go find where on the big bad dark web the release site is located. Thankfully knowing a shady guy named Boris pays off for me, and we find a torrent file for the release of the August 18th Ashley Madison user data dump. The torrent file we found has the following SHA1 hash.
e01614221256a6fec095387cddc559bffa832a19  impact-team-ashley-release.torrent

After extracting all the files we have the following sizes and
file hashes for evidence audit purposes:

$  du -sh *
4.0K    74ABAA38.txt
9.5G    am_am.dump
2.6G    am_am.dump.gz
4.0K    am_am.dump.gz.asc
13G     aminno_member.dump
3.1G    aminno_member.dump.gz
4.0K    aminno_member.dump.gz.asc
1.7G    aminno_member_email.dump
439M    aminno_member_email.dump.gz
4.0K    aminno_member_email.dump.gz.asc
111M    ashleymadisondump/
37M     ashleymadisondump.7z
4.0K    ashleymadisondump.7z.asc
278M    CreditCardTransactions.7z
4.0K    CreditCardTransactions.7z.asc
2.3G    member_details.dump
704M    member_details.dump.gz
4.0K    member_details.dump.gz.asc
4.2G    member_login.dump
2.7G    member_login.dump.gz
4.0K    member_login.dump.gz.asc
4.0K    README
4.0K    README.asc

$ sha1sum *
a884c4fcd61e23aecb80e1572254933dc85e2b4a  74ABAA38.txt
e4ff3785dbd699910a512612d6e065b15b75e012  am_am.dump
e0020186232dad71fcf92c17d0f11f6354b4634b  am_am.dump.gz
b7363cca17b05a2a6e9d8eb60de18bc98834b14e  am_am.dump.gz.asc
d412c3ed613fbeeeee0ab021b5e0dd6be1a79968  aminno_member.dump
bc60db3a78c6b82a5045b797e6cd428f367a18eb  aminno_member.dump.gz
8a1c328142f939b7f91042419c65462ea9b2867c  aminno_member.dump.gz.asc
2dcb0a5c2a96e4f3fff5a0a3abae19012d725a7e  aminno_member_email.dump
ab5523be210084c08469d5fa8f9519bc3e337391  aminno_member_email.dump.gz
f6144f1343de8cc51dbf20921e2084f50c3b9c86  aminno_member_email.dump.gz.asc
sha1sum: ashleymadisondump: Is a directory
26786cb1595211ad3be3952aa9d98fbe4c5125f9  ashleymadisondump.7z
eb2b6f9b791bd097ea5a3dca3414a3b323b8ad37  ashleymadisondump.7z.asc
0ad9c78b9b76edb84fe4f7b37963b1d956481068  CreditCardTransactions.7z
cb87d9fb55037e0b1bccfe50c2b74cf2bb95cd6c  CreditCardTransactions.7z.asc
11e646d9ff5d40cc8e770a052b36adb18b30fd52  member_details.dump
b4849cec980fe2d0784f8d4409fa64b91abd70ef  member_details.dump.gz
3660f82f322c9c9e76927284e6843cbfd8ab8b4f  member_details.dump.gz.asc
436d81a555e5e028b83dcf663a037830a7007811  member_login.dump
89fbc9c44837ba3874e33ccdcf3d6976f90b5618  member_login.dump.gz
e24004601486afe7e19763183934954b1fc469ef  member_login.dump.gz.asc
4d80d9b671d95699edc864ffeb1b50230e1ec7b0  README
a9793d2b405f31cc5f32562608423fffadc62e7a  README.asc

-----------[Attacker Identity & Attribution]

The attackers make it clear they have no desire to bridge their dark web identities with their real-life identities and have taken many measures to ensure this does not occur.

The torrent file and messaging were released via the anonymous Tor network through an Onion web server which serves only HTML/TXT content. If the attacker took proper OPSEC precautions while setting up the server, law enforcement and AM may never find them. That being said hackers have been known to get sloppy and slip up their OPSEC. The two most famous cases of this were when Sabu of Anonymous and separately the Dread Pirate Roberts of SilkRoad; were both caught even though they primarily used Tor for their internet activities.

Within the dump we see that the files are signed with PGP. Signing a file in this manner is a way of saying "I did this" even though we don't know the real-life identity of the person/group claiming to do this is (there is a bunch of crypto and math that makes this possible.) As a result we can be more confident that if there are files which are signed by this PGP key, then it was released by the same person/group.

In my opinion, this is done for two reasons. First the leaker wants to claim responsibility in an identity attributable manner, but not reveal their real-life identity. Secondly, the leaker wishes to dispel statements regarding "false leaks" made by the Ashley Madison team. The AM executive and PR teams have been in crises communications mode explaining that there have been many fake leaks.

The "Impact Team" is using the following public PGP key to sign their releases.
$ cat ./74ABAA38.txt
-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: GnuPG v1.4.12 (GNU/Linux)

mQINBFW25a4BEADt5OKS5F36aACyyPc4UMZAnhLnbImhxv5A2n7koTKg1QhyA1mI
InLLriKW3GR0Y4Fx+84pvjbYdoJAnuqMemI0oP+2VAJqwC0LYVVcFHKK6ZElYiN8
4/3e5WWYv6vzrHwB+3NbQ1O9bbUjgk9ky2RsdTe+vDBhKwKS0kPSb28h0oMpAs87
pJcgWZ57jjtvyUEIKXQZAqLvFo5xayS8dEp8tRgNLauQ0SafKGsxjW5cRd2Ok3Z5
QtIS44WnYECe3tqqFYSOo4kdHBeswC8zaKapYaNzxsHw9msdZvx/rkrMgXtJye/o
vmf2RdLIcvqK0Nwf1LDLhweCBP61wVn8gWqSrzww+as1ObE6b64hYKHFzdIMcqJ3
sbAErRrfZMqZ6ihWnlSjzDDx2L3n5T16ZIDxGx5Mt0KDYIo8RqDdF+VKLCT7Eq/C
g/Ax+06Eez4rVnY+xeW6Tj+1iBAlrGRIcRHCX89fNwLxr4Bcq/q1KKrCwVsgonBK
+3Mzzs2/b9XQ/Z6bDHFnMWUTDhomBmNcZOz9sHrZZI9XUzx/bfS6CoQ3MIqDhNM+
l7cKZ/Icfs6IDoOsYIS3QeTWC8gv3IBTvtfKFnf1o6JnkP0Qv6SrckslztNA4HDL
2iIMMGs34vDc11ddTzMBBkig1NgtiaHqHhG5T8OoOD9c3hEmTQzir7iCPQARAQAB
tCRJbXBhY3QgVGVhbSA8aW1wYWN0dGVhbUBtYWlsdG9yLm5ldD6JAjgEEwECACIF
AlW25a4CGwMGCwkIBwMCBhUIAgkKCwQWAgMBAh4BAheAAAoJECQ3PNV0q6o445UQ
AKYIVyrpVKKBA4jliarqngKvkEBRd62CXHY42ZdjFmubLvRw5nC0nDdGUyGPRYOl
0RddL2C7ROqW9lCYfNl3BAQYEXMADDjoBMEQkepIxeIVehat46ksbJuFZ0+uI6EB
aVcJCR4S2C+hJP09q9tn/7RKacIolfeT0+s9IteFghKKK0c8Aot52A/hExrqjldo
fsMX6liSFQjDQpPhQpqiAJ8z9N3eeFwcAAc/gqNz9bE0Wug/OXh0OAHUQk3fS57a
uIi8medOr+kAqHziuO79+5Hkachsp+8c58jBtIzZM4bO6e42aEa2yHv0FGG5MhoB
x7MH0ympFdwbgebpF6kpH371GIsJcyumwQ3Yn4Sy2kp2XmB8xOQo2W8tWRtLW1dI
yGAXHXXy5UI5FJek7G1KvQXCy4pa756RGDFiqdqigq0KC27A/at02M8CP6R9RxC9
YSnru0Qrl7JeATekWM3w8sKs8r6yMEDFAcpK2NHaYzF6/o6t/HEqUWD41DZ2cqqg
9i4uoXpkAB3vAG/snNg1B8g89b3vbVUf6hSIcU89G3lgj9hh87Q/TSsISRJ+yq0N
sLEeVmDmOdf+xb44g3RuRJ9yh0h3j8jdQOq0FvvwW3UHKIVDQlFB3kgHY478TCIa
5MMCtMovGv/ukGKlU8aELKV0/sVsliMh8HDdFQICTd0MuQINBFW25a4BEADIh8Vg
tMGfByY/+IgPd9l3u0I4FZLHqKGKOIpfFEeA31jPAhfOqQyBRcnEN/TxLwJ8NLnL
+GdQ+0z1YncZPxpHU/z8zyMwGpZM/hMbkixA9ysyu06S7hna4YMfifT+lOe1lGSo
Tz3Fz1u2OGH+2UzVk5+Rv0FqDl6X1ZoqhMTswzW0jYR7JLLJip5MTMrLD0rSl0b5
a2XvF9Tpjzy9KWubsJk4W7x00Egu2EU9NhEZXaY18H3rxvYgXT7JMjq/y+IUp2Cd
Bv/XCNWmzl66/ZSLC8hzlcxmAYpmBkxafYNdptMeVzsH/xHmN2zSFjuBNx0Mkk+R
TrOxK/boS9onrGsSQ3zItWJAmodo2qYFjlirtu9pURSdYEINNQ5DgWymg43iAIfp
Xp5/yGBj4BlWE80qEAVsBB2BIRs7QHvpd34xETP08dXMsswIrMn/XxvHumyPoimj
mcNvIpvnAZqt6xppo6BSZ3y7MU4cSIRsZzLuSvkwGk97Jv2sMNvXlPRxzpU9ozsI
iYJAk6/n8kbQiTJk/SeiCTbf6e+BzbZbgIE3O9iPKhfW+6zWjC4TL+lBeyWTy1PP
PcQTT+najDqIwysz2BFuPozwuUQsnfQnyRytSjcI5m1fDoYpJPH8NNRIu9lzp+RN
YENVKXiCfnUCMCnSzxP3Kij3Wt227JLZQqnBUQARAQABiQIfBBgBAgAJBQJVtuWu
AhsMAAoJECQ3PNV0q6o4C2EP/29Bis5Skt9NxHVUBpC1OgRL8V+JD5TjNurMT6Pu
E75szLsMZ84z0MQ6n74ADIgEuznPDIa9hMZGK9DwlsQfFOlC/jyTYxSpgAgN6LAl
qoJztVzLRnMd2gZjOj6wajUy616b8u3Q3zovHcEKll5niUyNwHXovZcCzukFqJBF
a3JU/tkPvBuj2PEWf4ytuO6He2ERuSnsi+7mil8rTAAV/PPy7N2R/T7OUa6ERoGg
hqIGythWizRtZBVPRzush+8L181GBU2ps7nJ1resZ7T0OsCFL67J6t8r8IpmjWWt
fiiV05E71UAyNWLOWriS57qAwNcQ0W2UYKkFFKor+oWaBB+hCpvb8Za5867wpH8l
O6gpS/G17e+MKHTn60hw64xIVFJn7pka+OdAINjPRo5B5qVyvM3puEjRepx1piOG
HKOan00quI0dhF2Gia59zrBHK/agdF4FjkJSjER8uf/jJpo184p38zuQ7kyMXUxY
ExpGcVMVjVOoWKVRPGXYEz2nc9HIZ6mHbvhzsWQEAVwwIxZCos5dW1AMW3Otn30A
uFqPsx4jh/ANGhqUASz18bBrZ8DW3zceVs2zelkMpdL0z7ifU/UNn2rtDlpgLwFl
9ggUtPwXnSxqB7doSxfJyPJUum+bZxMb4Iq5BNNa/tme7TeWGl9bmsVwcQXSQlY2
uZnr
=v0qe
-----END PGP PUBLIC KEY BLOCK-----
The key has the following Meta-data below.
Old: Public Key Packet(tag 6)(525 bytes)
        Ver 4 - new
        Public key creation time - Mon Jul 27 22:15:10 EDT 2015
        Pub alg - RSA Encrypt or Sign(pub 1)
        RSA n(4096 bits) - ...
        RSA e(17 bits) - ...
Old: User ID Packet(tag 13)(36 bytes)
        User ID - Impact Team <impactteam@mailtor.net>
Old: Signature Packet(tag 2)(568 bytes)
        Ver 4 - new
        Sig type - Positive certification of a User ID and Public Key packet(0x13).
        Pub alg - RSA Encrypt or Sign(pub 1)
        Hash alg - SHA1(hash 2)
        Hashed Sub: signature creation time(sub 2)(4 bytes)
                Time - Mon Jul 27 22:15:10 EDT 2015
        Hashed Sub: key flags(sub 27)(1 bytes)
                Flag - This key may be used to certify other keys
                Flag - This key may be used to sign data
        Hashed Sub: preferred symmetric algorithms(sub 11)(5 bytes)
                Sym alg - AES with 256-bit key(sym 9)
                Sym alg - AES with 192-bit key(sym 8)
                Sym alg - AES with 128-bit key(sym 7)
                Sym alg - CAST5(sym 3)
                Sym alg - Triple-DES(sym 2)
        Hashed Sub: preferred hash algorithms(sub 21)(5 bytes)
                Hash alg - SHA256(hash 8)
                Hash alg - SHA1(hash 2)
                Hash alg - SHA384(hash 9)
                Hash alg - SHA512(hash 10)
                Hash alg - SHA224(hash 11)
        Hashed Sub: preferred compression algorithms(sub 22)(3 bytes)
                Comp alg - ZLIB <RFC1950>(comp 2)
                Comp alg - BZip2(comp 3)
                Comp alg - ZIP <RFC1951>(comp 1)
        Hashed Sub: features(sub 30)(1 bytes)
                Flag - Modification detection (packets 18 and 19)
        Hashed Sub: key server preferences(sub 23)(1 bytes)
                Flag - No-modify
        Sub: issuer key ID(sub 16)(8 bytes)
                Key ID - 0x24373CD574ABAA38
        Hash left 2 bytes - e3 95
        RSA m^d mod n(4096 bits) - ...
                -> PKCS-1
Old: Public Subkey Packet(tag 14)(525 bytes)
        Ver 4 - new
        Public key creation time - Mon Jul 27 22:15:10 EDT 2015
        Pub alg - RSA Encrypt or Sign(pub 1)
        RSA n(4096 bits) - ...
        RSA e(17 bits) - ...
Old: Signature Packet(tag 2)(543 bytes)
        Ver 4 - new
        Sig type - Subkey Binding Signature(0x18).
        Pub alg - RSA Encrypt or Sign(pub 1)
        Hash alg - SHA1(hash 2)
        Hashed Sub: signature creation time(sub 2)(4 bytes)
                Time - Mon Jul 27 22:15:10 EDT 2015
        Hashed Sub: key flags(sub 27)(1 bytes)
                Flag - This key may be used to encrypt communications
                Flag - This key may be used to encrypt storage
        Sub: issuer key ID(sub 16)(8 bytes)
                Key ID - 0x24373CD574ABAA38
        Hash left 2 bytes - 0b 61
        RSA m^d mod n(4095 bits) - ...
                -> PKCS-1

We can verify the released files are attributable to the PGP public key
in question using the following commands:

$ gpg --import ./74ABAA38.txt
$ gpg --verify ./member_details.dump.gz.asc ./member_details.dump.gz
gpg: Signature made Sat 15 Aug 2015 11:23:32 AM EDT using RSA key ID 74ABAA38
gpg: Good signature from "Impact Team <impactteam@mailtor.net>"
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: 6E50 3F39 BA6A EAAD D81D  ECFF 2437 3CD5 74AB AA38

This also tells us at what date the dump was signed and packaged.

-----------[Catching the attackers]

The PGP key's meta-data shows a user ID for the mailtor dark web email service. The last known location of which was:
http://mailtoralnhyol5v.onion

Don't bother emailing the email address found in the PGP key as it does not have a valid MX record. The fact that this exists at all seems to be one of those interesting artifact of what happens when Internet tools like GPG get used on the dark web.

If the AM attackers were to be caught; here (in no particular order) are the most likely ways this would happen:

  • The person(s) responsible tells somebody. Nobody keeps something like this a secret, if the attackers tell anybody, they're likely going to get caught.
  • If the attackers review email from a web browser, they might get revealed via federal law enforcement or private investigation/IR teams hired by AM. The FBI is known to have these capabilities.
  • If the attackers slip up with their diligence in messaging only via TXT and HTML on the web server. Meta-data sinks ships kids -- don't forget.
  • If the attackers slip up with their diligence on configuring their server. One bad config of a web server leaks an internal IP, or worse!
  • The attackers slipped up during their persistent attack against AM and investigators hired by AM find evidence leading back to the attackers.
  • The attackers have not masked their writing or image creation style and leave some semantic finger print from which they can be profiled.

If none of those  things happen, I don't think these attackers will ever be caught. The cyber-crime fighters have a daunting task in front of them, I've helped out a couple FBI and NYPD cyber-crime fighters and I do not envy the difficult and frustrating job they have -- good luck to them! Today we're living in the Wild West days of the Internet.

-----------[Leaked file extraction and evidence gathering]

Now to document the information seen within this data leak we proceed with a couple of commands to gather the file size and we'll also check the file hashes to ensure the uniqueness of the files. Finally we review the meta-data of some of the compressed files. The meta-data shows the time-stamp embedded into the various compressed files. Although meta-data can easily be faked, it is usually not.

Next we'll extract these files and examine their file size to take a closer look.

$ 7z e ashleymadisondump.7z

We find within the extracted 7zip file another 7zip file
"swappernet_User_Table.7z" was found and also extracted.

We now have the following files sizes and SHA1 hashes for evidence
integrity & auditing purposes:

$ du -sh ashleymadisondump/*
68K     20131002-domain-list.xlsx
52K     ALMCLUSTER (production domain) computers.txt
120K    ALMCLUSTER (production domain) hashdump.txt
68K     ALM - Corporate Chart.pptx
256K    ALM Floor Plan - ports and names.pdf
8.0M    ALM - January 2015 - Company Overview.pptx
1.8M    ALM Labs Inc. Articles of Incorporation.pdf
708K    announcement.png
8.0K    Areas of concern - customer data.docx
8.0K    ARPU and ARPPU.docx
940K    Ashley Madison Technology Stack v5(1).docx
16K     Avid Life Media - Major Shareholders.xlsx
36K     AVIDLIFEMEDIA (primary corporate domain) computers.txt
332K    AVIDLIFEMEDIA (primary corporate domain) user information and hashes.txt
1.7M    Avid Org Chart 2015 - May 14.pdf
24K     Banks.xlsx
6.1M    Copies of Option Agreements.pdf
8.0K    Credit useage.docx
16K     CSF Questionnaire (Responses).xlsx
132K    Noel's loan agreement.pdf
8.0K    Number of traveling man purchases.docx
1.5M    oneperday_am_am_member.txt
940K    oneperday_aminno_member.txt
672K    oneperday.txt
44K     paypal accounts.xlsx
372K    printer@avidlifemedia.com_20101103_133855.pdf
16K     q2 2013 summary compensation detail_managerinput_trevor-s team.xlsx
8.0K    README.txt
8.0K    Rebill Success Rate Queries.docx
8.0K    Rev by traffic source rebill broken out.docx
8.0K    Rev from organic search traffic.docx
4.0K    Sales Queries
59M     swappernet_QA_User_Table.txt #this was extracted from swappernet_User_Table.7z in the same dir
17M     swappernet_User_Table.7z

$ sha1sum ashleymadisondump/*
f0af9ea887a41eb89132364af1e150a8ef24266f  20131002-domain-list.xlsx
30401facc68dab87c98f7b02bf0a986a3c3615f0  ALMCLUSTER (production domain) computers.txt
c36c861fd1dc9cf85a75295e9e7bcf6cf04c7d2c  ALMCLUSTER (production domain) hashdump.txt
6be635627aa38462ebcba9266bed5b492a062589  ALM - Corporate Chart.pptx
4dec7623100f59395b68fd13d3dcbbff45bef9c9  ALM Floor Plan - ports and names.pdf
601e0b462e1f43835beb66743477fe94bbda5293  ALM - January 2015 - Company Overview.pptx
d17cb15a5e3af15bc600421b10152b2ea1b9c097  ALM Labs Inc. Articles of Incorporation.pdf
1679eca2bc172cba0b5ca8d14f82f9ced77f10df  announcement.png
6a618e7fc62718b505afe86fbf76e2360ade199d  Areas of concern - customer data.docx
91f65350d0249211234a52b260ca2702dd2eaa26  ARPU and ARPPU.docx
50acee0c8bb27086f12963e884336c2bf9116d8a  Ashley Madison Technology Stack v5(1).docx
71e579b04bbba4f7291352c4c29a325d86adcbd2  Avid Life Media - Major Shareholders.xlsx
ef8257d9d63fa12fb7bc681320ea43d2ca563e3b  AVIDLIFEMEDIA (primary corporate domain) computers.txt
ec54caf0dc7c7206a7ad47dad14955d23b09a6c0  AVIDLIFEMEDIA (primary corporate domain) user information and hashes.txt
614e80a1a6b7a0bbffd04f9ec69f4dad54e5559e  Avid Org Chart 2015 - May 14.pdf
c3490d0f6a09bf5f663cf0ab173559e720459649  Banks.xlsx
1538c8f4e537bb1b1c9a83ca11df9136796b72a3  Copies of Option Agreements.pdf
196b1ba40894306f05dcb72babd9409628934260  Credit useage.docx
2c9ba652fb96f6584d104e166274c48aa4ab01a3  CSF Questionnaire (Responses).xlsx
0068bc3ee0dfb796a4609996775ff4609da34acb  Noel's loan agreement.pdf
c3b4d17fc67c84c54d45ff97eabb89aa4402cae8  Number of traveling man purchases.docx
9e6f45352dc54b0e98932e0f2fe767df143c1f6d  oneperday_am_am_member.txt
de457caca9226059da2da7a68caf5ad20c11de2e  oneperday_aminno_member.txt
d596e3ea661cfc43fd1da44f629f54c2f67ac4e9  oneperday.txt
37fdc8400720b0d78c2fe239ae5bf3f91c1790f4  paypal accounts.xlsx
2539bc640ea60960f867b8d46d10c8fef5291db7  printer@avidlifemedia.com_20101103_133855.pdf
5bb6176fc415dde851262ee338755290fec0c30c  q2 2013 summary compensation detail_managerinput_trevor-s team.xlsx
5435bfbf180a275ccc0640053d1c9756ad054892  README.txt
872f3498637d88ddc75265dab3c2e9e4ce6fa80a  Rebill Success Rate Queries.docx
d4e80e163aa1810b9ec70daf4c1591f29728bf8e  Rev by traffic source rebill broken out.docx
2b5f5273a48ed76cd44e44860f9546768bda53c8  Rev from organic search traffic.docx
sha1sum: Sales Queries: Is a directory
0f63704c118e93e2776c1ad0e94fdc558248bf4e  swappernet_QA_User_Table.txt
9d67a712ef6c63ae41cbba4cf005ebbb41d92f33  swappernet_User_Table.7z


-----------[Quick summary of each of the leaked files]

The following files are MySQL data dumps of the main AM database:
  • member_details.dump.gz
  • aminno_member.dump.gz
  • member_login.dump.gz
  • aminno_member_email.dump.gz
  • CreditCardTransactions.7z
Also included was another AM database which contains user info (separate from the emails):
  • am_am.dump.gz

In the top level directory you can also find these additional files:
  • 74ABAA38.txt
    Impact Team's Public PGP key used for signing the releases (The .asc files are the signatures)
  • ashleymadisondump.7z
    This contains various internal and corporate private files.
  • README
    Impact Team's justification for releasing the user data.
  • Various .asc files such as "member_details.dump.gz.asc"
    These are all PGP signature files to prove that one or more persons who are part of the "Impact Team" attackers released them.

Within the ashleymadisondump.7z we can extract and view the following files:
  • Number of traveling man purchases.docx
    SQL queries to investigate high-travel user's purchases.
  • q2 2013 summary compensation detail_managerinput_trevor-s team.xlsx
    Per-employee compensation listings.
  • AVIDLIFEMEDIA (primary corporate domain) user information and hashes.txt
  • AVIDLIFEMEDIA (primary corporate domain) computers.txt
    The output of the dnscmd windows command executing on what appears to be a primary domain controller. The timestamp indicates that the command was run on July 1st 2015. There is also "pwdump" style export of 1324 user accounts which appear to be from the ALM domain controller. These passwords will be easy to crack as NTLM hashes aren't the strongest
  • Noel's loan agreement.pdf
    A promissory note for the CEO to pay back ~3MM in Canadian monies.
  • Areas of concern - customer data.docx
    Appears to be a risk profile of the major security concerns that ALM has regarding their customer's data. And yes, a major user data dump is on the list of concerns.
  • Banks.xlsx
    A listing of all ALM associated bank account numbers and the biz which owns them.
  • Rev by traffic source rebill broken out.docx
  • Rebill Success Rate Queries.docx
    Both of these are SQL queries to investigate Rebilling of customers.
  • README.txt
    Impact Team statement regarding their motivations for the attack and leak.
  • Copies of Option Agreements.pdf
    All agreements for what appears all of the company's outstanding options.
  • paypal accounts.xlsx
    Various user/passes for ALM paypal accounts (16 in total)
  • swappernet_QA_User_Table.txt
  • swappernet_User_Table.7z
    This file is a database export into CSV format. I appears to be from a QA server
  • ALMCLUSTER (production domain) computers.txt
    The output of the dnscmd windows command executing on what appears to be a production domain controller. The timestamp indicates that the command was run on July 1st 2015.
  • ALMCLUSTER (production domain) hashdump.txt
    A "pwdump" style export of 1324 user accounts which appear to be from the ALM domain controller. These passwords will be easy to crack as NTLM hashes aren't the strongest.
  • ALM Floor Plan - ports and names.pdf
    Seating map of main office, this type of map is usually used for network deployment purposes.
  • ARPU and ARPPU.docx
    A listing of SQL commands which provide revenue and other macro financial health info.
    Presumably these queries would run on the primary DB or a biz intel slave.
  • Credit useage.docx
    SQL queries to investigate credit card purchases.
  • Avid Org Chart 2015 - May 14.pdf
    A per-team organizational chart of what appears to be the entire company.
  • announcement.png
    The graphic created by Impact Team to announce their demand for ALM to shut down it's flagship website AM.
  • printer@avidlifemedia.com_20101103_133855.pdf
    Contract outlining the terms of a purchase of the biz Seekingarrangement.com
  • CSF Questionnaire (Responses).xlsx
    Company exec Critical Success Factors spreadsheet. Answering questions like "In what area would you hate to see something go wrong?" and the CTO's response is about hacking.
  • ALM - January 2015 - Company Overview.pptx
    This is a very detailed breakdown of current biz health, marketing spend, and future product plans.
  • Ashley Madison Technology Stack v5(1).docx
    A detailed walk-through of all major servers and services used in the ALM production environment.
  • oneperday.txt
  • oneperday_am_am_member.txt
  • oneperday_aminno_member.txt
    These three files have limited leak info as a "teaser" for the .dump files that are found in the highest level directory of the AM leak.
  • Rev from organic search traffic.docx
    SQL queries to explore the revenue generated from search traffic.
  • 20131002-domain-list.xlsx
    BA list of the 1083 domain names that are, have been, or are seeking to be owned by ALM.
  • Sales Queries/
    Empty Directory
  • ALM Labs Inc. Articles of Incorporation.pdf
    The full 109 page Articles of Incorporation, ever aspect of inital company formation.
  • ALM - Corporate Chart.pptx
    A detailed block diagram defining the relationship between various tax and legal business entity names related to ALM businesses.
  • Avid Life Media - Major Shareholders.xlsx
    A listing of each major shareholder and their equity stake

-----------[File meta-data analysis]

First we'll take a look at the 7zip file in the top level directory.
$ 7z l ashleymadisondump.7z
Listing archive: ashleymadisondump.7z
----
Path = ashleymadisondump.7z
Type = 7z
Method = LZM
Solid = +
Blocks = 1
Physical Size = 37796243
Headers Size = 1303

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
2015-07-09 12:25:48 ....A     17271957     37794940  swappernet_User_Table.7z
2015-07-10 12:14:35 ....A       723516               announcement.png
2015-07-01 18:03:56 ....A        51222               ALMCLUSTER (production domain) computers.txt
2015-07-01 17:58:55 ....A       120377               ALMCLUSTER (production domain) hashdump.txt
2015-06-25 22:59:22 ....A        35847               AVIDLIFEMEDIA (primary corporate domain) computers.txt
2015-06-14 21:18:11 ....A       339221               AVIDLIFEMEDIA (primary corporate domain) user information and hashes.txt
2015-07-18 15:23:34 ....A       686533               oneperday.txt
2015-07-18 15:20:43 ....A       959099               oneperday_aminno_member.txt
2015-07-18 19:00:45 ....A      1485289               oneperday_am_am_member.txt
2015-07-19 17:01:11 ....A         6031               README.txt
2015-07-07 11:41:36 ....A         6042               Areas of concern - customer data.docx
2015-07-07 12:14:42 ....A         5907               Sales Queries/ARPU and ARPPU.docx
2015-07-07 12:04:35 ....A       960553               Ashley Madison Technology Stack v5(1).docx
2015-07-07 12:14:42 ....A         5468               Sales Queries/Credit useage.docx
2015-07-07 12:14:43 ....A         5140               Sales Queries/Number of traveling man purchases.docx
2015-07-07 12:14:47 ....A         5489               Sales Queries/Rebill Success Rate Queries.docx
2015-07-07 12:14:43 ....A         5624               Sales Queries/Rev by traffic source rebill broken out.docx
2015-07-07 12:14:42 ....A         6198               Sales Queries/Rev from organic search traffic.docx
2015-07-08 23:17:19 ....A       259565               ALM Floor Plan - ports and names.pdf
2012-10-19 16:54:20 ....A      1794354               ALM Labs Inc. Articles of Incorporation.pdf
2015-07-07 12:04:10 ....A      1766350               Avid Org Chart 2015 - May 14.pdf
2012-10-20 12:23:11 ....A      6344792               Copies of Option Agreements.pdf
2013-09-18 14:39:25 ....A       132798               Noel's loan agreement.pdf
2015-07-07 10:16:54 ....A       380043               printer@avidlifemedia.com_20101103_133855.pdf
2012-12-13 15:26:58 ....A        67816               ALM - Corporate Chart.pptx
2015-07-07 12:14:28 ....A      8366232               ALM - January 2015 - Company Overview.pptx
2013-10-07 10:30:28 ....A        67763               20131002-domain-list.xlsx
2013-07-15 15:20:14 ....A        13934               Avid Life Media - Major Shareholders.xlsx
2015-07-09 11:57:58 ....A        22226               Banks.xlsx
2015-07-07 11:41:41 ....A        15703               CSF Questionnaire (Responses).xlsx
2015-07-09 11:57:58 ....A        42511               paypal accounts.xlsx
2015-07-07 12:04:44 ....A        15293               q2 2013 summary compensation detail_managerinput_trevor-s team.xlsx
2015-07-18 13:54:40 D....            0            0  Sales Queries
------------------- ----- ------------ ------------  ------------------------
                              41968893     37794940  32 files, 1 folders
If we're to believe this meta-data, the newest file is from July 19th 2015 and the oldest is from October 19th 2012. The timestamp for the file announcement.png shows a creation date of July 10th 2015. This file is the graphical announcement from the leakers. The file swappernet_User_Table.7z
has a timestamp of July 9th 2015. Since this file is a database dump, one might presume that these files were created for the original release and the other files were copied from a file-system that preserves timestamps.

Within that 7zip file we've found another which looks like:
$ 7z l ashleymadisondump/swappernet_User_Table.7z
Listing archive: ./swappernet_User_Table.7z
----
Path = ./swappernet_User_Table.7z
Type = 7z
Method = LZMA
Solid = -
Blocks = 1
Physical Size = 17271957
Headers Size = 158

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
2015-06-27 18:39:40 ....A     61064200     17271799  swappernet_QA_User_Table.txt
------------------- ----- ------------ ------------  ------------------------
                              61064200     17271799  1 files, 0 folders

Within the ashleymadisondump directory extracted from ashleymadisondump.7z we've got
the following file types that we'll examine for meta-data:
8 txt
8 docx
6 xlsx
6 pdf
2 pptx
1 png
1 7z

The PNG didn't seem to have any EXIF meta-data, and we've already covered the 7z file.

The text files probably don't usually yield anything to us meta-data wise.

In the MS Word docx files  we have the following meta-data:
  • Areas of concern - customer data.docx
    No Metadata
  • ARPU and ARPPU.docx
    No Metadata
  • Ashley Madison Technology Stack v5(1).docx
    Created Michael Morris, created and last modified on Sep 17 2013.
  • Credit useage.docx
    No Metadata
  • Number of traveling man purchases.docx
    No Metadata
  • Rebill Success Rate Queries.docx
    No Metadata
  • Rev by traffic source rebill broken out.docx
    No Metadata
  • Rev from organic search traffic.docx
    No Metadata

In the MS Powerpoint pptx files we have the following meta-data:
  • ALM - Corporate Chart.pptx
    Created by "Diana Horvat" on Dec 5 2012 and last updated by "Tatiana Kresling"
    on Dec 13th 2012
  • ALM - January 2015 - Company Overview.pptx
    Created Rizwan Jiwan, Jan 21 2011 and last modified on Jan 20 2015.

In the MS Excel xlsx files we have the following meta-data:
  • 20131002-domain-list.xlsx
    Written by Kevin McCall, created and last modified Oct 2nd 2013
  • Avid Life Media - Major Shareholders.xlsx
    Jamal Yehia, created and last modified July 15th 2013
  • Banks.xlsx
    Created by "Elena" and Keith Lalonde, created Dec 15 2009 and last modified Feb 26th  2010
  • CSF Questionnaire (Responses).xlsx
    No Metadata
  • paypal accounts.xlsx
    Created by Keith Lalonde, created Oct 28  2010 and last modified Dec 22nd  2010
  • q2 2013 summary compensation detail_managerinput_trevor-s team.xlsx
    No Metadata

And finally within the PDF files we also see additional meta-data:
  • ALM Floor Plan - ports and names.pdf
    Written by Martin Price in MS Visio, created and last modified April 23 2015
  • ALM Labs Inc. Articles of Incorporation.pdf
    Created with DocsCorp Pty Ltd (www.docscorp.com), created and last modified on Oct 17 2012
  • Avid Org Chart 2015 - May 14.pdf
    Created and last modified on May 14 2015
  • Copies of Option Agreements.pdf
    OmniPage CSDK 16 OcrToolkit, created and last modified on Oct 16 2012
  • Noel's loan agreement.pdf
    Created and last modified on Sep 18 2013
  • printer@avidlifemedia.com_20101103_133855.pdf
    Created and last modified on Jul 7 2015

-----------[MySQL Dump file loading and evidence gathering]

At this point all of the dump files have been decompressed with gunzip or 7zip. The dump files are standard MySQL backup file (aka Dump files) the info in the dump files implies that it was taken from multiple servers:
$ grep 'MySQL dump' *.dump
am_am.dump:-- MySQL dump 10.13  Distrib 5.5.33, for Linux (x86_64)
aminno_member.dump:-- MySQL dump 10.13  Distrib 5.5.40-36.1, for Linux (x86_64)
aminno_member_email.dump:-- MySQL dump 10.13  Distrib 5.5.40-36.1, for Linux (x86_64)
member_details.dump:-- MySQL dump 10.13  Distrib 5.5.40-36.1, for Linux (x86_64)
member_login.dump:-- MySQL dump 10.13  Distrib 5.5.40-36.1, for Linux (x86_64)

Also within the dump files was info referencing being executed from localhost, this implies an attacker was on the Database server in question.

Of course, all of this info is just text and can easily be faked, but it's interesting none-the-less considering the possibility that it might be correct and unaltered.

To load up the MySQL dumps we'll start with a fresh MySQL database instance
on a decently powerful server and run the following commands:
--As root MySQL user
CREATE DATABASE aminno;
CREATE DATABASE am;
CREATE USER 'am'@'localhost' IDENTIFIED BY 'loyaltyandfidelity';
GRANT ALL PRIVILEGES ON aminno.* TO 'am'@'localhost';
GRANT ALL PRIVILEGES ON am.* TO 'am'@'localhost';

Now back at the command line we'll execute these to import the main dumps:

$ mysql -D aminno -uam -ployaltyandfidelity < aminno_member.dump

$ mysql -D aminno -uam -ployaltyandfidelity < aminno_member_email.dump

$ mysql -D aminno -uam -ployaltyandfidelity < member_details.dump

$ mysql -D aminno -uam -ployaltyandfidelity < member_login.dump

$ mysql -D am -uam -ployaltyandfidelity < am_am.dump

Now that you've got the data loaded up you can recreate some of the findings ksugihara made with his analysis here [Edit: It appears ksugihara has taken this offline, I don't have a mirror]. We didn't have much more to add for holistic statistics analysis than what he's already done so check out his blog post for more on the primary data dumps. There still is one last final database export though...

Within the file ashleymadisondump/swappernet_QA_User_Table.txt we have a final database export, but this one is not in the MySQL dump format. It is instead in CSV format. The file name implies this was an export from a QA Database server.

This file has the following columns (left to right in the CSV):

  • recid
  • id
  • username
  • userpassword
  • refnum
  • disable
  • ipaddress
  • lastlogin
  • lngstatus
  • strafl
  • ap43
  • txtCoupon
  • bot

Sadly within the file we see user passwords are in clear text which is always a bad security practice. At the moment though we don't know if these are actual production user account passwords, and if so how old they are. My guess is that these are from an old QA server when AM was a smaller company and hadn't moved to secure password hashing practices like bcrypt.

These commands show us there are 765,607 records in this database export and
only four of them have a blank password. Many of the passwords repeat and
397,974 of the passwords are unique.

$ cut -d , -f 4 < swappernet_QA_User_Table.txt |wc -l
765607
$ cut -d , -f 4 < swappernet_QA_User_Table.txt | sed '/^\s*$/d' |wc -l
765603
$ cut -d , -f 4 < swappernet_QA_User_Table.txt | sed '/^\s*$/d' |sort -u |wc -l
387974

Next we see the top 25 most frequently used passwords in this database export
using the command:
$ cut -d , -f 4 < swappernet_QA_User_Table.txt |sort|uniq -c |sort -rn|head -25
   5882 123456
   2406 password
    950 pussy
    948 12345
    943 696969
    917 12345678
    902 fuckme
    896 123456789
    818 qwerty
    746 1234
    734 baseball
    710 harley
    699 swapper
    688 swinger
    647 football
    645 fuckyou
    641 111111
    538 swingers
    482 mustang
    482 abc123
    445 asshole
    431 soccer
    421 654321
    414 1111
    408 hunter

After importing the CSV into MS excel we can use sort and filter to make some
additional statements based on the data.
  1. The only logins marked as "lastlogin" column in the year 2015 are from the
    following users:
    SIMTEST101
    SIMTEST130
    JULITEST2
    JULITEST3
    swappernetwork
    JULITEST4
    HEATSEEKERS

  2. The final and most recent login was from AvidLifeMedia's office IP range.
  3. 275,285 of these users have an entry for the txtCupon.
  4. All users with the "bot" column set to TRUE have either passwords
  5. "statueofliberty" or "cake"