Wednesday, August 19, 2015

A light-weight forensic analysis of the AshleyMadison Hack

by Erik Cabetas

-----------[Intro]

So Ashley Madison(AM) got hacked, it was first announced about a month ago and the attackers claimed they'd drop the full monty of user data if the AM website did not cease operations. The AM parent company Avid Life Media(ALM) did not cease business operations for the site and true to their word it seems the attackers have leaked everything they promised on August 18th 2015 including:

  • full database dumps of user data
  • emails
  • internal ALM documents
  • as well as a limited number of user passwords


Back in college I used to do forensics contests for the "Honey Net Project" and thought this might be a fun nostalgic trip to try and recreate my pseudo-forensics investigation style on the data within the AM leak.

Disclaimer: I will not be releasing any personal or confidential information
within this blog post that may be found in the AM leak. The purpose of
this blog post is to provide an honest holistic forensic analysis and minimal
statistical analysis of the data found within the leak. Consider this a
journalistic exploration more than anything.

Also note, that the credit card files were deleted and not reviewed as part of this write-up

-----------[Grabbing the Leak]

First we go find where on the big bad dark web the release site is located. Thankfully knowing a shady guy named Boris pays off for me, and we find a torrent file for the release of the August 18th Ashley Madison user data dump. The torrent file we found has the following SHA1 hash.
e01614221256a6fec095387cddc559bffa832a19  impact-team-ashley-release.torrent

After extracting all the files we have the following sizes and
file hashes for evidence audit purposes:

$  du -sh *
4.0K    74ABAA38.txt
9.5G    am_am.dump
2.6G    am_am.dump.gz
4.0K    am_am.dump.gz.asc
13G     aminno_member.dump
3.1G    aminno_member.dump.gz
4.0K    aminno_member.dump.gz.asc
1.7G    aminno_member_email.dump
439M    aminno_member_email.dump.gz
4.0K    aminno_member_email.dump.gz.asc
111M    ashleymadisondump/
37M     ashleymadisondump.7z
4.0K    ashleymadisondump.7z.asc
278M    CreditCardTransactions.7z
4.0K    CreditCardTransactions.7z.asc
2.3G    member_details.dump
704M    member_details.dump.gz
4.0K    member_details.dump.gz.asc
4.2G    member_login.dump
2.7G    member_login.dump.gz
4.0K    member_login.dump.gz.asc
4.0K    README
4.0K    README.asc

$ sha1sum *
a884c4fcd61e23aecb80e1572254933dc85e2b4a  74ABAA38.txt
e4ff3785dbd699910a512612d6e065b15b75e012  am_am.dump
e0020186232dad71fcf92c17d0f11f6354b4634b  am_am.dump.gz
b7363cca17b05a2a6e9d8eb60de18bc98834b14e  am_am.dump.gz.asc
d412c3ed613fbeeeee0ab021b5e0dd6be1a79968  aminno_member.dump
bc60db3a78c6b82a5045b797e6cd428f367a18eb  aminno_member.dump.gz
8a1c328142f939b7f91042419c65462ea9b2867c  aminno_member.dump.gz.asc
2dcb0a5c2a96e4f3fff5a0a3abae19012d725a7e  aminno_member_email.dump
ab5523be210084c08469d5fa8f9519bc3e337391  aminno_member_email.dump.gz
f6144f1343de8cc51dbf20921e2084f50c3b9c86  aminno_member_email.dump.gz.asc
sha1sum: ashleymadisondump: Is a directory
26786cb1595211ad3be3952aa9d98fbe4c5125f9  ashleymadisondump.7z
eb2b6f9b791bd097ea5a3dca3414a3b323b8ad37  ashleymadisondump.7z.asc
0ad9c78b9b76edb84fe4f7b37963b1d956481068  CreditCardTransactions.7z
cb87d9fb55037e0b1bccfe50c2b74cf2bb95cd6c  CreditCardTransactions.7z.asc
11e646d9ff5d40cc8e770a052b36adb18b30fd52  member_details.dump
b4849cec980fe2d0784f8d4409fa64b91abd70ef  member_details.dump.gz
3660f82f322c9c9e76927284e6843cbfd8ab8b4f  member_details.dump.gz.asc
436d81a555e5e028b83dcf663a037830a7007811  member_login.dump
89fbc9c44837ba3874e33ccdcf3d6976f90b5618  member_login.dump.gz
e24004601486afe7e19763183934954b1fc469ef  member_login.dump.gz.asc
4d80d9b671d95699edc864ffeb1b50230e1ec7b0  README
a9793d2b405f31cc5f32562608423fffadc62e7a  README.asc

-----------[Attacker Identity & Attribution]

The attackers make it clear they have no desire to bridge their dark web identities with their real-life identities and have taken many measures to ensure this does not occur.

The torrent file and messaging were released via the anonymous Tor network through an Onion web server which serves only HTML/TXT content. If the attacker took proper OPSEC precautions while setting up the server, law enforcement and AM may never find them. That being said hackers have been known to get sloppy and slip up their OPSEC. The two most famous cases of this were when Sabu of Anonymous and separately the Dread Pirate Roberts of SilkRoad; were both caught even though they primarily used Tor for their internet activities.

Within the dump we see that the files are signed with PGP. Signing a file in this manner is a way of saying "I did this" even though we don't know the real-life identity of the person/group claiming to do this is (there is a bunch of crypto and math that makes this possible.) As a result we can be more confident that if there are files which are signed by this PGP key, then it was released by the same person/group.

In my opinion, this is done for two reasons. First the leaker wants to claim responsibility in an identity attributable manner, but not reveal their real-life identity. Secondly, the leaker wishes to dispel statements regarding "false leaks" made by the Ashley Madison team. The AM executive and PR teams have been in crises communications mode explaining that there have been many fake leaks.

The "Impact Team" is using the following public PGP key to sign their releases.
$ cat ./74ABAA38.txt
-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: GnuPG v1.4.12 (GNU/Linux)

mQINBFW25a4BEADt5OKS5F36aACyyPc4UMZAnhLnbImhxv5A2n7koTKg1QhyA1mI
InLLriKW3GR0Y4Fx+84pvjbYdoJAnuqMemI0oP+2VAJqwC0LYVVcFHKK6ZElYiN8
4/3e5WWYv6vzrHwB+3NbQ1O9bbUjgk9ky2RsdTe+vDBhKwKS0kPSb28h0oMpAs87
pJcgWZ57jjtvyUEIKXQZAqLvFo5xayS8dEp8tRgNLauQ0SafKGsxjW5cRd2Ok3Z5
QtIS44WnYECe3tqqFYSOo4kdHBeswC8zaKapYaNzxsHw9msdZvx/rkrMgXtJye/o
vmf2RdLIcvqK0Nwf1LDLhweCBP61wVn8gWqSrzww+as1ObE6b64hYKHFzdIMcqJ3
sbAErRrfZMqZ6ihWnlSjzDDx2L3n5T16ZIDxGx5Mt0KDYIo8RqDdF+VKLCT7Eq/C
g/Ax+06Eez4rVnY+xeW6Tj+1iBAlrGRIcRHCX89fNwLxr4Bcq/q1KKrCwVsgonBK
+3Mzzs2/b9XQ/Z6bDHFnMWUTDhomBmNcZOz9sHrZZI9XUzx/bfS6CoQ3MIqDhNM+
l7cKZ/Icfs6IDoOsYIS3QeTWC8gv3IBTvtfKFnf1o6JnkP0Qv6SrckslztNA4HDL
2iIMMGs34vDc11ddTzMBBkig1NgtiaHqHhG5T8OoOD9c3hEmTQzir7iCPQARAQAB
tCRJbXBhY3QgVGVhbSA8aW1wYWN0dGVhbUBtYWlsdG9yLm5ldD6JAjgEEwECACIF
AlW25a4CGwMGCwkIBwMCBhUIAgkKCwQWAgMBAh4BAheAAAoJECQ3PNV0q6o445UQ
AKYIVyrpVKKBA4jliarqngKvkEBRd62CXHY42ZdjFmubLvRw5nC0nDdGUyGPRYOl
0RddL2C7ROqW9lCYfNl3BAQYEXMADDjoBMEQkepIxeIVehat46ksbJuFZ0+uI6EB
aVcJCR4S2C+hJP09q9tn/7RKacIolfeT0+s9IteFghKKK0c8Aot52A/hExrqjldo
fsMX6liSFQjDQpPhQpqiAJ8z9N3eeFwcAAc/gqNz9bE0Wug/OXh0OAHUQk3fS57a
uIi8medOr+kAqHziuO79+5Hkachsp+8c58jBtIzZM4bO6e42aEa2yHv0FGG5MhoB
x7MH0ympFdwbgebpF6kpH371GIsJcyumwQ3Yn4Sy2kp2XmB8xOQo2W8tWRtLW1dI
yGAXHXXy5UI5FJek7G1KvQXCy4pa756RGDFiqdqigq0KC27A/at02M8CP6R9RxC9
YSnru0Qrl7JeATekWM3w8sKs8r6yMEDFAcpK2NHaYzF6/o6t/HEqUWD41DZ2cqqg
9i4uoXpkAB3vAG/snNg1B8g89b3vbVUf6hSIcU89G3lgj9hh87Q/TSsISRJ+yq0N
sLEeVmDmOdf+xb44g3RuRJ9yh0h3j8jdQOq0FvvwW3UHKIVDQlFB3kgHY478TCIa
5MMCtMovGv/ukGKlU8aELKV0/sVsliMh8HDdFQICTd0MuQINBFW25a4BEADIh8Vg
tMGfByY/+IgPd9l3u0I4FZLHqKGKOIpfFEeA31jPAhfOqQyBRcnEN/TxLwJ8NLnL
+GdQ+0z1YncZPxpHU/z8zyMwGpZM/hMbkixA9ysyu06S7hna4YMfifT+lOe1lGSo
Tz3Fz1u2OGH+2UzVk5+Rv0FqDl6X1ZoqhMTswzW0jYR7JLLJip5MTMrLD0rSl0b5
a2XvF9Tpjzy9KWubsJk4W7x00Egu2EU9NhEZXaY18H3rxvYgXT7JMjq/y+IUp2Cd
Bv/XCNWmzl66/ZSLC8hzlcxmAYpmBkxafYNdptMeVzsH/xHmN2zSFjuBNx0Mkk+R
TrOxK/boS9onrGsSQ3zItWJAmodo2qYFjlirtu9pURSdYEINNQ5DgWymg43iAIfp
Xp5/yGBj4BlWE80qEAVsBB2BIRs7QHvpd34xETP08dXMsswIrMn/XxvHumyPoimj
mcNvIpvnAZqt6xppo6BSZ3y7MU4cSIRsZzLuSvkwGk97Jv2sMNvXlPRxzpU9ozsI
iYJAk6/n8kbQiTJk/SeiCTbf6e+BzbZbgIE3O9iPKhfW+6zWjC4TL+lBeyWTy1PP
PcQTT+najDqIwysz2BFuPozwuUQsnfQnyRytSjcI5m1fDoYpJPH8NNRIu9lzp+RN
YENVKXiCfnUCMCnSzxP3Kij3Wt227JLZQqnBUQARAQABiQIfBBgBAgAJBQJVtuWu
AhsMAAoJECQ3PNV0q6o4C2EP/29Bis5Skt9NxHVUBpC1OgRL8V+JD5TjNurMT6Pu
E75szLsMZ84z0MQ6n74ADIgEuznPDIa9hMZGK9DwlsQfFOlC/jyTYxSpgAgN6LAl
qoJztVzLRnMd2gZjOj6wajUy616b8u3Q3zovHcEKll5niUyNwHXovZcCzukFqJBF
a3JU/tkPvBuj2PEWf4ytuO6He2ERuSnsi+7mil8rTAAV/PPy7N2R/T7OUa6ERoGg
hqIGythWizRtZBVPRzush+8L181GBU2ps7nJ1resZ7T0OsCFL67J6t8r8IpmjWWt
fiiV05E71UAyNWLOWriS57qAwNcQ0W2UYKkFFKor+oWaBB+hCpvb8Za5867wpH8l
O6gpS/G17e+MKHTn60hw64xIVFJn7pka+OdAINjPRo5B5qVyvM3puEjRepx1piOG
HKOan00quI0dhF2Gia59zrBHK/agdF4FjkJSjER8uf/jJpo184p38zuQ7kyMXUxY
ExpGcVMVjVOoWKVRPGXYEz2nc9HIZ6mHbvhzsWQEAVwwIxZCos5dW1AMW3Otn30A
uFqPsx4jh/ANGhqUASz18bBrZ8DW3zceVs2zelkMpdL0z7ifU/UNn2rtDlpgLwFl
9ggUtPwXnSxqB7doSxfJyPJUum+bZxMb4Iq5BNNa/tme7TeWGl9bmsVwcQXSQlY2
uZnr
=v0qe
-----END PGP PUBLIC KEY BLOCK-----
The key has the following Meta-data below.
Old: Public Key Packet(tag 6)(525 bytes)
        Ver 4 - new
        Public key creation time - Mon Jul 27 22:15:10 EDT 2015
        Pub alg - RSA Encrypt or Sign(pub 1)
        RSA n(4096 bits) - ...
        RSA e(17 bits) - ...
Old: User ID Packet(tag 13)(36 bytes)
        User ID - Impact Team <impactteam@mailtor.net>
Old: Signature Packet(tag 2)(568 bytes)
        Ver 4 - new
        Sig type - Positive certification of a User ID and Public Key packet(0x13).
        Pub alg - RSA Encrypt or Sign(pub 1)
        Hash alg - SHA1(hash 2)
        Hashed Sub: signature creation time(sub 2)(4 bytes)
                Time - Mon Jul 27 22:15:10 EDT 2015
        Hashed Sub: key flags(sub 27)(1 bytes)
                Flag - This key may be used to certify other keys
                Flag - This key may be used to sign data
        Hashed Sub: preferred symmetric algorithms(sub 11)(5 bytes)
                Sym alg - AES with 256-bit key(sym 9)
                Sym alg - AES with 192-bit key(sym 8)
                Sym alg - AES with 128-bit key(sym 7)
                Sym alg - CAST5(sym 3)
                Sym alg - Triple-DES(sym 2)
        Hashed Sub: preferred hash algorithms(sub 21)(5 bytes)
                Hash alg - SHA256(hash 8)
                Hash alg - SHA1(hash 2)
                Hash alg - SHA384(hash 9)
                Hash alg - SHA512(hash 10)
                Hash alg - SHA224(hash 11)
        Hashed Sub: preferred compression algorithms(sub 22)(3 bytes)
                Comp alg - ZLIB <RFC1950>(comp 2)
                Comp alg - BZip2(comp 3)
                Comp alg - ZIP <RFC1951>(comp 1)
        Hashed Sub: features(sub 30)(1 bytes)
                Flag - Modification detection (packets 18 and 19)
        Hashed Sub: key server preferences(sub 23)(1 bytes)
                Flag - No-modify
        Sub: issuer key ID(sub 16)(8 bytes)
                Key ID - 0x24373CD574ABAA38
        Hash left 2 bytes - e3 95
        RSA m^d mod n(4096 bits) - ...
                -> PKCS-1
Old: Public Subkey Packet(tag 14)(525 bytes)
        Ver 4 - new
        Public key creation time - Mon Jul 27 22:15:10 EDT 2015
        Pub alg - RSA Encrypt or Sign(pub 1)
        RSA n(4096 bits) - ...
        RSA e(17 bits) - ...
Old: Signature Packet(tag 2)(543 bytes)
        Ver 4 - new
        Sig type - Subkey Binding Signature(0x18).
        Pub alg - RSA Encrypt or Sign(pub 1)
        Hash alg - SHA1(hash 2)
        Hashed Sub: signature creation time(sub 2)(4 bytes)
                Time - Mon Jul 27 22:15:10 EDT 2015
        Hashed Sub: key flags(sub 27)(1 bytes)
                Flag - This key may be used to encrypt communications
                Flag - This key may be used to encrypt storage
        Sub: issuer key ID(sub 16)(8 bytes)
                Key ID - 0x24373CD574ABAA38
        Hash left 2 bytes - 0b 61
        RSA m^d mod n(4095 bits) - ...
                -> PKCS-1

We can verify the released files are attributable to the PGP public key
in question using the following commands:

$ gpg --import ./74ABAA38.txt
$ gpg --verify ./member_details.dump.gz.asc ./member_details.dump.gz
gpg: Signature made Sat 15 Aug 2015 11:23:32 AM EDT using RSA key ID 74ABAA38
gpg: Good signature from "Impact Team <impactteam@mailtor.net>"
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: 6E50 3F39 BA6A EAAD D81D  ECFF 2437 3CD5 74AB AA38

This also tells us at what date the dump was signed and packaged.

-----------[Catching the attackers]

The PGP key's meta-data shows a user ID for the mailtor dark web email service. The last known location of which was:
http://mailtoralnhyol5v.onion

Don't bother emailing the email address found in the PGP key as it does not have a valid MX record. The fact that this exists at all seems to be one of those interesting artifact of what happens when Internet tools like GPG get used on the dark web.

If the AM attackers were to be caught; here (in no particular order) are the most likely ways this would happen:

  • The person(s) responsible tells somebody. Nobody keeps something like this a secret, if the attackers tell anybody, they're likely going to get caught.
  • If the attackers review email from a web browser, they might get revealed via federal law enforcement or private investigation/IR teams hired by AM. The FBI is known to have these capabilities.
  • If the attackers slip up with their diligence in messaging only via TXT and HTML on the web server. Meta-data sinks ships kids -- don't forget.
  • If the attackers slip up with their diligence on configuring their server. One bad config of a web server leaks an internal IP, or worse!
  • The attackers slipped up during their persistent attack against AM and investigators hired by AM find evidence leading back to the attackers.
  • The attackers have not masked their writing or image creation style and leave some semantic finger print from which they can be profiled.

If none of those  things happen, I don't think these attackers will ever be caught. The cyber-crime fighters have a daunting task in front of them, I've helped out a couple FBI and NYPD cyber-crime fighters and I do not envy the difficult and frustrating job they have -- good luck to them! Today we're living in the Wild West days of the Internet.

-----------[Leaked file extraction and evidence gathering]

Now to document the information seen within this data leak we proceed with a couple of commands to gather the file size and we'll also check the file hashes to ensure the uniqueness of the files. Finally we review the meta-data of some of the compressed files. The meta-data shows the time-stamp embedded into the various compressed files. Although meta-data can easily be faked, it is usually not.

Next we'll extract these files and examine their file size to take a closer look.

$ 7z e ashleymadisondump.7z

We find within the extracted 7zip file another 7zip file
"swappernet_User_Table.7z" was found and also extracted.

We now have the following files sizes and SHA1 hashes for evidence
integrity & auditing purposes:

$ du -sh ashleymadisondump/*
68K     20131002-domain-list.xlsx
52K     ALMCLUSTER (production domain) computers.txt
120K    ALMCLUSTER (production domain) hashdump.txt
68K     ALM - Corporate Chart.pptx
256K    ALM Floor Plan - ports and names.pdf
8.0M    ALM - January 2015 - Company Overview.pptx
1.8M    ALM Labs Inc. Articles of Incorporation.pdf
708K    announcement.png
8.0K    Areas of concern - customer data.docx
8.0K    ARPU and ARPPU.docx
940K    Ashley Madison Technology Stack v5(1).docx
16K     Avid Life Media - Major Shareholders.xlsx
36K     AVIDLIFEMEDIA (primary corporate domain) computers.txt
332K    AVIDLIFEMEDIA (primary corporate domain) user information and hashes.txt
1.7M    Avid Org Chart 2015 - May 14.pdf
24K     Banks.xlsx
6.1M    Copies of Option Agreements.pdf
8.0K    Credit useage.docx
16K     CSF Questionnaire (Responses).xlsx
132K    Noel's loan agreement.pdf
8.0K    Number of traveling man purchases.docx
1.5M    oneperday_am_am_member.txt
940K    oneperday_aminno_member.txt
672K    oneperday.txt
44K     paypal accounts.xlsx
372K    printer@avidlifemedia.com_20101103_133855.pdf
16K     q2 2013 summary compensation detail_managerinput_trevor-s team.xlsx
8.0K    README.txt
8.0K    Rebill Success Rate Queries.docx
8.0K    Rev by traffic source rebill broken out.docx
8.0K    Rev from organic search traffic.docx
4.0K    Sales Queries
59M     swappernet_QA_User_Table.txt #this was extracted from swappernet_User_Table.7z in the same dir
17M     swappernet_User_Table.7z

$ sha1sum ashleymadisondump/*
f0af9ea887a41eb89132364af1e150a8ef24266f  20131002-domain-list.xlsx
30401facc68dab87c98f7b02bf0a986a3c3615f0  ALMCLUSTER (production domain) computers.txt
c36c861fd1dc9cf85a75295e9e7bcf6cf04c7d2c  ALMCLUSTER (production domain) hashdump.txt
6be635627aa38462ebcba9266bed5b492a062589  ALM - Corporate Chart.pptx
4dec7623100f59395b68fd13d3dcbbff45bef9c9  ALM Floor Plan - ports and names.pdf
601e0b462e1f43835beb66743477fe94bbda5293  ALM - January 2015 - Company Overview.pptx
d17cb15a5e3af15bc600421b10152b2ea1b9c097  ALM Labs Inc. Articles of Incorporation.pdf
1679eca2bc172cba0b5ca8d14f82f9ced77f10df  announcement.png
6a618e7fc62718b505afe86fbf76e2360ade199d  Areas of concern - customer data.docx
91f65350d0249211234a52b260ca2702dd2eaa26  ARPU and ARPPU.docx
50acee0c8bb27086f12963e884336c2bf9116d8a  Ashley Madison Technology Stack v5(1).docx
71e579b04bbba4f7291352c4c29a325d86adcbd2  Avid Life Media - Major Shareholders.xlsx
ef8257d9d63fa12fb7bc681320ea43d2ca563e3b  AVIDLIFEMEDIA (primary corporate domain) computers.txt
ec54caf0dc7c7206a7ad47dad14955d23b09a6c0  AVIDLIFEMEDIA (primary corporate domain) user information and hashes.txt
614e80a1a6b7a0bbffd04f9ec69f4dad54e5559e  Avid Org Chart 2015 - May 14.pdf
c3490d0f6a09bf5f663cf0ab173559e720459649  Banks.xlsx
1538c8f4e537bb1b1c9a83ca11df9136796b72a3  Copies of Option Agreements.pdf
196b1ba40894306f05dcb72babd9409628934260  Credit useage.docx
2c9ba652fb96f6584d104e166274c48aa4ab01a3  CSF Questionnaire (Responses).xlsx
0068bc3ee0dfb796a4609996775ff4609da34acb  Noel's loan agreement.pdf
c3b4d17fc67c84c54d45ff97eabb89aa4402cae8  Number of traveling man purchases.docx
9e6f45352dc54b0e98932e0f2fe767df143c1f6d  oneperday_am_am_member.txt
de457caca9226059da2da7a68caf5ad20c11de2e  oneperday_aminno_member.txt
d596e3ea661cfc43fd1da44f629f54c2f67ac4e9  oneperday.txt
37fdc8400720b0d78c2fe239ae5bf3f91c1790f4  paypal accounts.xlsx
2539bc640ea60960f867b8d46d10c8fef5291db7  printer@avidlifemedia.com_20101103_133855.pdf
5bb6176fc415dde851262ee338755290fec0c30c  q2 2013 summary compensation detail_managerinput_trevor-s team.xlsx
5435bfbf180a275ccc0640053d1c9756ad054892  README.txt
872f3498637d88ddc75265dab3c2e9e4ce6fa80a  Rebill Success Rate Queries.docx
d4e80e163aa1810b9ec70daf4c1591f29728bf8e  Rev by traffic source rebill broken out.docx
2b5f5273a48ed76cd44e44860f9546768bda53c8  Rev from organic search traffic.docx
sha1sum: Sales Queries: Is a directory
0f63704c118e93e2776c1ad0e94fdc558248bf4e  swappernet_QA_User_Table.txt
9d67a712ef6c63ae41cbba4cf005ebbb41d92f33  swappernet_User_Table.7z


-----------[Quick summary of each of the leaked files]

The following files are MySQL data dumps of the main AM database:
  • member_details.dump.gz
  • aminno_member.dump.gz
  • member_login.dump.gz
  • aminno_member_email.dump.gz
  • CreditCardTransactions.7z
Also included was another AM database which contains user info (separate from the emails):
  • am_am.dump.gz

In the top level directory you can also find these additional files:
  • 74ABAA38.txt
    Impact Team's Public PGP key used for signing the releases (The .asc files are the signatures)
  • ashleymadisondump.7z
    This contains various internal and corporate private files.
  • README
    Impact Team's justification for releasing the user data.
  • Various .asc files such as "member_details.dump.gz.asc"
    These are all PGP signature files to prove that one or more persons who are part of the "Impact Team" attackers released them.

Within the ashleymadisondump.7z we can extract and view the following files:
  • Number of traveling man purchases.docx
    SQL queries to investigate high-travel user's purchases.
  • q2 2013 summary compensation detail_managerinput_trevor-s team.xlsx
    Per-employee compensation listings.
  • AVIDLIFEMEDIA (primary corporate domain) user information and hashes.txt
  • AVIDLIFEMEDIA (primary corporate domain) computers.txt
    The output of the dnscmd windows command executing on what appears to be a primary domain controller. The timestamp indicates that the command was run on July 1st 2015. There is also "pwdump" style export of 1324 user accounts which appear to be from the ALM domain controller. These passwords will be easy to crack as NTLM hashes aren't the strongest
  • Noel's loan agreement.pdf
    A promissory note for the CEO to pay back ~3MM in Canadian monies.
  • Areas of concern - customer data.docx
    Appears to be a risk profile of the major security concerns that ALM has regarding their customer's data. And yes, a major user data dump is on the list of concerns.
  • Banks.xlsx
    A listing of all ALM associated bank account numbers and the biz which owns them.
  • Rev by traffic source rebill broken out.docx
  • Rebill Success Rate Queries.docx
    Both of these are SQL queries to investigate Rebilling of customers.
  • README.txt
    Impact Team statement regarding their motivations for the attack and leak.
  • Copies of Option Agreements.pdf
    All agreements for what appears all of the company's outstanding options.
  • paypal accounts.xlsx
    Various user/passes for ALM paypal accounts (16 in total)
  • swappernet_QA_User_Table.txt
  • swappernet_User_Table.7z
    This file is a database export into CSV format. I appears to be from a QA server
  • ALMCLUSTER (production domain) computers.txt
    The output of the dnscmd windows command executing on what appears to be a production domain controller. The timestamp indicates that the command was run on July 1st 2015.
  • ALMCLUSTER (production domain) hashdump.txt
    A "pwdump" style export of 1324 user accounts which appear to be from the ALM domain controller. These passwords will be easy to crack as NTLM hashes aren't the strongest.
  • ALM Floor Plan - ports and names.pdf
    Seating map of main office, this type of map is usually used for network deployment purposes.
  • ARPU and ARPPU.docx
    A listing of SQL commands which provide revenue and other macro financial health info.
    Presumably these queries would run on the primary DB or a biz intel slave.
  • Credit useage.docx
    SQL queries to investigate credit card purchases.
  • Avid Org Chart 2015 - May 14.pdf
    A per-team organizational chart of what appears to be the entire company.
  • announcement.png
    The graphic created by Impact Team to announce their demand for ALM to shut down it's flagship website AM.
  • printer@avidlifemedia.com_20101103_133855.pdf
    Contract outlining the terms of a purchase of the biz Seekingarrangement.com
  • CSF Questionnaire (Responses).xlsx
    Company exec Critical Success Factors spreadsheet. Answering questions like "In what area would you hate to see something go wrong?" and the CTO's response is about hacking.
  • ALM - January 2015 - Company Overview.pptx
    This is a very detailed breakdown of current biz health, marketing spend, and future product plans.
  • Ashley Madison Technology Stack v5(1).docx
    A detailed walk-through of all major servers and services used in the ALM production environment.
  • oneperday.txt
  • oneperday_am_am_member.txt
  • oneperday_aminno_member.txt
    These three files have limited leak info as a "teaser" for the .dump files that are found in the highest level directory of the AM leak.
  • Rev from organic search traffic.docx
    SQL queries to explore the revenue generated from search traffic.
  • 20131002-domain-list.xlsx
    BA list of the 1083 domain names that are, have been, or are seeking to be owned by ALM.
  • Sales Queries/
    Empty Directory
  • ALM Labs Inc. Articles of Incorporation.pdf
    The full 109 page Articles of Incorporation, ever aspect of inital company formation.
  • ALM - Corporate Chart.pptx
    A detailed block diagram defining the relationship between various tax and legal business entity names related to ALM businesses.
  • Avid Life Media - Major Shareholders.xlsx
    A listing of each major shareholder and their equity stake

-----------[File meta-data analysis]

First we'll take a look at the 7zip file in the top level directory.
$ 7z l ashleymadisondump.7z
Listing archive: ashleymadisondump.7z
----
Path = ashleymadisondump.7z
Type = 7z
Method = LZM
Solid = +
Blocks = 1
Physical Size = 37796243
Headers Size = 1303

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
2015-07-09 12:25:48 ....A     17271957     37794940  swappernet_User_Table.7z
2015-07-10 12:14:35 ....A       723516               announcement.png
2015-07-01 18:03:56 ....A        51222               ALMCLUSTER (production domain) computers.txt
2015-07-01 17:58:55 ....A       120377               ALMCLUSTER (production domain) hashdump.txt
2015-06-25 22:59:22 ....A        35847               AVIDLIFEMEDIA (primary corporate domain) computers.txt
2015-06-14 21:18:11 ....A       339221               AVIDLIFEMEDIA (primary corporate domain) user information and hashes.txt
2015-07-18 15:23:34 ....A       686533               oneperday.txt
2015-07-18 15:20:43 ....A       959099               oneperday_aminno_member.txt
2015-07-18 19:00:45 ....A      1485289               oneperday_am_am_member.txt
2015-07-19 17:01:11 ....A         6031               README.txt
2015-07-07 11:41:36 ....A         6042               Areas of concern - customer data.docx
2015-07-07 12:14:42 ....A         5907               Sales Queries/ARPU and ARPPU.docx
2015-07-07 12:04:35 ....A       960553               Ashley Madison Technology Stack v5(1).docx
2015-07-07 12:14:42 ....A         5468               Sales Queries/Credit useage.docx
2015-07-07 12:14:43 ....A         5140               Sales Queries/Number of traveling man purchases.docx
2015-07-07 12:14:47 ....A         5489               Sales Queries/Rebill Success Rate Queries.docx
2015-07-07 12:14:43 ....A         5624               Sales Queries/Rev by traffic source rebill broken out.docx
2015-07-07 12:14:42 ....A         6198               Sales Queries/Rev from organic search traffic.docx
2015-07-08 23:17:19 ....A       259565               ALM Floor Plan - ports and names.pdf
2012-10-19 16:54:20 ....A      1794354               ALM Labs Inc. Articles of Incorporation.pdf
2015-07-07 12:04:10 ....A      1766350               Avid Org Chart 2015 - May 14.pdf
2012-10-20 12:23:11 ....A      6344792               Copies of Option Agreements.pdf
2013-09-18 14:39:25 ....A       132798               Noel's loan agreement.pdf
2015-07-07 10:16:54 ....A       380043               printer@avidlifemedia.com_20101103_133855.pdf
2012-12-13 15:26:58 ....A        67816               ALM - Corporate Chart.pptx
2015-07-07 12:14:28 ....A      8366232               ALM - January 2015 - Company Overview.pptx
2013-10-07 10:30:28 ....A        67763               20131002-domain-list.xlsx
2013-07-15 15:20:14 ....A        13934               Avid Life Media - Major Shareholders.xlsx
2015-07-09 11:57:58 ....A        22226               Banks.xlsx
2015-07-07 11:41:41 ....A        15703               CSF Questionnaire (Responses).xlsx
2015-07-09 11:57:58 ....A        42511               paypal accounts.xlsx
2015-07-07 12:04:44 ....A        15293               q2 2013 summary compensation detail_managerinput_trevor-s team.xlsx
2015-07-18 13:54:40 D....            0            0  Sales Queries
------------------- ----- ------------ ------------  ------------------------
                              41968893     37794940  32 files, 1 folders
If we're to believe this meta-data, the newest file is from July 19th 2015 and the oldest is from October 19th 2012. The timestamp for the file announcement.png shows a creation date of July 10th 2015. This file is the graphical announcement from the leakers. The file swappernet_User_Table.7z
has a timestamp of July 9th 2015. Since this file is a database dump, one might presume that these files were created for the original release and the other files were copied from a file-system that preserves timestamps.

Within that 7zip file we've found another which looks like:
$ 7z l ashleymadisondump/swappernet_User_Table.7z
Listing archive: ./swappernet_User_Table.7z
----
Path = ./swappernet_User_Table.7z
Type = 7z
Method = LZMA
Solid = -
Blocks = 1
Physical Size = 17271957
Headers Size = 158

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
2015-06-27 18:39:40 ....A     61064200     17271799  swappernet_QA_User_Table.txt
------------------- ----- ------------ ------------  ------------------------
                              61064200     17271799  1 files, 0 folders

Within the ashleymadisondump directory extracted from ashleymadisondump.7z we've got
the following file types that we'll examine for meta-data:
8 txt
8 docx
6 xlsx
6 pdf
2 pptx
1 png
1 7z

The PNG didn't seem to have any EXIF meta-data, and we've already covered the 7z file.

The text files probably don't usually yield anything to us meta-data wise.

In the MS Word docx files  we have the following meta-data:
  • Areas of concern - customer data.docx
    No Metadata
  • ARPU and ARPPU.docx
    No Metadata
  • Ashley Madison Technology Stack v5(1).docx
    Created Michael Morris, created and last modified on Sep 17 2013.
  • Credit useage.docx
    No Metadata
  • Number of traveling man purchases.docx
    No Metadata
  • Rebill Success Rate Queries.docx
    No Metadata
  • Rev by traffic source rebill broken out.docx
    No Metadata
  • Rev from organic search traffic.docx
    No Metadata

In the MS Powerpoint pptx files we have the following meta-data:
  • ALM - Corporate Chart.pptx
    Created by "Diana Horvat" on Dec 5 2012 and last updated by "Tatiana Kresling"
    on Dec 13th 2012
  • ALM - January 2015 - Company Overview.pptx
    Created Rizwan Jiwan, Jan 21 2011 and last modified on Jan 20 2015.

In the MS Excel xlsx files we have the following meta-data:
  • 20131002-domain-list.xlsx
    Written by Kevin McCall, created and last modified Oct 2nd 2013
  • Avid Life Media - Major Shareholders.xlsx
    Jamal Yehia, created and last modified July 15th 2013
  • Banks.xlsx
    Created by "Elena" and Keith Lalonde, created Dec 15 2009 and last modified Feb 26th  2010
  • CSF Questionnaire (Responses).xlsx
    No Metadata
  • paypal accounts.xlsx
    Created by Keith Lalonde, created Oct 28  2010 and last modified Dec 22nd  2010
  • q2 2013 summary compensation detail_managerinput_trevor-s team.xlsx
    No Metadata

And finally within the PDF files we also see additional meta-data:
  • ALM Floor Plan - ports and names.pdf
    Written by Martin Price in MS Visio, created and last modified April 23 2015
  • ALM Labs Inc. Articles of Incorporation.pdf
    Created with DocsCorp Pty Ltd (www.docscorp.com), created and last modified on Oct 17 2012
  • Avid Org Chart 2015 - May 14.pdf
    Created and last modified on May 14 2015
  • Copies of Option Agreements.pdf
    OmniPage CSDK 16 OcrToolkit, created and last modified on Oct 16 2012
  • Noel's loan agreement.pdf
    Created and last modified on Sep 18 2013
  • printer@avidlifemedia.com_20101103_133855.pdf
    Created and last modified on Jul 7 2015

-----------[MySQL Dump file loading and evidence gathering]

At this point all of the dump files have been decompressed with gunzip or 7zip. The dump files are standard MySQL backup file (aka Dump files) the info in the dump files implies that it was taken from multiple servers:
$ grep 'MySQL dump' *.dump
am_am.dump:-- MySQL dump 10.13  Distrib 5.5.33, for Linux (x86_64)
aminno_member.dump:-- MySQL dump 10.13  Distrib 5.5.40-36.1, for Linux (x86_64)
aminno_member_email.dump:-- MySQL dump 10.13  Distrib 5.5.40-36.1, for Linux (x86_64)
member_details.dump:-- MySQL dump 10.13  Distrib 5.5.40-36.1, for Linux (x86_64)
member_login.dump:-- MySQL dump 10.13  Distrib 5.5.40-36.1, for Linux (x86_64)

Also within the dump files was info referencing being executed from localhost, this implies an attacker was on the Database server in question.

Of course, all of this info is just text and can easily be faked, but it's interesting none-the-less considering the possibility that it might be correct and unaltered.

To load up the MySQL dumps we'll start with a fresh MySQL database instance
on a decently powerful server and run the following commands:
--As root MySQL user
CREATE DATABASE aminno;
CREATE DATABASE am;
CREATE USER 'am'@'localhost' IDENTIFIED BY 'loyaltyandfidelity';
GRANT ALL PRIVILEGES ON aminno.* TO 'am'@'localhost';
GRANT ALL PRIVILEGES ON am.* TO 'am'@'localhost';

Now back at the command line we'll execute these to import the main dumps:

$ mysql -D aminno -uam -ployaltyandfidelity < aminno_member.dump

$ mysql -D aminno -uam -ployaltyandfidelity < aminno_member_email.dump

$ mysql -D aminno -uam -ployaltyandfidelity < member_details.dump

$ mysql -D aminno -uam -ployaltyandfidelity < member_login.dump

$ mysql -D am -uam -ployaltyandfidelity < am_am.dump

Now that you've got the data loaded up you can recreate some of the findings ksugihara made with his analysis here [Edit: It appears ksugihara has taken this offline, I don't have a mirror]. We didn't have much more to add for holistic statistics analysis than what he's already done so check out his blog post for more on the primary data dumps. There still is one last final database export though...

Within the file ashleymadisondump/swappernet_QA_User_Table.txt we have a final database export, but this one is not in the MySQL dump format. It is instead in CSV format. The file name implies this was an export from a QA Database server.

This file has the following columns (left to right in the CSV):

  • recid
  • id
  • username
  • userpassword
  • refnum
  • disable
  • ipaddress
  • lastlogin
  • lngstatus
  • strafl
  • ap43
  • txtCoupon
  • bot

Sadly within the file we see user passwords are in clear text which is always a bad security practice. At the moment though we don't know if these are actual production user account passwords, and if so how old they are. My guess is that these are from an old QA server when AM was a smaller company and hadn't moved to secure password hashing practices like bcrypt.

These commands show us there are 765,607 records in this database export and
only four of them have a blank password. Many of the passwords repeat and
397,974 of the passwords are unique.

$ cut -d , -f 4 < swappernet_QA_User_Table.txt |wc -l
765607
$ cut -d , -f 4 < swappernet_QA_User_Table.txt | sed '/^\s*$/d' |wc -l
765603
$ cut -d , -f 4 < swappernet_QA_User_Table.txt | sed '/^\s*$/d' |sort -u |wc -l
387974

Next we see the top 25 most frequently used passwords in this database export
using the command:
$ cut -d , -f 4 < swappernet_QA_User_Table.txt |sort|uniq -c |sort -rn|head -25
   5882 123456
   2406 password
    950 pussy
    948 12345
    943 696969
    917 12345678
    902 fuckme
    896 123456789
    818 qwerty
    746 1234
    734 baseball
    710 harley
    699 swapper
    688 swinger
    647 football
    645 fuckyou
    641 111111
    538 swingers
    482 mustang
    482 abc123
    445 asshole
    431 soccer
    421 654321
    414 1111
    408 hunter

After importing the CSV into MS excel we can use sort and filter to make some
additional statements based on the data.
  1. The only logins marked as "lastlogin" column in the year 2015 are from the
    following users:
    SIMTEST101
    SIMTEST130
    JULITEST2
    JULITEST3
    swappernetwork
    JULITEST4
    HEATSEEKERS

  2. The final and most recent login was from AvidLifeMedia's office IP range.
  3. 275,285 of these users have an entry for the txtCupon.
  4. All users with the "bot" column set to TRUE have either passwords
  5. "statueofliberty" or "cake"

Wednesday, August 13, 2014

Reversing the Dropcam Part 3: Digging into complied Lua functionality

Contribs from Nico Rodriguez, Kris Brosch, and Erik Cabetas

In Part 1 & Part 2 of this RE blog series you saw how we reverse engineered the Dropcam and got access to the file system. In this final post of the series we'll examine some of the binaries found on the file system and play a bit with Lua code we found there. As usual we'll talk about some of the lessons learned from some failures in the analysis process as well as successes. We'll conclude with a release of a small tool that can aid reversers who are looking at Lua disassembly.

The Lua code we found on the system is packed inside the Dropcam's /usr/connect binary which was obtained from the rooted Dropcam as described in our previous blog post (Part 2.) We unpacked the connect binary; it's compressed/packed with upx but that is trivial to undo. Once unpacked we loaded the binary in our trusty IDA and looked around a little bit. We noticed it was writing a file named /tmp/connect.bin and then running this command via a call to system():

rm -rf /tmp/connect && mkdir /tmp/connect && tar zx -f /tmp/connect.bin -C /tmp/connect && rm /tmp/connect.bin

So it looks like /usr/bin/connect is decompressing a tar.gz file hidden inside the connect binary itself. The IDA screenshot below shows the function that writes the file and then calls the shell command. This function is called with the arguments 0x8393c (the address of the connect.bin data in memory) and 0x29203 (the length of the file):

We extracted the file using dd:

dd if=./connect.decompressed of=connect.tar.gz bs=1 skip=473404 count=168451

And then, we unpacked the .tar.gz file and took a look at what was there:
$ ls -la

total 808
drwxrwxrwx 1 nico staff 4096 Feb 21 15:20 .
drwxrwxrwx 1 nico staff 4096 Nov 11 20:35 ..
-rwxrwxrwx 1 nico staff 1504 Apr 23 2013 containers.bin
-rwxrwxrwx 1 nico staff 5879 Apr 23 2013 decoder.bin
-rwxrwxrwx 1 nico staff 1038 Apr 23 2013 descriptor.bin
-rwxrwxrwx 1 nico staff 10376 Apr 23 2013 dispatch.bin
-rwxrwxrwx 1 nico staff 54727 Apr 23 2013 droptalk_pb.bin
-rwxrwxrwx 1 nico staff 9360 Apr 23 2013 encoder.bin
-rwxrwxrwx 1 nico staff 1243 Apr 23 2013 hello.bin
-rwxrwxrwx 1 nico staff 545 Apr 23 2013 hwver.bin
-rwxrwxrwx 1 nico staff 4279 Apr 23 2013 ir.bin
-rwxrwxrwx 1 nico staff 879 Apr 23 2013 list.bin
-rwxrwxrwx 1 nico staff 615 Apr 23 2013 listener.bin
-rwxrwxrwx 1 nico staff 650 Apr 23 2013 main.bin
-rwxrwxrwx 1 nico staff 2363 Apr 23 2013 monitor.bin
-rwxrwxrwx 1 nico staff 708 Apr 23 2013 motion.bin
-rwxrwxrwx 1 nico staff 2010 Oct 29 19:48 net.bin
-rwxrwxrwx 1 nico staff 2607 Apr 23 2013 oldiags.bin
-rwxrwxrwx 1 nico staff 17536 Apr 18 2013 ov9715_01_3D_hwrev_1.bin
-rwxrwxrwx 1 nico staff 17536 Apr 18 2013 ov9715_01_3D_hwrev_2.bin
-rwxrwxrwx 1 nico staff 17536 Apr 18 2013 ov9715_02_3D_hwrev_1.bin
-rwxrwxrwx 1 nico staff 17536 Apr 18 2013 ov9715_02_3D_hwrev_2.bin
-rwxrwxrwx 1 nico staff 17536 Apr 18 2013 ov9715_03_3D_hwrev_1.bin
-rwxrwxrwx 1 nico staff 17536 Apr 18 2013 ov9715_03_3D_hwrev_2.bin
-rwxrwxrwx 1 nico staff 17536 Apr 18 2013 ov9715_04_3D_hwrev_1.bin
-rwxrwxrwx 1 nico staff 17536 Apr 18 2013 ov9715_04_3D_hwrev_2.bin
-rwxrwxrwx 1 nico staff 3280 Apr 23 2013 persistence.bin
-rwxrwxrwx 1 nico staff 329 Apr 23 2013 platform.bin
-rwxrwxrwx 1 nico staff 3365 Apr 23 2013 platform_a5s.bin
-rwxrwxrwx 1 nico staff 551 Apr 23 2013 platform_local.bin
-rwxrwxrwx 1 nico staff 20750 Apr 23 2013 protobuf.bin
-rwxrwxrwx 1 nico staff 191 Apr 23 2013 rtp.bin
-rwxrwxrwx 1 nico staff 643 Apr 23 2013 settings.bin
-rwxrwxrwx 1 nico staff 9931 Apr 23 2013 states.bin
-rwxrwxrwx 1 nico staff 912 Apr 23 2013 status.bin
-rwxrwxrwx 1 nico staff 3822 Apr 23 2013 streams.bin
-rwxrwxrwx 1 nico staff 1505 Apr 23 2013 text_format.bin
-rwxrwxrwx 1 nico staff 1525 Apr 23 2013 type_checkers.bin
-rwxrwxrwx 1 nico staff 3047 Apr 23 2013 update.bin
-rwxrwxrwx 1 nico staff 601 Apr 23 2013 usb.bin
-rwxrwxrwx 1 nico staff 2602 Apr 23 2013 util.bin
-rwxrwxrwx 1 nico staff 1468 Apr 23 2013 watchdog.bin
-rwxrwxrwx 1 nico staff 3620 Apr 23 2013 wire_format.bin

Inspecting the first .bin file we see these are Lua byte-code files. The first five bytes were those of a Lua Bytecode Header:
+------------------------+
| 1B | 4C | 75 | 61 | 52 | => Lua 0x52
+------------------------+

These files contain compiled Lua bytecode that supplements the logic in the connect binary. From the initial examination, we saw the bytecode was Lua 5.2 bytecode. The structure of a Lua bytecode file is extensively documented; we'll just cover the necessary information in this post (for a quick overview take a look at this link).

Of course we'd like to know what functionality is hidden in these files so we tried every decompiler we could get our hands on. Unfortunately they all complained about the byte-code version or died trying to interpret the bytes on the files. This is because the decompilers weren't up-to-date for Lua 5.2. This version of Lua adds a couple of instructions to the VM but the semantics and the byte-code format seems to be the same.

Here were some of the decompilers we tried (among others):
Considering this, we tried to hack up the files to trick the decompilers into working with our target files but alas, nothing seemed to be working, the decompilers just died with errors stating that the chunk of code did not correspond to valid Lua code. Note: Pay careful attention to endianness when hacking up byte code files. We even considered patching a tool like unluac to support Lua 5.2 bytecode as this looked like the most mature out of the ones we tried, but this wouldn't be a trivial task and would require major surgery. Unluac and others weren't going anywhere without a major patch and we didn't have much time so we went lower-level to a bytecode disassembler.

Enter: LuaAssemblyTools(LAT) - https://github.com/mlnlover11/LuaAssemblyTools.
This Lua library allowed us to parse and disassemble the byte-code regardless of version and/or endianness. We were able to decompile the Lua 5.2 byte-code used in the connect binary into LASM (a LAT representation of Lua VM's instructions).

Now we have disassembly, but it's ugly -- like DNSSec level of ugly. So our next challenge was what to do with the dissasembled code. The way tables and constants are handled in Lua's VM is great for machine consumption but human readable it is not! How many levels of indirection can one really keep track of in their head at the same time?

Using LAT's LASM Decompiler we disassembled descriptor.bin into this:

; Decompiled to lasm by LASM Decompiler v1.0 ; Decompiler Copyright (C) 2012 LoDC ; Main code .name "" .options 0 0 1 2 ; Above contains: Upvalue count, Argument count, Vararg flag, Max Stack Size ; Constants .const "module" .const "descriptor" .const "FieldDescriptor" .const "TYPE_DOUBLE" .const 1 .const "TYPE_FLOAT" .const 2 .const "TYPE_INT64" .const 3 .const "TYPE_UINT64" .const 4 .const "TYPE_INT32" .const 5 .const "TYPE_FIXED64" .const 6 .const "TYPE_FIXED32" .const 7 .const "TYPE_BOOL" .const 8 .const "TYPE_STRING" .const 9 .const "TYPE_GROUP" .const 10 .const "TYPE_MESSAGE" .const 11 .const "TYPE_BYTES" .const 12 .const "TYPE_UINT32" .const 13 .const "TYPE_ENUM" .const 14 .const "TYPE_SFIXED32" .const 15 .const "TYPE_SFIXED64" .const 16 .const "TYPE_SINT32" .const 17 .const "TYPE_SINT64" .const 18 .const "MAX_TYPE" .const "CPPTYPE_INT32" .const "CPPTYPE_INT64" .const "CPPTYPE_UINT32" .const "CPPTYPE_UINT64" .const "CPPTYPE_DOUBLE" .const "CPPTYPE_FLOAT" .const "CPPTYPE_BOOL" .const "CPPTYPE_ENUM" .const "CPPTYPE_STRING" .const "CPPTYPE_MESSAGE" .const "MAX_CPPTYPE" .const "LABEL_OPTIONAL" .const "LABEL_REQUIRED" .const "LABEL_REPEATED" .const "MAX_LABEL" ; Upvalues .upval '' 1 0 ; Instructions gettabup 0 0 256 loadk 1 1 call 0 2 1 newtable 0 0 25 settable 0 259 260 settable 0 261 262 settable 0 263 264 settable 0 265 266 settable 0 267 268 settable 0 269 270 settable 0 271 272 settable 0 273 274 settable 0 275 276 settable 0 277 278 settable 0 279 280 settable 0 281 282 settable 0 283 284 settable 0 285 286 settable 0 287 288 settable 0 289 290 settable 0 291 292 settable 0 293 294 settable 0 295 294 settable 0 296 260 settable 0 297 262 settable 0 298 264 settable 0 299 266 settable 0 300 268 settable 0 301 270 settable 0 302 272 settable 0 303 274 settable 0 304 276 settable 0 305 278 settable 0 306 278 settable 0 307 260 settable 0 308 262 settable 0 309 264 settable 0 310 264 settabup 0 258 0 return 0 1 0 

To understand this as quick as possible we need something to make LASM a bit more sane, time to write some code to do it ourselves! Lua is a register-based virtual machine so that makes our life a little easier.

We made an easy script that rewrites the LASM code into something more human readable. It organizes the disassembly to a much more readable code display form, so consider the output of this tool somewhere in the middle of the spectrum between a straight disassembler and a decompiler (restructured disassembly?)

If you're interested to learn more, here are a few presentations showing the internals of a Lua VM that came in handy for this task (http://luaforge.net/docman/83/98/ANoFrillsIntroToLua51VMInstructions.pdf and http://www.dcc.ufrj.br/~fabiom/lua/ were a huge help).

The resulting code from our tool can't be compiled (so it's not a true decompiler) but it was so much easier to follow than a straight disassembly. You can find the tool published on our Github here.

Here we can see the description.bin output after using our LasmRewriter.py script:

function main(...) module(descriptor) regs[0] = [] regs[0][TYPE_DOUBLE] = 1 regs[0][TYPE_FLOAT] = 2 regs[0][TYPE_INT64] = 3 regs[0][TYPE_UINT64] = 4 regs[0][TYPE_INT32] = 5 regs[0][TYPE_FIXED64] = 6 regs[0][TYPE_FIXED32] = 7 regs[0][TYPE_BOOL] = 8 regs[0][TYPE_STRING] = 9 regs[0][TYPE_GROUP] = 10 regs[0][TYPE_MESSAGE] = 11 regs[0][TYPE_BYTES] = 12 regs[0][TYPE_UINT32] = 13 regs[0][TYPE_ENUM] = 14 regs[0][TYPE_SFIXED32] = 15 regs[0][TYPE_SFIXED64] = 16 regs[0][TYPE_SINT32] = 17 regs[0][TYPE_SINT64] = 18 regs[0][MAX_TYPE] = 18 regs[0][CPPTYPE_INT32] = 1 regs[0][CPPTYPE_INwhT64] = 2 regs[0][CPPTYPE_UINT32] = 3 regs[0][CPPTYPE_UINT64] = 4 regs[0][CPPTYPE_DOUBLE] = 5 regs[0][CPPTYPE_FLOAT] = 6 regs[0][CPPTYPE_BOOL] = 7 regs[0][CPPTYPE_ENUM] = 8 regs[0][CPPTYPE_STRING] = 9 regs[0][CPPTYPE_MESSAGE] = 10 regs[0][MAX_CPPTYPE] = 10 regs[0][LABEL_OPTIONAL] = 1 regs[0][LABEL_REQUIRED] = 2 regs[0][LABEL_REPEATED] = 3 regs[0][MAX_LABEL] = 3 return regs[0] end

This gets the disassembly to the point where we can easily understand it, compared to what we had before which was just horrible. Now that we can disassemble the files we see that they control the logic of the device, but the hardware access is done at a lower level. More so, the System-on-a-Chip has some interesting features like setting up the parameters of your video input and output and the image post-processing is done by the hardware which is much more efficient.

Lua on an embedded devices such as Dropcam is compact and safer to write than C, so that's a good idea from the security front. The Linux kernel and it's device drivers running on the device take care of everything real-time related and they expose this functionality to Lua the Unix way i.e. everything is a file. You can open a /dev/ file to access the stream of video and manipulate camera functionality. Everything for image conversion, filtering, etc. is taken care of in the low-level drivers. (Note: a bit more detail on this topic can be be found in SynAck's recent presentation which was published after the research you're reading in this blog-post was conducted.)

This way of using Lua on embedded devices is a little different than project like eLua (http://www.eluaproject.net/) which takes the Lua VM and make it run on small embedded devices (to check the supported CPUs click http://www.eluaproject.net/overview/status). We've seen that used on other embedded devices we hack on.

Well that's the conclusion of this blog post series, we hope you got a bit of insight into reversing embedded devices. We didn't publish any 0day vulns in these posts, 0days are a given in every product if you look hard enough, this blog series was meant to give the beginner/intermediate IoT reverser some guidance.

Reminder: You can find the Lua disassembly rewriter tool on our Github here.

Thursday, July 17, 2014

Hacking your hacking tools: When you absolutely must decode ProtoBuf

Earlier this year we did a web application assessment where our client made extensive use of protobufs sent over HTTP. For those who haven't come across it, Protobuf is a library developed by Google for serializing messages to a compact binary format. Protobufs are often used for developing different types of network protocols, and sometimes they are used to serialize data that will be sent over HTTP, a situation where encoding data in a human-readable format like JSON or XML is more common.

We like to use Burp Suite when auditing anything that works over HTTP, and when applications serialize data in a human-readable format, it's easy to use Burp to modify that data. With a binary format like protobufs, however, modifying an encoded message by hand is tedious and error-prone, so we decided to try the (Burp Protobuf Decoder plugin by Marcin Wielgoszewski.) This post details our experience working with the Burp Protobuf Decoder plugin, the problems we had getting Burp set up to test this particular web app, and how we solved those problems.

As we started testing, our Burp session filled with binary data in our proxy history. When we loaded the plugin into that session, it didn't add any “protobuf” tabs or decode anything. We quickly realized that this was because the plugin was looking for messages with a content-type header of "application/x-protobuf", while the application was using a slightly different content-type. Changing the plugin code to look for the modified content-type header let us see the contents of the protobufs more easily, but we still couldn't edit them.

We wanted to edit the contents of the messages, but to see why we couldn't, and what we would have to do to be able to edit them, let's back up and look at how protobufs are defined. Protobuf message formats are defined in the protobuf language and stored as .proto files. The .proto files are then compiled into source code for the language where you want to use them. The Burp Protobuf Decoder plugin allows you to modify protobufs once you've loaded the message definition .proto files; without them, it falls back on using the protoc tool to decode messages.

The protoc tool can decode binary messages without access to the original .proto definition files, but it doesn't support re-encoding messages. This is because some information is lost when encoding the messages, making encoding messages without the message type definition difficult. When you only have the information in the binary messages to go by, the message field types are ambiguous, and it also isn't always clear whether some fields are optional or can be repeated. Of course, the names of fields and enumerated values are not included in binary messages either.

We were lucky because we were doing a greybox assessment, meaning we had access to the .proto files (as well as the rest of the application source code). At the same time we were unlucky - when we tried to load the .proto files into the Burp plugin, some of them would refuse to load, instead causing Java exceptions to be thrown with the message "Method code too large!"

The Protobuf Decoder plugin loads message definitions by first compiling the .proto files into python code using the standard protoc command and then importing the python files on the fly. Burp extensions written in python are run using the Jython python implementation, and it turns out that Java doesn't support methods larger than 64k. This is the reason we were getting the "Method code too large!" exception - Jython was trying to load the python code generated by protoc into Java methods, but they were too big for Java.

For most developers, the solution to the "Method code too large!" exception is to break up their python code into smaller files and methods. In this situation however, our python code was generated by protoc, and it wasn't very clear how to split it up. Instead, we decided to try splitting up the problematic .proto files into multiple smaller .proto files so that each generated python file would be smaller. This solution eventually worked.

The problem with this solution is that it's not necessarily easy to split up .proto files because of dependencies between type definitions. Protobuf messages can have fields that contain other message types. A message definition can reference another message definition in the same .proto file, or in a .proto file that it imports, but protoc can't handle circular dependencies between .proto files.

For example, let's say you're trying to split a.proto into a1.proto and a2.proto. If you have a2.proto import a1.proto, you can't have a1.proto import a2.proto. That means that you have to split the file so that none of the message definitions in a1.proto depend on those in a2.proto.

Say this is a.proto:
message Foo { required Bar bar = 1; } message Bar { optional Qux qux = 1; } message Baz { repeated Foo foo = 1; } message Qux { required int32 q = 1; }
To safely split it into two, you have to carefully arrange your message definitions. Here is a1.proto:
message Bar { optional Qux qux = 1; } message Qux { required int32 q = 1; }
And here's a2.proto:
import "a1.proto"; message Foo { required Bar bar = 1; } message Baz { repeated Foo foo = 1; }
Doing this programatically would require code to parse and re-write .proto files. Luckily, there were only a few .proto files that were giving us trouble, and we were able to split them up by hand relatively easily. We split each of them into two .proto files, which compiled to make python files small enough for Jython to load. We loaded the smaller .proto files into the Burp plugin, allowing us to view and edit messages in Burp and finally do the tests that we wanted to try.

In this case we were unlucky that the .proto files we were given were big enough to cause trouble, but we were able to use Wielgoszewski's plugin and some .proto file hacking to get our hacking done. We hope sharing this experience will save you or another web app hacker some headaches when trying to work with protobufs in Burp!



Tuesday, June 3, 2014

Exploiting CVE-2014-0196 a walk-through of the Linux pty race condition PoC

By Samuel Groß

Introduction

Recently a severe vulnerability in the Linux kernel was publicly disclosed and patched. In this post we'll analyze what this particular security vulnerability looks like in the Linux kernel code and walk you through the publicly published proof-of-concept exploit code by Matthew Daley released May 12th 2014.

The original post by the SUSE security team to oss-security announced that the vuln was found accidentally by a customer in production! You can find the patch at this link.

The core issue is located in the pty subsystem of the kernel and has been there for about five years. There was about one year in the middle where the vuln was not present, we'll talk about that a bit later in this post.

Background on the pty/tty subsystem

In order to fully understand the vuln we'll have to dive into the pty/tty subsystem of the linux kernel so lets start there.

A tty is "..an electromechanical typewriter paired with a communication channel." Back in the day a tty was made up of a keyboard for the input, a screen or similar display for the output and an OS process that was attached to this concept of tty. The process would then receive the input and it's output would be redirected to the screen. Those days are long gone but command line applications are not (thankfully!) and today we mostly use pseudo terminals. The main difference here is that instead of a keyboard and screen another process sits at the master side of the pty (for example a terminal emulator). Think of a pty as a bidirectional pipe or socket with some additional hooks in place (for example if you type a ctrl-c on the master side the kernel will interpret it instead of sending it to the slave. In this case the kernel will send a SIGINT signal to the slave process which will often cause it to terminate execution).

It's the pty subsystem's job to take input from either side of the pty, look for specific bytes in the byte stream (e.g. a ctrl-c), process them and deliver everything else to the other side. There is additional logic involved here which is not present in other IPC concepts such as pipes or sockets. This logic takes care to ensure things like echoing characters you type at the master end are also written back to it, pressing the backspace key to remove previously typed characters actually works on display, or sending signals like SIGINT when ctrl-c is sent. This logic is called line discipline (ldisc in short). Upon receiving data from either side the kernel will store the data in a temporary buffer (struct tty_buffer) and queue a work item to process the incoming data (flush it to the line discipline) at a later point and deliver them to the client side (I assume this is mainly done for "real" terminals whose input arrives in interrupt context (i.e. keyboard press, USB packet, ...) and should thus be handled as fast as possible). In this vuln we'll be racing one of these worker processes while it processes data to find the exploitable condition.


You can learn more about the pty subsystem here: http://www.linusakesson.net/programming/tty/

The vulnerability

For background we'll first need to introduce some important structures from include/linux/tty.h. (all source code excerpts were taken from Linux 3.2.58 except if stated otherwise):

struct tty_buffer { struct tty_buffer *next; char *char_buf_ptr; unsigned char *flag_buf_ptr; int used; int size; int commit; int read; /* Data points here */ unsigned long data[0]; };
As seen above a tty_buffer data structure temporarily holds a fixed number (well under normal circumstances) of bytes that have arrived at one end of the tty and still need to be processed.
tty_buffer is a dynamically sized object, so the char_buf_ptr will always point at the first byte right after the struct and flag_buf_ptr will point to that address plus 'size'. tty_buffer.size (which is only the size of the char buffer) can be any of the following: 256, 512, 768, 1024, 1280, 1536 and 1792 (TTY_BUFFER_PAGE).

The actual size of the object is then calculated as follows: 2 x size (for characters + flags) + sizeof(tty_buffer) (for the header), causing the tty_buffer to live in one of the following three kernel heap slabs: kmalloc-1024, kmalloc-2048 or kmalloc-4096.

struct tty_bufhead { struct work_struct work; spinlock_t lock; struct tty_buffer *head; /* Queue head */ struct tty_buffer *tail; /* Active buffer */ struct tty_buffer *free; /* Free queue head */ int memory_used; /* Buffer space used excluding free queue */ };
A tty_bufhead is, as the name implies, is the head (or first) data structure for tty_buffers. It keeps a list active buffers (head) while also storing a direct pointer to the last buffer (the currently active one) to improve performance. You will often see references to bufhead->tail in the kernel source code, meaning the currently active buffer is requested. It also keeps it's own freelist for buffers smaller than 512 bytes (see drivers/tty/tty_buffer.c:tty_buffer_free()).

struct tty_struct { int magic; struct kref kref; struct device *dev; struct tty_driver *driver; const struct tty_operations *ops; /* ... */ struct tty_bufhead buf; /* Locked internally */ /* ... */ };
The tty_struct data structure represents a tty/pty in kernel space. For the sake of this post all you need to know is that it stores the tty_bufhead and thus the buffers.


Alright, let's start with the function mentioned in the commit message, tty_insert_flip_string_fixed_flag() in drivers/tty/tty_buffer.c.
It is responsible for storing the given bytes in a tty_buffer of the tty device, allocating a new one if required:

The call chain leading up to this function roughly looks like this: write(pty_fd) in userspace -> sys_write() in kernelspace -> tty_write() -> pty_write() -> tty_insert_flip_string_fixed_flag()
int tty_insert_flip_string_fixed_flag(struct tty_struct *tty, const unsigned char *chars, char flag, size_t size) { int copied = 0; do { int goal = min_t(size_t, size - copied, TTY_BUFFER_PAGE); int space = tty_buffer_request_room(tty, goal); /* -1- */ struct tty_buffer *tb = tty->buf.tail; /* If there is no space then tb may be NULL */ if (unlikely(space == 0)) break; memcpy(tb->char_buf_ptr + tb->used, chars, space); /* -2- */ memset(tb->flag_buf_ptr + tb->used, flag, space); tb->used += space; /* -3- */ copied += space; chars += space; /* There is a small chance that we need to split the data over several buffers. If this is the case we must loop */ } while (unlikely(size > copied)); return copied; }
This function is fairly straightforward: At -1- tty_buffer_request_room ensures that enough space is available in the currently active buffer (tty_bufhead->tail), allocating a new one if required. At -2- the incoming data is written to the active buffer and at -3- the 'used' member is incremented. Note that tb->used is used as an index into the buffer.

The commit message mentions that two separate processes (a kernel worker process echoing data previously written to the master end and the process at the slave end writing to the pty directly) can enter this function at the same time due to a missing lock, thus causing a race condition.
So what could happen here? The commit message provides us with the following scenario:

            A                                       B 
__tty_buffer_request_room               
                                        __tty_buffer_request_room 
memcpy(buf(tb->used), ...) 
tb->used += space; 
                                        memcpy(buf(tb->used), ...) ->BOOM

In here we see two processes (A and B) writing to the pty at the same time. Since the first process updates tb->used first the memcpy() of the second process will write past the end of the buffer (assuming the first write already filled the buffer) and thus causes the memory corruption.
Now this looks reasonable at first but is actually only part of the story.
Here are some observations that don't quite fit with this scenario:
- When running a simple PoC the kernel seems to crash very fast (on older kernels at least), while the scenario above seems relatively hard to achieve
- Looking at the debugger shows that often multiple pages of kernel data have been overwritten upon crashing. This can hardly be the case when only sending e.g. 2 x 4096 bytes at once

Also take a look at the following (slightly shortened) stacktrace, produced by setting a breakpoint at tty_insert_flip_string_fixed_flag()

 
#0  tty_insert_flip_string_fixed_flag (tty=tty@entry=0xffff880107a82800, 
    chars=0x0, flag=flag@entry=0 '\000', size=1)                      /* -1. */
#1  tty_insert_flip_string (size=<optimized out>, 
    chars=<optimized out>, tty=0xffff880107a82800)
#2  pty_write (tty=0xffff880117cd3800, buf=<optimized out>, c=<optimized out>)
#3  tty_put_char (tty=tty@entry=0xffff880117cd3800, ch=66 'B')        /* -2- */
#4  process_echoes (tty=0xffff880117cd3800)
#6  n_tty_receive_char (c=<optimized out>, tty=0xffff880117cd3800)
#7  n_tty_receive_buf (tty=0xffff880117cd3800, 
    cp=0xffff880117a78828 'B' ..., fp=0xffff880117a78a2d "", count=512)
#8  flush_to_ldisc (work=0xffff880117cd3910)
#9  process_one_work (worker=worker@entry=0xffff880118f507c0, 
    work=0xffff880117cd3910)
#10 worker_thread (__worker=__worker@entry=0xffff880118f507c0)
#11 kthread (_create=0xffff880118ed9d80)
#12 kernel_thread_helper ()

This is the code path a worker process takes when performing a flush to the line discipline. As can be seen at -1- and -2- the echoing is actually done byte by byte.
Clearly we can't cause much harm by only overwriting a buffer with a single byte when the chunk still has unused space left (as will be the case for tty_buffer objects).

In the following we will now assume that the race went something like this: Process A wrote 256 bytes, process B (performing an echo) entered tty_buffer_request_room() before A updated tb->used, causing it to not allocate a fresh buffer. Afterwards B wrote another byte to the same buffer and incremented tb->used further.

To understand what is really causing the memory corruption take a look at the tty_buffer_request_room() function called by tty_insert_flip_string_fixed_flag().

int tty_buffer_request_room(struct tty_struct *tty, size_t size) { struct tty_buffer *b, *n; int left; /* -1- */ unsigned long flags; spin_lock_irqsave(&tty->buf.lock, flags); /* -2- */ /* OPTIMISATION: We could keep a per tty "zero" sized buffer to remove this conditional if its worth it. This would be invisible to the callers */ if ((b = tty->buf.tail) != NULL) left = b->size - b->used; /* -3- */ else left = 0; if (left < size) { /* -4- */ /* This is the slow path - looking for new buffers to use */ if ((n = tty_buffer_find(tty, size)) != NULL) { if (b != NULL) { b->next = n; b->commit = b->used; } else tty->buf.head = n; tty->buf.tail = n; } else size = left; } spin_unlock_irqrestore(&tty->buf.lock, flags); return size; }
Now things start to get interesting, note how at -1- 'left' has type int while 'size' is of type size_t (aka unsigned long). Assuming we previously won the race and have written 257 bytes while the buffer was only 256 bytes large then we now have the following situation:
b->size is 256
b->used is 257

Looking at the code above, at -3- 'left' will now equal -1 and at -4- will be casted to an unsigned value, resulting in 18446744073709551615 (assuming 64 bit long) which is definitely larger then the given size. The following block will be skipped and no new buffer will be allocated for the current request even though the current buffer is more than full.
At this point sending more data to the pty will result in the data being put into the same buffer, overflowing it further (remember 'used' is used as an index into the buffer). Since b->used will still be incremented for each byte we can now overflow as much data as we want.
Also note that this function is locked internally (at -2-), thus serializing access to it.

Now we are ready to draw an updated scenario that leads to an overflow:
        A (Slave)                          B (Echo)

tty_buffer_request_room                 
        |                     // waiting for A to release the lock
                              tty_buffer_request_room 
                              // tb->used < tb->size,
                              // no new buffer is allocated
memcpy(.., 256);
                                        
                              memcpy(.., 1);

tb->used += space; 
                              tb->used += space;    
                              // tb->used is now larger than tb->size


Note that we will win the race as soon as the echoing process enters tty_buffer_request_room and calculates 'left' before the first process gets to update tb->used. Since the whole memcpy() operation is in between, that time frame is relatively large.

So as far as race condition scenarios go, the single case mentioned in the commit message is only one possible way that can result in memory corruption (and only if A fills the buffer completely).
In general any sequence that results in tb->used being larger than tb->size will result in a memory corruption later on. For that to happen the first process must send data to completely fill a buffer (i.e. sending tb->size bytes in total) while the echoing process must enter tty_buffer_request_room() before the first process updates tb->used (this leads to tty_buffer_request_room() not allocating a fresh buffer). The corruption is then caused by sending more data to the pty which will continue to overflow the same buffer.

At this point the vuln turns into a standard kernel heap overflow.

And we'll conclude this section with fun fact: The race in this vuln can actually be won by using just one process. This stems from the fact that we are racing a kernel worker process and not a second user-land process.

Getting to root - The exploit

Here we want to quickly analyze the published exploit code which will hopefully be easy to understand now that the details of the vuln are known.

Going step-by-step with PoC's console output we see...

[+] Resolving symbols

Yep, that's what it's doing. Note that some modern distributions (notably Ubuntu) set /proc/sys/kernel/kptr_restrict to 1, thus disabling /proc/kallsyms. For repository kernels this is merely an inconvenience though since the kernel image (and System.map) can be downloaded locally and the addresses taken from there.

[+] Doing once-off allocations

Stabilizing the heap. We need to make sure existing holes are filled to maximize the chances of getting objects laid out linear in the address space. We want our target buffer to be followed by one of our target objects (struct tty_struct).

[+] Attempting to overflow into a tty_struct... 

Now we are racing.

This is fairly straightforward, open a pty, spawn a new thread and write to both ends at the same time. Afterwards the child thread will send the data needed to overflow into the adjacent chunk. Assuming the race has been won at the start then there is no time pressure on these operations as discussed above.
Also note that only one byte is sent to the master end, this is done so the number of bytes that has yet to be sent can be calculated.

The exploit targets tty_struct structures which end up in the kmalloc-1024 slab cache. The buffer we will overflow will thus have to be in that cache as well (so tb->size = 256 which is also the minimum size). Before writing to the slave end the first time (to allocate a fresh buffer) the exploit creates a bunch of new pty's, thus allocating tty_structs in kernel space. It will then close one of them in hopes that the newly allocated buffer will end up in the freed chunk. If this works out we will have a bunch of tty_structs, followed by the buffer followed by more tty_structs in the kernel address space.

Let's take a quick look at the function executed by the new thread to overflow into the following chunk:
void *overwrite_thread_fn(void *p) { write(slave_fd, buf, 511); write(slave_fd, buf, 1024 - 32 - (1 + 511 + 1)); write(slave_fd, &overwrite, sizeof(overwrite)); }
The first write here will fill the previously allocated buffer (right after closing one of the pty's we allocated a new buffer by writing one byte to the slave fd). Note that the author assumes the buffer to hold 512 bytes while it's size is 256 (MIN_TTYB_SIZE). The reason for that is that on newer releases the kernel can use the flag buffer for data as well (if it knows the flags won't be needed), so the usable size of the buffer is doubled.

The next write fills the memory chunk of the buffer completely. The chunk is 1024 bytes large and so far we have written 32 bytes (sizeof(struct tty_buffer)) + 511 + 1 (the first write to the slave fd) + 1 (the echoed byte from the master fd).

The final write overwrites into the next heap chunk with a fake tty_structure previously created.

Now remember that tty_struct has a member 'ops' that is a pointer to a tty_operations struct? Well those ops members in the linux kernel are always pointers to structures holding function pointers themselves (if you're familiar with C++ this is similar to the vtable pointer of C++ objects). These function pointers correspond to actions performed on the device, there's one for open(), one for close() one for ioctl() and so on. Now assuming we have overwritten the object then 'ops' will now be under our control, pointing into user space. There we have prepared an array of function pointers pointing to our kernel payload.

Now as soon as we perform an ioctl on the tty device we will hijack the kernel control flow and redirect it into the payload. There we'll execute the standard prepare_kernel_cred(0) followed by commit_creds(), elevating our privileges to root:

[+] Got it :)

# id
 uid=0(root) gid=0(root) groups=0(root)


Note that SMEP/SMAP will prevent this exploit (as well as the grsecurity system) as they prevent the kernel from accessing user-land data (SMAP) and code (SMEP).

Limitations

Unlike most other race conditions, in the case of this vuln the attacker is only able to control one of the two processes. Kernel worker processes will check for new work items regularly but can't really be affected by user space. This seems to make a huge difference for different kernel versions, on 3.2 it usually only takes a couple seconds to win the race while on 3.14 it can take multiple minutes.

As mentioned in the PoC code another thing that limits the reliability is the slab cache size in use. As previously discussed the buffer can only be in one of the following slabs: kmalloc-1024, kmalloc-2048 and kmalloc-4096. At sizes this big the chance of hitting the last chunk in the last page of a slab becomes more likely, further limiting the reliability. When that happens the code will overflow into uncontrolled data. This might have no consequences (no important data has been overwritten), lead to a crash later on (some object has been overwritten that is referenced at some point in the future) or even lead to an immediate panic/Oops (for example when the next page is mapped read only).

As also mentioned in the PoC exploit the flags cause some trouble on older kernels (before the commit acc0f67f307f52f7aec1cffdc40a786c15dd21d9) as b->size bytes following the overwritten part will always be cleared to zero. Thus when overwriting a controlled object either the whole objects needs to be restored (and the zeros written into unused space before the end of the chunk) or an object needs to be found where parts of it can safely be overwritten with zeros.

For the last part it might be possible to target tty_buffer objects when exploiting the vuln on pre 3.14 kernels. Here the header can be overwritten, yielding an arbitrary write (overwrite char_buf_ptr and afterwards send data to the pty) while the zeroes can safely be written into the buffer space and won't cause any trouble.

Is Android vulnerable?

As stated in the advisory the vulnerability dates back to 2.6.x kernels, roughly 5 years old. That would imply that pretty much every android device out there is vulnerable to this issue. Running a quick PoC on a newer device (for example the Nexus 5, HTC One or Galaxy S4) it seems the race can never be won there though. Let's again take a look at some kernel source code, this time from the HTC One (m7) Cyanogenmod kernel source.

int tty_insert_flip_string_fixed_flag(struct tty_struct *tty, const unsigned char *chars, char flag, size_t size) { int copied = 0; do { int goal = min_t(size_t, size - copied, TTY_BUFFER_PAGE); int space; unsigned long flags; struct tty_buffer *tb; spin_lock_irqsave(&tty->buf.lock, flags); /* -1- */ space = __tty_buffer_request_room(tty, goal); tb = tty->buf.tail; if (unlikely(space == 0)) { spin_unlock_irqrestore(&tty->buf.lock, flags); break; } memcpy(tb->char_buf_ptr + tb->used, chars, space); memset(tb->flag_buf_ptr + tb->used, flag, space); tb->used += space; spin_unlock_irqrestore(&tty->buf.lock, flags); copied += space; chars += space; } while (unlikely(size > copied)); return copied; }
The Interesting difference is that at -1- we see that the function here is actually locked internally. Now as stated above to win the race the second process needs to enter __tty_buffer_request_room() before the first process updated tb->used. This is not possible if the function is locked like this.

Taking a look at the git history of the Linux kernel it turns out that all kernels between c56a00a165712fd73081f40044b1e64407bb1875 (march 2012) and 64325a3be08d364a62ee8f84b2cf86934bc2544a (january 2013) are not affected by this vuln as tty_insert_flip_string_fixed_flag() was internally locked there.

For Android that means quite a few of the newer devices are not vulnerable to this issue, most of the older ones are though and there are some newer ones integrated the 64325a3be08d364a62ee8f84b2cf86934bc2544a Linux kernel patch, making them vulnerable again.

Conclusion

Kernel exploits are hard, getting them reliable is even harder! This concludes our analysis of CVE-2014-0196, we hope you have gained some deeper understanding of this vuln and kernel level security in general. For more details on linux kernel exploitation you can take a look at our last post: How to exploit the x32 recvmmsg() kernel vulnerability CVE 2014-0038

If you have feedback or have worked on something similar let us know, you can email us at: info/at\includesecurity.com