Wednesday, February 8, 2012

I can still see your actions on Google Maps over SSL

A while ago, yours truly gave two talks on SSL traffic analysis: one at 44Con and one at RuxCon. A demonstration of the tool was also given at last year's BlackHat Arsenal by two of my co-workers. The presented research and tool may not have been as groundbreaking as some of the other talks at those conferences, but attendees seemed to like it, so I figured it might make some good blog content. 



Traffic analysis is definitely not a new field, neither in general nor when applied to SSL; a lot of great work has been done by reputable research outlets, such as Microsoft Research with researchers like George Danezis. What recent traffic analysis research has tried to show is that there are enormous amounts of useful information to be obtained by an attacker who can monitor the encrypted communication stream. 

A great example of this can be found in the paper with the slightly cheesy title Side-Channel Leaks in Web Applications: a Reality Today, a Challenge Tomorrow. The paper discusses some approaches to traffic analysis on SSL-encrypted web applications and applies them to real-world systems. One of the approaches enables an attacker to build a database that contains traffic patterns of the AutoComplete function in drop-down form fields (like Google's Auto Complete). Another great example is the ability to—for a specific type of stock management web application—reconstruct pie charts in a couple of days and figure out the contents of someone's stock portfolio.


After discussing these attack types with some of our customers, I noticed that most of them seemed to have some difficulty grasping the potential impact of traffic analysis on their web applications. The research papers I referred them to are quite dry and they're also written in dense, scientific language that does nothing to ease understanding. So, I decided to just throw some of my dedicated research time out there and come up with a proof of concept tool using a web application that everyone knows and understands: Google Maps.

Since ignorance is bliss, I decided to just jump in and try to build something without even running the numbers on whether it would make any sense to try. I started by running Firefox and Firebug in an effort to make sense of all the JavaScript voodoo going on there. I quickly figured out that Google Maps works by using a grid system in which PNG images (referred to as tiles) are laid out. Latitude and longitude coordinates are converted to x and y values depending on the selected zoom level; this gives a three dimensional coordinate system in which each separate (x, y, z)-triplet represents two PNG images. The first image is called the overlay image and contains the town, river, highway names and so forth; the second image contains the actual satellite data. 

Once I had this figured out the approach became simple: scrape a lot of satellite tiles and build a database of the image sizes using the tool GMapCatcher. I then built a tool that uses libpcap to approximate the image sizes by monitoring the SSL encrypted traffic on the wire. The tool tries to match the image sizes to the recorded (x,y,z)-triplets in the database and then tries to cluster the results into a specific region. This is notoriously difficult to do since one gets so many false positives if the database is big enough. Add to this the fact that it is next to impossible to scrape the entire Google Maps database since, first, they will ban you for generating so much traffic and, second, you will have to store many petabytes of image data. 

With a little bit of cheating—I used a large browser screen so I would have more data to work with—I managed to make the movie Proof of Concept - SSL Traffic Analysis of Google Maps. 



As shown in the movie, the tool has a database that contains city profiles including Paris, Berlin, Amsterdam, Brussels, and Geneva. The tool runs on the right and on the left is the browser accessing Google Maps over SSL. In the first attempt, I load the city of Paris and zoom in a couple of times. On the second attempt I navigate to Berlin and zoom in a few times. On both occasions the tool manages to correctly guess the locations that the browser is accessing. 

Please note that it is a shoddy proof of concept, but it shows the concept of SSL traffic analysis pretty well. It also might be easier to understand for less technically inclined people, as in "An attacker can still figure out what you're looking at on Google Maps" (with the addendum that it's never going to be a 100% perfect and that my shoddy proof of concept has lots of room for improvement). 

For more specific details on this please refer to the IOActive white paper Traffic Analysis on Google Maps with GMaps-Trafficker or send me a tweet at @santaragolabs.

10 comments:

  1. Yes, but if Wifi is used I can look over your shoulder and see what town you are looking at.SCNR...

    Is this also possible if the data is VPN encryped?

    ReplyDelete
  2. It depends; the tool I made obviously won't work anymore since that one requires the capturing of the actual SSL stream. However if I can analyze the VPN traffic and I can extract the traffic patterns from there which correspond to Google Maps traffic it *might* be doable. But there are to many other factors in play which might prevent an attacker from doing it. I wouldn't bet on it not being possible though.

    ReplyDelete
  3. This is great research Vincent. The Google related VPN traffic analysis would be epic if you can pull it off. Very good work man and totally original!

    ReplyDelete
  4. Hey Dillon, thanks for the compliments!

    Yeah I suspect there are possibilities there. There's a cool tool by Michal Zalewksi (lcamtuf for the uninitiated) called fl0p.

    Quote: "fl0p is a passive, layer 7 flow fingerprinter that does not look at packet payloads, only at their relative sizes, direction, and timing. It can be used to peek into encrypted tunnels [...] and much more."

    You can download it at http://lcamtuf.coredump.cx. I think that would be a good starting point to try and do something like this. The biggest problem one first has to overcome is to differentiate between all the other traffic going over the VPN and the traffic going to Google Maps; my gut instinct tells me that it will be very hard, if not impossible. But who knows?

    ReplyDelete
  5. ssh -vCND 33333 your.server.tld

    ReplyDelete
  6. Some specific Web Mapping Servers have been build with this in mind - and the standard allows for the fluffing up of the payloads in a specific way (which is 'as compressible as the actual payload as to not cause noticable entropy flux changes) - and furthermore some of the known web mapping services will go a step further and tile in a fractal style circling which is different based on history rather than region of interest. Would be nice to repeat this with a WMS/1.2 service thus configured.

    Dw.

    ReplyDelete
  7. @dirkx: I definitely didn't know that. That's interesting. Thanks for the pointers. It sounds like they've got it all taken care of then. Maybe I'll get back to this topic one day and check it out to make sure. Thanks again!

    ReplyDelete
  8. Chrome uses SPDY (not plain HTTP), which includes interleaving requests and persistent connections, thus wrecking a large part of your opportunity, but hey, not *totally* wrecking it, and it's a minor market share anyhow.

    ReplyDelete
  9. @yawn: It only uses SPDY on Google Maps and other Google services. It's going to take a long time (if ever) before everyone uses SPDY. There's no reason as far as I can see now why this wouldn't work on Chrome accessing Bing Maps.

    ReplyDelete
  10. Wondering if the interesting data is already not exposed by the used url's?

    ReplyDelete