Homeland Security Watch

News and analysis of critical issues in homeland security

July 9, 2013

How to spy on yourself without really trying.

Filed under: Intelligence and Info-Sharing — by Christopher Bellavita on July 9, 2013

A friend sent me an email this morning with this subject line: “This is Amazing.”

The message said:

Check this metadata app (you can only use it of you use a gmail account): immersion.media.mit.edu 

I wasn’t the only one to learn about this new creation from the MIT Media Lab.  A lot of people wanted to try it out. So it took a long time to get through. But eventually I did.

I gave the Media Lab permission to see the metadata from my gmail account. Yes, you have to surrender your privacy to see what surrendering your privacy could be like. But what the hell. It’s only metadata. Metadata’s innocuous.

If you’d like to try Immersion, but either don’t use gmail or don’t want to share your account with MIT, here’s a link to an Immersion demonstration:  https://immersion.media.mit.edu/demo

And here is a link to a seven minute video explaining Immersion: https://vimeo.com/69464265

Here’s what the Media Lab’s Immersion Project showed me about my gmail metadata, covering 2004 through July 2013 (names removed):

Cb network image one

Interesting, but what could it mean?

I found James Vincent’s description of the Immersion Project in The Independent:

Plugging your Gmail address into MIT’s Immersion allows the system to scrape your email account for its metadata, and produces a complex bubble map showing who you talk to, how much you talk to them, and what your relationships with your contacts are.

Vincent’s article led me to a blog post by Ethan Zuckerman, describing how he used the tool.

Among his observations:

The Obama administration and supporters have responded to criticism of these programs [identified by Snowden] by assuring Americans that the information collected is “metadata”, information on who is talking to whom, not the substance of conversations. As Senator Dianne Feinstein put it, “This is just metadata. There is no content involved.” By analyzing the metadata, officials claim, they can identify potential suspects then seek judicial permission to access the content directly. Nothing to worry about. You’re not being spied on by your government – they’re just monitoring the metadata.

Sociologist Kieran Healy shows another set of applications of these techniques, using a much smaller, historical data set. He looks at a small number of 18th century colonists and the societies in Boston they were members of to identify Paul Revere as a key bridge tie between different organizations. In Healy’s brilliant piece, he writes in the voice of a junior analyst reporting his findings to superiors in the British government, and suggests that his superiors consider investigating Revere as a traitor. He closes with this winning line: “…if a mere scribe such as I — one who knows nearly nothing — can use the very simplest of these methods to pick the name of a traitor like Paul Revere from those of two hundred and fifty four other men, using nothing but a list of memberships and a portable calculating engine, then just think what weapons we might wield in the defense of liberty one or two centuries from now.”

Zuckerman published the Immersion Project’s image of his gmail account, along with an analysis.
Other network example

The largest node in the graph, the person I exchange the most email with, is my wife, Rachel. I find this reassuring, but [two people involved with Immersion] have told me that people’s romantic partners are rarely their largest node. Because I travel a lot, Rachel and I have a heavily email-dependent relationship, but many people’s romantic relationships are conducted mostly face to face and don’t show up clearly in metadata. But the prominence of Rachel in the graph is, for me, a reminder that one of the reasons we might be concerned about metadata is that it shows strong relationships, whether those relationships are widely known or are secret.

The Immersion image of my emails allowed me to identify people who are key in my network. Here’s an image of one of them, again I have removed the names:

One person image

I am also able to see, based on the thickness of the connecting lines, who in my network has the strongest ties to this central person. And that’s just scratching the metadata surface.

Back to Zuckerman’s blog. After describing some additional implications of his Immersion-generated social network image, he writes:

My point here isn’t to elucidate all the peculiarities of my social network (indeed, analyzing these diagrams is a bit like analyzing your dreams – fascinating to you, but off-putting to everyone else). It’s to make the case that this metadata paints a very revealing portrait of oneself. And while there’s currently a waiting list to use Immersion, this is data that’s accessible to NSA analysts and to the marketing teams at Google. [my emphasis] That makes me uncomfortable, and it makes me want to have a public conversation about what’s okay and what’s not okay to track.

Jonathan O’Donnell commented on Zuckerman’s post with a brief literature review about the consequences of data tracking (see the original posting for links to the cited research):

For me, the classic paper in this area is Paul Ohm’s analysis of why anonymization doesn’t work. He shows that small amounts of metadata, and a modicum of known facts, will reveal big amounts of private information (Ohm, 2010).

For example:
In 1997, two students at Massachusetts Institute of Technology (MIT) analyzed the Facebook profiles of 6,000 past and present MIT students. They demonstrated that they were able to predict, with a very high degree of certainty, whether someone was gay or not, based on their friendship group (Jernigan & Mistree, 2009).

In 2009, Acquisti and Gross demonstrated that they could ‘guess’ a large number of American social security numbers using just the birth date and place of a person (Acquisti and Gross, 2009).

In 2009, Zheleva and Getoor demonstrated that friendship and group affiliation on social networks could be used to recover the information of private-profile users. They found that they could predict (with reasonable degrees of success) country of residence (Flickr), gender (Facebook), breed of dog (Dogster) and whether someone was a spammer (BibSonomy), even when 50% of the sample group were private-profile users (Zheleva and Getoor, 2009).

In 2011, Calandrino and others demonstrated that you could use the “You might also like” feature on Hunch, Last.fm, LibraryThing, and Amazon to predict individual purchasing, listening and reading habits of users of these systems. As long as you knew a small number of items that were true about a person, you could use the system to investigate their private behaviour on these sites (Calandrino et al, 2011).

…I’m pretty sure that these techniques can be chained, so that if you are a prolific user of social networks, people can tell your gender, sexual orientation, country of residence, breed of dog, purchasing, listening, reading and spamming activities, your social security number and your name, even if you were anonymous.

But so what, if you’ve done nothing wrong? Why be concerned?

Some of my colleagues ask me that.

I know of at least one major police department that is concerned the ease of social network tracking is making life more dangerous for its undercover officers. The officers practice safe social networking. But they have little control over the social network practices of other people in their professional and social networks — let alone control over the people in the friends of their friends networks.  It gets megacomplex really quickly.

A few months ago, Bruce Schneier wrote that it’s too late to talk about control.  The Internet won, he says.  Privacy lost.

The Internet is a surveillance state. Whether we admit it to ourselves or not, and whether we like it or not, we’re being tracked all the time. … [It] is ubiquitous surveillance: All of us being watched, all the time, and that data being stored forever. This is what a surveillance state looks like, and it’s efficient beyond the wildest dreams of George Orwell.

Sure, we can take measures to prevent this. We can limit what we search on Google from our iPhones, and instead use computer web browsers that allow us to delete cookies. We can use an alias on Facebook. We can turn our cell phones off and spend cash. But increasingly, none of it matters.

There are simply too many ways to be tracked. The Internet, e-mail, cell phones, web browsers, social networking sites, search engines: these have become necessities, and it’s fanciful to expect people to simply refuse to use them just because they don’t like the spying, especially since the full extent of such spying is deliberately hidden from us and there are few alternatives being marketed by companies that don’t spy.

So, we’re done. Welcome to a world where Google knows exactly what sort of porn you all like, and more about your interests than your spouse does. Welcome to a world where your cell phone company knows exactly where you are all the time. Welcome to the end of private conversations, because increasingly your conversations are conducted by e-mail, text, or social networking sites.

And welcome to a world where all of this, and everything else that you do or is done on a computer, is saved, correlated, studied, passed around from company to company without your knowledge or consent; and where the government accesses it at will without a warrant.

Welcome to an Internet without privacy, and we’ve ended up here with hardly a fight.

Oh well, there’s always Pong.  Pong’s innocuous.

Share and Enjoy:
  • Digg
  • Reddit
  • Facebook
  • Yahoo! Buzz
  • Google Bookmarks
  • email
  • Print
  • LinkedIn


Comment by Philip J. Palin

July 9, 2013 @ 7:09 am


Very nice. What does it mean? For me it means that over the last four-plus years I have had four mostly separate networks organized around six dominant nodes. One of these nodes has been you.

When I compare the aggregated data against the data for the last year, three of my four networks are in consistent decline. Only the smallest — most personal — network is close to unchanged.

Especially given that my communications and connections are heavily email oriented this is, I think, significant and, probably, indicates a major shift in professional direction… probably less robust than over the prior period.

An additional bit of vulnerability analysis.

Comment by Donald Quixote

July 9, 2013 @ 12:13 pm

With so many social media addicts sharing every immediate and possible thought and observation electronically, does a majority of the population really care at the end of the day? It may be a generational concern, declining as does the older population that remembers previous abuses and their consequences.

I now know to never email you for I have a tremendous fear of black helicopters over my house.

Comment by Philip J. Palin

July 9, 2013 @ 3:11 pm


With a bit more time to look at the data, several more bits of analysis including:

My four mostly independent networks could be characterized as:

State/Local Homeland Security (largest)
Private sector homeland security (almost as large)
Federal (half as big)
Personal (really small)

I am surprised how separate the first three networks seem to be. Why is there not more overlap?

I also notice that while my sent emails have increased significantly over the last year, my received emails have declined, even as my overall level of interaction has declined. So I seem to be doing a monologue instead of having a dialogue. Not a good sign.

So… potentially helpful in many ways.

As I have otherwise noted, in many ways I wish Google, the government, and others could not track me so easily (I recently purchased a book by C.G. Jung from Amazon, it is amazing the offers and ads this has spawned). But given the track-ability that is built into systems on which I increasingly depend, the issue is not if but how and for what purpose the tracking is done. Oversight, checks-and-balances, laws, regulations, and self-awareness are all encouraged.

RSS feed for comments on this post.

Leave a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>