Email privacy crash course – Part 3: Metadata and Anonymity

Privacy crash course part-3: Anonymity and Metadata diagram

Although people like to talk about ‘just’ metadata, metadata is extraordinarily intrusive. As an analyst, I’d prefer to be looking at metadata rather than content because it’s quicker and it’s easier and it doesn’t lie.”

Edward Snowden

 

What is metadata and why should I care?

“Metadata” sounds technical and boring (unless you are a geek), and you may be bewildered by Snowden’s cryptic quotation. Most privacy-sensitive people will just use end-to-end PGP encryption and assume that their privacy is protected. While you definitely should encrypt your email, you should not assume that this guarantees your privacy. For this, you also need to protect your metadata. Let us tell you about metadata in non-geeky terms.

 

Picture3

Metadata is the stuff contained by a “header” (see the picture below) that is appended to each of your email messages. Email clients hide the header and display only message body and attachments. The header contains two parts: the bookkeeping part and the personal part. The former contains no personally identifiable information (PII), and is of no interest to us here. The personal part of the header, on the other hand, contains PII that is disastrous for your privacy. Specifically, it usually includes all of the following:

  1. Your IP address
  2. Your email address
  3. Email addresses of all recipients, including those on cc and bcc
  4. The email client that you use
  5. The unique ID of your computer
  6. The language in which you write your emails
  7. The Subject line

 

Picture1

The header is not encrypted even if you use PGP: when email protocols and PGP were created in the ’80s and the ’90s, people did not realize that metadata creates a host of privacy threats.

Some of the threats are obvious. Your IP address allows to geo-locate you within seconds, without GPS. Your email address is often synonymous with your identity, and the combination of the email and IP addresses will surely give you away. The email addresses of the recipients expose your social graph. The ID of your computer further helps to nail you down, and so does the language. Not to mention the subject line that describes the content of your message, even if the content itself is encrypted.

However, metadata-based privacy intrusion can go much further, using automated analysis over time. For example, web services that you register to routinely send you emails with telltale subject lines. Frequency of your email exchanges reveals the strength of your relationships with people. Persistent email exchanges with medical care providers reveal your medical condition. The list goes on.

Email metadata analysis and the generation from it of your unique profile and pattern of life have been thoroughly researched, practically demonstrated and described in this very strong paper. Such profile may include but not be limited to:

  • Your friends
  • Your business associates and the nature of your relationship with them
  • Your love life
  • Your family relationships
  • Your place of work and when you change it (or when you are looking for  a new job)
  • Your interests
  • The web services you subscribe to (see the Rosebutt breach)
  • Your online purchases
  • Your health problems
  • Your habits
  • Who you meet, when and where
  • When, where and how often you travel and your travel arrangements
  • Events that you attend
  • Who pays you and who gets paid by you on PayPal
  • Time and frequency of your consultations with your lawyer and accountant

 

All this is exposed by looking at metadata only, while ignoring the content of your emails. Such analysis is dirt cheap in terms of computing power. Not to mention the enormous intrusive analysis possibilities arising from correlating your email metadata with your telephone call and web browsing metadata. Maybe the above quotation of Snowden makes more sense now.

Where is my metadata?

The short answer is: in Gmail. Almost everybody uses Gmail, and even if you don’t, it is used by almost all the people that communicate with you. Your messages to and from them are likely to be stuck in their Gmail mailboxes forever (Gmail provides practically unlimited storage), and you can’t erase them.

Who is interested in my metadata?

Gmail and the likes. Gmail’s official privacy policy states, loud and clear, that they scan your email. So does Yahoo’s, and it does not help that they had to settle a related class action – they continue to do it.  Gmail’s purpose is to profile you and monetize your profile through targeted advertisement. Hundreds of billions of dollars are at stake.

Governments. Nothing provides a cheaper, faster solution to satisfying the spy agencies’ passion to “collect it all” (or, as they say on the other side of the pond, “collect everything“) than email metadata. In the US they do not even need a warrant if the email is older than 180 days (remember, it is on Gmail forever). The legislative attempts to change this do not seem to go anywhere. Trusting that spy agencies will always get a warrant to collect metadata that is less than 180 days old is a judgment call.

Using email, governments are effectively maintaining a constantly updated dossier on you – yes you, who have “nothing to hide”. They do it by permanently collecting and quickly analyzing your metadata. Or rather, Gmail does the collection for them. Every time you send an email, encrypted or not, you are feeding another piece of info into your dossier.

Hackers. They are usually interested in only one piece of your metadata – email address, and they obtain it when they breach the websites where you are registered with it. Depending on the website, the effects can range from the inconvenience of having to replace your password at several websites to fatal.

What is anonymity?

Anonymity means hiding your identity. In the context of email, there are two kinds of anonymity: account anonymity and alias anonymity. The former is stronger than the latter. Most people do not understand the nuanced difference between the two.

Account anonymity means that nobody knows that you are the owner of an email account. To achieve account anonymity you must at least make sure that (a) when you create the account you are not required to provide any personally identifiable information and (b) when you create the account and when you connect to your account to send or read emails, your IP address is not disclosed to the service provider. This can be difficult to achieve. Many people think that if they create an anonymous “burner” account, they are safe. They usually are not, because the email service provider knows their IP address (unless the service is accessible via Tor both when you register and when you use the account).

Alias anonymity means that the header of the message that you send shows an alias instead of your real email address. This provides very weak anonymity: your email service provider knows that you are the owner of the account and that the alias belongs to you. Inspecting the header of your aliased message allows tracing your message back to your service provider. Subpoena to the service provider, and you are toast. Nevertheless, alias anonymity may enable you to hide your identity from non-government actors.

So what are my options?

If you and the people you communicate with are paranoid about metadata and anonymity, you can use Tor-based email services such as SIGAINT. Anonymity is first-rate but they do not support end-to-end encryption. If the government cracks down on SIGAINT one day, the content of your emails will be exposed.

If you use a PGP client, your metadata is as unsafe as if you were sending your emails in clear text. PGP does not support metadata protection.

If you use an end-to-end email encryption provider such Tutanota or Protonmail, and if you communicate with subscribers of the same service, your metadata is better taken care of simply because these providers naturally have stronger commitment to privacy than the likes of Gmail, and are easier to trust. However, they do not employ any zero-knowledge technology with respect to metadata. This, combined with the fact that, unlike SIGAINT, governments know where to find Protonmail and Tutanota, means that if they are subpoenaed your metadata is toast.

Moreover, Protonmail requires your real email address when you register over Tor unless you are subscribing to their paid plan – which means that you cannot use Protonmail anonymously unless you use their paid service. Tutanota does accept anonymous registration over Tor but  puts you on a 48-hour approval delay. We have not tested whether you indeed are approved and can use the service while remaining anonymous, after the 48-hour period.

You could use experimental messaging services like Ricochet or Bitmessage, but this is not email. We will cover them in our next article. Other email privacy services are being developed that provide end-to-end anonymity and metadata protection as well as encryption. In the final article of this series (Part 6: Make your choice) we will provide an overall comparison of all the available tools.

Stay tuned for our next article on email privacy – Part 4 – Usability vs. Security.

2 comments

  1. Tom

    Great piece. Thank you. Looking forward to your next instalment

    • Darren Chaker

      Great post indeed. Covers the less known topic of meta data, what it consists of, and how most of it can be secured.

Send this to a friend