What’s wrong with Signal’s contact discovery

:: security, orwell, computer

After WhatsApp’s threatened change to their terms of service, which may allow them to leak information to Facebook, many people are moving to Signal, a tool which purports to be more secure. If you want security which is not at least partly theatrical you should not use Signal.

WhatsApp

On or about the 6th of January 2021, WhatsApp users were required to agree to new terms of service or to stop using the service by the 8th of February. These terms of service were at best confusing, but given that WhatsApp is owned by Facebook, a company whose entire business model is selling its users’ souls to its customers and which has been heavily implicated in that other thing that happened on the 6th of January 2021, the conclusion was not likely to be good.

I’m glad to say this seems to have been a disaster for WhatsApp: so many users changed to Signal — an app which sells itself as being more secure — that it fell over under the load for a while on the 15th of January. People are apparently leaving WhatsApp in droves, and moving to Signal and other platforms.

WhatsApp / Facebook were so alarmed by this that they’ve both issued a number of clarifications, delayed the implementation date until the 15th of May — probably in the hope that people will have forgotten by then — and made clear that the changes do not apply in Europe, where there are reasonable privacy laws, and not even, yet, in the UK which has not yet completed its transition to Boris Johnson’s hereditary feudal fiefdom.

So that’s, perhaps, good, right? Lots of people were driven to Signal which is ever so much more secure and written and run by very nice people who understand and care about security.

Signal

Well, the people who wrote Signal and run its infrastructure care about their users’ security only as far as it suits them. Yes, they make a great deal of noise about how secure and safe it is: their website is covered in quotes from people like Edward Snowden and Bruce Schneier and generally makes a very big deal about the security of the platform. If you don’t read what they write quite carefully you could be forgiven for thinking that Signal was completely safe, and completely private.

It’s not. And it’s not safe by design: the Signal people know it is not safe, and they don’t care.

Signal’s contact discovery

Here is a sketch of how contact discovery works in Signal. If you are a Signal user you have some identity on the system, and that identity is derived from your phone number. In particular, if you know the phone number you can work out the identity1. If you allow Signal access to your contacts (which it will ask you for), then every once in a while it will work out something equivalent to identities corresponding to your contacts, upload them, ephemerally, to Signal’s infrastructure, and compute the intersection. Once it’s done that, you know which of your contacts have Signal.

There are several obvious problems with this approach. The most obvious of these is that if any of the data on your contacts leaks, even in encrypted form — if someone attacks Signal’s infrastructure, or if Signal themselves are not trustworthy, say — then it is, obviously, a bad thing. And Signal have gone to heroic lengths to protect against this. Here is their initial outline of what it does (the following text comes from the link below):

Private contact discovery using SGX is fairly simple at a high level:

  1. Run a contact discovery service in a secure SGX enclave.
  2. Clients that wish to perform contact discovery negotiate a secure connection over the network all the way through the remote OS to the enclave.
  3. Clients perform remote attestation to ensure that the code which is running in the enclave is the same as the expected published open source code.
  4. Clients transmit the encrypted identifiers from their address book to the enclave.
  5. The enclave looks up a client’s contacts in the set of all registered users and encrypts the results back to the client.

There is much more description of this. And it’s all fine: it really does go to very great lengths to make it very hard for Signal themselves or any other malicious actor who might be able to compromise their systems to gain access to your contacts, and still less to your messages. And that’s all very wonderful.

Now you’re probably expecting me to spout some conspiracy theory about how the SGX enclaves themselves have been compromised at the hardware level by some state-level entity, possibly with a three-letter name, so everything is worthless. Well, there have been rumours that that sort of thing has happened, certainly. But, well, they probably haven’t happened: the conspiracy theories probably are just conspiracy theories as they usually are. Even if they have happened, defending against state-level entities, with or without three-letter names, is generally futile: if these people are interested enough in what’s on your phone they probably will find out, either by fancy technology or by more traditional techniques, possibly involving a rubber hose.

No, that’s not the problem. The problem is laughably simpler than that.

Alice and Elizabeth

Let’s imagine two people: Alice and Elizabeth, her partner. Alice is physically violent towards Elizabeth who lives in serious fear of her, is regularly being beaten by her and is terrified that worse things will happen soon. Elizabeth desperately wants and needs to escape from the relationship before something really bad happens, but she doesn’t know how: she needs to talk to someone privately. Alice, needless to say, doesn’t want this to happen.

Elizabeth realises that she can install Signal on her phone and then use it to communicate, privately, with people who might be able to help her — the police, perhaps. She does so.

Unbeknownst to her Alice already has Signal, perhaps on a phone the number of which Elizabeth does not know. Signal’s contact discovery promptly tells Alice that Elizabeth has installed Signal, and since she’s running it on a phone which doesn’t appear in Elizabeth’s contacts, Elizabeth doesn’t know this. And this story ends with Alice beating Elizabeth to death.

Vladimir and the dissidents

Or let’s imagine Vladimir. Vladimir runs a country which was once, briefly, a democracy but now, once more and inevitably, is a kleptocracy and a police state. Many, many people in Vladimir’s country don’t like him: his problem is knowing which ones to have dealt with. Well this is easy. Vladimir extracts from the telephone companies the phone numbers of the people he’s interested in — either with bribes or with pliers, it does not matter which. He then buys a burner phone, puts all these numbers in its contact list, and installs Signal. Now he knows which of his enemies have Signal, but since his burner phone is most certainly not in their contact lists they have no idea that he knows they have it and thus cannot run. Doors are knocked on at 3 in the morning, people vanish, their assets are acquired by Vladimir who uses them to build another vast, tasteless palace.

Unsafe at any speed

What Signal have done is to produce a beautifully secure implementation of a contact discovery algorithm which is designed to be unsafe, because it allows anyone who knows your phone number to know whether you have Signal, and if you don’t know their phone number — if they are, for instance, stalking you — it will not, and can not, tell you that they know this. The contact discovery algorithm is designed to leak information.

And they know this, and they don’t care. I’ll repeat that: they know that their product enables stalking, and they do not care about that.I don’t know why they made these choices, but I don’t expect the reasons are very good ones.

Some ideas which are mostly useless

It’s tempting to say that, well, the contact discovery algorithm should be mutual: it should only tell me that you have Signal if both you are in my contacts list and I am in yours. That can’t work, because the only way to do this would be to allow my contact list (in encrypted form) to persist, indefinitely, on Signal’s infrastructure, which would leave it open to attack.

Another approach would be to have a bit you could set on your identity which says ‘this identity should not partake in contact discovery’: if it was set then Signal would not allow either it to be discoverable or it to discover others, with the second restriction existing to prevent people deliberately setting it so they could stalk other people while not themselves being discoverable. This is closer to working: it protects against users of the service, but it does not protect against people who can acquire its data: they can simply strip the privacy bits from the identities they’ve captured and run contact discovery on their own copy of the infrastructure.

Strangely, something which should make Signal’s stalking problem less serious is Facebook’s catastrophic misjudgement over WhatsApp’s privacy policy: large numbers of users have migrated from WhatsApp to Signal or, at least, have installed Signal and thus now have identifiers in the system. Stalking someone by discovering they have Signal installed now tells you a lot less about them than it did previously. Of course Elizabeth has Signal, and Vladimir may discover that both his real and potential enemies also have it2. This makes things, at least, less bad, although it does not make them good.

One idea which is not useless

The underlying problem is that Signal uses phone numbers as identifiers, where phone numbers are essentially public information. This enables stalking and worse.

Well, instead, the system could use completely randomly created identifiers which were not tied in any way to phone numbers. This would make the users of the system completely anonymous: the only way you could discover someone’s identifier is if they gave it to you. For added value it might be made, optionally and not by default, possible to attach things like phone numbers and email addresses to the random identifiers, whereupon they would be discoverable, by an algorithm essentially identical to Signal’s. Using such a system you could choose either to be completely undiscoverable or, and only if you wanted to be, to be more-or-less discoverable.

That would be easy, wouldn’t it? The Signal people, who are clearly ever so smart, must have thought of that, and decided not to do it: I wonder why?

Well, of course, other people — people who actually care about the safety of these sorts of systems — have not only thought about doing it this way, they have done it this way. Threema is one such app3.

The theatre of the absurd

Signal’s authors make a lot of noise about how secure it is. But they know it is, by design, not safe. If you care about safety you should use tools which really are safe rather than tools whose authors treat safety as a matter of theatre.


  1. Whether you can go the other way is not clear: ideally the answer would be ‘no’ but the space of phone numbers is so small that it’s not completely implausible to simply search by brute-force to find out which identities correspond to which numbers if you have the computational resources to do so. However this does not matter here. 

  2. Vladimir is not the sort of person who has friends. 

  3. This article is not an advertisement for Threema: it just happens to be a system I know of which does this. I do not personally use it although it does appear to be very competently designed and implemnted by people who really do care about safety rather than are merely pretending to do so. I am sure there are other similar systems.