Everyone who has a supercomputer in their pocket today (and ) has access to a critical part of modern global communication: secure, end-to-end encrypted messaging. But encroaching artificial intelligence systems pose a fundamental risk to the confidentiality of these communications.
There are many varieties of secure messenger, including Meta鈥檚 globally dominant , the security gold standard , upstarts with interesting architectural properties like , and built-in tooling that handles both secure and insecure messages, such as Apple鈥檚 or Google鈥檚 .
The one baseline thing that every secure messaging app is supposed to do is to keep the contents of every message confidential, so that only you and the people you鈥檙e exchanging messages with can read them. This confidentiality is essential for private communication 鈥 and the ability to communicate privately is a baseline for a society that respects civil liberties and freedom more broadly.
Meta, however, that it would introduce AI processing for WhatsApp messages. While it might seem convenient to ask Meta鈥檚 large language models (LLMs) to summarize the 50 messages in your group chat, or to propose an answer for you to send, it鈥檚 important to understand that adding this functionality brings new risks to the architecture that protects our privacy and security. Meta鈥檚 LLMs don鈥檛 run locally on your phone, so WhatsApp will have to send all your supposedly secure messages into Meta鈥檚 servers so that the LLM can process them. What does this mean for confidentiality? If you鈥檝e sent Meta鈥檚 servers the contents of your messages, can Meta read them?
Think about what happens if you paste the contents of your secure messages into a network LLM service like ChatGPT: the operator of the service (in ChatGPT鈥檚 case, OpenAI) can read your messages, breaking their confidentiality. (And it鈥檚 not just you: if anyone who receives your messages sends them to ChatGPT, the same concern applies. Everyone in the chat has to decide to avoid this leakage.)
In theory if you want to run AI analysis on private messages, you could run your own, local AI model on your device (which already has access to your messages), rather than sending them to a networked online model such as ChatGPT. Local models are getting smaller and more powerful by the day, but that would still make the app bulkier, and it would probably require or at least run much better on higher-end hardware. But for folks who want the supposed benefits of AI, they could in theory get them without the risks to privacy by using a local model.
Local models aside, the privacy situation gets even worse if the operating system that the chat app runs on embeds a network-connected AI service at a low enough level. In that case, no messenger would be secure, as Signal鈥檚 . In other words if Apple or Google were to integrate a networked 鈥渁gentic AI鈥 into its phones that could read your text messages 鈥 that means 鈥渂reaking the blood-brain barrier between the operating system and the application layer,鈥 as Whittaker put it, and nothing the developers of Signal do would protect the privacy of your chats. This is because the operating system itself would be sending all of your information (including your secure messages) to their AI, regardless of whether your messaging app offers an AI feature of its own.
But getting back to Whatsapp, to deal with the privacy issues of integrating a secure messenger with a networked AI, that is supposed to defend the confidentiality of your messages even as they are handled by Meta鈥檚 servers.
A lot of appears to have gone into 鈥淧rivate Processing,鈥 and the promises Meta is making are attractive. But it鈥檚 worth unpacking some of the infrastructural features the promises depend on. I鈥檒l set aside whether you or the people you chat with actually want to be chatting with an AI-assisted person, as opposed to chatting with the actual unfiltered human on the other end of the connection. For the sake of the argument, I鈥檒l pretend that passing your chats through AI is somehow attractive and convenient, and focus only on the security and privacy implications of mixing a networked AI service with secure messaging.
There are at least three significant promises made by Meta鈥檚 solution (and by any technology of this sort):
- Data Confidentiality means that any data handled by the machine cannot be copied off the machine.
- Code Integrity means that the software running on the machine is exactly what the user expects it to be.
- Attestation means that the client using the service gets some form of mathematical proof about the state of the service, which they can confirm without having access to the hardware themselves.
To be clear, to trust a network AI service with confidential data, we need at least these three properties, all together, which can be lumped generally under the idea of a 鈥淭rusted Execution Environment,鈥 or TEE. (Several other kinds of promises are also possible, as or CCC, but these three are the most relevant for this discussion.)
The unreliability of confidentiality for data processed on AI servers
The problem is that most of these promises don鈥檛 actually work reliably in the real world against a well-resourced attacker with physical access to the hardware of the TEE. And in the case of Whatsapp, who could be a well-resourced 鈥渁ttacker鈥 with access to the hardware? The biggest concern would be an insider threat at Meta itself. Remember, the whole point of end-to-end encryption is that users don鈥檛 have to trust anyone with their data, including the companies that run the messaging service. If we could trust that Meta could and would voluntarily protect our data in all circumstances, we wouldn鈥檛 need any cryptography at all. But Meta (like any large corporation) faces risks of hacking, political pressure, economic incentives, legal compulsion (from any jurisdiction they operate in), billionaire whims, and all sorts of other reasons why trusting these technical claims is not a reasonable long-term strategy.
Let鈥檚 look at why the confidentiality promises of a TEE are weak. One example of a common technique used as a part of creating a Trusted Execution Environment is to burn a secret key into the hardware when building a machine, with a corresponding public key published by the machine operator. The secret key is used by the TEE to create a digital signature that can be verified by a client of the service. For example, it might sign off on an ephemeral (temporary) encryption key to indicate that the key can be used to safely encrypt data that will be sent to the service. If all works as designed, that ephemeral key will be destroyed once the information is decrypted by the TEE, so no one can ever reuse it to decrypt the message outside the TEE.
But (in just one example) an attack called 鈥溾 demonstrated that some hardware that is supposed to protect secret keys in this way allowed the secret key to be extracted by an attacker who is close to the machine. If Meta were to be to extract such a key from one of their Private Processing servers, then that key could be used by any computer (including a compromised one) to sign an encryption key that is held outside the TEE, claiming it was an ephemeral key held within the TEE. And a WhatsApp application on a user鈥檚 device would be convinced to encrypt data to that key, thinking it will only be used by the 鈥淧rivate Processing鈥 service. Then whoever holds a copy of that key could decrypt the contents of this supposedly secure message.
And it鈥檚 not just TPM-Fail! Researchers have shown that permitted extraction of secret key material related to two different services that were counting on those keys to remain secret. And uses low-cost hardware to read the contents of supposedly encrypted memory, breaking open supposedly secure memory from other vendors (such as AMD鈥檚 SEV-SNP) as well.
Indeed, demonstrate that access to the hardware can generally be used to bypass the strongest hardware protections we know how to build and operate with the sort of economies of scale that Meta uses.
Meta even implicitly acknowledges that using its AI features will leak data: in Whatsapp they have a setting called 鈥溾 that turns on heightened protections 鈥 and that setting blocks everyone in the chat from 鈥渦sing messages for AI features鈥.
The unreliability of attestation and code integrity
The Attestation aspect of TEE deployment is typically also performed by client validation of digital signatures. These signatures are an assertion from 鈥渢he hardware manufacturer of the TEE鈥. Meta says they鈥檒l depend on hardware backed by AMD and NVIDIA, but if either of those two vendors can be coerced or tricked into leaking hardware signing keys, or if either one has a bug in their hardware implementation, then the attestation promises don鈥檛 hold. Furthermore, if the WhatsApp user wants to see a log of those attestations to be able to compare them externally, they need a report, which most users are unlikely to do.
Even if we had plausible defenses against this litany of Data Confidentiality and Attestation attacks, WhatsApp users who feed their chats to AI would still be dependent on Code Integrity. Ensuring that code integrity serves user needs (as opposed to Meta鈥檚 legal, political, or business incentives) would require substantial independent auditing efforts to be reliable: the overwhelming majority of WhatsApp users won鈥檛 actually read the code. But Meta hasn鈥檛 even committed to publishing the source code for their 鈥淧rivate Processing鈥 machines, offering only 鈥渋mage binaries鈥 to unspecified 鈥渞esearchers,鈥 alongside 鈥渟ource code for certain components of the system.鈥
Evaluation of only some components is insufficient to assess the behavior of the system as a whole, and binaries are more complex to evaluate than source code. Even worse, a substantial part of the software being run in this TEE is an LLM, a class of tool that is notoriously difficult to audit even when conditions are optimal. And the work of doing this system-wide evaluation (even with full source access to all components, model weights, etc.) is expensive, especially as the system gets more complex. Who is going to fund this kind of oversight? So while Meta is gesturing in the direction of the kind of supervision necessary for trust, it is far from meeting even the basic bar.
People today put some of the most private aspects of their lives into text messaging services. They need to know exactly how much they can trust that no one else will be able to access their messages, photos, and other content. As Meta and perhaps other companies explore how to integrate AI into these services, users need to know that if Meta wants to cheat, or is forced to, it can probably peek at the information that has been shipped to its AI from WhatsApp. Rolling out these risky mechanisms by default to billions of users represents a profound break in the baseline expectation of privacy that is critical for civil liberties in the modern global communications network. The promised conveniences of AI here are not worth the substantial risks.