“Privacy is necessary for an open society in the electronic age. Privacy is not secrecy. A private matter is something one doesn’t want the whole world to know, but a secret matter is something one doesn’t want anybody to know. Privacy is the power to selectively reveal oneself to the world.”A Cypherpunk’s Manifesto, by Eric Hughes.
In 1993, “A Cypherpunk’s Manifesto” was published in the cypherpunk mailing list. It described the importance of privacy in the digital realm and stated that cypherpunks were dedicated to building anonymous systems, including electronic money.
Fast forward 30 years: bitcoin is already 14 years old and the lightning network’s first channel was opened 5 years ago. Does the lightning network live up to the words of Eric Hughes? In this piece, we’ll go through how private the lighting network is and what kind of heuristics or attacks can be put to use to de-anonymize users.
What are the issues with the networking aspect? How can the first layer compromise the privacy of payment channels? How routing can be explored to retrieve information about payments? We’ll also analyze privacy from different perspectives: the sender and the receiver. Lastly, we’ll see technologies that can mitigate some privacy issues.
Before going any further, let’s define what the word “privacy” means in this specific context. Usually, for computer systems, privacy can be thought of in terms of “information security” and this in turn can be broken down into three properties: confidentiality, integrity, and availability. Here, we’ll be focusing on the first property, confidentiality, which is assurance that the information only gets to the intended recipients.
Another useful concept for this article is the notion of “anonymity set”. This is a set of identities that, from an attacker’s point of view, an action could correspond to. The theme of this article is: “how can an attacker use the properties of the Lightning Network to reduce the anonymity set of a given user or even de-anonymize it?”
The lightning network is a set of connected computers that route bitcoin transactions to one another. To route payments, these computers must be able to find each other over the internet, so when a new lightning node joins the network, it announces its address and its node ID. There are currently two types of addresses that can be used currently: a public IP or a Tor onion address.
A public IP exposes sensitive data about the identity running the node, and you don’t have to be any sort of hacker to access this information: a quick search on an IP location tool reveals the approximate location and the internet service provider for a given IP.
A solution for this problem is using a Tor onion address. This is a special kind of internet address that cannot be traced back to the user’s IP address. The tradeoff is that the node is hosted on the Tor network, which is more unreliable than the regular network and can make the success rate for your payments drop significantly. This might be acceptable for an end-user, but for routing nodes the tradeoff might not be worth it, as the service will become more unreliable and peers will always prefer to route payments through reliable routers.
Cross-Layer Data Leakage
The lightning network is a “layer two” protocol. This means that lighting is built on top of another protocol: bitcoin. This is similar to how data transmission on the internet works: protocols stacked on top of each other that abstract complexity from the user.
The ideal situation would be for the higher level to see the lower level as a simplified and self-contained unit, but in practice, the inner workings of the lower level often become apparent in the higher level, creating a problem known as “leaky abstractions.”
Lighting uses the bitcoin blockchain to anchor its payment channels. This can be used to tie UTXOs to lightning nodes. Let’s explore this in further detail.
A Funding transaction, as explained here, is a bitcoin transaction that locks the inputs into a Pay-to-Witness-Script-Hash (P2WSH) output. The script that locks the bitcoin is a 2-of-2 multi-signature. If this output is unspent, and if the attacker only has access to on-chain data, there is no way of differentiating a funding transaction from any other transaction with P2WSH. This is because only the hash of the script is published on-chain. Therefore, the attacker will also need to be listening to Lighting gossip to start linking funding transactions and its UTXOs with payment channels.
When a public payment channel is created, the node sends a `channel_announcement` message to other nodes in the network through the gossip protocol. To avoid spam, the node sending the message must prove that the payment channel exists in the blockchain. This is done by sending the location of the funding transaction along with some other data that can be used by third parties to validate the ownership of the funding transaction.
This is how the attacker links a funding transaction with lighting nodes. The location of the funding transaction can be retrieved by “short channel id” in the `channel_annoucement` message. It looks like this:
Thus, it’s relatively easy for the attacker to link lightning nodes with announced channels with UTXOs on-chain. Lightning was designed so all nodes can make that link when receiving a new `channel_announcement` message.
The biggest privacy issue comes when the bitcoins used in the funding transaction are linked to an identity, usually by KYC. If the attacker has this information as well, this is enough for de-anonymizing the identity that owns the lighting node. Knowing the lighting node means that the attacker also knows the following:
- The total amount of bitcoin locked in payment channels;
- The number of payment channels;
- The node ID of all of your channel partners;
- The ISP and approximate location of the node if it’s running without Tor.
Closing Channel Transactions
Just as the act of opening a channel necessitates an on-chain transaction, the act of closing a channel also necessitates one. There are two types of channel closures: cooperative closes and non-cooperative closes (also known as force closes). Both kinds spend the funds in the 2-of-2 multi-signature P2WSH output from the funding transaction.
A collaborative close looks like any other transaction that spends from a P2WSH that encoded a 2-of-2 multi-signature. With only on-chain data, there’s no way of discerning the closing transaction from any other transaction that spends from a 2-of-2 multi-signature P2WSH output. Nevertheless, this still can be an indication that this was a lightning channel. Taproot can fix this, with Musig2, which makes the collaborative close transaction indistinguishable from a single signature output.
For force closes the transaction will look different. As covered in this post, the Bitcoin blockchain acts as a judge to solve any possible disputes over who owns what after a force close. This means that this transaction will have some extra data in it, so it’s possible to solve any disputes.
Force closes transactions rely on specialized scripts that are quite unique to lightning. Therefore, a force close reveals the P2WSH was in fact used to open a lighting channel. This is especially harmful to private channels (channels not announced to other peers in the network). since the attacker now knows that the P2WSH from the previous transaction was used for a payment channel.
If Alice uses her output from the force close transaction to open a public channel in the future, the attacker can now infer that her lighting node associated with this new public channel was one of the partners from the “private” channel, and the same goes for Bob. If both channel partners use the outputs for announced channels, then both partners from the “private” channel now are known by the attacker.
By tracking the UTXOs related to these lighting nodes, the attacker can also infer with great likelihood that another transaction that has the appearance of a funding transaction and doesn’t have a corresponding `channel_announcement` message on lightning is the node opening another private channel.
As can be seen, “private channel” doesn’t sound like a proper choice of words. Private channels are just lightning channels that don’t were announced publicly in the lighting network through a `channel_announcement` message. But as you can see, this doesn’t mean that they can’t be de-anonymized
Lightning as a Closed System
Now that we discussed how on-chain data can be used to de-anonymize lighting network users, this section will look only at the lightning network as a closed system. At a first glance, the lightning network seems to offer better privacy guarantees than bitcoin’s blockchain, since transactions are made off-chain. This is a naive oversimplification of this matter.
While it’s true that lighting takes payments off-chain, one has to look at how this is done to be able to analyze the consequences of the chosen design in relation to privacy. How the user ends up actually using this second layer also matters.
The main goal at this level is the correlation between lightning payments and senders and receivers.
As discussed here, the lighting network uses an onion encryption scheme called Sphinx. This means that when payments are being routed through the network, intermediary nodes only know from who the onion came, where it’s going, and the amount being sent. Ideally, intermediary nodes shouldn’t know who are the senders and receivers nor how long the route is.
It’s worth noting that, even though sphinx routing provides some privacy guarantees, in some cases this is not enough. A large payment, for example, will naturally have fewer possible routes than a small payment. If there’s only one route with enough liquidity for that payment, it’s obvious to all nodes in the path and also to external observers what was the path. The bigger the payment, the smaller the anonymity set.
Besides that, there are also heuristics that can be applied by attackers to link payments to senders and receivers. Let’s see what those are and how they work.
Every public channel announces its capacity (the total amount of bitcoin locked in the payment channel). But there is no information available for how the capacity is currently distributed in a channel. This is where payment probing comes in. The technique allows a lightning node to “probe” other nodes in the network, in order to find how the liquidity is distributed in their payment channels.
The process is very simple. Let’s suppose that Alice and Bob have a 1 BTC capacity channel. The attacker sends a payment of 1 BTC to Bob with an invalid payment hash. There are two possible scenarios:
- The payment reaches Bob, and as he didn’t request any payment that matches the payment hash, his node returns an “unknown payment hash error” back to the attacker;
- The payment doesn’t reach Bob because Alice doesn’t have enough outbound liquidity. In this case, Alice will return an “insufficient balance error”.
If the attacker receives an “insufficient balance error”, now knows that Alice doesn’t have 1 BTC outbound liquidity, so he could try again with smaller amounts until he receives an “unknown payment hash error”.
In the GIF above, the attacker uses probing to find out that Alice’s local balance is at least 0.5 BTC and less than 0.75. Since the total capacity has to be 1 BTC, the attacker can also infer Bob’s local balance. The attacker can also keep the process going, with smaller increments in each step, until it finds that Alice has exactly 0.5 BTC in her local balance.
While this mechanism can be used to discover channel balances for a specific payment channel, it can also be used to take a broader picture of how liquidity is distributed in the network. The attacker can probe all the nodes in the network to discover how the liquidity is distributed over the network at a specific time. He can keep doing this periodically and compare each snapshot of the network to see where liquidity flowed.
Compare the two snapshots above and see if you can identify how much Alice paid Bob and through which channels the payment passed through.
The bright side is that, as the network scales, this kind of attack becomes more difficult and expensive to achieve, as more nodes will have to be probed in the same amount of time.
HTLCs require that all nodes in the route use the same payment hash. If an attacker controls more than one node in the route, it’ll know that the payment being routed is the same payment just by comparing the payment hash. In certain scenarios, this can be enough to correlate the payment to the sender and receiver.
One such scenario is when both sender and receiver only have one connection each, and this connection is to nodes that are controlled by the attacker, just like shown in the image below. If the attacker knows that these nodes only have one connection, the payment must be coming and going to them.
Mobile wallets are the most vulnerable to this kind of attack because of the unique way that they operate in the network in comparison with lightning nodes hosted in servers:
- They don’t have stable IP addresses;
- They are offline most of the time;
- They are not routing payments;
Therefore, it’s quite obvious for direct peers to discover that they are connected to mobile wallets. When the mobile wallet sends a payment for the peer to route, the peer knows that the mobile wallet is the sender. The same is valid for receivers: when a peer routes to another peer which is known to be a mobile wallet, it knows that it’s the receiver.
If both the direct peer from the sender and the direct peer from the receiver are nodes controlled by the attacker, it knows who is the sender, the receiver, and also the amount of the payment being made.
It is typical for mobile wallets to maintain a single connection to a Lightning Service Provider (LSP). As a result, should an attacker assume the role of the LSP, and the transaction in question involves nodes that are directly connected to it, the correlation of the payment to the specific nodes in question becomes a relatively simple task, as it doesn’t require the attacker to control more nodes besides the LSP.
Another way to correlate payments is by analyzing how much time a payment took and, based on information that the node has on the topology of the network, make assumptions about whom is the sender and the receiver.
This attack begins with data collection. The attacker can send a bunch of fake payments to different lightning nodes at different degrees of separation. Then it measures the average time that each of these payments takes and saves this data for comparison later. When the attacker receives a payment it starts a clock, routes the payment, and stops the clock when it receives the “receipt” for the payment. Now it compares the time measured with the data that it has to make assumptions about who is the sender and the receiver of that payment.
Researchers at the Technical University of Berlin published a study demonstrating that the leading nodes in the network have the ability to determine the origin and destination of a substantial proportion of payments, ranging from 50-72%.
Privacy from different perspectives
The lightning network currently provides different levels of privacy for people sending and receiving payments. BOLT 11 and BOLT 4, which specify how payments are encoded and how routing works, respectively, can be harmful to the privacy of the person receiving the payment. Let’s explore some of the guarantees and nuances for both senders and receivers.
Senders have better privacy than receivers. Their identity is protected by onion routing, meaning that nodes along the path don’t know who the sender is. But, as discussed earlier, there are techniques that can be used by an attacker to de-anonymize the sender of a payment.
Yet, complex attacks are not the only way to de-anonymize the sender. There’s always the possibility that the sender shoots himself in the foot. One way of accomplishing that is sending a 1 hop payment from a node with only one channel.
As previously mentioned, when the sender only has one channel, a two-hop payment can also be easily linked to the sender by the routing node. This is a typical scenario for mobile lighting wallets, which often only have one payment channel to an LSP.
Senders may jeopardize their privacy when sending payments if they’re not careful, but receivers are even more exposed. It all begins with the invoice, which has the public key of the receiver embedded in it. Everyone with access to an invoice can easily discover the node associated with it.
A decoded invoice from lightningdecoder.com
If the sender is receiving the payment through an unannounced channel, it will also need to embed routing hints in the invoice. These hints include the short channel ID for the unannounced payment channel, so it’s possible to also leak UTXO data about an unannounced channel in invoices.
There is still work to do in order to preserve users’ privacy in the lighting network. Almost every issue listed in this article has one or more mitigations at least theorized, others are being implemented while you read this. Let’s go over some of them:
- Schnorr and Musig2 can make lightning channels indistinguishable from any normal single signature output in the cooperative case.
- PTLCs (Point Time-Locked Contracts) enable routing nodes to use different secrets when routing the same payments, which fixes the issue with attackers identifying payments by matching secrets seen by more than one node they control.
- Multi-Path-Payments, which are already the default in some implementations, can split a payment into smaller chunks and route it through different routes, making it more difficult to use probing snapshots to determine senders and receivers.
- Alternative routing techniques such as Rendezvous, Route Blinding, and Trampoline routing can be used to protect the privacy of the receiver, enabling them to safely request payments without exposing their public keys in the invoice.
- Other routing schemes such as Public Key Routing can protect the privacy of unannounced channels by avoiding to use of the short channel id as routing hints when receiving through a private channel.
- Random time delays between hops can be used to break the heuristics of the timing attacks. This is controversial since speed is one of the premises of the network, but given this can be implemented as an opt-in feature, users can use it as they wish.
All items listed here deserved a post of their own. Also, keep in mind that this list is only scratching the surface of proposed mitigations. The pace of innovation in the lighting network is fast, and it’s difficult to keep up with it. The key takeaway to this section is that developers are aware of the privacy pitfalls and are working on solutions as you read this.
The lighting network still falls short of the words of Eric Hughes in his “Cypherpunk Manifesto”. While at first glance lightning seems more private than layer one payments by taking payments off-chain, one has to take into consideration how lighting enables these off-chain payments in the first place to understand how users’ privacy might be affected when using the network.
Nodes have to be able to find each other over the internet to route payments. The ones that choose to use their public IP addresses over Tor reveal sensitive information such as who is the ISP and the approximate location of the node.
Because the lighting network is built on top of Bitcoin’s blockchain, data can leak between those layers and reveal valuable information to attackers. Funding and channel close transactions can be used to link UTXOs to lighting nodes. Not even private channels are protected from this leakage, as sometimes they can publish information on-chain that reveal that the transaction was in fact related to lightning network activity.
When looking only at lightning as a closed system, it’s possible to correlate payments to senders and receivers in different ways. An attacker can control multiple nodes in the network and use seen payment hashes to try to link payments to senders and receivers. Attackers can also use probing snapshots to determine where funds flowed to in the network over a period of time. Timing attacks also make it possible to link payments by measuring how long a payment took and analyzing it against the known topology of the network and average delays between hops.
Mobile users are even more exposed because of the way these nodes operate in the network. They’re offline most of the time, and commonly also have one payment channel open with an LSP, which makes it trivial for the LSP to identify senders and receivers if they are directly connected to them.
Senders and receivers have different privacy guarantees. Receivers are more exposed because of how invoices work: their public key is directly embedded in it. But senders can shoot themselves in the foot if they’re not careful.
Does this mean that the lightning network was a mistake? No. Developers are aware of privacy pitfalls and are working on solutions. The upside is that the protocol was designed so it can be easily upgradeable with opt-in features, so there’s plenty of room for innovation. It’s also important for the community to be aware of the privacy pitfalls, not only to protect their privacy but to create an incentive for developers to work on these solutions.
For the definitions for privacy used in this assessment, it’s possible to say that the Lighting Network is not very private as it is right now. Ultimately, the privacy you get from using the lighting network will depend on how you use it. Do you use Tor? Do you send more than you receive? Are your lightning channels linked to KYC’d Bitcoin? Do you run your own lightning node? You can use the information in this post to enhance your privacy. Power users that are aware of the risks will always have better privacy than regular users.
Create your own node below, or learn more about our Lightning Enterprise solution today.
Also, enjoy exploring our LSP and receiving inbound liquidity with Flow.