$$ \renewcommand{\textsc}[1]{\style{font-variant-caps: small-caps}{\text{#1}}} \renewcommand{\textsf}[1]{\style{font-family: "Linux Biolinum"}{\text{#1}}} \renewcommand{\textit}[1]{\style{font-family: "Linux Libertine"; font-style: italic}{\text{#1}}} \newcommand{\Set}[1]{\left\{#1\right\}} \newcommand{\s}[1]{\texttt{#1}} \newcommand{\Len}[1]{\left|#1\right|} \newcommand{\abs}[1]{\left|#1\right|} \newcommand{\ENC}{\textsf{Enc}} \newcommand{\DEC}{\textsf{Dec}} \newcommand{\GEN}{\textsf{Gen}} \newcommand{\SIGN}{\textsf{Sign}} \newcommand{\VER}{\textsf{Ver}} \let\MAC\relax \newcommand{\MAC}{\textsf{Mac}} \newcommand{\Gen}[1]{\GEN\!\left(#1\right)} \newcommand{\Enc}[2]{\ENC_{#1}\!\left(#2\right)} \newcommand{\Dec}[2]{\DEC_{#1}\!\left(#2\right)} \newcommand{\Mac}[2]{\MAC_{#1}\!\left(#2\right)} \newcommand{\Sign}[2]{\SIGN_{#1}\!\left(#2\right)} \newcommand{\Ver}[2]{\VER_{#1}\!\left(#2\right)} \newcommand{\NN}{\mathbb{N}} \newcommand{\ZZ}{\mathbb{Z}} \newcommand{\QQ}{\mathbb{Q}} \newcommand{\RR}{\mathbb{R}} \newcommand{\CC}{\mathbb{C}} \newcommand{\FF}{\mathbb{F}} \newcommand{\OO}{\mathcal{O}} \newcommand{\gen}[1]{\left\langle #1 \right\rangle} \newcommand{\pk}{\Id{pk}} \newcommand{\sk}{\Id{sk}} \newcommand{\Id}[1]{\textit{#1}} \DeclareMathOperator{\ord}{ord} \let\E\relax \DeclareMathOperator*{\E}{E} \DeclareMathOperator*{\Var}{Var} \newcommand{\concat}{\mathrel{\hspace{1pt}||\hspace{1pt}}} \renewcommand{\P}{\textsf{P}} \renewcommand{\NP}{\textsf{NP}} \renewcommand{\RP}{\textsf{RP}} \renewcommand{\BPP}{\textsf{BPP}} \renewcommand{\ZPP}{\textsf{ZPP}} \renewcommand{\gets}{:=} \newcommand{\getsrandom}{\mathrel{\hspace{1pt}\leftarrow\hspace{1pt}}} \newcommand{\isequal}{\mathrel{=\hspace*{0.1pt}=}} \newcommand{\BitAnd}{\mathbin{\&}} \newcommand{\BitOr}{\mathbin{|}} \newcommand{\shl}{\ll} \newcommand{\shr}{\gg} \newcommand{\sra}{\ggg} \newcommand{\BitNeg}{\mathord{\sim}} \definecolor{tomred}{RGB}{200, 40, 41} $$

Perfect secrecy

Relevant readings: Katz and Lindell 2.1–2.3

The notion of perfect secrecy

We will now attempt to formalize our definition that we formulated in the previous lecture: “Regardless of information an attacker already has, a ciphertext should leak no additional information about the underlying plaintext.”

We first need to tweak our syntax of encryption defined from before by explicitly allowing the encryption algorithm to be randomized (meaning \(\Enc{k}{m}\) might output a different ciphertext when run multiple times), and we write \(c \mathrel{\hspace{1pt}\leftarrow\hspace{1pt}}\Enc{k}{m}\) to denote the possibly probabilistic process of encrypting \(m\) with key \(k\) to get \(c\). On the other hand, if is deterministic, we may write it as \(c :=\Enc{k}{m}\).

Under these changes, the invariant for correctness is now: $$\Pr\left[ \Dec{k}{\Enc{k}{m}} = m \right] = 1$$ for every key \(k \in \mathcal{K}\) and every message \(m \in \mathcal{M}\). This implies that we may assume \(\DEC\) is deterministic since it must give the same output every time it is run, thus we write \(m :=\Dec{k}{c}\) to denote the deterministic process of decrypting \(c\) with key \(k\) to get \(m\).

Let \(K\) be the random variable denoting the value of the key output by . We also let \(M\) be the random variable denoting the message being encrypted, so \(\Pr[M = m]\) denotes the probability that the message takes on the value \(m \in \mathcal{M}\).

Example

Let’s say the enemy knows that Caesar likes to fight in the rain and it is raining today, and the message will be either be stand by or attack now. In fact, the enemy also knows that with probability \(0.7\) the message will be a command to attack and with probability \(0.3\) the message will be a command to stand by, so we have $$\Pr[M = \texttt{attack now}] = 0.7 \text{ and } \Pr[M = \texttt{stand by}] = 0.3.$$

Now, suppose that Caesar sends \(c :=\Enc{k}{m}\) to his generals and the enemy has calculated the following: $$\Pr[M = \texttt{attack now} \mid C = c] = 0.8 \text{ and } \Pr[M = \texttt{stand by} \mid C = c] = 0.2.$$ Did the attacker learn anything useful?

Suppose an adversary knows the likelihood that different messages will be sent (in other words, the probability distribution of \(\mathcal{M}\)), and also the encryption scheme being used, but not the key. For a scheme to be perfectly secret, observing this ciphertext should have no effect on the adversary’s knowledge regarding the actual message that was sent. In other words, the adversary learns absolutely nothing about the plaintext that was encrypted.

Definition

[def:perf_sec] An encryption scheme \(\Pi = (\GEN{}, \ENC{}, \DEC{})\) with message space \(\mathcal{M}\) is perfectly secret if for every probability distribution over \(\mathcal{M}\), every message \(m \in \mathcal{M}\) and every ciphertext \(c \in \mathcal{C}\) for which \(\Pr[C = c] > 0\), $$\Pr[M = m \mid C = c] = \Pr[M = m].$$

Perfect indistinguishability

There is another equivalent definition of perfect secrecy, which relies on a game involving an adversary passively observing a ciphertext and then trying to guess which of two possible messages was encrypted.

Consider the following game:

  1. An adversary \(\mathcal A\) first specifies two arbitrary messages \(m_{\texttt{0}}, m_{\texttt{1}} \in \mathcal M\).

  2. A key \(k\) is generated using \(\GEN\).

  3. One of the two messages specified by \(\mathcal A\) is chosen (each with probability \(1/2\)) and encrypted using \(k\); the resulting ciphertext is given to \(\mathcal A\).

  4. \(\mathcal A\) outputs a “guess” as to which of the two messages was encrypted.

  5. Then \(\mathcal A\) wins the game if it guesses correctly.

Then we say that an encryption scheme is perfectly indistinguishable if no adversary \(\mathcal A\) can succeed with probability better than \(1/2\). Note that, for any encryption scheme, \(\mathcal A\) can succeed with probability \(1/2\) by outputting a uniform guess; the requirement is simply that no attacker can do any better than that.

More formally, for an encryption scheme \(\Pi = (\GEN, \ENC, \DEC)\) we define the IND game (which is the same game described above) as follows:

  1. The adversary \(\mathcal A\) outputs a pair of messages \(m_{\texttt{0}}, m_{\texttt{1}} \in \mathcal M\).

  2. A key \(k\) is generated using , and a uniform bit \(b \in \Set{\texttt{0}, \texttt{1}}\) is chosen. A ciphertext \(c \mathrel{\hspace{1pt}\leftarrow\hspace{1pt}}\Enc{k}{m_b}\) is computed and given to \(\mathcal A\). We refer to \(c\) as the challenge ciphertext.

  3. \(\mathcal A\) outputs a bit \(b^\prime\).

  4. \(\mathcal A\) wins if \(b^\prime = b\).

Definition

[def:perf_ind] An encryption scheme \(\Pi = (\GEN, \ENC, \DEC)\) is perfectly indistinguishable if for every adversary \(\mathcal{A}\), $$\Pr\left[\text{$\mathcal{A}$ wins the IND game}\right] = \tfrac{1}{2}.$$

The following lemma states that Definition [def:perf_ind] is equivalent to Definition [def:perf_sec], but we won’t prove it here:

Lemma

An encryption scheme \(\Pi\) is perfectly secret if and only if it is perfectly indistinguishable.

One-time pad

The one-time pad (OTP) is an encryption technique created by Gilbert Vernam in 1919. At the time of invention, it was not yet proven that it was perfectly secret until about 25 years later when Claude Shannon introduced the definition of perfect secrecy and demonstrated that the one-time pad satisfied that definition.

Let \(a \oplus b\) be the bitwise exclusive-or (XOR) of two binary strings \(a\) and \(b\) of equal length, so if \(a = a_1\cdots a_n\) and \(b_1 \cdots b_n\) are both \(n\)-bit strings, then \(a \oplus b\) is another \(n\)-bit string given by $$a \oplus b = \left(a_1 \oplus b_1\right)\cdots \left(a_n \oplus b_n\right).$$ Here, we use the variable \(n\) to denote the length of a secret key in an encryption scheme. We can use the following properties of the XOR operation to make our lives easier: $$\begin{align} x \oplus x &= \texttt{000}\cdots & \text{XOR'ing a string with itself results in zeroes} \\ x \oplus \texttt{000}\cdots &= x & \text{XOR'ing with zeroes has no effect} \\ x \oplus \texttt{111}\cdots &= \neg x & \text{XOR'ing with ones flips every bit} \\ x \oplus y &= y \oplus x & \text{XOR is commutative} \\ (x \oplus y) \oplus z &= x \oplus (y \oplus z) & \text{XOR is associative}\end{align}$$

The corresponding algorithms for the one-time pad are defined as follows:

  • \( \GEN \): Given \(n\), pick a random \(k \getsrandom \Set{\texttt{0}, \texttt{1}}^n\), then return \(k\).

  • \( \ENC \): Given \(m\) and \(k\), return \(k \oplus m\).

  • \( \DEC \): Given \(c\) and \(k\), return \(k \oplus c\).

Recall that \(k \mathrel{\hspace{1pt}\leftarrow\hspace{1pt}}\Set{\texttt{0}, \texttt{1}}^n\) means that a random \(n\)-bit string is chosen uniformly from the set, so each key is chosen with probability \(2^{-n}\).

Theorem

For all \(k, m \in \Set{\texttt{0}, \texttt{1}}^n\), \(\Dec{k}{\Enc{k}{m}} = m\).

Proof. Simply substitute the definitions of and and use the properties mentioned above. For all \(k, m \in \Set{\texttt{0}, \texttt{1}}^n\), we have $$\begin{align} \Dec{k}{\Enc{k}{m}} &= \Dec{k}{k \oplus m} \\ &= k \oplus (k \oplus m) \\ &= (k \oplus k) \oplus m \\ &= \texttt{0}^n \oplus m \\ &= m. \end{align}$$ ◻

Using Definition [def:perf_sec], we now try to show that the one-time pad is perfectly secret:

Theorem

The one-time pad is perfectly secret.

Proof. We first compute \(\Pr[C = c \mid M = m]\) for arbitrary \(c \in \mathcal{C}\) and \(m \in \mathcal{M}\) with \(\Pr[M = m] > 0\). For the one-time pad, we have $$\begin{align} \Pr[C = c \mid M = m] &= \Pr[K \oplus m = c \mid M = m] & \text{(by definition)} \\ &= \Pr[K = m \oplus c \mid M = m] \\ &= 2^{-n}. \end{align}$$ where the final equality holds because \(K\) is a uniform \(n\)-bit string independent of \(M\). Fix any distribution over \(\mathcal{M}\). Then for any \(c \in \mathcal{C}\), we have $$\begin{align} \Pr[C = c] &= \sum_{m \in \mathcal{M}} \Pr[C = c \mid M = m] \cdot \Pr[M = m] & \text{(law of total probability)} \\ &= 2^{-n} \cdot \sum_{m \in \mathcal{M}} \Pr[M = m] & \text{(from above result)} \\ &= 2^{-n} & \text{(second axiom of probability)} \end{align}$$ where \(\Pr[M = m] \ne 0\). Bayes’ theorem gives us: $$\begin{align} \Pr[M = m \mid C = c] &= \frac{\Pr[C = c \mid M = m] \cdot \Pr[M = m]}{\Pr[C = c]} \\ &= \frac{2^{-n} \cdot \Pr[M = m]}{2^{-n}} \\ &= \Pr[M = m]. \end{align}$$ Therefore the one-time pad is perfectly secret. ◻

Although the one-time pad is perfectly secret, it is rarely used for practical applications, one of main reasons being the key should be as long as the message. This is a problem when sending very long messages, especially when it cannot be predicted in advance how long the message will be.

As the name suggests, the one-time pad is only be secure if used once. Let’s see what happens if two messages \(m, m^\prime\) are encrypted with the same key \(k\). Then Eve upon obtaining \(c = k \oplus m\) and \(c^\prime = k \oplus m^\prime\) can compute $$c \oplus c^\prime = (k \oplus m) \oplus (k \oplus m^\prime) = m \oplus m^\prime$$ and from there, Eve can learn where the two messages differ.

Limitations of perfect secrecy

The drawbacks of the one-time pad are not due to the one-time pad itself, but are inherent limitations of perfect secrecy. It can be proven that any perfectly secret encryption scheme must have a key space that is at least as large as the message space.