Leftover Hash Lemma

Recently, I’ve been investigating computational notions of entropy and had to use the Leftover Hash Lemma. (A variant of this lemma is stated below) I first encountered the Lemma several years ago but didn’t have to use it for anything… until now!

The lemma is attributed to Impagliazzo, Levin, and Luby [1]. A corollary of the lemma is that one can convert a source of (high-enough) Rényi entropy into a distribution that is uniform (or close to uniform). Before stating the Lemma, I’ll discuss a few different notions of entropy, including the classic Shannon Entropy, min-entropy, max-entropy, and so on. See [2] for many different applications for the entropy measures.

Entropy Measures

Consider the random variable $X$ . We use $x\xleftarrow{R} X$ to denote that the element $x$ is randomly drawn from $X$ . Denote its support by Supp( $X$ ). Define the sample entropy of $x$ with respect to $X$ as $H_X(x) = \log\frac{1}{\mathbb{P}[X=x]}$ . The sample entropy measures how much randomness is present in the sample $x$ when generated according to the law/density-function of $X$ . Also, let $H_X(x) = \infty$ when $x\notin$ Supp( $X$ ). Then we can state the entropy measures in terms of the sample entropy:

Shannon Entropy: $H(X) = \mathbb{E}_{x\xleftarrow{R} X}[H_X(x)]$
Min-Entropy: $H_\infty(X) = \min_{x\in\text{Supp}(X)}H_X(x)$
Rényi Entropy: $H_2(X) = -\log\sum_{x\in\text{Supp}(X)}\mathbb{P}_X(x)^2$
Max-Entropy: $H_0(X) = \log(|\text{Supp}(X)|)$

How should we interpret these measures? The min-entropy can be seen as a worst-case measure of how “random” a random variable is. The Rényi entropy measure, intuitively, measures how “collision-resistant” a random variable is (i.e., think hash functions). In my opinion, max-entropy does not give much information, except for how large the support of a random variable is. These entropy measures are related by this inequality:

$H_\infty(X)\leq H_2(X) \leq H(X) \leq H_0(X)$

The inequality above is tight if and only if $X$ is uniformly distributed on its support. The statement of the lemma below uses universal hash functions. Here is a definition:

A function family $\mathcal{H} = \{h:\mathcal{D}\mapsto\mathcal{R}\}$ is two-universal if $\forall x\neq x'\in\mathcal{D}$ , the following holds: $\mathbb{P}_{h\xleftarrow{R}\mathcal{H}}[h(x) = h(x')]\leq 1/|\mathcal{R}|$ .

The Lemma

Statement: Let $X$ be a random variable over $\{0, 1\}^n$ with $H_2(X)\geq k$ . Consider the two-universal function family $\mathcal{H} = \{g:\{0, 1\}^n\mapsto\{0, 1\}^m\}$ . Then for any $H\xleftarrow{R}\mathcal{H}$ , the statistical distance between $(H, H(X))$ and $(H, \mathcal{U}_m)$ is at most $\frac{1}{2}\cdot 2^{(m-k)/2}$ .

One can interpret the statement above as saying that you can convert a random variable with high-enough Rényi entropy into a random variable that is very close to uniform.

References

[1] Impagliazzo, Russell; Levin, Leonid A.; Luby, Michael (1989). Pseudo-random Generation from one-way functions.

[2] Iftach Haitner and Salil Vadhan. The Many Entropies in One-Way Functions, pages 159–217. Springer International Publishing, 2017

Why (and How) Things Work

In Honor of David Blackwell

Leftover Hash Lemma

Entropy Measures

The Lemma

Leave a comment Cancel reply

Entropy Measures

The Lemma

Share this:

Leave a comment Cancel reply