The Usefulness of “Useless” Knowledge (and Why AI Makes Flexner Even More Right)

I just finished reading The Usefulness of Useless Knowledge again, this time with the perspective of living through a period of rapid technological acceleration driven by AI. On an earlier reading, Flexner’s defense of curiosity-driven inquiry felt aspirational and almost moral in tone, a principled argument for intellectual freedom. On rereading, it feels more diagnostic. Many of the tensions he identified (i.e., between short-term utility and long-term understanding, between institutional incentives and genuine discovery) now play out daily in how we fund, evaluate, and deploy AI research. What has changed is not the structure of his argument, but its urgency: in a world increasingly optimized for immediate outputs, Flexner’s insistence that transformative advances often arise from questions with no obvious application reads less like an idealistic manifesto and more like a practical warning.

In 1939, on the eve of a world war, Abraham Flexner published a slim, stubbornly optimistic essay with a mischievous title: The Usefulness of Useless Knowledge. His claim is not that practical work is bad. It’s that the deep engine of civilization is often curiosity that doesn’t start with an application in mind, and that trying to force every idea to justify itself immediately is a reliable way to stop the next revolution before it begins.

Robbert Dijkgraaf’s companion essay (and related pieces written from his vantage point at the Institute for Advanced Study) updates Flexner’s argument for a world that is now built out of microelectronics, networks, and software; this is exactly the substrate on which modern AI sits. Reading them together today feels like watching two people describe the same phenomenon across two eras: breakthroughs are usually the delayed interest on “useless” questions.

Below is a guided tour of their core ideas, with a detour through the current AI moment, where “useless” knowledge is quietly doing most of the work.

Flexner’s central paradox: curiosity first, usefulness later

Flexner’s essay is a defense of a particular kind of intellectual freedom: the right to pursue questions without writing an ROI memo first.

Dijkgraaf highlights one of Flexner’s most quoted lines (and the one that best captures the whole stance): “Curiosity… is probably the outstanding characteristic of modern thinking… and it must be absolutely unhampered.”

That “must” is doing a lot of work. Flexner isn’t saying that applications are optional. He’s saying the route to them is often non-linear and hard to predict. He even makes the institutional point: a research institute shouldn’t justify itself by promising inventions on a timeline. Instead: “We make ourselves no promises… [but] cherish the hope that the unobstructed pursuit of useless knowledge” will matter later.

Notice the subtlety: he hopes it will matter, but he refuses to make that the official rationale. Why? Because if you only fund what looks useful today, you’ll underproduce the ideas that define tomorrow.

The “Mississippi” model of discovery (and why it matters for AI)

Flexner is unusually modern in how he describes the innovation pipeline: not as single geniuses striking gold, but as a long chain of partial insights that only later “click.”

He writes: “Almost every discovery has a long and precarious history… Science… begins in a tiny rivulet… [and] is formed from countless sources.”

This is basically an antidote to the myth that research can be managed like a factory. You can optimize a pipeline once you know what the pipeline is. But when you’re still discovering what questions are even coherent, “efficiency” often means “premature narrowing.”

AI is a perfect example of the Mississippi model. Modern machine learning is not one idea; it’s a confluence:

mathematical statistics + linear algebra,
optimization + numerical computing,
information theory + coding,
neuroscience metaphors + cognitive science,
hardware advances + systems engineering,
and now massive-scale data and infrastructure.

Much of that was, at some point, “not obviously useful” until it suddenly was.

Flexner’s warning: the real enemy is forced conformity

Flexner’s defense of “useless knowledge” is not only about technology; it’s about human freedom. He’s writing in a period where universities were being pushed into ideological service, and he argues that the gravest threat is not wrong ideas, but the attempt to prevent minds from ranging freely.

One of his sharpest lines: “The real enemy… is the man who tries to mold the human spirit so that it will not dare to spread its wings.”

If you read that in 2025, it lands uncomfortably close to modern pressures on research:

“Only fund what’s immediately commercial.”
“Only publish what’s trendy.”
“Only study what aligns with the current institutional incentive gradient.”
“Only build what can be shipped next quarter.”

And in AI specifically:

“Only do work that scales.”
“Only do benchmarks.”
“Only do applied product wins.”

Flexner isn’t anti-application; he’s anti-premature closure.

Dijkgraaf’s update: society runs on knowledge it can’t fully see anymore

Dijkgraaf’s companion essay takes Flexner’s stance and says, essentially: look around, Flexner won. The modern world is built out of the long tail of basic research.

He gives a crisp late-20th-century example: the World Wide Web began as a collaboration tool for particle physicists at CERN (introduced in 1989, made public in 1993). He ties that to the evolution of grid and cloud computing developed to handle scientific data, technology that now undergirds everyday internet services. Then he makes a claim that matters a lot for AI policy debates: fundamental advances are public goods (i.e., they diffuse beyond any single lab or nation).That’s an especially relevant lens for AI, where:

open ideas (architectures, optimization tricks, safety methods) propagate fast,
but compute, data, and deployment concentrate power.

If knowledge is a public good, then a society that starves basic research is quietly selling off its future, even if it still “uses” plenty of science in the present.

AI as a case study in “useful uselessness”

Here’s a helpful way to read Flexner in the age of AI:

A) “Useless” questions that became AI infrastructure

Many of the questions that shaped AI looked abstract or niche before they became inevitable:

How do high-dimensional models generalize?
When does overparameterization help rather than hurt?
What is the geometry of optimization landscapes?
How can representation learning capture structure without labels?
What are the limits of compression, prediction, and inference?

These don’t sound like product requirements. They sound like “useless” theory, until you realize they govern whether your model trains at all, whether it’s robust, whether it leaks private data, whether it can be aligned, and whether it fails safely.

Flexner’s point isn’t that every abstract question pays off. It’s that you can’t pre-identify the ones that will, and trying to do so narrows the search too early.

B) “Tool-making” is often the hidden payoff

Dijkgraaf emphasizes that pathbreaking research yields tools and techniques in indirect ways. (ias.edu)
AI progress has been exactly this: tool-making (optimizers, architectures, pretraining recipes, eval frameworks, interpretability methods, privacy-preserving techniques) that later becomes the platform everyone builds on.

C) The scary twist: usefulness for good and bad

Flexner also notes that discoveries can become instruments of destruction when repurposed. He uses chemical and aviation examples to make the point.

AI has the same dual-use character:

The same generative model family can draft medical summaries or automate phishing.
The same computer vision advances can improve accessibility or expand surveillance.
The same inference tools can find scientific patterns or extract sensitive attributes.

Flexner’s framework doesn’t solve dual-use, but it forces honesty: the ethical challenge isn’t a reason to stop curiosity; it’s a reason to pair curiosity with governance, norms, and safeguards.

A Flexnerian reading of the current AI funding wave

We’re currently living through a paradox that Flexner would recognize instantly:

AI is showered with investment because it’s visibly useful now.
That investment creates pressure to define “research” as whatever improves next quarter’s metrics.
But the next conceptual leap in AI may come from areas that look “useless” relative to today’s dominant paradigm.

If you want better long-horizon AI outcomes (i.e., robustness, interpretability, privacy, security, alignment, and scientific discovery) Flexner would argue you need institutions that protect inquiry that isn’t instantly legible as profitable.

Or in his words, you need “spiritual and intellectual freedom.”

What to do with this (three practical takeaways)

1) Keep a portfolio: fast product work + slow foundational work

Treat research like an ecosystem. If everything must justify itself immediately, you get brittle progress. Flexner’s “no promises” stance is a feature, not a bug.

2) Reward questions, not only answers

Benchmarks matter, but they can also overfit the field’s imagination. Some of the most important AI work right now is about re-framing the question (e.g., what counts as “understanding,” what counts as “alignment,” what counts as “privacy,” what counts as “truthfulness”).

3) Build institutions that protect intellectual risk

Flexner designed the Institute for Advanced Study around the idea that scholars “accomplish most when enabled” to pursue deep work with minimal distraction.
AI needs its own versions of that: spaces where the incentive is insight, not velocity.

AI is not an argument against Flexner (it’s his exhibit A)

If you hold a smartphone, use a search engine, or interact with modern AI systems, you’re touching the compounded returns of yesterday’s “useless” knowledge.

Flexner’s defense isn’t sentimental. It’s strategic: a society that wants transformative technology must also want the conditions that produce it: freedom, patience, and room for ideas that don’t yet know what they’re for. Or, as Dijkgraaf puts it in summarizing Flexner’s view: fundamental inquiry goes to the “headwaters,” and applications follow, slowly, steadily, and often surprisingly.

Main Source: https://www.ias.edu/ideas/2017/dijkgraaf-usefulness

Statistical Zero-Knowledge Proofs

How would you prove you’ve solved a Sudoku puzzle without revealing the solution? You can construct a zero-knowledge proof showing the grid satisfies Sudoku rules (unique numbers per row, column, and box).

This semester, one of my projects involves zero-knowledge proofs. I’ll try to explain what I’ve learned about this amazing concept and its variants (with particular attention to statistical zero-knowledge). Shout out to Boaz’s cryptography class. Zero-knowledge proofs have found profound use in blockchain technology, authentication, privacy, and so on.

Definition

Intuition: Imagine someone wants to prove they know the solution to a complex problem (e.g., a puzzle or how Trump was going to win the election) without revealing the solution. They use a process that convinces the verifier they have the solution without showing it.

A zero-knowledge proof (ZKP) is a method by which one party (the prover) can demonstrate to another party (the verifier) that a specific statement is true without revealing any additional information about the statement itself.

Key Properties of Zero-Knowledge Proofs

There are a few variants of zero-knowledge. Here is one:

Completeness:
If the statement is true, a honest prover can convince the verifier of its truth:
- If $x \in L$ and the prover $P$ knows a valid witness $w$ , then for all $\epsilon > 0$ , the honest verifier $V$ will accept the proof with probability at least $1 - \epsilon$ : $\Pr[V(x) = \text{accept} \mid x \in L, w \text{ valid}] \geq 1 - \epsilon.$
Soundness:
If the statement is false, no dishonest prover can convince the verifier that it is true (except with an extremely small probability).
- If $x \not\in L$ , then no cheating prover $P^*$ can convince the honest verifier $V$ to accept, except with negligible probability: $\Pr[V(x) = \text{accept} \mid x \not\in L] \leq \epsilon$ where $\epsilon$ is a negligible function.,
Zero-Knowledge:
The verifier learns nothing other than the fact that the statement is true. No information about how or why the statement is true is revealed.
- For every polynomial-time verifier $V^*$ , there exists a polynomial-time simulator $S$ such that the output of $S(x)$ is computationally indistinguishable from the interaction between $P$ and $V^*$ on input $x$ : ${S(x)}_{x \in L} \approx {\text{Transcript}(P \leftrightarrow V^*, x)}_{x \in L}.$

Type	Definition	Guarantee
Perfect Zero-Knowledge	Real and simulated distributions are identical.	Holds even against computationally unbounded verifiers.
Statistical Zero-Knowledge	Real and simulated distributions are statistically close (negligible difference).	Holds against computationally unbounded verifiers.
Computational Zero-Knowledge	Real and simulated distributions are computationally indistinguishable for polynomial-time verifiers.	Holds only against computationally bounded verifiers.

Interactive Zero-Knowledge Proofs

These involve a back-and-forth interaction between the prover and the verifier.

Graph Isomorphism:
Prove that two graphs are isomorphic (structurally identical) without revealing the isomorphism itself. Alice proves to Bob that she knows a way to relabel the nodes of graph $A$ to match graph $B$ .
Hamiltonian Cycle Problem:
Prove that a graph contains a Hamiltonian cycle (a path visiting every vertex exactly once) without revealing the actual cycle.

Non-Interactive Zero-Knowledge Proofs (NIZKs)

These eliminate the need for interaction, enabling the prover to generate a single proof that can be verified multiple times.

zk-SNARKs (Succinct Non-Interactive Arguments of Knowledge):
Widely used in blockchain systems like Zcash to validate transactions while keeping them private. Example: Prove that a transaction is valid (inputs equal outputs) without disclosing amounts or participants.
zk-STARKs (Scalable Transparent Arguments of Knowledge):
A transparent alternative to zk-SNARKs that avoids the need for trusted setups and is more scalable. Example: Used in Ethereum Layer-2 solutions like StarkNet to bundle transaction proofs.

The Fiat-Shamir Heuristic technique to convert interactive proofs into non-interactive ones using cryptographic hash functions.

Schnorr Protocol:
A proof that you know a discrete logarithm of a number without revealing the logarithm itself. Example: Prove ownership of a private key without exposing it (used in Schnorr signatures).

Example Use Cases

Zero-knowledge proofs (ZKPs) come in different forms, with specific examples being applied across theoretical and practical scenarios. Below are some notable examples:

1. Commit-and-Prove Protocols

Combine commitments (binding and hiding data) with zero-knowledge proofs (for example Pedersen Commitments). Prove that you committed to a number $x$ without revealing $x$ but can later open the commitment to verify $x$ .

2. Bulletproofs

Efficient range proofs that demonstrate a value lies within a specific range without revealing the value. Example: Used in Monero to ensure transaction amounts are positive without disclosing the actual amounts.

3. Proofs in Cloud Computing

Proof of Retrievability:
Prove a cloud provider stores your data without downloading it. Example: Used in decentralized storage systems like Filecoin.
Proof of Computation:
Demonstrate the correctness of outsourced computation without revealing inputs or outputs.

4. Secure Voting Protocols

Homomorphic Encryption-Based Proofs:
Prove a vote is valid (e.g., within a candidate set) without revealing the voter’s choice.

5. Knowledge of a Password

Example: Authenticate to a server by proving knowledge of a password without transmitting it. SRP Protocol (Secure Remote Password): Verifies a user knows a password without sending the password itself.

Perfect Zero-Knowledge

Perfect Zero-Knowledge is a stronger version of zero-knowledge where the verifier cannot distinguish between the interaction with the actual prover and the simulated interaction, even with unlimited computational power. In other words, the simulator’s output is statistically identical to the real interaction transcript, not just computationally indistinguishable.

Formal Definition

Let $(P, V)$ be a proof system for a language $L$ . The proof system is perfect zero-knowledge if for every polynomial-time verifier $V^*$ , there exists a probabilistic polynomial-time simulator $S$ such that for every $x\in L$ : $\Pr[\text{Transcript}(P\leftrightarrow V^*,x)=t]= \Pr[S(x)=t]\forall t,$

where:

$\text{Transcript}(P\leftrightarrow V^*,x)$ is the transcript of the interaction between $P$ and $V^*$ on input $x$ ,
$S(x)$ is the simulated transcript generated by $S$ for the same input $x$ .

This implies that the probability distributions of the transcripts from the real interaction and the simulated interaction are exactly the same.

Key Features of Perfect Zero-Knowledge

Statistical Indistinguishability:
The output of the simulator is statistically indistinguishable from the real transcript, meaning the difference between the two distributions is exactly zero.
Stronger Privacy Guarantees:
Since the guarantee holds even against verifiers with infinite computational power, it is stronger than computational zero-knowledge, where the indistinguishability only holds for polynomial-time adversaries.

Example of Perfect Zero-Knowledge

The classic Graph Isomorphism Zero-Knowledge Protocol is a perfect zero-knowledge protocol:

A prover shows two graphs are isomorphic without revealing the actual isomorphism.
The verifier cannot distinguish between a genuine interaction and a simulated one, even with infinite computational power, making it perfect zero-knowledge.

Computational Zero-Knowledge

Computational Zero-Knowledge is a type of zero-knowledge proof where the verifier cannot distinguish between the actual interaction with the prover and the output of a simulator, provided the verifier has limited (polynomial-time) computational power.

This means that the zero-knowledge property relies on the computational infeasibility of distinguishing between the two scenarios, often based on cryptographic hardness assumptions (e.g., the difficulty of factoring large numbers or solving discrete logarithms).

Formal Definition

Let $(P,V)$ be a proof system for a language $L$ . The system is computational zero-knowledge if for every probabilistic polynomial-time (PPT) verifier $V^*$ , there exists a PPT simulator $S$ such that for every $x \in L$ , the distributions: $\text{Transcript}(P \leftrightarrow V^*, x)\}_{x \in L}$ and $\{S(x)\}_{x \in L}$ are computationally indistinguishable. That is, no polynomial-time distinguisher can tell apart the real interaction and the simulated interaction with non-negligible probability.

Key Features of Computational Zero-Knowledge

Computational Indistinguishability:
The zero-knowledge property holds against adversaries with limited computational power (polynomial-time distinguishers). If the verifier were computationally unbounded, they might be able to differentiate the two distributions.
Cryptographic Assumptions:
Computational zero-knowledge often relies on assumptions like:
- The infeasibility of factoring large integers.
- The hardness of the discrete logarithm problem.
- Other complexity-theoretic assumptions.
Relaxed Privacy Guarantees:
Unlike perfect zero-knowledge, where the simulated and real distributions are statistically identical, computational zero-knowledge only guarantees privacy against computationally bounded adversaries.

Examples of Computational Zero-Knowledge

zk-SNARKs:
Used in blockchain protocols like Zcash to ensure transaction validity without revealing sensitive details. The zero-knowledge property here relies on computational assumptions.
Interactive Proofs with Commitment Schemes:
Many zero-knowledge protocols use cryptographic commitments (e.g., Pedersen commitments) to hide information during the proof, ensuring the verifier cannot extract more data computationally.

Real-World Importance

Computational zero-knowledge is widely used in practical applications, such as:

Cryptocurrencies (e.g., Zcash, zkRollups).
Authentication protocols.
Privacy-preserving identity verification.

It strikes a balance between strong privacy guarantees and computational efficiency, making it suitable for real-world cryptographic systems.

Statistical Zero-Knowledge

Statistical Zero-Knowledge (SZK) is a type of zero-knowledge proof where the verifier cannot distinguish between the real interaction with the prover and the output of a simulator, even with unlimited computational power. The key difference from perfect zero-knowledge is that the two distributions (real and simulated) are not identical but are statistically close, meaning the difference between them is negligible.

Formal Definition

Let $(P,V)$ be a proof system for a language $L$ . The system is statistical zero-knowledge if for every probabilistic polynomial-time (PPT) verifier $V^*$ , there exists a PPT simulator $S$ such that for every $x \in L$ , the output distributions: $\text{Transcript}(P \leftrightarrow V^*, x)\}_{x \in L}$ and $\{S(x)\}_{x \in L}$ are statistically indistinguishable. This means the statistical distance (or total variation distance) between the two distributions is negligible:

$\Delta(\text{Transcript}(P \leftrightarrow V^*, x), S(x)) = \frac{1}{2} \sum_t \left| \Pr[\text{Transcript} = t] - \Pr[S(x) = t] \right| \leq \epsilon,$

where $\epsilon$ is a negligible function of the input size.

Key Features of Statistical Zero-Knowledge

Statistical Indistinguishability:
The difference between the real and simulated transcripts is negligibly small, even for verifiers with unlimited computational power.
Weaker than Perfect Zero-Knowledge:
Perfect zero-knowledge requires the distributions to be exactly identical, while statistical zero-knowledge allows for a negligible difference.
Stronger than Computational Zero-Knowledge:
Computational zero-knowledge only guarantees indistinguishability for polynomial-time adversaries, whereas statistical zero-knowledge holds against adversaries with unlimited computational power.
No Dependence on Cryptographic Assumptions:
SZK is typically not reliant on computational hardness assumptions, unlike computational zero-knowledge.

Examples of Statistical Zero-Knowledge

Quadratic Residuosity Problem:
Prove that a number $x$ is a quadratic residue modulo $N$ (a composite number) without revealing the factorization of $N$ . The simulator can generate transcripts statistically indistinguishable from those produced during the real interaction.
Graph Isomorphism Problem:
Prove that two graphs $G_1$ and $G_2$ are isomorphic without revealing the isomorphism. The verifier’s view of the interaction can be statistically simulated.

Real-World Applications

While SZK is less common in practical applications compared to computational zero-knowledge, it has theoretical importance in cryptographic protocol design and scenarios where absolute guarantees against powerful adversaries are required.

Some References

[1] Goldreich, Oded (2001). Foundations of Cryptography Volume I. Cambridge University Press.

[2] Murtagh, Jack. https://www.scientificamerican.com/article/wheres-waldo-how-to-prove-you-found-him-without-revealing-where-he-is/

[3] Goldwasser, S.; Micali, S.; Rackoff, C. (1989), “The knowledge complexity of interactive proof systems” (PDF), SIAM Journal on Computing, 18 (1): 186–208, doi:10.1137/0218012, ISSN 1095-7111