Watermarking Language Models

Lav Varshney and I recently released a IACR preprint on how to analyze unforgeable watermarking procedures for generative agents. Our approach relies on cryptographic techniques and computational entropy notions.

Abstract

In this work, we construct distortion-free and unforgeable watermarks for language models and generative agents. The watermarked output cannot be forged by an adversary nor removed by the adversary without significantly degrading model output quality. That is, the watermarked output is distortion-free: the watermarking algorithm does not noticeably change the quality of the model output and without the public detection key, no efficient adversary can distinguish output that is watermarked from outputs which are not. The core of the watermarking schemes involve embedding a message and publicly-verifiable digital signature in the generated model output. The message and signature can be extracted during the detection phase and verified by any authorized entity that has a public key. We show that, assuming the standard cryptographic assumption of one-way functions, we can construct distortion-free and unforgeable watermark schemes. Our framework relies on analyzing the inaccessible entropy of the watermarking schemes based on computational entropy notions derived from the existence of one-way functions.

The Errors in Our Way

This blog post is written for a general audience.

In today’s digital age, where information flows across networks at lightning speed, ensuring data integrity is crucial. Whether it’s a message sent over a noisy communication channel, data stored in memory, or even a barcode scanned at the supermarket, errors can occur during transmission or retrieval. This is where error-correcting codes (ECC) come into play, enabling systems to detect and correct errors automatically.

What Are Error-Correcting Codes?

Error-correcting codes are algorithms used to encode and decode data in such a way that errors introduced during transmission or storage can be detected and, in many cases, corrected (very important distinction!). These codes add redundancy to the original data, allowing the receiver to recognize and fix errors without requiring retransmission.

Types of Error-Correcting Codes

There are two broad categories of ECC:

1. Block Codes

Block codes work by encoding a fixed block of data at a time. Each block is transformed into a longer block that includes extra bits for error detection and correction. Examples include:

  • Hamming Codes: Introduced by Richard Hamming in the 1950s, these codes can detect and correct single-bit errors.
  • Reed-Solomon Codes: Widely used in CDs, DVDs, QR codes, and deep-space communication, they are effective at correcting burst errors.
  • Bose-Chaudhuri-Hocquenghem (BCH) Codes: Used in wireless communication and storage devices for robust error correction.

2. Convolutional Codes

Unlike block codes, convolutional codes process data continuously by encoding each bit in the context of previous bits. They are commonly used in real-time communication, such as satellite and mobile phone signals. The Viterbi Algorithm is often used to decode convolutional codes efficiently [1].

How Do Error-Correcting Codes Work?

At a high level, ECC techniques follow these steps:

  1. Encoding: The original data is transformed using an encoding algorithm that introduces redundancy.
  2. Transmission or Storage: The encoded data is sent over a channel or stored in a medium where errors might occur.
  3. Error Detection: At the receiver’s end, the received data is analyzed to identify errors using parity checks or syndrome decoding.
  4. Error Correction: If errors are detected, the redundancy bits help reconstruct the original message.

Specific Examples of Encoding and Decoding Functions

Hamming Code (7,4) Encoding:

A simple example of Hamming code encoding involves a 4-bit data input (D1, D2, D3, D4) and generating three parity bits (P1, P2, P3) using the following formulas:

  • P1 = D1 ⊕ D2 ⊕ D4
  • P2 = D1 ⊕ D3 ⊕ D4
  • P3 = D2 ⊕ D3 ⊕ D4

The final 7-bit encoded message is: P1 P2 D1 P3 D2 D3 D4. See [3].

Hamming Code Decoding:

On the receiving end, the parity bits are recomputed and compared with the received parity bits to detect errors. The syndrome vector determines the error location, and if necessary, the erroneous bit is flipped to correct the error.

Reed-Solomon Encoding and Decoding:

In Reed-Solomon coding, encoding is performed by treating data as coefficients of a polynomial over a finite field. Redundant parity symbols are generated using polynomial division. Decoding involves using error-locator polynomials (e.g., Berlekamp-Massey algorithm) to identify and correct errors [4].

Binary Golay Code Encoding and Decoding:

The (23,12) Binary Golay Code is a linear block code that encodes 12-bit data into a 23-bit codeword by adding 11 parity bits. The code is known for its ability to correct up to 3-bit errors and detect up to 7-bit errors.

  • Encoding: The 12-bit input message is multiplied by a generator matrix to produce the 23-bit codeword.
  • Noise Transmission: Errors may be introduced as the codeword travels through a noisy channel.
  • Decoding: The receiver computes a syndrome using the parity-check matrix and applies an efficient decoding algorithm to correct errors if they fall within the error correction capability.

The Golay code is used in deep-space communication and digital broadcasting [5].

Applications of Error-Correcting Codes

Error-correcting codes are used in a variety of fields, including:

  • Digital Communications: Used in Wi-Fi, mobile networks (e.g., 4G, 5G), and satellite communications to ensure reliable data transmission.
  • Storage Devices: Hard drives, SSDs, and RAM use ECC to prevent data corruption.
  • Deep Space Communication: NASA and other space agencies rely on ECC to communicate with spacecraft over vast distances.
  • Barcodes and QR Codes: Reed-Solomon codes allow damaged or partially obscured barcodes to be accurately scanned.
  • Financial and Banking Systems: Secure data transmission in transactions and cryptographic applications.

Conclusion

As technology advances, so does the need for more efficient error correction methods. Emerging fields such as quantum error correction are gaining traction, ensuring the reliability of quantum computing and communication [2]. AI-driven ECC algorithms are also being explored to improve real-time error detection and correction.

Error-correcting codes remain a fundamental component of modern computing and communications, safeguarding data integrity in an increasingly connected world. Whether you’re streaming a video, making a phone call, or sending a deep-space probe, ECC is (probably) silently working behind the scenes to keep your data accurate and reliable.

Some References

[1] https://en.wikipedia.org/wiki/Viterbi_decoder

[2] https://en.wikipedia.org/wiki/Quantum_error_correction

[3] https://en.wikipedia.org/wiki/Hamming(7,4)

[4] https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction

[5] https://en.wikipedia.org/wiki/Binary_Golay_code

Commitment Schemes

Scenario: Coin Tossing Over the Phone

Imagine two people, Àdébísí and Bọ́lá, want to flip a coin to decide who should pay for drinks, but they are communicating over the phone and, thus, cannot physically toss a coin in the view of both of them. More concretely:

  1. Àdébísí and Bọ́lá want to flip a coin.
  2. Àdébísí secretly decides her guess for “heads” or “tails”.
  3. Bọ́lá flips the coin and announces the result.
  4. Àdébísí reveals her guess, proving she didn’t cheat by changing her guess after hearing the result.
  5. Àdébísí wins the game if her guess matches the announced result.

In this scenario, Àdébísí and Bọ́lá can use a commitment scheme that allows one party to “commit” to a chosen value while keeping it hidden to the other party until an appropriate time. [1]

This ensures neither party can cheat:

  • Àdébísí cannot change her guess after Bọ́lá announces the result (binding).
  • Bọ́lá cannot deduce Àdébísí’s guess before she reveals it (hiding).

A commitment scheme is a cryptographic primitive that allows one party (the sender) to commit to a chosen value while keeping it hidden from another party (the receiver), with the ability to reveal the committed value later. Commitment schemes are essential building blocks in various cryptographic protocols, such as zero-knowledge proofs and secure multi-party computations.

Properties of Commitment Schemes

A commitment scheme must satisfy two fundamental properties:

  1. Binding Property
    • Once the sender has committed to a value, they cannot change it.
    • This ensures that the commitment is “binding,” preventing the sender from lying about the value during the reveal phase.
  2. Hiding Property
    • The commitment does not reveal any information about the value to the receiver until the sender decides to reveal it.
    • This ensures that the value is “hidden” until the sender discloses it.

Phases of a Commitment Scheme

A commitment scheme typically involves two phases:

  1. Commit Phase
    • The sender chooses a value m and generates a commitment C using a commitment function, often involving random coins r.
    • C = \text{Commit}(m, r)
    • The commitment C is sent to the receiver.
  2. Reveal Phase
    • The sender reveals m and r to the receiver.
    • The receiver verifies the commitment C by checking if C = \text{Commit}(m, r).

Hiding

The hiding property of a commitment scheme ensures that the commitment does not reveal any information about the committed value m to the receiver until the sender decides to reveal it. In other words, the commitment conceals the value m while still binding the sender to that value.

Types of Hiding

  1. Perfect/Statistical Hiding: The commitment C provides no information about m at all, even with infinite computational power. This is achieved if the distribution of C is independent of m.
  2. Computational Hiding: It is computationally infeasible for the receiver to deduce m from C (in polynomial time in relevant parameters).

Binding

The binding property in a commitment scheme ensures that once a sender commits to a value, they cannot change it later. This property prevents the sender from “cheating” by revealing a different value during the reveal phase.

Types of Binding

  1. Perfect/Statistical Binding: It is mathematically impossible to find two different values m_1 and m_2 that produce the same commitment C.
  2. Computational Binding: Finding such a pair (m_1, r_1) and (m_2, r_2), that produces the same commitment C, is computationally infeasible for any adversary with limited computational resources.

Exercise

Prove that a commitment scheme cannot be both perfectly hiding and perfectly binding at the same time.

References

[1] Gilles Brassard, David Chaum, and Claude Crépeau, Minimum Disclosure Proofs of Knowledge, Journal of Computer and System Sciences, vol. 37, pp. 156–189, 1988.

[2] Manuel Blum, Coin Flipping by Telephone, Proceedings of CRYPTO 1981.

[3] Oded Goldreich, Silvio Micali, and Avi Wigderson, Proofs that yield nothing but their validity, or all languages in NP have zero-knowledge proof systems, Journal of the ACM, 38: 3, pp. 690–728, 1991


🎉 Happy New Year! 🎉

Here’s to a year filled with joy, success, and unforgettable memories. 🥂🎆

With love,
Daniel Alabi