When Influence Scores Betray Us: Efficiently Attacking Memorization Scores

tl;dr 👉 We just put out work on attacking influence-based estimators in data markets. The student lead (who did most of the work) is Tue Do! Check it out. Accurate models are not enough. If the auditing tools we rely on can be fooled, then the trustworthiness of machine learning is on shaky ground.

Modern machine learning models are no longer evaluated solely by their training or test accuracy. Increasingly, we ask:

  • Which training examples influenced a particular prediction?
  • How much does the model rely on each data point?
  • Which data are most valuable, or most dangerous, to keep?

Answering these questions requires influence measures, which are mathematical tools that assign each training example a score reflecting its importance or memorization within the model. These scores are already woven into practice: they guide data valuation (identifying key examples), dataset curation (removing mislabeled or harmful points), privacy auditing (tracking sensitive examples), and even data markets (pricing user contributions).

But here lies the problem: what if these influence measures themselves can be attacked? In our new paper, Efficiently Attacking Memorization Scores, we show that they can. Worse, the attacks are not only possible but efficient, targeted, and subtle.


Memorization Scores: A Primer

A memorization score quantifies the extent to which a training example is “remembered” by a model. Intuitively:

  • A point has a high memorization score if the model depends heavily on it (e.g., removing it would harm performance on similar examples).
  • A low score indicates the model has little reliance on the point.

Formally, scores are often estimated through:

  • Leave-one-out retraining (how accuracy changes when a point is removed).
  • Influence functions (approximating parameter sensitivity).
  • Gradient similarity measures (alignment between gradients of a point and test loss).

Because they are computationally heavy, practical implementations rely on approximations, which (one could argue) introduces new fragilities.


The Adversarial Setting

We consider an adversary whose goal is to perturb training data so as to shift memorization scores in their favor. Examples include:

  • Data market gaming (prime motivation): A seller inflates the memorization score of their data to earn higher compensation.
  • Audit evasion: A harmful or mislabeled point is disguised by lowering its score.
  • Curation disruption: An attacker perturbs examples so that automated cleaning pipelines misidentify them as low-influence.

Constraints:

The attack could satisfy a few key conditions but we focus on

  1. Efficiency: The method must scale to modern, large-scale datasets.
  2. Plausibility: Model accuracy should remain intact, so the manipulation is not caught by standard validation checks.

The Pseudoinverse Attack

Our core contribution is a general, efficient method called the Pseudoinverse Attack: (1) Memorization scores, though nonlinear in general, can be locally approximated as a linear function of input perturbations. This mirrors how influence functions linearize parameter changes. (2) We solve an inverse problem (specified in paper), compute approximate gradients that link input perturbations to score changes, use the pseudo-inverse to find efficient perturbations and apply them selectively to target points. This avoids full retraining for each perturbation and yields perturbations that are both targeted and efficient.


Validation

We validate across image classification tasks (e.g., CIFAR benchmarks) with standard architectures (CNNs, ResNets).

Key Findings

  1. High success rate: Target scores can be reliably increased or decreased.
  2. Stable accuracy: Overall classification performance remains essentially unchanged.
  3. Scalability: The attack works even when applied to multiple examples at once.

Example (Score Inflation): A low-memorization image (e.g., a benign CIFAR airplane) is perturbed. After retraining, its memorization score jumps into the top decile, without degrading accuracy on other examples. This demonstrates a direct subversion of data valuation pipelines.


Why This Is Dangerous

The consequences ripple outward:

  • Data markets: Compensation schemes based on memorization become easily exploitable.
  • Dataset curation: Automated cleaning fails if adversaries suppress scores of mislabeled or harmful points.
  • Auditing & responsibility: Legal or ethical frameworks built on data attribution collapse under adversarial pressure.
  • Fairness & privacy: Influence-based fairness assessments are no longer trustworthy.

If influence estimators can be manipulated, the entire valuation-based ecosystem is at risk.

Conclusion

This work sits at the intersection of adversarial ML and interpretability:

  • First wave: Adversarial examples. i.e., perturb inputs to fool predictions.
  • Second wave: Data poisoning and backdoor attacks. i.e., perturb training sets to corrupt models.
  • Third wave (our focus): Attacks on the auditing layer: perturb training sets to corrupt pricing/interpretability signals without harming predictions/accuracy.

This third wave is subtle but potentially more damaging: if we cannot trust influence measures, then even “good” models become opaque and unaccountable. As machine learning moves toward explainability and responsible deployment, securing the interpretability layer is just as critical as securing models themselves.

Our paper reveals a new adversarial frontier: efficiently manipulating memorization scores.

  • We introduce the Pseudoinverse Attack, an efficient, targeted method for perturbing training points to distort influence measures.
  • We show, supported by theory and experiments, that memorization scores are highly vulnerable, even under small, imperceptible perturbations.
  • We argue that this undermines trust in data valuation, fairness, auditing, and accountability pipelines.

The Talking Drum as a Communication Channel

We just wrapped up Week 1 of my UIUC course, ECE598DA: Topics in Information-Theoretic Cryptography. The class introduces students to how tools from information theory can be used to design and analyze both privacy applications and foundational cryptographic protocols. Like many courses in privacy and security, we began with the classic one-time pad as our entry point into the fascinating world of secure communication.

We also explored another ‘tool’ for communication: the talking drum. This musical tradition offers a striking example of how information can be encoded, transmitted, and understood only by those familiar with the underlying code. In class, I played a video of a master drummer to bring this idea to life.

What Are Talking Drums?

Talking drums, especially those like the Yoruba dùndún, are traditional African hourglass‑shaped percussion instruments prized for their ability to mimic speech. Skilled drummers can vary pitch and rhythm to convey tonal patterns, effectively transmitting messages over short distances.

  • Speech surrogacy: The drum replicates the microstructure of tonal languages by adjusting pitch and rhythm, embodying what researchers call a “speech surrogate” .
  • Cultural ingenuity: Historically, these drums served as everyday communication tools, not merely for music or rituals but for sharing proverbs, announcements, secure messages, and more.

Here’s one of the exercises I gave students in Week 1:

Exercise: Talking drums. Chapter 1 of Gleick’s The Information highlights the talking drum as an early information technology: a medium that compresses, encodes, and transmits messages across distance. Through a communications theory lens, can you describe the talking drum as a medium that achieves a form of secure communication?

And here’s a possible solution:

African talking drums (e.g., Yoruba “dùndún”) reproduce the pitch contours and tonal patterns of speech. Since many West African languages are tonal, the drum reproduces structure without literal words.

  • Encoding: A spoken sentence is mapped into rhythmic and tonal patterns.
  • Compression: The drum strips away vowels and consonants, leaving tonal “skeletons.”
  • Security implication: To an outsider unfamiliar with the tonal code or local idioms, the message is incomprehensible. In effect, the drum acts as an encryption device where the key is cultural and linguistic context.

There are a few entities to model:

  • Source: Message in natural language (tonal West African language, e.g., Yoruba).
  • Encoder: Drummer maps source to a drummed signal using tonal contours and rhythmic patterns.
  • Channel: Physical propagation of drum beats across distance, subject to noise (wind, echo, competing sounds).
  • Legitimate receiver: Villager fluent in both the spoken language and cultural conventions.
  • Adversary: Outsider (colonial administrator, rival tribe, foreign merchant) who hears the same signal but lacks full knowledge of mapping or redundancy rules.

Let X denote a message in a tonal language (e.g., Yoruba). A drummer acts as an encoder E mapping X to a drummed signal S = E(X,K), where K denotes shared cultural/linguistic knowledge (idioms, proverbs, discourse templates) known to legitimate receivers but not to outsiders. The signal S traverses a physical channel C and is received as Y_R by insiders and as Y_A by an adversary (outsider). Decoders D_R and D_A attempt to reconstruct X:

NaijaCoder at the University of Lagos (UNILAG)

Last year, NaijaCoder started hosting its Lagos camp at the University of Lagos (UNILAG). The Abuja camp started in 2022.

The University of Lagos (UNILAG) is a leading public research university in Lagos, Nigeria. It is often celebrated as “the University of First Choice and the Nation’s Pride.” Founded in 1962, UNILAG was established shortly after Nigeria’s independence as one of the country’s first generation universities. Over the past six decades, it has grown into one of Nigeria’s most prestigious institutions. I’ll briefly discuss UNILAG’s rich history and highlight recent NaijaCoder camps at UNILAG’s Artificial Intelligence and Robotics Lab (AirLab). The goal of this post is not to provide a comprehensive overview of UNILAG but to highlight NaijaCoder’s connections to the university.

A Brief History of UNILAG

UNILAG was established by an Act of Parliament in 1962 as an immediate response to the national need for a competent professional workforce to drive Nigeria’s social, economic, and political development. (At the time, I believe the Federal Capital Territory of Nigeria was still Lagos.) UNILAG opened its doors on October 22, 1962, starting with just 131 students, but rapidly expanded in scope and enrollment. By 1964, additional faculties such as Arts, Education, Engineering, and Science had been added to the original three faculties (Business & Social Studies, Law, and Medicine). This early growth set the stage for UNILAG’s transformation into a comprehensive university. Today, the university enrolls tens of thousands of students and operates across three campuses in Lagos: the main campus at Akoka in Yaba, the College of Medicine at Idi-Araba, and a smaller campus at Yaba for radiography. (I’m currently writing this post at the Akoka campus.)

From its inception, UNILAG played a critical role in Nigeria’s development. During the decades when Lagos was the nation’s capital, the university became a key intellectual hub influencing politics and public policy. Its student body was notably diverse and cosmopolitan, attracting talent from different regions and economic backgrounds, which helped cultivate a generation of educated Nigerians poised to lead in various sectors. Over the years, UNILAG has weathered challenges, such as economic downturns in the 1980s that strained facilities and led to some brain drain, but it rebounded by expanding revenue streams, improving its academic reputation, and drawing in more students. By 2011, enrollment had grown to over 39,000, a far cry from the 131 pioneer students in 1962. In recent times, student population figures have exceeded 57,000 annually, reflecting UNILAG’s status as one of Nigeria’s largest and most in-demand universities. In fact, it is one of the country’s most competitive schools for admissions.

University leadership has made it clear that research and innovation are at the heart of UNILAG’s future trajectory. Professor Folasade Ogunsola, who became UNILAG’s first female Vice-Chancellor in 2022, has articulated a vision to make UNILAG a “future-ready, research-oriented and enterprise-driven hub.” She introduced a strategic framework with the acronym “FIRM” – focusing on Financial re-engineering, Infrastructural development, Reputation building through teaching/research/innovation, and Manpower development. A major part of this vision, it seems, is strengthening research output and global partnerships (through initiatives like NaijaCoder partnerships).

NaijaCoder Camps at UNILAG AIRLab (2024 & 2025) – Empowering the Next Generation

NaijaCoder is a non-profit organization dedicated to teaching algorithms to young Nigerians. In the summers of 2024 and 2025, UNILAG’s AIRLab (AI and Robotics Lab) partnered with NaijaCoder to run intensive camps in Lagos. Prof. Chika Yinka-Banjo is the director of the AIRLab; she has been instrumental in bringing the program to Lagos, from initial recruiting to daily daily logistics to TA recruitment.

In Summer 2024, the Lagos NaijaCoder camp was held right on UNILAG’s campus in collaboration with the AIRLab. For 14 days, about 50 students (mostly teens) immersed themselves in learning the basics of algorithms at the UNILAG AI & Robotics Lab. The curriculum introduced the participants to core concepts in an accessible way. The camp instructors covered everything from basic Python programming syntax and data types, to loops and recursion, searching and sorting algorithms, basic data structures, and use of python libraries. By the final days, students were applying their knowledge to solve problems and took an exam/competition to cap off their learning. The hands-on sessions were facilitated by instructors from NaijaCoder alongside UNILAG volunteers. Following the success of the 2024 program, we just finished Week 1 of NaijaCoder at the UNILAG AIRLab this summer (Summer 2025).

At NaijaCoder, we look forward to continued collaboration with UNILAG to bring computing-related curricula to classrooms across Nigeria.