Factlen ExplainerMachine UnlearningExplainerJun 20, 2026, 9:04 PM· 6 min read· #7 of 7 in ai

How Researchers Are Teaching AI to Forget: The Rise of Machine Unlearning

Researchers are rapidly advancing 'machine unlearning,' a technique that allows AI models to selectively forget copyrighted, private, or dangerous data without requiring a complete retraining. Recent breakthroughs in cryptographic evaluation and source-free unlearning are transforming this academic concept into a critical compliance tool for the AI industry.

By Factlen Editorial Team

Share this story

AI Developers & Engineers 35%Privacy & Copyright Advocates 25%AI Safety Researchers 25%Governance & Evaluation Experts 15%

AI Developers & Engineers: Focus on the computational feasibility of unlearning and preserving model performance without catastrophic forgetting.
Privacy & Copyright Advocates: Demand absolute removal of personal and copyrighted data to comply with laws and protect intellectual property.
AI Safety Researchers: View unlearning as a critical tool to surgically remove dangerous capabilities and toxic behaviors from deployed models.
Governance & Evaluation Experts: Emphasize the need for mathematical proofs and standardized metrics to verify that an AI has truly forgotten targeted data.

What's not represented

· Independent artists and authors whose work was scraped
· Open-source AI community developers

Why this matters

As artificial intelligence becomes deeply integrated into daily life, its inability to 'forget' personal data, copyrighted material, or dangerous instructions poses a massive legal and safety risk. Machine unlearning provides the technical mechanism to delete specific knowledge from an AI without destroying the entire system, ensuring these models can comply with privacy laws and remain safe for public use.

Key points

Machine unlearning allows AI models to forget specific data without the massive cost of retraining from scratch.
The technology is critical for complying with privacy laws like the GDPR and resolving copyright disputes.
The SISA framework achieves exact unlearning by splitting training data into isolated shards.
Recent breakthroughs include 'source-free unlearning,' which operates without needing the original dataset.
Cryptographic tests introduced in late 2025 now allow researchers to mathematically prove an AI has forgotten data.
Safety researchers use unlearning to remove dangerous capabilities, such as bioweapon instructions, from open-source models.

1/10th

Retraining cost using a 10-shard SISA framework

Article 17

GDPR clause granting the 'Right to be Forgotten'

Zero

Adversary advantage in the NeurIPS 2025 SWAP test

The artificial intelligence industry has a massive, multi-billion-dollar memory problem. Modern large language models are voracious readers, ingesting trillions of words from the public internet to build their remarkable reasoning capabilities. But this indiscriminate consumption creates a fundamental vulnerability: once an AI learns something it shouldn't have—whether it is a copyrighted novel, a private medical record, or instructions for synthesizing a bioweapon—it is extraordinarily difficult to make it forget.[8]

This technical limitation is currently colliding with a wall of legal and ethical mandates. In Europe, the General Data Protection Regulation (GDPR) guarantees individuals the "Right to be Forgotten," legally mandating that companies delete personal data upon request. Simultaneously, major publishers and authors are suing AI developers for copyright infringement, demanding that their intellectual property be excised from deployed models. The law assumes that data can simply be erased, but artificial intelligence does not work that way.[1][5]

To understand the difficulty of AI amnesia, one must understand how neural networks store information. An AI does not contain a hard drive full of text files or a spreadsheet where a developer can simply highlight a row and press delete. Instead, information is encoded probabilistically across billions of interconnected mathematical parameters. To remove the influence of one specific person's data requires altering millions of interdependent weights, effectively reconfiguring the model's identity. As researchers note, you cannot simply "delete row 42" in a neural network.[4][8]

Unlike a traditional database where a file can simply be deleted, an AI model stores information probabilistically across billions of parameters.

Historically, the only guaranteed way to remove a specific piece of data from an AI was the brute-force approach: delete the offending file from the training corpus and retrain the entire model from scratch. For a frontier language model, this process can take months of continuous computing and cost tens of millions of dollars. If a company receives thousands of deletion requests a week, full retraining becomes economically and practically impossible.[5][6]

Enter "machine unlearning," one of the most critical and rapidly accelerating fields in AI safety. Machine unlearning encompasses a suite of algorithms and techniques designed to surgically remove the influence of specific training data from a deployed model without requiring a complete rebuild. The goal is to produce an updated model that behaves exactly as if the removed data point had never been seen, preserving the AI's general intelligence while excising the targeted knowledge.[5]

The field is broadly divided into two approaches: exact unlearning and approximate unlearning. Exact unlearning provides a mathematical guarantee that the data has been entirely removed. The most prominent method here is the SISA framework—which stands for Sharded, Isolated, Sliced, and Aggregated. Instead of training one massive model on a single giant dataset, developers carve the training data into multiple isolated "shards" and train a series of smaller sub-models.[6]

The SISA approach dramatically alters the economics of data deletion. If a dataset is split into ten shards, and a user requests their data be deleted, the developers only need to retrain the single sub-model that processed that specific shard. The computational cost of compliance drops to one-tenth of a full retraining. While highly effective, this method requires foresight; it must be built into the AI's architecture before training begins.[6]

The SISA framework splits training data into isolated shards, allowing developers to retrain only a fraction of the model when a deletion request is made.

The SISA approach dramatically alters the economics of data deletion.

For models that have already been trained, researchers use approximate unlearning. This involves mathematically adjusting the model's internal weights to suppress the neural pathways associated with the unwanted data. By applying correction factors and gradient subtraction, engineers can effectively "dim" the model's memory of a specific concept. While much faster than retraining, approximate methods are complex and risk leaving microscopic residual traces of the data behind.[2][5]

The technology is advancing at a blistering pace. In September 2025, researchers at the University of California, Riverside, achieved a major breakthrough with the introduction of "source-free unlearning." Traditionally, unlearning algorithms required developers to have the original massive dataset on hand to calculate the necessary weight adjustments. The UC Riverside method allows an AI to unlearn specific data even when the original training corpus is no longer accessible, solving a major privacy and storage bottleneck for commercial developers.[4]

While copyright and privacy dominate the headlines, AI safety researchers view machine unlearning as a vital security mechanism. As open-source models become more powerful, there is a growing risk that they could be used to generate hazardous materials, such as malicious code or biological weapons. Machine unlearning provides a surgical tool to extract these dangerous capabilities from a model before it is released to the public, ensuring the AI remains helpful without acting as a vector for harm.[6]

Despite these breakthroughs, the field is currently grappling with an evaluation crisis: how do you definitively prove that an AI has forgotten something? Historically, researchers used Membership Inference Attacks (MIAs)—essentially trying to hack the model to see if it would leak the supposedly deleted data. However, recent studies have shown that MIAs are often poorly calibrated and unreliable, leaving companies unable to guarantee that a deletion request was truly successful.[7]

Machine unlearning techniques can reduce the computational cost of data removal to a fraction of what a full model retraining would require.

A solution to this crisis emerged at the NeurIPS 2025 conference, where researchers introduced a cryptographic framework known as the SWAP test. Inspired by security games, this evaluation method uses dataset splits to achieve a mathematical proof of forgetting. Under this framework, an adversary has exactly zero advantage in detecting the unlearned data, formally certifying the unlearning process as theoretically optimal. This provides the hard mathematical guarantees that corporate boards and regulators demand.[7]

Engineers must also navigate the risk of "catastrophic unlearning." Neural networks are highly entangled; the parameters that encode a copyrighted novel might also be crucial for the model's general grasp of grammar or creative writing. If an unlearning algorithm is too aggressive, it can accidentally lobotomize the model, degrading its performance on entirely unrelated tasks. Balancing effective forgetting with the preservation of general utility remains a delicate mathematical tightrope.[8]

Furthermore, the technology is not a silver bullet for all security threats. Recent findings from the Vector Institute's 2025 Machine Learning Security Workshop revealed that current unlearning methods often fail against sophisticated data poisoning attacks. If malicious actors intentionally inject corrupted data into a training set, standard unlearning techniques struggle to fully remove the poisoned influence, highlighting the need for more robust, adversarial-resistant forgetting mechanisms.[3]

Researchers are developing cryptographic frameworks to mathematically prove that an AI model has successfully forgotten targeted information.

There is also a lingering disconnect between technical capabilities and legal expectations. Policy experts at Harvard University warn that while machine unlearning can remove specific data points, it may not satisfy the broader intent of copyright law. If an AI unlearns a specific textbook but still generates outputs that are "substantially similar" to the author's style based on other data it absorbed, it may still run afoul of intellectual property regulations. The legal definition of deletion and the mathematical reality of unlearning are still being reconciled.[1]

Nevertheless, machine unlearning has officially transitioned from a niche academic curiosity to a foundational pillar of trustworthy AI infrastructure. As the technology matures, the ability to selectively edit an AI's memory will become just as important as the ability to train it in the first place. By bridging the gap between human rights law and technical feasibility, machine unlearning ensures that the artificial minds of the future remain accountable, adaptable, and safe.[7][8]

How we got here

2015
Early theoretical foundations of machine unlearning are proposed to address data privacy.
2019
The SISA framework is introduced, offering a practical method for exact unlearning through data sharding.
2023–2024
Major copyright lawsuits against AI developers accelerate the commercial urgency for unlearning technology.
Sept 2025
Researchers at UC Riverside publish a breakthrough in 'source-free unlearning,' eliminating the need for original datasets.
Dec 2025
The NeurIPS conference highlights the unlearning evaluation crisis and introduces cryptographic proofs for data removal.

Viewpoints in depth

Privacy & Copyright Advocates

Demand absolute removal of personal and copyrighted data to comply with laws and protect intellectual property.

For legal scholars, authors, and privacy advocates, the standard for data removal is absolute. They argue that if an individual invokes their GDPR 'Right to be Forgotten,' or if a publisher wins a copyright infringement claim, the offending data must be entirely excised from the model's parameters. This camp is often skeptical of 'approximate unlearning' methods, warning that if a model retains even microscopic traces of the original data, it remains legally non-compliant. They push for exact unlearning methods and strict regulatory oversight to ensure AI companies cannot simply hide behind technical complexity to avoid deleting stolen or private data.

AI Developers & Engineers

Focus on the computational feasibility of unlearning and preserving model performance without catastrophic forgetting.

The engineering community views machine unlearning as a complex optimization problem. Retraining a frontier model from scratch for every deletion request is economically impossible, costing tens of millions of dollars per run. Developers are focused on building scalable, efficient unlearning algorithms that can surgically adjust weights without triggering 'catastrophic unlearning'—a failure mode where the model forgets how to perform basic tasks because too many parameters were altered. For this camp, the goal is finding the mathematical sweet spot between compliance and preserving the model's general utility.

AI Safety Researchers

View unlearning as a critical tool to surgically remove dangerous capabilities and toxic behaviors from deployed models.

Beyond copyright and privacy, safety researchers see machine unlearning as a vital defense mechanism against AI misuse. As open-source models become more capable, there is a legitimate fear that they could be prompted to generate instructions for biological weapons, cyberattacks, or highly toxic content. This camp utilizes unlearning techniques to actively hunt down and lobotomize these specific hazardous capabilities before a model is released to the public, ensuring that the AI remains a helpful assistant rather than a vector for mass harm.

Governance & Evaluation Experts

Emphasize the need for mathematical proofs and standardized metrics to verify that an AI has truly forgotten targeted data.

This camp focuses on the 'evaluation crisis' within the field. It is not enough for a company to simply claim they have unlearned a piece of data; they must be able to prove it to regulators and corporate boards. Governance experts champion the development of cryptographic frameworks, such as the SWAP test introduced at NeurIPS 2025, which provide mathematical guarantees that data has been removed. They argue that without standardized, adversarial-resistant testing, machine unlearning is merely an illusion of compliance.

What we don't know

Whether courts will accept approximate machine unlearning as legally sufficient for copyright non-infringement.
How effectively unlearning algorithms will scale to future models with tens of trillions of parameters.
If sophisticated extraction attacks will eventually find ways to recover 'unlearned' data from residual model weights.

Key terms

Machine Unlearning: The process of removing the influence of specific training data from an AI model without retraining it from scratch.
Catastrophic Unlearning: A failure mode where an AI model forgets essential, unrelated knowledge while attempting to unlearn a specific piece of data.
SISA Framework: An unlearning method that splits data into isolated shards, allowing developers to retrain only the small fraction of the model affected by a deletion request.
Membership Inference Attack (MIA): A technique used to determine whether a specific piece of data was included in an AI model's training set, often used to test if unlearning was successful.
Source-Free Unlearning: An advanced technique that allows an AI to forget specific information even when the developers no longer have access to the original training dataset.

Frequently asked

Why can't developers just delete a file from the AI?

AI models do not store data like a traditional database. Information is encoded probabilistically across billions of mathematical parameters, meaning specific data cannot simply be deleted without altering the entire network.

Does machine unlearning make the AI less intelligent?

If done incorrectly, it can cause 'catastrophic unlearning,' where the model loses general capabilities. However, modern techniques are designed to surgically remove specific data while preserving the model's overall performance.

Is machine unlearning legally compliant with the GDPR?

It is currently the best technical approach to the 'Right to be Forgotten,' but legal experts warn that technical unlearning may not yet fully satisfy strict legal definitions of complete data erasure.

How do researchers prove an AI has forgotten something?

Historically, they used Membership Inference Attacks to test for residual data traces. Recently, researchers have developed cryptographic frameworks, like the SWAP test, to mathematically certify that data has been removed.

Sources

[1]Harvard UniversityPrivacy & Copyright Advocates
Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy
Read on Harvard University →
[2]University of Texas at AustinAI Developers & Engineers
New Algorithm Helps AI 'Unlearn' Copyrighted and Violent Content
Read on University of Texas at Austin →
[3]Vector InstituteAI Safety Researchers
2025 Machine Learning Security & Privacy Workshop
Read on Vector Institute →
[4]Cirrus InstituteAI Developers & Engineers
New research: Source-free unlearning
Read on Cirrus Institute →
[5]IEEE Computer SocietyPrivacy & Copyright Advocates
Machine Unlearning for Large Language Models
Read on IEEE Computer Society →
[6]BlueDot ImpactAI Safety Researchers
Machine unlearning: Removing dangerous capabilities
Read on BlueDot Impact →
[7]AI Governance SubstackGovernance & Evaluation Experts
The Path from Theoretical Breakthrough to Boardroom Consideration
Read on AI Governance Substack →
[8]Factlen Editorial TeamGovernance & Evaluation Experts
Synthesis by Factlen editorial team
Read on Factlen Editorial Team →

Up next

Local AI

The Local AI Revolution: How Open-Weight Models Are Moving From the Cloud to Your Laptop

In 2026, running powerful artificial intelligence locally has shifted from a niche hobby to a mainstream productivity hack, offering absolute privacy and zero subscription fees.

Every angle. Every day.

Get ai stories with full source coverage and perspective breakdowns delivered to your inbox.

Get the briefing →Browse ai