Companies of all kinds use machine learning to analyze people’s desires, dislikes, or faces. Some researchers are now asking a different question: How can we make machines forget?
A nascent area of computer science dubbed machine unlearning seeks ways to induce selective amnesia in artificial intelligence software. The goal is to remove all trace of a particular person or data point from a machine-learning system without affecting its performance.
If made practical, the concept could give people more control over their data and the value derived from it. Although users can already ask some companies to delete personal data, they are generally in the dark about what algorithms their information helped tune or train. Machine unlearning could make it possible for a person to withdraw both their data and a company’s ability to profit from it.
Although intuitive to anyone who has rued what they shared online, that notion of artificial amnesia requires some new ideas in computer science. Companies spend millions of dollars training machine-learning algorithms to recognize faces or rank social posts, because the algorithms often can solve a problem more quickly than human coders alone. But, once trained, a machine-learning system is not easily altered, or even understood. The conventional way to remove the influence of a particular data point is to rebuild a system from the beginning, a potentially costly exercise.
“This research aims to find some middle ground,” said Aaron Roth, a professor at the University of Pennsylvania who is working on machine unlearning. “Can we remove all influence of someone’s data when they ask to delete it but avoid the full cost of retraining from scratch?”
Work on machine unlearning is motivated in part by growing attention to the ways artificial intelligence can erode privacy. Data regulators around the world have long had the power to force companies to delete ill-gotten information. Citizens of some locales, such as the EU and California, even have the right to request that a company delete their data if they have a change of heart about what they disclosed. More recently, US and European regulators have said the owners of AI systems must sometimes go a step further: deleting a system that was trained on sensitive data.