When we design a skyscraper, we expect it will perform to specification: that the tower will support so much weight and be able to withstand an earthquake of a certain strength.
But, with one of the most important technologies of the modern world, we’re effectively building blind. We play with different designs, tinker with different setups, but, until we take it out for a test run, we don’t really know what it can do or where it will fail.
This technology is the neural network, which underpins today’s most advanced artificial intelligence systems. Increasingly, neural networks are moving into the core areas of society: They determine what we learn of the world through our social media feeds, they help doctors diagnose illnesses, and they even influence whether a person convicted of a crime will spend time in jail.
Yet “the best approximation to what we know is that we know almost nothing about how neural networks actually work and what a really insightful theory would be,” said Boris Hanin, a mathematician at Texas A&M University and a visiting scientist at Facebook AI Research who studies neural networks.
He likens the situation to the development of another revolutionary technology: the steam engine. At first, steam engines weren’t good for much more than pumping water. Then they powered trains, which is maybe the level of sophistication neural networks have reached. Then scientists and mathematicians developed a theory of thermodynamics, which let them understand exactly what was going on inside engines of any kind. Eventually, that knowledge took us to the moon.
“First, you had great engineering, and you had some great trains, then you needed some theoretical understanding to go to rocket ships,” Hanin said.
Within the sprawling community of neural network development, there is a small group of mathematically minded researchers who are trying to build a theory of neural networks—one that would explain how they work and guarantee that, if you construct a neural network in a prescribed manner, it will be able to perform certain tasks.
This work is still in its very early stages, but, in the last year, researchers have produced several papers which elaborate the relationship between form and function in neural networks. The work takes neural networks all the way down to their foundations. It shows that long before you can certify that neural networks can drive cars, you need to prove that they can multiply.