Artificial neural networks—algorithms inspired by biological brains—are at the center of modern artificial intelligence, behind both chatbots and image generators. But with their many neurons, they can be black boxes, their inner workings uninterpretable to users.
Researchers have now created a fundamentally new way to make neural networks that some ways surpasses traditional systems. These new networks are more interpretable and also more accurate, proponents say, even when they’re smaller. Their developers say the way they learn to represent physics data concisely could help scientists uncover new laws of nature.
“It’s great to see that there is a new architecture on the table.” —Brice Ménard, Johns Hopkins University
For the past decade or more, engineers have mostly tweaked neural-network designs through trial and error, says Brice Ménard, a physicist at Johns Hopkins University who studies how neural networks operate but was not involved in the new work, which was posted on arXiv in April. “It’s great to see that there is a new architecture on the table,” he says, especially one designed from first principles.
One way to think of neural networks is by analogy with neurons, or nodes, and synapses, or connections between those nodes. In traditional neural networks, called multi-layer perceptrons (MLPs) each synapse learns a weight—a number that determines how strong the connection is between those two neurons. The neurons are arranged in layers, such that a neuron from one layer takes input signals from the neurons in the previous layer, weighted by the strength of their synaptic connection. Each neuron then applies a simple function to the sum total of its inputs, called an activation function.
In traditional neural networks, sometimes called multi-layer perceptrons (left), each synapse learns a number called a weight, and each neuron applies a simple function to the sum of its inputs. In the new Kolmogorov-Arnold architecture (right), each synapse learns a function, and the neurons sum the outputs of those functions.
In the new architecture, the synapses play a more complex role. Instead of simply learning how strong the connection between two neurons is, they learn the full nature of that connection—the function that maps input to output. Unlike the activation function used by neurons in the traditional architecture, this function could be more complex—in fact a “spline” or combination of several functions—and is different in each instance. Neurons, on the other hand, become simpler—they just sum the outputs of all their preceding synapses. The new networks are called Kolmogorov-Arnold Networks (KANs), after two mathematicians who studied how functions could be combined. The idea is that KANs would provide greater flexibility when learning to represent data, while using fewer learned parameters.
“It’s like an alien life that looks at things from a different perspective but is also kind of…
Read full article: Kalmogorov-Arnold Neural Networks Shake Up How AI Is Done
The post “Kalmogorov-Arnold Neural Networks Shake Up How AI Is Done” by Matthew Hutson was published on 08/05/2024 by spectrum.ieee.org
Leave a Reply