Quantum field theory is the formalism used by physicists to describe the behavior of elementary particles, such as electrons, photons, or quarks. Until today, this theory has proven to be resoundingly successful. To recall a famous example, the calculation of the magnetic moment of the electron has led to the most accurate experimental verification of a theoretical prediction in the entire history of science.
Since its inception about 90 years ago, quantum field theory was conceived to account for nature at its most fundamental level: that of elementary particles and their corresponding quantum fields. But could this formalism be equally successful in describing ’emergent’ behavior; that is, the kind of collective phenomena that arise when many basic constituents interact with each other?
A typical example of this kind of behavior is provided by the human brain. Although it is made up of simple elements (neurons) that interact with each other using relatively simple rules, these interactions quickly give rise to genuinely new collective phenomena. These ultimately arise from interactions between neurons, but affect the brain as a whole. Therefore, we can ask whether it is possible to devise a physical theory that accurately describes this class of collective phenomena.
In recent work by our group at Swansea University and published last April in Physical Review D , we have discovered a surprising relationship between certain quantum theories of fields and some machine learning algorithms. Specifically, we have shown that some neural networks and other artificial intelligence systems can be directly described by computational models of quantum field theories. This unexpected result paves the way for studying machine learning using the field theory formalism, while providing a promising new way to investigate the mathematical foundations of field theory itself.
Spacetime and supercomputers
The connection between field theory and artificial intelligence has its origin in a certain formalism used in particle physics and known as « field theory in the lattice »( lattice field theory ). This framework deviates from that used mostly in particle physics, and its main characteristic is that it models space and time as if they were made up of finite-sized “pixels”. This way of treating spacetime has the advantage that it manages to keep the equations to be solved under control. However, it requires solving a gigantic number of them, which means that it can only be done with the help of powerful supercomputers.
Although this computational technique in particle physics dates back to the 1970s, it has only been in recent years, thanks to the advent of modern supercomputers, that it has managed to increase its precision. In the search for methods to further accelerate these demanding calculations, machine learning has been proposed as one of the possible solutions.
One of the most widely used machine learning methods in lattice field theory is the one based on neural networks. A neural network is nothing more than a set of nodes (“neurons”) interconnected by links. From a mathematical point of view, such networks can be described by means of a graph: the abstract representation of a set of vertices connected to each other by edges. If we identify each vertex with a variable of a certain type, the edges that connect different vertices introduce a form of dependency between the corresponding variables.
A key point is that such graph-based structures are also used in lattice field theory; specifically, to represent a quantum field in a discrete, or “pixelated,” spacetime. Thanks to this parallelism, it is possible to find a common mathematical language to describe both machine learning and field theory on the lattice.
Locality and amnesia
This common mathematical language is known in technical terms as “random Markov fields”. In a simplified way, this formalism describes a set of random variables, each of which can be identified with the nodes of a graph. Such variables must satisfy the so-called ” property of Markov “, named after the 19th century Russian mathematician Andrei Markov. This property establishes that the events that occur in a certain region of the graph must be independent of those that occur in remote areas.
The Markov property is related to a characteristic known as “amnesia” ( memorylessness ): the fact that the state of a system only depends on what has happened in the previous instant, but not of those events that happened further in the past. In some machine learning algorithms, such as those used in image processing, this Markov property is used to discover local structures in images.
In the context of quantum fields, the Markov condition may appear due to the discretization of spacetime. In this case, the Markov property equates to the locality condition: the well-known physical principle that events that occur in a certain region of spacetime can only be affected by nearby events, but not by events that occur far away. (that is, the same principle that prohibits the existence of instantaneous actions between distant points of spacetime).
All these parallels allow us to anticipate a relationship between quantum field theory and machine learning. In our work, we have established that this is indeed the case by showing that certain field theories satisfy a certain theorem known as the “Hammersley-Clifford theorem.” This rigorously guarantees that these field theories satisfy the Markov condition and that, therefore, they can be reformulated in terms of machine learning algorithms.
Quantum fields that learn
Our result opens the possibility of using field theories to derive new machine learning algorithms for specific tasks, such as image segmentation (the process of dividing an image in different parts, like the light and dark areas of a black and white photo, for example). This is possible because the physical properties of quantum fields, such as their tendency to minimize energy and other quantities, correspond to the optimization processes used in machine learning.
For example, in our work we have demonstrated the equivalence between a certain quantum field theory and an algorithm that can be trained to learn an image. After training, if the algorithm is given a random set of pixels, it will rearrange them until it reaches a “balance setting” that faithfully reproduces the original image.
This equivalence between algorithms and field theories could lead to new ideas in artificial intelligence. For example, it has long been known that some algorithms can be related to the mathematical description of certain physical systems, such as the so-called ” spin glasses ” (a line of research started by the Italian physicist Giorgio Parisi and that this year it has been recognized with the award of the Nobel Prize in Physics ). Our results could shed light on how to interpret these machine learning algorithms, which can now be investigated with the tools of field theory.
In this regard, it would be interesting to explore the possible phase transitions in those algorithms that have an equivalent in quantum field theory. In physics, a phase transition describes an abrupt change in the properties of a system, such as when water boils and turns into steam. It has been known for years that, during a phase transition, some very different physical systems end up obeying the same laws, in which case we say they belong to the same ” universality class “. This is a very powerful concept in physics that allows us to understand the properties of a system much better. Now, we can explore whether those algorithms that are equivalent to field theories undergo phase transitions during their training process, and if so, to what kind of universality they belong.
Learning about quantum fields
In addition to using physics to better understand artificial intelligence, our results also open the door to using machine learning for physical applications; for example, to approximate intricate field theories by simpler ones. Today, applications of machine learning to field theory can still be considered in its infancy, so this equivalence could inspire totally new lines of research.
Finally, this connection between fundamental physics and machine learning can be related to a very deep and as yet unresolved question: what are the mathematical foundations of quantum field theory?
In recent decades, this question has been explored from multiple perspectives. One of them is known as “constructive quantum field theory”, a part of which studies the question from the point of view of the Markov random fields. Therefore, this connection between field theory and machine learning opens the opportunity to initiate a dialogue at the intersection of physics, computation, and mathematics; one that could end up transforming all three disciplines.