ETH Zurich prof designs better antibody drugs with artificial intelligence

Antibodies are not only produced by our immune cells to fight viruses and other pathogens in the body. For a few decades now, medicine has also been using antibodies produced by biotechnology as drugs. This is because antibodies are extremely good at binding specifically to molecular structures according to the lock-and-key principle. Their use ranges from oncology to the treatment of autoimmune diseases and neurodegenerative conditions.

However, developing such antibody drugs is anything but simple. The basic requirement is for an antibody to bind to its target molecule optimally. At the same time, an antibody drug must fulfill a host of additional criteria. For example, it should not trigger an immune response in the body. It should be efficient to produce using biotechnology, and it should remain stable over a long time.

Once scientists have found an antibody that binds to the desired molecular target structure, the development process is far from over. Rather, this marks the start of a phase in which researchers use bioengineering to try to improve the antibody's properties. Scientists led by Sai Reddy, a professor at the Department of Biosystems Science and Engineering at ETH Zurich in Basel, have now developed a machine learning method that supports this optimization phase, helping to develop more effective antibody drugs.

Robots can't manage more than a few thousand

Machine learning helps develop optimal antibody drugs.

When researchers optimize an entire antibody molecule in its therapeutic form (i.e. not just a fragment of an antibody), it is used to start with an antibody lead candidate that binds reasonably well to the desired target structure. Then researchers randomly mutate the gene that carries the blueprint for the antibody to produce a few thousand related antibody candidates in the lab. The next step is to search among them to find the ones that bind best to the target structure. "With automated processes, you can test a few thousand therapeutic candidates in a lab. But it is not really feasible to screen any more than that," Reddy says. Typically, the best dozen antibodies from this screening move on to the next step and are tested for how well they meet additional criteria. "Ultimately, this approach lets you identify the best antibody from a group of a few thousand," he says.

Candidate pool massively increased by machine learning

Reddy and his colleagues are now using machine learning to increase the initial set of antibodies to be tested to several million. "The more candidates there are to choose from, the greater the chance of finding one that really meets all the criteria needed for drug development," Reddy says.

The ETH researchers provided the proof of concept for their new method using Roche's antibody cancer drug Herceptin, which has been on the market for 20 years. "But we weren't looking to make suggestions for how to improve it - you can't just retroactively change an approved drug," Reddy explains. "Our reason for choosing this antibody is because it is well known in the scientific community and because its structure is published in open-access databases."

Supercomputer predictions

Starting from the Herceptin antibody's DNA sequence, the ETH researchers created about 40,000 related antibodies using a CRISPR mutation method they developed a few years ago. Experiments showed that 10,000 of them bound well to the target protein in question, a specific cell surface protein. The scientists used the DNA sequences of these 40,000 antibodies to train a machine learning algorithm.

They then applied the trained algorithm to search a database of 70 million potential antibody DNA sequences. For these 70 million candidates, the algorithm predicted how well the corresponding antibodies would bind to the target protein, resulting in a list of millions of sequences expected to bind.

Using further supercomputer models, the scientists predicted how well these millions of sequences would meet the additional criteria for drug development (tolerance, production, physical properties). This reduced the number of candidate sequences to 8,000.

Improved antibodies found

From the list of optimized candidate sequences on their computer, the scientists selected 55 sequences from which to produce antibodies in the lab and characterize their properties. Subsequent experiments showed that several of them bound even better to the target protein than Herceptin itself and was easier to produce and more stable than Herceptin. "One new variant may even be better tolerated in the body than Herceptin," says Reddy. "It is known that Herceptin triggers a weak immune response, but this is typically not a problem in this case." However, it is a problem for many other antibodies and is necessary to prevent drug development.

The ETH scientists are now applying their artificial intelligence method to optimize antibody drugs that are in clinical development. To this end, they recently founded the ETH spin-off deepCDR Biologics, which partners with both early-stage and established biotech and pharmaceutical companies for antibody drug development.

University of Gothenburg's Laura Natali shows how machine learning can help slow down future pandemics

Artificial intelligence could be one of the keys to limiting the spread of infection in future pandemics. In a new study, researchers at the University of Gothenburg in Sweden have investigated how machine learning can be used to find effective testing methods during epidemic outbreaks, thereby helping to better control the outbreaks.

In the study, the researchers developed a method to improve testing strategies during epidemic outbreaks and with relatively limited information be able to predict which individuals offer the best potential for testing.

“This can be a first step towards society gaining better control of future major outbreaks and reduce the need to shut down society,” says Laura Natali, a doctoral student in physics at the University of Gothenburg and the lead author of the published study. Laura Natali (Photo: Aykut Argun)

Simulation shows rapid control over the outbreak
Machine learning is a type of artificial intelligence and can be described as a mathematical model where computers are trained to learn to see connections and solve problems using different data sets. The researchers used machine learning in a simulation of an epidemic outbreak, where information about the first confirmed cases was used to estimate infections in the rest of the population. Data about the infected individual’s network of contacts and other information was used: who they have been in close contact with, where and for how long.

“In the study, the outbreak can quickly be brought under control when the method is used, while random testing leads to the uncontrolled spread of the outbreak with many more infected individuals. Under real-world conditions, information can be added, such as demographic data, age, and health-related conditions, which can improve the method’s effectiveness even more. The same method can also be used to prevent reinfections in the population if immunity after the disease is only temporary," she said. 

More exact localization of the infection
She emphasizes that the study is a simulation and that testing with real data is needed to improve the method even more. Therefore, it is too early to use it in the ongoing coronavirus pandemic. At the same time, she sees the research as a first step in being able to implement more targeted initiatives to reduce the spread of infections, since the machine learning-based testing strategy automatically adapts to the specific characteristics of diseases. As an example, she mentions the potential to easily predict if a specific age group should be tested or if a limited geographic area is a risk zone, such as a school, a community, or a specific neighborhood.

“When a large outbreak has begun, it is important to quickly and effectively identify infectious individuals. In random testing, there is a significant risk of failing to achieve this, but with a more goal-oriented testing strategy, we can find more infected individuals and thereby also gain the necessary information to decrease the spread of infection. We show that machine learning can be used to develop this type of testing strategy,” she said.

More effective use of testing resources
There are few previous studies that have examined how machine learning can be used in cases of pandemics, particularly with a clear focus on finding the best testing strategies. “We show that it is possible to use relatively simple and limited information to make predictions of who would be most beneficial to test. This allows better use of available testing resources," she commented.

Paper: https://iopscience.iop.org/article/10.1088/2632-2153/abf0f7

Hebrew University researcher introduces new approach to three-body problem

The "three-body problem," the term coined for predicting the motion of three gravitating bodies in space, is essential for understanding a variety of astrophysical processes as well as a large class of mechanical problems and has occupied some of the world's best physicists, astronomers, and mathematicians for over three centuries. Their attempts have led to the discovery of several important fields of science, yet its solution remained a mystery.

At the end of the 17th century, Sir Isaac Newton succeeded in explaining the motion of the planets around the sun through a law of universal gravitation. He also sought to explain the motion of the moon. Since both the earth and the sun determine the motion of the moon, Newton became interested in the problem of predicting the motion of three bodies moving in space under the influence of their mutual gravitational attraction (see attached illustration), a problem that later became known as "the three-body problem".

However, unlike the two-body problem, Newton was unable to obtain a general mathematical solution for it. Indeed, the three-body problem proved easy to define, yet difficult to solve.

New research, led by Professor Barak Kol at Hebrew University of Jerusalem's Racah Institute of Physics, adds a step to this scientific journey that began with Newton, touching on the limits of scientific prediction and the role of chaos in it.

The theoretical study presents a novel and exact reduction of the problem, enabled by a re-examination of the basic concepts that underlie previous theories. It allows for a precise prediction of the probability for each of the three bodies to escape the system.

Following Newton and two centuries of fruitful research in the field including by Euler, Lagrange and Jacobi, by the late 19th century the mathematician Poincare discovered that the problem exhibits extreme sensitivity to the bodies' initial positions and velocities. This sensitivity, which later became known as chaos, has far-reaching implications - it indicates that there is no deterministic solution in closed-form to the three-body problem.

In the 20th century, the development of computers made it possible to re-examine the problem with the help of computerized simulations of the bodies' motion. The simulations showed that under some general assumptions, a three-body system experiences periods of chaotic, or random, motion alternating with periods of regular motion, until finally the system disintegrates into a pair of bodies orbiting their common center of mass and a third one moving away, or escaping, from them.

The chaotic nature implies that not only is a closed-form solution impossible, but also supercomputer simulations cannot provide specific and reliable long-term predictions. However, the availability of large sets of simulations led in 1976 to the idea of seeking a statistical prediction of the system, and in particular, predicting the escape probability of each of the three bodies. In this sense, the original goal, to find a deterministic solution, was found to be wrong, and it was recognized that the right goal is to find a statistical solution.

Determining the statistical solution has proven to be no easy task due to three features of this problem: the system presents chaotic motion that alternates with regular motion; it is unbounded and susceptible to disintegration. A year ago, Racah's Dr. Nicholas Stone and his colleagues used a new method of calculation and, for the first time, achieved a closed mathematical expression for the statistical solution. However, this method, like all its predecessor statistical approaches, rests on certain assumptions. Inspired by these results, Kol initiated a re-examination of these assumptions.

The infinite unbounded range of the gravitational force suggests the appearance of infinite probabilities through the so-called infinite phase-space volume. To avoid this pathology, and for other reasons, all previous attempts postulated a somewhat arbitrary "strong interaction region", and accounted only for configurations within it in the calculation of probabilities.

The new study, recently published in the scientific journal Celestial Mechanics and Dynamical Astronomy, focuses on the outgoing flux of phase-volume, rather than the phase-volume itself. Since the flux is finite even when the volume is infinite, this flux-based approach avoids the artificial problem of infinite probabilities, without ever introducing the artificial strong interaction region.

The flux-based theory predicts the escape probabilities of each body, under a certain assumption. The predictions are different from all previous frameworks, and Prof. Kol emphasizes that "tests by millions of computer simulations show strong agreement between theory and simulation." The simulations were carried out in collaboration with Viraj Manwadkar from the University of Chicago, Alessandro Trani from the Okinawa Institute in Japan, and Nathan Leigh from the University of Concepcion in Chile. This agreement proves that understanding the system requires a paradigm shift and that the new conceptual basis describes the system well. It turns out, then, that even for the foundations of such an old problem, innovation is possible.

The implications of this study are wide-ranging and are expected to influence both the solution of a variety of astrophysical problems and the understanding of an entire class of problems in mechanics. In astrophysics, it may have an application to the mechanism that creates pairs of compact bodies that are the source of gravitational waves, as well as to deepen the understanding of the dynamics within star clusters. In mechanics, the three-body problem is a prototype for a variety of chaotic problems, so progress in it is likely to reflect on additional problems in this important class.