Michigan State team decodes crop genetics with artificial intelligence, $1 million grant from the NSF

We live in a time when it's never been easier or less expensive to sequence a plant's complete genome. But knowing all a plant's genes is not the same thing as knowing what all those genes do. Be "decoding" plant genetics, the MSU researchers hope to help farmers grow crops with genes that give their plants the best chance to withstand threats like drought and disease.

Michigan State experts in plant biology and computer science plan to close that gap with the help of artificial intelligence and a new $1.4 million grant from the National Science Foundation. Ultimately, the goal is to help farmers grow crops with genes that give their plants the best chance to withstand threats such as drought and disease.

To get to that point, though, researchers still need to reveal the fundamental role of many of the genes found in plants.

"In terms of plant science, there are major questions that we're trying to answer: How does a particular gene sequence work? What's its molecular function? What does it do?" said lead investigator Shin-Han Shiu. Shiu is a professor in the College of Natural Science's Department of Plant Biology and Computational Mathematics, Science, and Engineering, a department jointly administered by the College of Natural Science and the College of Engineering.

"In the cases where we don't know, it's not because we haven't tried to find out," Shiu said. "It's because it becomes harder and harder to find the answers through experiments."

The researchers believe that AI can provide the assistance researchers need to crack those tough cases, which represent a sizable fraction of plant genes. The researchers are using a type of AI known as machine learning, which involves computer algorithms that can essentially be taught.

To "teach" the AI, the team will program in available data, describing what scientists do know about plant genes and their functions. And there is a substantial bounty of this data, particularly from well-studied plants including corn, tomatoes, and a model plant known as Arabidopsis.

The algorithm can then start making informed predictions about what genes with unknown functions do and then help guide scientists in designing experiments to test those predictions.

But the researchers are facing more than just technical challenges with this project. Some scientific communities have been particularly cautious to embrace AI and machine learning algorithms that can seem like indecipherable codes that offer up answers without context, Shiu said.

That's why he and his colleagues knew they needed to have a group of experts from varied disciplines that could communicate effectively with each other. That way, the team could create machine learning systems where everyone could understand what the answers mean and where they came from.

"Humans need to know how the machine makes decisions, otherwise the AI is just a black box," said co-investigator Jiliang Tang. Tang is also an associate professor in the MSU College of Engineering's Department of Computer Science and Engineering.

"The goal of AI is to make predictions, but we also need humans who can audit these predictions," he said. "Our goal is to make explainable and responsible AI, which means we have to make it trustworthy and transparent."

That's why the team also recruited machine learning ace Yuying Xie, assistant professor in the Department of Computational Mathematics, Science and Engineering, and expert experimentalist Melissa Lehti-Shiu, a research assistant professor in the Department of Plant Biology.

"Typically, the development of a machine learning model and its applications can be disjointed," Xie said. "By working together as a group, we have a better understanding of the methodology top to bottom. It benefits us as a group, and it pushes the science forward."

"By understanding and interpreting the models, we can also gain insight into biological processes and why certain genes belong to certain biological pathways," Lehti-Shiu said, adding that AI can also help focus experiments. Right now, it looks like there could be thousands of genes that may be important to stress responses -- say, how a plant responds to heat stress," Lehti-Shiu said. "The model can help predict which genes are the most important."

In reaching its scientific goals -- using machine learning to predict gene functions -- the team is also aiming to demonstrate the power of AI as a research tool to the plant science community. As trailblazers often do, the Spartans have faced early challenges in getting others to accept their ideas.

"When I first came to MSU and became interested in machine learning, it took five years to publish our first paper using it. Plant and computer sciences have completely different cultures and use completely different languages," Shiu said. But, working as a team, the Spartans have persevered.

"This project would not have been possible without computer scientists who were patient and willing enough to teach and work with plant scientists."

So, although the team may be charting new territory with AI, it's working from a proven roadmap, one that relies on teamwork and Spartans' will. Add to that the support from the new NSF grant and the MSU team is primed to make waves in plant science.

"I'm hopeful this will help more researchers apply AI in this field," Tang said. "I'm excited to demonstrate its potential and promise."

Berkeley's SSL astrophysicists predicted galactic gamma-ray bursts last year show up right on schedule

Sherlock Holmes story gives clue to the successful prediction of bursts from a nearby magnetar

Magnetars are bizarre objects -- massive, spinning neutron stars with magnetic fields among the most powerful known, capable of shooting off brief bursts of radio waves so bright they're visible across the universe. Since 2014, a magnetar in our galaxy (SGR1935+2154) has been emitting bursts of soft gamma rays (black stars). UC Berkeley scientists concluded that they occurred only within certain windows of time (green stripes) but were somehow blocked during intervening windows (red). They used this pattern to predict renewed bursts starting after June 1, 2021 (stripes outlined in blue at right), and since June 24, more than a dozen have been detected (blue stars): right on schedule.  CREDIT Mikhail Denissenya, Nazarbayev University, Kazakhstan

A team of astrophysicists has now found another peculiarity of magnetars: They can emit bursts of low energy gamma-rays in a pattern never before seen in any other astronomical object.

It's unclear why this should be, but magnetars themselves are poorly understood, with dozens of theories about how they produce radio and gamma-ray bursts. The recognition of this unusual pattern of gamma-ray activity could help theorists figure out the mechanisms involved.

"Magnetars, which are connected with fast radio bursts and soft gamma repeaters, have something periodic going on, on top of randomness," said astrophysicist Bruce Grossan, an astrophysicist at the University of California, Berkeley's Space Sciences Laboratory (SSL). "This is another mystery on top of the mystery of how the bursts are produced."

The researchers -- Grossan and theoretical physicist and cosmologist Eric Linder from UC Berkeley and postdoctoral fellow Mikhail Denissenya from Nazarbayev University in Kazakhstan -- discovered the pattern in bursts from a soft gamma repeater, SGR1935+2154, that is a magnetar, a prolific source of soft or lower energy gamma-ray bursts and the only known source of fast radio bursts within our Milky Way galaxy. They found that the object emits bursts randomly, but only within regular four-month windows of time, each active window separated by three months of inactivity.

On March 19, the team uploaded a preprint claiming "periodic windowed behavior" in soft gamma bursts from SGR1935+2154 and predicted that these bursts would start up again after June 1 -- following a three-month hiatus -- and could occur throughout a four-month window ending Oct. 7.

On June 24, three weeks into the window of activity, the first new burst from SGR1935+2154 was observed after the predicted three-month gap, and nearly a dozen more bursts have been observed since, including one on July 6, the day the paper was published online in the journal Physical Review D.

"These new bursts within this window means that our prediction is dead-on," said Grossan, who studies high-energy astronomical transients. "Probably more important is that no bursts were detected between the windows since we first published our preprint."

Linder likens the non-detection of bursts in three-month windows to a key clue -- the "curious incident" that a guard dog did not bark in the nighttime -- that allowed Sherlock Holmes to solve a murder in the short story "The Adventure of Silver Blaze".

"Missing or occasional data is a nightmare for any scientist," noted Denissenya, the first author of the paper and a member of the Energetic Cosmos Laboratory at Nazarbayev University that was founded several years ago by Grossan, Linder, and UC Berkeley cosmologist and Nobel laureate George Smoot. "In our case, it was crucial to realize that missing bursts or no bursts at all carry information."

The confirmation of their prediction startled and thrilled the researchers, who think this may be a novel example of a phenomenon -- periodic windowed behavior -- that could characterize emissions from other astronomical objects.

Mining data from 27-year-old satellite

Within the last year, researchers suggested that the emission of fast radio bursts -- which typically last a few thousandths of a second -- from distant galaxies might be clustered in a periodic windowed pattern. But the data were intermittent, and the statistical and computational tools to firmly establish such a claim with sparse data were not well developed.

Grossan convinced Linder to explore whether advanced techniques and tools could be used to demonstrate that periodically windowed -- but random, as well, within an activity window -- behavior was present in the soft gamma-ray burst data of the SGR1935+2154 magnetar. The Konus instrument aboard the WIND spacecraft, launched in 1994, has recorded soft gamma-ray bursts from that object, exhibiting fast radio bursts -- since 2014 and likely never missed a bright one.

Linder, a member of the Supernova Cosmology Project based at Lawrence Berkeley National Laboratory, had used advanced statistical techniques to study the clustering in space of galaxies in the universe, and he and Denissenya adapted these techniques to analyze the clustering of bursts in time. Their analysis, the first to use such techniques for repeated events, showed an unusual windowed periodicity distinct from the very precise repetition produced by bodies rotating or in orbit, which most astronomers think of when they think of periodic behavior.

"So far, we have observed bursts over 10 windowed periods since 2014, and the probability is 3 in 10,000 that while we think it is periodically windowed, it is actually random," he said, meaning there's a 99.97% chance they're right. He noted that a Monte Carlo simulation indicated that the chance they're seeing a pattern that isn't really there is likely well under 1 in a billion.

The recent observation of five bursts within their predicted window, seen by WIND and other spacecraft monitoring gamma-ray bursts, adds to their confidence. However, a single future burst observed outside the window would disprove the whole theory, or cause them to redo their analysis completely.

"The most intriguing and fun part for me was to make predictions that could be tested in the sky. We then ran simulations against real and random patterns and found it really did tell us about the bursts," Denissenya said.

As for what causes this pattern, Grossan and Linder can only guess. Soft gamma-ray bursts from magnetars are thought to involve starquakes, perhaps triggered by interactions between the neutron star's crust and its intense magnetic field. Magnetars rotate once every few seconds, and if the rotation is accompanied by a precession -- a wobble in the rotation -- that might make the source of burst emission point to Earth only within a certain window. Another possibility, Grossan said, is that a dense, rotating cloud of obscuring material surrounds the magnetar but has a hole that only periodically allows bursts to come out and reach Earth.

"At this stage of our knowledge of these sources, we can't really say which it is," Grossan said. "This is a rich phenomenon that will likely be studied for some time."

Linder agrees and points out that the advances were made by the cross-pollination of techniques from high-energy astrophysics observations and theoretical cosmology.

"UC Berkeley is a great place where diverse scientists can come together," he said. "They will continue to watch and learn and even 'listen' with their instruments for more dogs in the night."

Tokyo Tech's TSUBAME 3.0 supercomputer predicts cell-membrane permeability of cyclic peptides

Scientists at the Tokyo Institute of Technology have developed a computational method based on large-scale molecular dynamics simulations to predict the cell-membrane permeability of cyclic peptides using a supercomputer. Their protocol has exhibited promising accuracy and may become a useful tool for the design and discovery of cyclic peptide drugs, which could help us reach new therapeutic targets inside cells beyond the capabilities of conventional small-molecule drugs or antibody-based drugs. The simulations conducted in this study reveal important details of the mechanisms by which cyclic peptides diffuse into cells. The scatter plot on the top left shows the correlation between the electrostatic interaction (horizontal axis) and the predicted value of membrane permeability (vertical axis). The scatter plot on the right shows the correlation between the experimental value of membrane permeability (horizontal axis) and the value predicted by the proposed method (vertical axis).  CREDIT 2021 Sugita M, et al. Published by American Chemical Society (Licensed under CC BY 4.0)

Cyclic peptide drugs have attracted the attention of major pharmaceutical companies around the world as promising alternatives to conventional small molecule-based drugs. Through proper design, cyclic peptides can be tailored to reach specific targets inside cells, such as protein-protein interactions, which are beyond the scope of small molecules. Unfortunately, it has proven notoriously difficult to design cyclic peptides with high cell-membrane permeability--that is, cyclic peptides that can easily diffuse through the lipid bilayer that delimits the inside and outside of a cell.

To resolve this bottleneck, scientists at the Middle Molecule IT-based Drug Discovery Laboratory (MIDL) have been working on a computational method for predicting cell-membrane permeability. Established in September 2017, MIDL is one of the "Research Initiatives" at the Tokyo Institute of Technology (Tokyo Tech) that goes beyond the boundaries of departments. Under the support of the Program for Building Regional Innovation Ecosystems of the Ministry of Education, Culture, Sports, Science, and Technology (MEXT), MIDL has been working with the city of Kawasaki to industrialize a framework for discovering middle molecule-based drugs--cyclic peptide drugs and nucleic acid drugs larger than conventional small-molecule drugs but smaller than antibody-based drugs--by combining computational drug design and chemical synthesis technology.

In a recent study published in the Journal of Chemical Information and Modeling, Professor Yutaka Akiyama and colleagues from MIDL and Tokyo Tech have developed a protocol for predicting the cell-membrane permeability of cyclic peptides using molecular dynamics simulations. Such simulations constitute a widely accepted computational approach for predicting and reproducing the dynamics of atoms and molecules by sequentially solving Newton's laws of motion at short time intervals. However, even a single simulation for predicting the permeability of a cyclic peptide with only eight amino acids takes a tremendous amount of time and resources. "Our study marks the first time comprehensive simulations were performed for as many as 156 different cyclic peptides," highlights Prof. Akiyama, "The simulation of each cyclic peptide using the protocol we developed took about 70 hours per peptide using 28 GPUs on the TSUBAME 3.0 supercomputer at Tokyo Tech."

The researchers verified the predicted permeability values with experimentally derived ones and confirmed an acceptable correlation coefficient of R = 0.63 under the best conditions, showcasing the potential of their protocol. Moreover, after a detailed analysis of the peptide conformation and energy values obtained from the trajectory data, Prof. Akiyama's team found that the strength of the electrostatic interactions between the atoms constituting the cyclic peptide and the surrounding media, namely lipid membrane and water molecules, are strongly related to the membrane permeability value. The simulations also revealed how peptides permeate through the membrane by changing their orientation and conformation according to their surroundings (Figure). "Our results shed some light on the mechanisms of cell membrane permeability and provide a guideline for designing molecules that can get inside cells more efficiently. This will greatly contribute to the development of next-generation peptide drugs," remarks Prof. Masatake Sugita, the first author of the study.

The researchers are already working on a more advanced simulation protocol that will enable more accurate predictions. They are also trying to incorporate artificial intelligence into the picture by adopting deep learning techniques, which could increase both accuracy and speed. Considering that cyclic peptides could unlock many therapeutic targets for diseases that are difficult to treat, let us hope that scientists at MIDL and Tokyo Tech succeed in their endeavors!

This research achievement will be featured in the supplementary cover of the journal issue in which this manuscript will be published.