UCLA researchers use AI to speed critical information on drug overdose deaths

Faster data processing is crucial to devising a rapid public health response to curb overdose deaths

An automated process based on computer algorithms that can read text from medical examiners’ death certificates can substantially speed up data collection of overdose deaths – which in turn can ensure a more rapid public health response time than the system currently used, new UCLA research finds.

The analysis, to be published Aug. 8 in the peer-reviewed JAMA Network Open, used tools from artificial intelligence to rapidly identify substances that caused overdose deaths.

 “The overdose crisis in America is the number one cause of death in young adults, but we don’t know the actual number of overdose deaths until months after the fact,” said study lead Dr. David Goodman-Meza, assistant professor of medicine in the division of infectious diseases at the David Geffen School of Medicine at UCLA. “We also don’t know the number of overdoses in our communities, as rapidly released data is only available at the state level, at best. We need systems that get this data out fast and at a local level so public health can respond. Machine learning and natural language processing can help bridge this gap.”

As it now stands, overdose data recording involves several steps, beginning with medical examiners and coroners, who determine a cause of death and record suspected drug overdoses on death certificates, including the drugs that caused the death. The certificates, which include unstructured text, are then sent to local jurisdictions or the Centers for Disease Control and Prevention (CDC) which code them according to the International Statistical Classification of Diseases and Related Health Problems, Tenth Edition (ICD-10). This coding process is time-consuming as it may be done manually. As a result, there is a substantial lag between the date of death and the reporting of those deaths, which slows the release of surveillance data. This in turn slows the public health response.

Further complicating matters is that under this system, different drugs with different uses and effects are aggregated under the same code – for instance, buprenorphine, a partial opioid used to treat opioid use disorder, and the synthetic opioid fentanyl are listed under the same ICD-10 code.

For this study, the researchers used “natural language processing” (NLP) and machine learning to analyze nearly 35,500 death records for all of 2020 from Connecticut and from 9 U.S. counties: Cook (Illinois); Jefferson (Alabama); Johnson, Denton, Tarrant and Parker (Texas), Milwaukee (Wisconsin), and Los Angeles and San Diego. They examined how combining NLP, which uses computer algorithms to understand text, and machine learning can automate the deciphering of large amounts of data with precision and accuracy.

They found that of the 8,738 overdose deaths recorded that year the most common specific substances were fentanyl (4758, 54%), alcohol (2866, 33%), cocaine (2247, 26%), methamphetamine (1876, 21%), heroin (1613, 18%), prescription opioids (1197, 14%), and any benzodiazepine (1076, 12%). Of these, only the classification for benzodiazepines was suboptimal under this method and the others were perfect or near perfect.

Most recently the CDC released preliminary overdose data that was no sooner than four months after the deaths, Goodman-Meza said.

“If these algorithms are embedded within medical examiner’s offices, the time could be reduced to as early as toxicology testing is completed, which could be about three weeks after the death,” he said.

The rest of the overdose deaths were due to other substances such as amphetamines, antidepressants, antipsychotics, antihistamines, anticonvulsants, barbiturates, muscle relaxants, and hallucinogens researchers note some limitations to the study, the main one being that the system was not tested on less common substances such as anticonvulsants or other designer drugs, so it is unknown if it would work for these. Also, given that the models need to be trained to rely on a large volume of data to make predictions, the system may be unable to detect emerging trends.

But rapid and accurate data are needed to develop and implement interventions to curb overdoses, and “NLP tools such as these should be integrated into data surveillance workflows to increase rapid dissemination of data to the public, researchers, and policymakers.”

NIH first to develop 3D structure of twinkle protein

Researchers hope discovery leads to potential treatments for mitochondrial diseases

Researchers from the National Institutes of Health have developed a three-dimensional structure that allows them to see how and where disease mutations on the twinkle protein can lead to mitochondrial diseases. The protein is involved in helping cells use energy our bodies convert from food. Prior to the development of this 3D structure, researchers only had models and were unable to determine how these mutations contribute to disease. Mitochondrial diseases are a group of inherited conditions that affect 1 in 5,000 people and have very few treatments. This rotating image shows the 3D structure that NIEHS researchers created of the twinkle protein. The researchers used Cryo-EM and other techniques to show how disease mutations on the protein can lead to mitochondrial diseases. The video zooms to the protein interface where many of the disease mutations occur.  CREDIT Graphics and video courtesy of A.A. Riccio, NIEHS

“For the first time, we can map the mutations that are causing a number of these devastating diseases,” said lead author Amanda A. Riccio, Ph.D., and researcher in the National Institute of Environmental Health Sciences (NIEHS) Mitochondrial DNA Replication Group, which is part of NIH. “Clinicians can now see where these mutations lie and can use this information to help pinpoint causes and help families make choices, including decisions about having more children.”

The new findings will be particularly relevant for developing targeted treatments for patients who suffer from mitochondrial diseases such as progressive external ophthalmoplegia, a condition that can lead to loss of muscle functions involved in eye and eyelid movement; Perrault syndrome, a rare genetic disorder that can cause hearing loss; infantile-onset spinocerebellar ataxia, a hereditary neurological disorder; and hepatocerebral mitochondrial DNA (mtDNA) depletion syndrome, a hereditary disease that can lead to liver failure and neurological complications during infancy.

The paper showcases how the NIEHS researchers were the first to accurately map clinically relevant variants in the twinkle helicase, the enzyme that unwinds the mitochondrial DNA double helix. The twinkle structure and all the coordinates are now available in the open data Protein Data Bank that is freely available to all researchers.

“The structure of twinkle has eluded researchers for many years. It’s a very difficult protein to work with,” noted William C. Copeland, Ph.D., who leads the Mitochondrial DNA Replication Group and is the corresponding author on the paper. “By stabilizing the protein and using the best equipment in the world we were able to build the last missing piece for the human mitochondrial DNA replisome.”

The researchers used cryo-electron microscopy (CryoEM), which allowed them to see inside the protein and the intricate structures of hundreds of amino acids or residues and how they interact.

Mitochondria, which are responsible for energy production, are especially vulnerable to mutations. mtDNA mutations can disrupt their ability to generate energy efficiently for the cell. Unlike other specialized structures in cells, mitochondria have their own DNA. In a cell’s nucleus, there are two copies of each chromosome, however, in the mitochondria, there could be thousands of copies of mtDNA. Having a high number of mitochondrial chromosomes allows the cell to tolerate a few mutations, but an accumulation of too many mutated copies leads to mitochondrial disease.

To conduct the study, the researchers used a clinical mutation, W315L, known to cause progressive external ophthalmoplegia, to solve the structure. Using CryoEM, they were able to observe thousands of protein particles appearing in different orientations. The final structure shows a multi-protein circular arrangement. They also used mass spectrometry to verify the structure and then did supercomputer simulations to understand why the mutation results in disease.

Within twinkle, they were able to map up to 25 disease-causing mutations. They found that many of these disease mutations map right at the junction of two protein subunits, suggesting that mutations in this region would weaken how the subunits interact and make the helicase unable to function.

“The arrangement of twinkle is a lot like a puzzle. A clinical mutation can change the shape of the twinkle pieces, and they may no longer fit together properly to carry out the intended function,” Riccio explained.

“What is so beautiful about Dr. Riccio and the team’s work is that the structure allows you to see so many of these disease mutations assembled in one place,” said Matthew J. Longley, Ph.D., an author and NIEHS researcher. “It is very unusual to see one paper that explains so many clinical mutations. Thanks to this work, we are one step closer to having information that can be used to develop treatments for these debilitating diseases.”

Stevens ballistics researchers build a model of the spin on the football spiral

Only a handful of researchers have studied why an American football flies in such a unique trajectory, rifling through the air with remarkable precision, but also swerving, wobbling, and even tumbling as it barrels downfield. Now, ballistics experts at Stevens Institute of Technology have, for the first time, applied their understanding of artillery shells to explain this unique movement, creating the most precise model to date of the flight of a spiraling football. shutterstock 725163886 82f8e

“When a quarterback makes a good spiral pass, the ball’s trajectory is remarkably similar to that of an artillery shell or a bullet, and the military has poured enormous resources into studying the way those projectiles fly,” explained John Dzielski, a Stevens’ research professor and mechanical engineer whose work is reported in The American Society of Mechanical Engineers’ Open Journal of Engineering. “Using well-understood ballistics equations, we’ve been able to model the flight of a football more precisely than ever before.”

In fact, Dzielski said, while the ballistics equations themselves are not terribly complex, the motions that they predict can be. The equations contain many terms that represent all of the ways that the air may affect a shell’s motion. The first challenge lay in considering each variable, in turn, to determine which ones are important when used in a new or different context.

Dzielski and co-author Mark Blackburn, a senior research scientist at Stevens, first took an exhaustive approach — modeling everything from a quarterback’s handedness to the effect of crosswinds, to the impact of the Earth’s rotation — then derived equations that stripped out factors that didn’t substantially influence a football’s flight path. For example, during a 60-yard pass, the Earth’s rotation changes the end point of the pass by only four inches. “It turns out the Earth’s rotation doesn’t have much effect on a football pass — but at least now we know that for sure,” Dzielski said.

Modeling a football’s flight sheds light on what separates good passes from bad ones. Dzielski and colleagues not only showed that a spiral pass can wobble at a slow rate or at a fast rate (or a combination of both), but also were the first to calculate what those frequencies are for a football. If the football wobbles slowly, then it was well thrown. If it wobbles quickly, then the quarterback twisted his wrist (like turning a screwdriver) or shoved sideways as the ball was released. The wrist might have twisted because the quarterback was being hit.

“Quarterbacks and coaches already know this intuitively, but we’ve been able to describe the physics at work,” Dzielski said.

Another, more surprising finding was that the Magnus effect, which causes a spinning baseball to slide or swerve due to changes in air pressure, has remarkably little effect on a spinning football. A football spins along the wrong axis to trigger the Magnus effect, so any deviations in flight path must come from a different source, such as the lift created as a ball angles through the air, Dzielski explained. “Many people believe that footballs swerve left or right because of the Magnus effect, but that’s not the case at all. The effect of the Magnus force is about double the effect of the Earth’s rotation,” he said. 

In addition, Dzielski and Blackburn showed, for the first time, that this swerving is intimately connected to why the ball ends up nose-down at the end of the pass when it is thrown with the nose up.

Although Dzielski’s and Blackburn’s work represents the most precise model of a football’s flightpath to date, Dzielski cautioned that more work is still needed. Because a football spins and tumbles as it travels, it’s almost impossible to use wind tunnel studies to accurately record the aerodynamics of a football in motion. “That means we don’t yet have good data to feed into our model, so creating an accurate simulation is impossible,” he said.

In the coming months, Dzielski hopes to find funding for instruments that can capture aerodynamic data from a free-flying football in real-world settings, not only in wind tunnels. “That’s the only way we’ll be able to get the kind of data we need,” he said. “Until then, a truly precise – and accurate – way to model a football’s trajectory will remain out of reach.” 

CHOP helps develop platform to speed up drug development for kids with cancer

The NCI-backed “Molecular Targets Platform” will streamline and catalyze drug development by harmonizing data about pediatric cancer targets and pathways

Children’s Hospital of Philadelphia (CHOP) has helped launch a new computational platform that will harmonize pediatric cancer data, allowing researchers, pharmaceutical companies, and advocacy groups to accelerate the pace of drug development for pediatric cancer. With funding from the National Cancer Institute (NCI) via a subcontract with Leidos Biomedical Research, the current operator of the NCI’s Frederick National Laboratory for Cancer Research, CHOP researchers have created the Molecular Targets Platform to facilitate pediatric research in response to the Research to Accelerate Cures and Equity (RACE) for Children Act, which requires companies to test cancer drugs in children that are used in adults when there is a shared molecular target. Co-PI John M. Maris, MD, Giulio D'Angio Chair in Neuroblastoma Research at Children’s Hospital of Philadelphia

“Through this project, we are using the power of integrated data to solve childhood cancer’s biggest challenges,” said co-principal investigator Deanne M. Taylor, Ph.D., Director of Bioinformatics in the Department of Biomedical and Health Informatics at Children’s Hospital of Philadelphia and Assistant Professor of Pediatrics at the University of Pennsylvania Perelman School of Medicine, who is leading the development of the new platform. “The Molecular Targets Platform will empower different communities to study new ways of understanding and treating pediatric cancer and will provide an invaluable resource for discovery and drug development. This platform will promote new hypotheses as people use this computational ecosystem to make new discoveries.”

A hesitancy has long stymied pediatric cancer research among drug developers to test new treatments in children, due in part to the relatively small size of the affected population. Passed in 2017 and enacted in 2020, the RACE for Children Act requires pharmaceutical companies to develop targeted cancer drugs for children if a drug with the same molecular target is being tested in adults, even if the malignancy occurs in a different organ. For example, suppose a company is testing a targeted therapy for breast cancer, and that genetic target is also relevant in pediatric cancer. In that case, the company will be required to test the drug as a treatment for pediatric cancer as well, unless it receives a waiver from the Food and Drug Administration (FDA).

To facilitate the enactment of the law, the FDA published a list of molecular targets in adult cancer that are seen as substantially relevant to pediatric cancer. However, there was no organized way to adjudicate the list as data on pediatric cancer genetics was dispersed and uneven in its representation of the hundreds of childhood cancer types. 

Through an NCI subcontract with Leidos Biomed and enabled by the Childhood Cancer Data Initiative, CHOP researchers used their expertise in molecular medicine, computational approaches, and bioinformatics to harmonize data from six major data sources about pediatric cancer targets, genes, and pathways. The platform allows users to query multiple aspects of pediatric cancer, from scored lists of cancer targets to profiles of a gene’s relationship to other cancers and diseases. The interface is publicly available for strategic research into childhood cancer therapies, with an intent for it to be utilized by investigators in academia and industry, as well as the FDA and patient advocates.

“Those of us in the pediatric cancer research field were delighted when the RACE for Children Act passed, but for the legislation to truly have an impact, we knew we needed a computational ecosystem where all of these data could exist in a user-friendly interface,” said co-principal investigator John M. Maris, MD, Giulio D'Angio Chair in Neuroblastoma Research at Children’s Hospital of Philadelphia and Professor of Pediatrics at the University of Pennsylvania Perelman School of Medicine. “Through the hard work – and, importantly, the vision – of researchers in CHOP’s Cancer CenterDepartment of Biomedical and Health Informatics, and Center for Data-Driven Discovery in Biomedicine, along with our collaborators at Leidos Biomed and the NCI, this platform will reduce the time it takes to make important data connections about childhood cancer from a few days or weeks to a few clicks of the mouse.”

The project is funded through a subcontract with Leidos Biomed, which is providing more than $3 million per year to CHOP through the NCI to develop and help maintain the new platform. 

“We are grateful to those who recognized the need for this data platform and to advocacy groups like Kids v Cancer, who was critical in pushing for passage of the RACE for Children Act,” Dr. Maris said. “With the Molecular Targets Platform, we hope we can hasten the discovery of new and long-overdue pediatric cancer treatments.” Co-PI Deanne M. Taylor, PhD, Director of Bioinformatics in the Department of Biomedical and Health Informatics at Children’s Hospital of Philadelphia

German prof builds AI that enables the design of novel proteins

A research team at the University of Bayreuth in Germany led by Prof. Dr. Birte Höcker has successfully applied a computer-based natural language processing model to protein research. Principles and processes that govern computational natural language processing are now increasingly used in protein research. Image: UBT / Protein design group.

Artificial intelligence (AI) has created new possibilities for designing tailor-made proteins to solve everything from medical to ecological problems. A research team at the University of Bayreuth led by Prof. Dr. Birte Höcker has now successfully applied a computer-based natural language processing model to protein research. Completely independently, the ProtGPT2 model designs new proteins that are capable of stable folding and could take over defined functions in larger molecular contexts. 

Natural languages and proteins are similar in structure. Amino acids arrange themselves in a multitude of combinations to form structures that have specific functions in the living organism – similar to the way words form sentences in different combinations that express certain facts. In recent years, numerous approaches have therefore been developed to use principles and processes that control the computer-assisted processing of natural language in protein research. "Natural language processing has made extraordinary progress thanks to new AI technologies. Today, models of language processing enable machines not only to understand meaningful sentences but also to generate them themselves. Such a model was the starting point of our research. With detailed information concerning about 50 million sequences of natural proteins, my colleague Noelia Ferruz trained the model and enabled it to generate protein sequences on its own. It now understands the language of proteins and can use it creatively. We have found that these creative designs follow the basic principles of natural proteins," says Prof. Dr. Birte Höcker, Head of the Protein Design Group at the University of Bayreuth.

The language processing model transferred to protein evolution is called "ProtGPT2". It can now be used to design proteins that adopt stable structures through folding and are permanently functional in this state. In addition, the Bayreuth biochemists have found out, through complex investigations, that the model can even create proteins that do not occur in nature and have possibly never existed in the history of evolution. These findings shed light on the immeasurable world of possible proteins and open a door to designing them in novel and unexplored ways. There is a further advantage: Most proteins that have been designed de novo so far have idealized structures. Before such structures can have a potential application, they usually must pass through an elaborate functionalization process – for example by inserting extensions and cavities – so that they can interact with their environment and take on precisely defined functions in larger system contexts. ProtGPT2, on the other hand, generates proteins that have such differentiated structures innately, and are thus already operational in their respective environments.

"Our new model is another impressive demonstration of the systemic affinity of protein design and natural language processing. Artificial intelligence opens up highly interesting and promising possibilities to use methods of language processing for the production of customized proteins. At the University of Bayreuth, we hope to contribute in this way to developing innovative solutions for biomedical, pharmaceutical, and ecological problems," says Prof. Dr. Birte Höcker.