French team uses supercomputer modeling to study how far plastic drifts far from its starting point as it sinks into the sea

Discarded or drifting in the ocean, plastic debris can accumulate on the water’s surface, forming floating garbage islands. Although it’s harder to spot, researchers suspect a significant amount also sinks. In a new study in ACS’ Environmental Science & Technology, one team used supercomputer modeling to study how far bits of lightweight plastic travel when falling into the Mediterranean Sea. Their results suggest these particles can drift farther underwater than previously thought. 

Plastic pollution is besieging the oceans from old shopping bags to water bottles. Not only is this debris unsightly, but animals can also become trapped or mistakenly eat it. And if it remains in the water, plastic waste can release organic pollutants. The problem is most visible on the surface, where currents can aggregate this debris into massive garbage patches. However, plastic waste also collects much deeper. Even material that weighs less than water can sink as algae and other organisms glom onto it, and through other processes. Bits of this light plastic, typically measuring 5 millimeters or less, have turned up at least half a mile below the surface. Researchers don’t know much about what happens when plastic sinks, but they generally assume it falls straight down from the surface. However, Alberto Baudena and his colleagues suspected this light plastic might not follow such a direct route.

To test this assumption, they used an advanced supercomputer model developed to track plastic at sea and incorporated extensive data already collected on floating plastic pollution in the Mediterranean Sea. They then simulated nearly 7.7 million bits of plastic distributed across the sea and tracked their virtual paths to depths as great as about half a mile. Their results suggested that the slower the pieces sank, the farther currents carried them from their points of origin, with the slowest traveling an average of roughly 175 miles laterally. While observations of the distribution of plastic underwater is limited, the team found their simulations agree with those available in the Mediterranean. Their simulations also suggested that currents may push plastic toward coastal areas and that only about 20% of pollution near coasts originates from the nearest country. According to the researchers, these particles’ long journeys mean this plastic has greater potential to interact with, and harm, marine life.

The authors acknowledge funding from the International Union for Conservation of Nature, the Tara Expeditions Foundation, and the Albert II Monaco Foundation.

(Adobe Stock)

UConn physicist Volkov shows how to manipulate quasiparticles in thin layers of ordinary superconductors to create topological superconductors by slightly twisting the stacked layers

"The twist is essentially determining the properties, and funnily enough, it gives you some very unexpected properties"

Transporting energy is costly. When a current runs through conductive materials, some of the energy is lost due to resistance as particles within the material interact. This energy loss presents a hurdle to the advancement of many technologies and scientists are searching for ways to make superconductors that eliminate resistance. 

Superconductors can also provide a platform for fault-tolerant quantum supercomputing if endowed with topological properties. An example of the latter is the quantum Hall effect where the topology of electrons leads to universal, “quantized,” resistance with accuracy up to one part in a billion, which finds use in meteorology. Unfortunately, the quantum Hall effect requires extremely strong magnetic fields, typically detrimental to superconductivity. This makes the search for topological superconductors a challenging task.

In two new papers in Physical Review Letters and Physical Review B UConn Physicist Pavel Volkov and his colleagues propose how to experimentally manipulate the quantum particles, called quasiparticles, in very thin layers of ordinary superconductors to create topological superconductors by slightly twisting the stacked layers.

Volkov explains there is a lot of research being done on ways to engineer materials by stacking layers of two-dimensional materials together:

“Most famously, this has been done with graphene. Stacking two graphene layers in a particular way results in a lot of interesting new phenomena. Some parallel those in high-temperature superconductors, which was unexpected because, by itself, graphene is not superconducting.”

Superconductivity happens when a material conducts current without any resistance or energy loss. Since resistance is a challenge for many technologies, superconducting materials have the potential to revolutionize how we do things, from energy transmission to quantum supercomputing to more efficient MRI machines. 

However, endowing superconductors with topological properties is challenging, says Volkov, and as of now, there are no materials that can reliably perform as topological superconductors.

The researchers theorize that there is an intricate relation between what happens inside the twisted superconductor layers and a current applied between them. Volkov says the application of a current makes the quasiparticles in the superconductor behave as if they were in a topological superconductor.

“The twist is essentially determining the properties, and funnily enough, it gives you some very unexpected properties. We thought about applying twisting to materials that have a peculiar form of superconductivity called nodal superconductivity,” says Volkov. “Fortunately for us, such superconductors exist and, for example, the cuprate high-temperature superconductors are nodal superconductors. What we claim is that if you apply a current between two twisted layers of such superconductors, it becomes a topological superconductor.”

The proposal for current-induced topological superconductivity is, in principle, applicable at any twist angle, Volkov explains, and there is a wide range of angles that optimize the characteristics, which is unlike other materials studied so far.

“This is important because, for example, in twisted bilayer graphene, observation of interesting new phenomena requires aligning the two layers to 1.1 degrees, and deviations by .1 degrees are strongly detrimental. That means that one is required to make a lot of samples before finding one that works. For our proposal, this problem won’t be as bad.  If you miss the angle even by a degree, it’s not going to destroy the effect we predict.”

Volkov expects that this topological superconductor has the potential to be better than anything else currently on the market. Though one caveat is they do not know exactly what the parameters of the resulting material will be, they have estimates that may be useful for proof of principle experiments. (a) Momentum-space schematic of a twisted nodal superconductor exemplified by a d-wave superconductor with a sign-changing gap (from blue to red). Near the nodes (  K N   and   ˜ K N  ) he BdG quasiparticles of the two layers have a Dirac dispersion shifted by a vector   Q N ( = θ K N )   with respect to one another. (b) Interlayer current leads to opening of a bulk   Z   topological gap with gapless chiral edge modes.

The researchers also found unexpected behaviors for the special value of twist angle.

“We find a particular value of the angle, the so-called ‘magic angle,’ where a new state should appear – a form of magnetism. Typically, magnetism and superconductivity are antagonistic phenomena but here, superconductivity begets magnetism, and this happens precisely because of the twisted structure of the layers.” says Volkov.

Demonstrating these predictions experimentally will bring more challenges to overcome, including making the atoms-thick layers better themselves and determining the difficult-to-measure parameters, but Volkov says there is a lot of motivation behind developing these highly complex materials.

“Basically, the main problem so far is that the candidate materials are tricky to work with. There are several groups around the world trying to do this. Monolayers of nodal superconductors, necessary for our proposal have been realized, and experiments on twisted flakes are ongoing. Yet, the twisted bilayer of these materials has not been demonstrated. That’s work for the future.”

These materials hold promise for improving materials we use in everyday life, says Volkov. Things already in use that take advantage of the topological states include devices used to set resistance standards with high accuracy. Topological superconductors are also potentially useful in quantum supercomputing, as they serve as a necessary ingredient for proposals of fault-tolerant qubits, the units of information in quantum computing. Volkov also emphasizes the promise topological materials hold for precision physics,

“Topological states are useful because they allow us to do precision measurements with materials. A topological superconductor may allow us to perform such measurements with unprecedented precision for spin (magnetic moment of an electron) or thermal properties.”

Assistant Professor Cancer Center Member Ph.D., Yale University, 2015
Assistant Professor Cancer Center Member Ph.D., Yale University, 2015

CSHL prof Koo builds EvoAug for improving the interpretability of genomic deep neural networks with evolution-inspired data augmentations

Genes make up only a small fraction of the human genome. Between them are wide sequences of DNA that direct cells when, where, and how much each gene should be used. These biological instruction manuals are known as regulatory motifs. If that sounds complex, well, it is.

The instructions for gene regulation are written in a complicated code, and scientists have turned to artificial intelligence to crack it. To learn the rules of DNA regulation, they’re using deep neural networks (DNNs), which excel at finding patterns in large datasets. DNNs are at the core of popular AI tools like ChatGPT. Thanks to a new tool developed by Cold Spring Harbor Laboratory Assistant Professor Peter Koo, genome-analyzing DNNs can now be trained with far more data than can be obtained through experiments alone. The name EvoAug stands for evolution augmentations. The Koo lab built its new AI-training model by feeding it augmented data based on the genetic mutations that have driven evolution. Image: © VectorMine – stock.adobe.com

“With DNNs, the mantra is the more data, the better," Koo says. “We really need these models to see a diversity of genomes so they can learn robust motif signals. But in some situations, the biology itself is the limiting factor, because we can’t generate more data than exists inside the cell.”

If an AI learns from too few examples, it may misinterpret how a regulatory motif impacts gene function. The problem is that some motifs are uncommon. Very few examples are found in nature.

To overcome this limitation, Koo and his colleagues developed EvoAug—a new method of augmenting the data used to train DNNs. EvoAug was inspired by a dataset hiding in plain sight—evolution. The process begins by generating artificial DNA sequences that nearly match real sequences found in cells. The sequences are tweaked in the same way genetic mutations have naturally altered the genome during evolution.

Next, the models are trained to recognize regulatory motifs using the new sequences, with one key assumption. It’s assumed the vast majority of tweaks will not disrupt the sequences’ function. Koo compares augmenting the data in this way to training image-recognition software with mirror images of the same cat. The computer learns that a backward cat pic is still a cat pic.

The reality, Koo says, is that some DNA changes do disrupt function. So, EvoAug includes a second training step using only real biological data. This guides the model “back to the biological reality of the dataset,” Koo explains.

Koo’s team found that models trained with EvoAug perform better than those trained on biological data alone. As a result, scientists could soon get a better read of the regulatory DNA that writes the rules of life itself. Ultimately, this could someday provide a whole new understanding of human health.

Drexel researchers demo a ML approach for predicting Philadelphia's future energy use

As Philadelphia strives to meet greenhouse gas emissions goals established in its 2050 Plan, a better understanding of how zoning can play a role in managing building energy use could set the city up for success. Researchers in Drexel University’s College of Engineering are hoping a machine learning model they’ve developed can support these efforts by helping to predict how energy consumption will change as neighborhoods evolve.

In 2017, the city set a goal of becoming carbon neutral by 2050, led in large part by a reduction in greenhouse gas emissions from building energy use – which accounted for nearly three-quarters of Philadelphia’s carbon footprint at the time. But the key to meeting this mark lies not just in establishing sustainable energy use practices for current buildings, but also in incorporating energy use projections into zoning decisions that will direct future development.

And the challenge for Philadelphia, one of the oldest cities in the country, is that building types vary widely — as does their energy use. So planning for more efficient energy use at the City level is not a problem with a one-size-fits-all solution.

“For Philadelphia in particular, neighborhoods vary so much from place to place in the prevalence of certain housing features and zoning types that it’s important to customize energy programs for each neighborhood, rather than trying to enact blanket policies for carbon reduction across the entire city or county,” said Simi Hoque, Ph.D., a professor in the College of Engineering who led research into using machine learning for granular energy-use modeling.

Hoque’s team believes existing machine learning programs, properly deployed, can provide some clarity on how zoning decisions could affect future greenhouse gas emissions from buildings.

“Right now there is a huge volume of energy use data, but it’s often just too inconsistent and messy to be reasonably put to use. For example, one dataset corresponding to certain housing characteristics may have usable energy estimates, but another dataset corresponding to socioeconomic features is missing too many values to be usable,” she said. “Machine learning is well equipped to handle this challenge because they can iteratively learn and improve through the training process to reduce bias and variance despite these data limitations.”

To glean information from the disjointed data, the team developed a process using two machine learning programs — one that can tease out patterns from massive tranches of data and use them to make projections about future energy and a second that can pinpoint the details in the model that likely had the greatest effect on changing the projections.

First, they trained a deep-learning program, called Extreme Gradient Boosting (XGBoost), with volumes of commercial and residential energy-use data for Philadelphia from the U.S. Energy Information’s Residential Energy Consumption Survey and Commercial Buildings Energy Consumption Survey for 2015, as well as the city’s demographic and socioeconomic data from the U.S. Census Bureau’s American Communities Survey for that time period.

The program learned enough from the data that it could draw correlations between a laundry list of variables, such as the density of buildings, the population of a given area, building square footage, number of occupants, how many days heating or air conditioning was used, and energy use for each house or building.

While deep learning models like XGBoost are very useful for making informed forecasts, given a large and inconsistent set of data, their methods can be obscured by the complexity of the operations they perform. But to be a useful tool for guiding planners, the team needed to unpack the so-called “black box” program enough to turn its projections into recommendations.

To do it, they employed a Shapley additive explanations analysis, an assessment used in game theory to distribute credit among factors that contributed to an outcome. This allowed them to suss out how much a change in building density or square footage, for example, factored into the program’s projection.

“Machine learning models like XGBoost learn how to chug through datasets to fulfill a specific task — like generating a reliable forecast of a system — but they do not claim to really understand or represent the on-the-ground relationships that underlie a phenomenon,” Hoque said. “And while a Shapley analysis cannot tell us which features have the greatest impact on energy use, it can explain which features had the greatest impact on the model’s energy use prediction, which is still quite a useful piece of information.”

Then the team put the model to the test by providing input data from a hypothetical scenario proposed by the Delaware Valley Regional Planning Commission that estimated continuing economic development in Philadelphia through the year 2045. The scenario suggested a 17% population increase with a commensurate increase in households, and it presents a number of different possibilities for employment and income by region throughout the city.

For each scenario, the model projected how new residential and commercial development would change greenhouse gas emissions from building energy use throughout 11 different parts of the city and which variables played prominent roles in making the projections.

Looking specifically at residential energy use for the 2045 scenario, the program suggested that six of the 11 areas would decrease their energy use – mostly lower-income regions. While mixed-income regions, like the northernmost part of the city, including Oak Lane, would likely see an increase in energy use.

According to the Shapley analysis, the presence of single-family attached (lower energy use) versus detached (higher energy use) dwellings played an important role in the projections, with high monthly electricity cost, lot sizes of less than one acre, and lower number of rooms per building all contributing to lower energy use projections.

“Overall, the residential energy prediction model finds that features related to lower building intensity relate to lower energy consumption estimates in the model, for example, lower lot acreage, lower number of rooms per unit,” they wrote. “These results give reason to reinvestigate the effects of upzoning policies, commonly present as an affordable housing solution in Philadelphia and other cities across the U.S., and subsequent changes in energy use for these areas.”

On the commercial side of the scenario, the machine learning model did not project much change in energy use under the 2045 conditions — energy use for the largest commercial buildings remained high. And while it was limited to looking at just six variables — square footage, number of employees, number of floors, heating degree days, cooling degree days, and the principal activity of the building — due to the available data in the training set, the Shapley analysis pointed to building square footage and number of employees as the most important predictors of energy use for most types of commercial buildings.

“With respect to the commercial sector, the study suggests that commercial buildings in the top quantiles of square footage and employee count should be the primary targets for energy reduction programs,” the authors wrote. “The research posits an approximate threshold of 10,000 square feet of total building area, with buildings over that marker being prioritized due to their disproportionate influence on the energy prediction of the model.”

While the researchers caution against assuming a direct link between variables and energy use changes in the model, they suggest that it is still quite useful because of its ability to give planners both a high-level and granular look at the interplay of zoning decisions and development and their effect on energy use.

“I see a lot of potential in using machine learning models like XGBoost to forecast energy use increases or decreases due to new construction projects or policy changes,” Hoque said. “For example, building a new rail line in a neighborhood may change the demographics and employment of a neighborhood, and our methods would be ideal for incorporating that information in the context of an energy prediction model.”

The team acknowledges that more testing is necessary and the program will only improve as it is provided with additional data. They suggest that the next step for the research would be to focus on areas of the city with known high energy use and perform a Shapely analysis to discern some of the factors that could be contributing to it.

“We hope this will provide a resource for future researchers and policymakers so they don’t have to scope through the entire city of Philadelphia, but can hone in on neighborhoods and variables which we have flagged as areas of potential importance,” Hoque said. “Ideally, future studies would use more interpretable methods to test whether these features really correspond to higher or lower energy estimates in a given area.”

St. Jude tool gets more out of multi-omics data

An upgraded computational tool from St. Jude Children’s Research Hospital, Memphis, TN, can find potentially druggable hidden drivers of cancer and other biological processes using multi-omics data. 

Despite the astounding advances made in understanding the biological underpinnings of cancer, many cancers are missing obvious genetic drivers. When scientists can’t pinpoint the factors that drive cancer, treating it can be much more difficult. Scientists at St. Jude Children’s Research Hospital hope to solve that problem with an updated way to analyze multi-omic (primarily transcriptomics and proteomics) data. The researchers created a next-generation computational tool to gain new insights from biological data and find hidden druggable targets.

The updated application, NetBID2, successfully uncovers difficult-to-identify proteins that drive biological processes (such as rapid cell growth) contributing to cancer. These hidden drivers present new therapeutic opportunities, either because existing drugs can already target them or because they might inspire drug developers to make new therapeutics.

“We made it easier to find hidden drivers,” said Jiyang Yu, Ph.D., St. Jude Department of Computational Biology. “Finding hidden drivers is important because many of these are potentially druggable targets. NetBID2 can find these drivers and potentially move them quickly into clinical trials. We may be able to re-purpose an already FDA-approved drug that targets an identified hidden driver to a completely different patient population that may benefit.”

A network approach to finding hidden drivers

Large sets of RNA sequencing data from specific cells or cancer types can contain valuable information necessary to find hidden drivers of disease; however, standard analysis tools struggle to find them. NetBID2 is a sequel to the original tool developed by Yu in 2018. He specifically designed these tools to find hidden drivers by squeezing out more from “big data.”

“NetBID2 enables us to maximize the data we have,” said Yu, “particularly RNA sequencing data. It goes beyond the traditional mutation or differential gene expression data to expose hidden events and information that may be functionally important.”

Hidden drivers cannot be discovered by conventional genomics or sequencing approaches because their activity depends on post-translational modifications and other mechanisms that are invisible to traditional sequencing but affect the expression of other genes.

Therefore, NetBID2 takes RNA sequencing data, then generates a gene-gene interactome. This interactome tracks the relationships between driver candidates and their downstream effector genes to determine which signaling proteins are most central to the key relationships that fuel disease. These “central hubs” directing the network are the hidden drivers.

“NetBID2 looks for a hidden driver like the FBI would look for a crime boss,” Yu said. “If you look at the suspect, there’s no direct evidence to connect them with any crime. The way to capture them is first to build a network of associates. We do the same when we build the biological network by collecting a lot of data on its members and their relationships. Then we look for the boss’s first neighbors in the network when we look at a hidden driver’s activity. That’s the only way to capture the boss — by inference from their activities — otherwise, there is no way to identify them. We find these hidden drivers’ guilt by association.”

As proof of the tool’s capabilities, the St. Jude group showed it could find biologically meaningful hidden drivers in three unrelated samples. Using NetBID2, the team found unappreciated roles for MYC in adult lung cancer and for NOTCH1 in difficult-to-treat pediatric leukemia that standard differential expression analysis at the mRNA or protein levels hadn’t uncovered, despite the genes’ having been previously linked to cancer. They also found an unappreciated role for Gabpa in normal immune cell function. The gene’s importance was context-specific in each case, highlighting the need for targeted analyses.

The software’s other capabilities, such as new visualization tools, are meant to facilitate further analysis and discovery of hidden drivers from complex networks of RNA-seq and, in some cases, proteomics data.

NetBID2 is freely available on a GitHub repository. The St. Jude Cloud, which includes a NetBID2 app and data from many multi-omics projects, is also freely available for other scientists to use for further discovery of hidden drivers of basic biology and disease.

The study’s first writer is Xinran Dong, formerly of St. Jude. The other writers are Liang Ding, Andrew Thrasher, Xinge Wang (formerly), Jingjing Liu, Qingfei Pan, Jordan Rash, Yogesh Dhungana, Xu Yang, Isabel Risch (formerly), Yuxin Li (formerly), Lei Yan, Michael Rusch, Clay McLeod, Koon-Kiu Yan, Junmin Peng, Hongbo Chi, and Jinghui Zhang, all of St. Jude.

The study was supported by grants from the National Institutes of Health (R01GM134382, U01CA264610, and P30CA021765-403 41S3) and ALSAC, the fundraising and awareness organization of St. Jude.