Data mining system unearths US counties most at risk for COVID deaths

 The task of controlling the COVID-19 pandemic nationwide and predicting where cases will spike next and which areas may have high mortality rates remains daunting for scientists and public officials. A new machine learning tool developed by researchers at a startup company (Akai Kaeru LLC) affiliated with Stony Brook University's Department of Computer Science and the Institute for Advanced Computational Science (IACS) may help gauge areas most at risk for the virus and high death rates. The software they use analyzes a massive data set from all 3,007 U.S. counties. They found that combinations of factors such as poverty, rural settings, low education, low poverty but housing debt, and sleep deprivation are associated with higher death rates in counties.

The researchers use an automatic pattern mining engine and software to analyze a data set with approximately 500 attributes, which cover details related to demographics, economics, race and ethnicity, and infrastructure in all U.S. counties. After analyzing and assessing the data within counties they created nearly 300 sets of counties at a "high risk" for COVID-19 and related death rates. CAPTION This {module INSIDE STORY}

Many of these counties within the sets - but not all - are in Southern U.S. states and include close to 1,000 counties. Some of the counties include Hancock, Ga.; Attala, Miss.; Lee, S.C.; Swisher Texas; Adams, Ohio; Torrance, N.M.; and Madison, Fla. Mississippi, Louisiana, and Georgia are the most at risk, with 80-90 percent of their counties covered by these sets.

"Our software algorithm identifies counties with specific conditions that appear to lead to higher than average U.S. death rates due to COVID-19," said Klaus Mueller, Ph.D., Professor of Computer Science, IACS faculty member, CEO of startup Akai Kaeru, LLC, and Principal Investigator of the company study. "We cannot say that a specific county will have a higher than usual death rate, but we can predict this for the sets of counties that fit certain conditions."

According to Mueller, the software and method used to analyze the data and identify high-risk counties can help inform officials based on important correlations related to COVID-19 death rates and help the direct allocation of resources, such as testing kits and stations. The method and findings may also help to target community-based information campaigns about COVID-19 and measures to contain the pandemic and potentially reduce cases.

The researchers found that several conditions must be present at the same time to expose a county to elevated risk. Some of these condition sets are:

  • Poor rural counties with aging residents.
  • Sleep-deprived, under-educated counties with low participation in health insurance.
  • Counties with low Asian but high minority populations where black children live in poverty.
  • Counties with high homeownership and low poverty. For this set of counties, there also exists a significant correlation between death rate and the amount of housing debt the county residents have.

"Each of these sets of conditions tells a unique story and makes the Artificial Intelligence behind our algorithm explainable," Mueller says. "For instance, what we might conclude from the 'high homeownership and low poverty' pattern is that there are homeowners in these wealthy counties with high homeowners who cannot afford their homes and as a result run high housing debt. Then, as the percentage of these types of homeowners in a county grows, so does the risk of COVID-19 infection and potentially death."

"We also observe in a different county set that poor and aging counties with low population density are on average especially hard hit by COVID-19," explains Mueller. "While it is well known now that older residents are more vulnerable to COVID-19, the pattern tells us that this high risk seems to be amplified by two factors related to accessibility:

(1) The residents live in sparsely populated areas that offer fewer urgent care facilities and (2) the residents are mostly poor which hampers their ability to use and pay for these services."

Mueller emphasizes that any conclusions about conditions related to high death rates from COVID-19 in county sets or specific counties will continue to need further investigation because a pandemic is not static and factors contributing to disease and death are often complicated.

Akai Kaeru is a start-up company developed and located in the New York State Center of Excellence in Wireless and Information Technology (CEWIT). Created in 2003, CEWIT is the anchoring building to Stony Brook University's Research and Development Park to conduct research and commercialize it.

The entire high-risk county sets analysis can be viewed in more detail on this website.

UBC study shows six billion Earth-like planets in our galaxy

There may be as many as one Earth-like planets for every five Sun-like stars in the Milky Way Galaxy, according to new estimates by the University of British Columbia astronomers.

To be considered Earth-like, a planet must be rocky, roughly Earth-sized, and orbiting Sun-like (G-type) stars. It also has to orbit in the habitable zones of its star--the range of distances from a star in which a rocky planet could host liquid water, and potentially life, on its surface.

"My calculations place an upper limit of 0.18 Earth-like planets per G-type star," says UBC researcher Michelle Kunimoto, co-author of the new study in The Astronomical Journal. "Estimating how common different kinds of planets are around different stars can provide important constraints on planet formation and evolution theories, and help optimize future missions dedicated to finding exoplanets." CAPTION Artist's conception of Kepler telescope observing planets transiting a distant star.  CREDIT NASA Ames/ W Stenzel.{module INSIDE STORY}

According to UBC astronomer Jaymie Matthews: "Our Milky Way has as many as 400 billion stars, with seven percent of them being G-type. That means less than six billion stars may have Earth-like planets in our Galaxy."

Previous estimates of the frequency of Earth-like planets range from roughly 0.02 potentially habitable planets per Sun-like star to more than one per Sun-like star.

Typically, planets like Earth are more likely to be missed by a planet search than other types, as they are so small and orbit so far from their stars. That means that a planet catalog represents only a small subset of the planets that are actually in orbit around the stars searched. Kunimoto used a technique known as 'forward modeling' to overcome these challenges.

"I started by simulating the full population of exoplanets around the stars Kepler searched," she explained. "I marked each planet as 'detected' or 'missed' depending on how likely it was my planet search algorithm would have found them. Then, I compared the detected planets to my actual catalog of planets. If the simulation produced a close match, then the initial population was likely a good representation of the actual population of planets orbiting those stars."

Kunimoto's research also sheds more light on one of the most outstanding questions in exoplanet science today: the 'radius gap' of planets. The radius gap demonstrates that it is uncommon for planets with orbital periods less than 100 days to have a size between 1.5 and two times that of Earth. She found that the radius gap exists over a much narrower range of orbital periods than previously thought. Her observational results can provide constraints on planet evolution models that explain the radius gap's characteristics.

Previously, Kunimoto searched archival data from 200,000 stars of NASA's Kepler mission. She discovered 17 new planets outside of the Solar System, or exoplanets, in addition to recovering thousands of already known planets.

Paired with super telescopes, model Earths guide hunt for life

Cornell University astronomers have created five supercomputer models representing key points from our planet's evolution, like chemical snapshots through Earth's own geologic epochs.

The models will be spectral templates for astronomers to use in the approaching new era of powerful telescopes, and in the hunt for Earth-like planets in distant solar systems.

"These new generations of space- and ground-based telescopes coupled with our models will allow us to identify planets like our Earth out to about 50 to 100 light-years away," said Lisa Kaltenegger, associate professor of astronomy and director of the Carl Sagan Institute.

For the research and model development, Kaltenegger, doctoral student Jack Madden and Zifan Lin authored "High-Resolution Transmission Spectra of Earth through Geological Time," published in Astrophysical Journal Letters. {module INSIDE STORY}

"Using our own Earth as the key, we modeled five distinct Earth epochs to provide a template for how we can characterize a potential exo-Earth - from a young, prebiotic Earth to our modern world," she said. "The models also allow us to explore at what point in Earth's evolution a distant observer could identify life on the universe's 'pale blue dots' and other worlds like them."

Kaltenegger and her team created atmospheric models that match the Earth of 3.9 billion years ago, a prebiotic Earth when carbon dioxide densely cloaked the young planet. A second throwback model chemically depicts a planet free of oxygen, an anoxic Earth, going back 3.5 billion years. Three other models reveal the rise of oxygen in the atmosphere from a 0.2% concentration to modern-day levels of 21%.

"Our Earth and the air we breathe have changed drastically since Earth formed 4.5 billion years ago," Kaltenegger said, "and for the first time, this paper addresses how astronomers trying to find worlds like ours, could spot young to modern Earth-like planets in transit, using our own Earth's history as a template."

In Earth's history, the timeline of the rise of oxygen and its abundance is not clear, Kaltenegger said. But, if astronomers can find exoplanets with nearly 1% of Earth's current oxygen levels, those scientists will begin to find emerging biology, ozone, and methane - and can match it to ages of the Earth templates.

"Our transmission spectra show atmospheric features, which would show a remote observer that Earth had a biosphere as early as about 2 billion years ago," Kaltenegger said.

Using forthcoming telescopes like NASA's James Webb Space Telescope, scheduled to launch in March 2021, or the Extremely Large Telescope in Antofagasta, Chile, scheduled for first light in 2025, astronomers could watch as an exoplanet transit in front of its host star, revealing the planet's atmosphere.

"Once the exoplanet transits and blocks out part of its host star, we can decipher its atmospheric spectral signatures," Kaltenegger said. "Using Earth's geologic history as a key, we can more easily spot the chemical signs of life on the distant exoplanets."