Harvard Medical School prof Kuo discusses ChatGPT answers to common patient questions about colonoscopy

Braden Kuo, MD, a neuro gastroenterologist and the director of the Center for Neurointestinal Health at MGH and an associate professor of Medicine at Harvard Medical School, and Tsung-Chun Lee, MD, Ph.D., of Taipei Medical University Shuang Ho Hospital, in Taiwan are co-authors of a recent research letter published in Gastroenterology, ChatGPT Answers Common Patient Questions About Colonoscopy.

What was the question you set out to answer with this study?
ChatGPT, a new language processing tool driven by artificial intelligence (AI), provides conversational text responses to questions and can generate valuable information for enquiring individuals, but the quality of ChatGPT-generated answers to medical questions is currently unclear.

What Methods or Approach Did You Use?
We retrieved eight common questions and answers about colonoscopy from the publicly available webpages of three randomly-selected hospitals from the top-20 list of the US News & World Report Best Hospitals for Gastroenterology and Gastrointestinal Surgery.

We inputted these questions as prompts for ChatGPT for two times on the same day and recorded the ChatGPT-generated answers.

We then used plagiarism detection software to compare the text similarity among all answers. Finally, to objectively interpret the quality of ChatGPT-generated answers, four gastroenterologists rated 36 random pairs of questions and answers for the following quality indicators on a 7-point scale:

(1) ease of understanding
(2) scientific adequacy
(3) satisfaction with the answer
Raters were also asked to interpret whether the answers were AI-generated or not.

What Did You Find?
ChatGPT answers had extremely low text similarity compared with answers on hospital webpages, while the text similarity ranged from 28% to 77% between the two ChatGPT answers.

ChatGPT answers were rated similarly by gastroenterologists to non-AI answers in terms of ease of understanding but with the average AI scores higher than non-AI scores. Scores were also similar related to scientific adequacy and satisfaction with the answers. The raters were only 48% accurate in telling which answers were provided by ChatGPT.

This study is the first of its kind to demonstrate that a contemporary large language model–derived conversational AI program is able to provide easy-to-understand, scientifically adequate, and generally satisfactory answers to common questions about colonoscopy, as determined by gastroenterologists.

Such programs may help to optimize clinical communication with patients, especially for high-volume procedures like colonoscopy. Conversational AI empowered by large language models like ChatGPT has the potential to transform and benefit shared decision-making by patients and physicians.

What are the Implications?
Future research should explore responses to a broader sample of patient questions and clinical conditions and include both patients and physicians as raters.

First AI open language model for Portuguese is now available

Albertina PT is the first large generative AI model for the Portuguese language, free of charge, open source, and with universal access, and is now available. It can generate texts about any topic in the Portuguese both variants from Brazil and Portugal.

This model has been developed by researchers from the Faculty of Sciences of the University of Lisbon (Ciências ULisboa) and the Faculty of Engineering of the University of Oporto (FEUP). It has been made publicly available this May and has 900 million parameters. Albertina PT is aimed at researchers and organizations, public and private, large and small, from all economic sectors.

At the time of publication, its performance establishes the state of the art for Portuguese with respect to published and open neural language models. It is these kinds of language models that support the full range of AI applications that are all the rage, from chatbots to machine translation.

"This is a very important historical milestone in the technological preparation of the Portuguese language for the digital age", says António Branco, Professor at the Department of Informatics at Ciências ULisboa and coordinator of this project.

Further details about this work have been made available on arXiv, in the paper "Advancing Neural Encoding of Portuguese with Transformer Albertina PT-*".

ChatGPT passes the radiology board exam

The latest version of ChatGPT passed a radiology board-style exam, highlighting the potential of large language models but also revealing limitations that hinder reliability, according to two new research studies published in Radiology, a journal of the Radiological Society of North America (RSNA). 

ChatGPT is an artificial intelligence (AI) chatbot that uses a deep learning model to recognize patterns and relationships between words in its vast training data to generate human-like responses based on a prompt. But since there is no source of truth in its training data, the tool can generate responses that are factually incorrect.

“The use of large language models like ChatGPT is exploding and only going to increase,” said lead author Rajesh Bhayana, M.D., FRCPC, an abdominal radiologist and technology lead at University Medical Imaging Toronto, Toronto General Hospital in Toronto, Canada. “Our research provides insight into ChatGPT’s performance in a radiology context, highlighting the incredible potential of large language models, along with the current limitations that make it unreliable.”

ChatGPT was recently named the fastest-growing consumer application in history, and similar chatbots are being incorporated into popular search engines like Google and Bing that physicians and patients use to search for medical information, Dr. Bhayana noted.

To assess its performance on radiology board exam questions and explore strengths and limitations, Dr. Bhayana and colleagues first tested ChatGPT based on GPT-3.5, currently the most commonly used version. The researchers used 150 multiple-choice questions designed to match the style, content, and difficulty of the Canadian Royal College and American Board of Radiology exams.

The questions did not include images and were grouped by question type to gain insight into performance: lower-order (knowledge recall, basic understanding) and higher-order (apply, analyze, synthesize) thinking. The higher-order thinking questions were further subclassified by type (description of imaging findings, clinical management, calculation and classification, disease associations).

The performance of ChatGPT was evaluated overall and by question type and topic. Confidence in the language in responses was also assessed.

The researchers found that ChatGPT based on GPT-3.5 answered 69% of questions correctly (104 of 150), near the passing grade of 70% used by the Royal College in Canada. The model performed relatively well on questions requiring lower-order thinking (84%, 51 of 61), but struggled with questions involving higher-order thinking (60%, 53 of 89). More specifically, it struggled with higher-order questions involving the description of imaging findings (61%, 28 of 46), calculation and classification (25%, 2 of 8), and application of concepts (30%, 3 of 10). Its poor performance on higher-order thinking questions was not surprising given its lack of radiology-specific pretraining.

GPT-4 was released in March 2023 in limited form to paid users, specifically claiming to have improved advanced reasoning capabilities over GPT-3.5.

In a follow-up study, GPT-4 answered 81% (121 of 150) of the same questions correctly, outperforming GPT-3.5 and exceeding the passing threshold of 70%. GPT-4 performed much better than GPT-3.5 on higher-order thinking questions (81%), more specifically those involving the description of imaging findings (85%) and application of concepts (90%).

The findings suggest that GPT-4’s claimed improved advanced reasoning capabilities translate to enhanced performance in a radiology context. They also suggest an improved contextual understanding of radiology-specific terminology, including imaging descriptions, which is critical to enable future downstream applications.                  

“Our study demonstrates an impressive improvement in the performance of ChatGPT in radiology over a short time period, highlighting the growing potential of large language models in this context,” Dr. Bhayana said.

GPT-4 showed no improvement on lower-order thinking questions (80% vs 84%) and answered 12 questions incorrectly that GPT-3.5 answered correctly, raising questions related to its reliability for information gathering.

“We were initially surprised by ChatGPT’s accurate and confident answers to some challenging radiology questions, but then equally surprised by some very illogical and inaccurate assertions,” Dr. Bhayana said. “Of course, given how these models work, the inaccurate responses should not be particularly surprising.”

ChatGPT’s dangerous tendency to produce inaccurate responses, termed hallucinations, is less frequent in GPT-4 but still limits usability in medical education and practice at present.

Both studies showed that ChatGPT used confident language consistently, even when incorrect. This is particularly dangerous if solely relied on for information, Dr. Bhayana notes, especially for novices who may not recognize confident incorrect responses as inaccurate.

“To me, this is its biggest limitation. At present, ChatGPT is best used to spark ideas, help start the medical writing process, and in data summarization. If used for quick information recall, it always needs to be fact-checked,” Dr. Bhayana said.

Image of nanocrystal
Image of nanocrystal

UK scientists grow compound semiconductor nanocrystals in a solvent to shape their quantum dots

A new method of controlling the shape of tiny particles about one-tenth of the width of a human hair could make the technology that powers our daily lives more stable and more efficient, UK scientists claim. 

The process, which transforms the structure of microscopic semiconductor materials known as quantum dots, provides the industry with opportunities to optimize optoelectronics, energy harvesting, photonics, and biomedical imaging technologies, according to the Cardiff University-led team.

Their study, funded by the Engineering and Physical Science Research Council (EPSRC) and published in Nano Letters, used a process called nano faceting – the formation of small, flat surfaces on nanoparticles – to manipulate the quantum dots into a variety of shapes called nanocrystals.

From cubes and olive-like structures to complex truncated octahedra, the international team of researchers says these nanocrystals have unique optical and electronic properties, which can be used in different types of technology.

Dr. Bo Hou, a Senior Lecturer at Cardiff University’s School of Physics and Astronomy who led the study, said: “Quantum dots have the potential to revolutionize a number of industries because of the theoretically limitless efficiencies they offer. Our study is a significant step forward in the adoption of quantum dots technology across a wide range of energy and lighting industry applications.

“With further development, we might imagine the truncated octahedra we manufactured being used for energy harvesting in solar cells, improving efficiencies beyond the capabilities of current technologies which sit at around 33%. Likewise, our nanocrystals might be used for biomedical imaging, where inefficiencies and instabilities are currently limiting their use in diagnoses and drug delivery.

“So, these technologies really are the future and for our work to play a part in accelerating their application is really exciting.”

Working out of the state-of-the-art labs at Cardiff University’s new Translational Research Hub (TRH), the team grew the compound semiconductor nanocrystals in solvent and monitored their development in real-time using supercomputer simulations and powerful microscope technology.

Dr Hou added: “Growing the semiconductors in solvent was our preferred choice because of its low carbon footprint, the potential for higher yield, and economic benefits when compared to the high temperatures and vacuum conditions needed in the traditional production.

“It also meant we were able to study the effect of solvent polarity on the shape of the nanocrystals, which could provide a means to stabilize polar surfaces with further research.”

The team is now developing image sensors and low-carbon footprint LEDs which will enable the industry to implement the quantum dot nanocrystals into their technologies to boost their resolution and energy efficiency.

Image 1: Scanning tunnelling microscopy topography of the honeycomb lattice of germanene
Image 1: Scanning tunnelling microscopy topography of the honeycomb lattice of germanene

Dutch scientists create material that paves the way for more energy-efficient electronics

Researchers from the University of Twente, a public technical university located in Enschede, Netherlands, have proved that germanene, a two-dimensional material made of germanium atoms, behaves as a topological insulator. It is the first 2D topological insulator that consists of a single element. It also has the unique ability to switch between ‘on’ and ‘off’ states, comparable to transistors. This could lead to more energy-efficient electronics.

Topological insulators are materials with the unique property of insulating electricity in their interior while conducting electricity along their edges. The conductive edges allow electrical current to flow without energy loss. “At the moment, electronic devices lose a lot of energy in the form of heat, because defects in the material increase the resistance. As a result, your mobile phone can get uncomfortably hot”, explains UT researcher Pantelis Bampoulis. While scattering at defects is allowed in normal materials, at the edges of 2D topological insulators, the scattering of electrons at defects is forbidden due to the unique topological protection mechanism. Therefore, electrical current in 2D topological insulators flows without dissipating energy. This makes them more energy-efficient than current electronic materials.

CREATING GERMANENE

Germanene is such a 2D topological insulator. “Current topological insulators consist of complex structures from different types of elements. Germanene is unique in that it’s made from just a single element”, explains Bampoulis. To create this exciting material, the researchers melted germanium together with platinum. When the mixture cooled down, a tiny layer of germanium atoms was arranged into a honeycomb lattice on top of the germanium-platinum alloy. This 2D layer of atoms is called germanene.

Image 2: Artistic illustration of the dissipationless edge channels in Germanene. Credits: Ella Marushchenko

TOPOLOGICAL TRANSISTORS

The researchers also discovered that the conducting properties of the material can be switched ‘off’ by applying an electric field. This property is unique for a topological insulator. “The possibility to switch between ‘on’ and ‘off’ states adds an exciting application case for germanene”, says Bampoulis. It paves the way for designing topological field-effect transistors. These transistors could replace traditional transistors in electronic devices. Resulting in electronics that no longer heat up.