by Gyana Swain

Patients may suffer from hallucinations of AI medical transcription tools

News

Oct 29, 20244 mins

One transcription product that relies on an AI model deletes the original audio, leaving doctors no way to check the transcriptions.

Doctors, nurse or laptop in night healthcare, planning research or surgery teamwork in wellness hospital. Talking, thinking or medical women on technology for collaboration help or life insurance app

娇色导航

An AI-powered transcription tool widely used in the medical field, has been found to hallucinate text, posing potential risks to patient safety, according to a recent academic study.

And that tool is being used in a commercial medical transcription product that, worryingly, deletes the underlying audio from which transcriptions are generated, leaving medical staff no way to verify their accuracy, on Saturday.

OpenAI’s Whisper, the underlying AI tool, is integrated into medical transcription services from Nabla, which the company says are used by over 30,000 clinicians at more than 70 organizations. Nabla told AP its product had been used to transcribe around 7 million medical visits.

Whisper is also embedded in Microsoft’s and Oracle’s cloud computing platforms and integrated with certain versions of ChatGPT. Despite its wide adoption, researchers are now raising serious concerns about its accuracy.

In a study conducted by researchers from Cornell University, the University of Washington, and others, researchers discovered that Whisper “hallucinated” in about 1.4% of its transcriptions, sometimes inventing entire sentences, nonsensical phrases, or even dangerous content, including violent and racially charged remarks.

The study, Whisper: Speech-to-Text Hallucination Harms, found that Whisper often inserted phrases during moments of silence in medical conversations, particularly when transcribing patients with aphasia, a condition that affects language and speech patterns.

In these cases, the AI sometimes fabricated unrelated phrases, such as “Thank you for watching!” — likely due to its training on a large dataset of YouTube videos. In more concerning instances, it invented fictional medications like “hyperactivated antibiotics” and even injected racial commentary into transcripts, AP reported.

For example, Whisper correctly transcribed a speaker’s reference to “two other girls and one lady” but added “which were Black,” despite no such racial context in the original conversation.

Whisper is not the only AI model that generates such errors. In a separate study, researchers found that were also prone to hallucinations.

Harmful hallucinations

Whisper’s errors are a result of the AI model creating patterns based on its training data that do not exist in the samples, leading to nonsensical or fabricated outputs. This phenomenon, known as hallucination, has been documented across various AI models. According to the researchers, 40% of Whisper’s hallucinations could have harmful consequences, as the AI misinterpreted or misrepresented the speaker’s intent in several cases.

Although Whisper’s creators have claimed that the tool possesses “,” multiple studies have shown otherwise.

In one study of public meetings cited by AP, a researcher from the University of Michigan found hallucinations in eight of every 10 audio transcriptions. Another machine learning engineer reported hallucinations in about half of over 100 hours of transcriptions inspected. A third study identified hallucinations in nearly every one of 26,000 transcripts generated using Whisper, AP said.

Microsoft, which offers Whisper as part of its cloud computing services, in the solutions they offer to “obtain appropriate legal advice to review your solution, particularly if you will use it in sensitive or high-risk applications.”

Despite this, many healthcare providers are already adopting it for transcribing patient consultations.

Nabla, the company integrating Whisper into its medical transcription tools, has acknowledged the hallucination issue and is reportedly working to address it, the AP report said.

With over 4.2 million downloads on the open-source AI platform in the past month, Whisper has become one of the most popular speech recognition models. However, as its usage spreads, researchers are warning against its adoption in critical sectors like healthcare due to the serious implications of its errors.

While other AI transcription tools also make mistakes, the frequency and potential harm caused by Whisper’s hallucinations are raising red flags. Similar AI models, such as Google’s AI Overviews, have faced criticism for producing similarly outlandish outputs, such as recommending non-toxic glue to keep cheese from falling off pizza.

As the healthcare industry increasingly integrates AI solutions, the risks posed by such hallucinations demand immediate attention to avoid harmful consequences for patients.

by Gyana Swain

Gyana Swain is a seasoned technology journalist with over 20 years' experience covering the telecom and IT space. He is a consulting editor with VARINDIA and earlier in his career, he held editorial positions at CyberMedia, PTI, 9dot9 Media, and Dennis Publishing. A published author of two books, he combines industry insight with narrative depth. Outside of work, he’s a keen traveler and cricket enthusiast. He earned a B.S. degree from Utkal University.

Show me more

Bentley Motors 娇色导航Kirsty Mason on building the skills foundation for AI adoption

Aug 6, 202519 mins

娇色导航Leadership Live

Is the AI skills shortage a threat to IT leaders? | What IT Leaders Want, Ep. 10

Aug 4, 202527 mins

CIOCyberattacksGenerative AI

Slalom: Building hyper-personalized connection with generative AI

Jul 30, 202522 mins

Generative AI

RelationalAI transforms enterprise data in Snowflake via GenAI, knowledge graphs

Aug 6, 202524 mins

Data IntegrationData QualityGenerative AI

Bentley Motors 娇色导航Kirsty Mason on building the skills foundation for AI adoption

Aug 6, 202519 mins

娇色导航Leadership Live

Is the AI skills shortage a threat to IT leaders? | What IT Leaders Want, Ep. 10

Aug 4, 202527 mins

CIOCyberattacksGenerative AI

娇色导航

Africa

Americas

Asia

Europe

Oceania

娇色导航

About

Policies

Our Network

More

Patients may suffer from hallucinations of AI medical transcription tools

One transcription product that relies on an AI model deletes the original audio, leaving doctors no way to check the transcriptions.

娇色导航

Harmful hallucinations

More from this author

Trump’s semiconductor tariffs threaten 娇色导航budgets with up to 80% cost surge

Indian IT outsourcing layoffs put service stability on the line for CIOs

Pentagon’s $11B IT modernization struggles with cost overruns, delays, and cybersecurity gaps

IBM aims to set industry standard for enterprise AI with ITBench SaaS launch

Anthropic’s and OpenAI’s new AI education initiatives offer hope for enterprise knowledge retention

SAP adoption surges in Europe as enterprises embrace cloud

How Birmingham’s $48M Oracle ERP project turned into an epic failure

Trump repeals Biden’s AI oversight order, shifts focus to innovation-driven policies

Show me more

Ushering in a new era of mainframe modernization

Lost in plain sight: The quiet collapse of your transformation

Executives love their AI rollouts, but employees aren’t buying it

Bentley Motors 娇色导航Kirsty Mason on building the skills foundation for AI adoption

Is the AI skills shortage a threat to IT leaders? | What IT Leaders Want, Ep. 10

Slalom: Building hyper-personalized connection with generative AI

RelationalAI transforms enterprise data in Snowflake via GenAI, knowledge graphs

Bentley Motors 娇色导航Kirsty Mason on building the skills foundation for AI adoption

Is the AI skills shortage a threat to IT leaders? | What IT Leaders Want, Ep. 10

娇色导航

Patients may suffer from hallucinations of AI medical transcription tools

One transcription product that relies on an AI model deletes the original audio, leaving doctors no way to check the transcriptions.

娇色导航

Harmful hallucinations

From our editors straight to your inbox

More from this author

Trump’s semiconductor tariffs threaten 娇色导航budgets with up to 80% cost surge

Indian IT outsourcing layoffs put service stability on the line for CIOs

Pentagon’s $11B IT modernization struggles with cost overruns, delays, and cybersecurity gaps

IBM aims to set industry standard for enterprise AI with ITBench SaaS launch

Anthropic’s and OpenAI’s new AI education initiatives offer hope for enterprise knowledge retention

SAP adoption surges in Europe as enterprises embrace cloud

How Birmingham’s $48M Oracle ERP project turned into an epic failure

Trump repeals Biden’s AI oversight order, shifts focus to innovation-driven policies

Show me more

Ushering in a new era of mainframe modernization

Lost in plain sight: The quiet collapse of your transformation

Executives love their AI rollouts, but employees aren’t buying it

Bentley Motors 娇色导航Kirsty Mason on building the skills foundation for AI adoption

Is the AI skills shortage a threat to IT leaders? | What IT Leaders Want, Ep. 10

Slalom: Building hyper-personalized connection with generative AI

RelationalAI transforms enterprise data in Snowflake via GenAI, knowledge graphs

Bentley Motors 娇色导航Kirsty Mason on building the skills foundation for AI adoption

Is the AI skills shortage a threat to IT leaders? | What IT Leaders Want, Ep. 10