‘Tuned’ version of Gemini could change everything in medicine

-

All about Google

All about Artificial intelligence

Google Research and Google’s AI research laboratory, DeepMind, detailed the scope of Med-Gemini, a “family” of artificial intelligence (AI) models specialized in medicine. And the data is promising.

Capabilities of Gemini models in medicine

  • Google Research and DeepMind detailed advances in Med-Gemini, a medically specialized version of big tech’s Gemini multimodal AI models;
  • Researchers tested Med-Gemini’s capabilities under various conditions, including answering representative medical questions from the US Medical Licensing Examination (USMLE). The model also used information from the web to increase the accuracy of responses through MedQA-RS, a dataset that combines reasoning and web research;
  • Med-Gemini has been tested on 14 medical benchmarks, setting a new standard of excellence in ten of them and outperforming previous models like the Med-PaLM 2. This includes remarkable performance on multimodal tasks like the New England Journal Medical Imaging Challenge of Medicine;
  • In a practical application, Med-Gemini was able to correctly identify a skin nodule from an image provided by a user, demonstrating its ability to perform accurate diagnoses and offer recommendations. Additionally, researchers consider incorporating responsible AI principles into model development to ensure privacy, fairness, and avoid amplification of historical biases;
  • Med-Gemini represents a significant innovation at the intersection of AI and medicine, promising to transform diagnostic and therapeutic capabilities in healthcare through its ability to process and understand large volumes of complex, multimodal data (text, image, audio).

Google’s Gemini models are a new generation of multimodal AI models – they can process information from different modalities (text, images, videos and audio). The Med-Gemini has the advantages of the Gemini models, but tuned for medicine (hence the name).

Read more:

Researchers detail Gemini’s reach in medicine

Person using cell phone with Gemini logo on the back
(Image: rafares/Shutterstock)

The researchers tested these adaptations focused on medicine and included their results in an article available on the platform arXiv. Check out the most impressive excerpts from the document (58 pages long, it must be said) below.

Self-training and web research

Reaching a diagnosis and formulating a treatment plan requires doctors to combine their own medical knowledge with a range of other relevant information (patient symptoms; medical, surgical, and social history; laboratory results).

So with Med-Gemini, Google has included access to web-based search to enable more advanced clinical reasoning.

Like many medically focused large language models (LLMs), Med-Gemini was trained on MedQA, multiple-choice questions representative of US Medical Licensing Examination (USMLE) questions designed to test medical knowledge and reasoning in diverse scenarios.

However, Google also developed two datasets for the model. Check it out below:

  • MedQA-R (Reasoning): extends MedQA with synthetically generated reasoning explanations called “Chain-of-Thoughts” (CoTs);
  • MedQA-RS (Reasoning and Research): provides the model with instructions to use web search results as additional context to improve the accuracy of responses.

In the case of MedQA-RS, it works like this: if a medical question leads to an uncertain answer, the model is asked to perform a web search for more information to resolve the uncertainty.

Image of a robot caring for a patient to illustrate artificial intelligence in medicine
(Image: Pedro Spadoni via DALL-E/Olhar Digital)

Med-Gemini was tested on 14 medical benchmarks and established a new standard of excellence (SoTA) in ten, outperforming the GPT-4 family of models on all benchmarks.

On the MedQA (USMLE) benchmark, Med-Gemini achieved 91.1% accuracy using its uncertainty-driven search strategy, beating Google’s previous medical LLM, Med-PaLM 2, by 4.5%.

In seven multimodal benchmarks, including the New England Journal of Medicine (NEJM) imaging challenge (images of challenging clinical cases from which a diagnosis is made from a list of ten), Med-Gemini performed better than the GPT-4 by an average relative margin of 44.5%.

“While the results are promising, significant additional research is needed,” the researchers wrote. “For example, we do not consider restricting search results to more authoritative medical sources, using multimodal search retrieval, or analyzing the accuracy and relevance of search results and the quality of citations. Furthermore, it remains to be seen whether LLM minors can also be taught to make use of web research. We leave these explorations for future work.”

Searching for information in extensive medical records

(Image: Antonio Marca/Shutterstock)

Electronic health records (EHRs) can be long, but doctors need to be aware of what they contain. And documents often contain textual similarities, spelling errors, acronyms and synonyms. In other words, elements that can confuse the AI.

To test Med-Gemini’s ability to understand and reason from long-context medical information, researchers performed a “needle in a haystack” task using a large public database, the Medical Information Mart for Intensive Care (MIMIC- III).

The goal was for the model to retrieve the relevant mention of a rare and subtle medical condition, symptom, or procedure (the “needle”) in a large collection of clinical notes in the EHR (the “haystack”).

(Image: LALAKA/Shutterstock)

In total, 200 examples were selected. And each example consisted of a collection of de-identified EHR notes from 44 ICU patients with long medical histories.

The “needle in the haystack” task had two steps. First, Med-Gemini needed to identify mentions of the specified medical problem in the records. Then, the model had to evaluate the relevance of all mentions, categorize them and conclude whether the patient had a history of that problem, providing a clear rationale for its decision.

Compared to the SoTA method, Med-Gemini performed well on the task. Google’s AI scored 0.77 in accuracy compared to the SoTA method (0.85).

“Probably the most notable aspect of Med-Gemini is its long-context processing capabilities because they open new frontiers of performance and application possibilities that are unprecedented and previously unfeasible for medical AI systems,” the researchers wrote.

Conversations with Med-Gemini

Cell phone with Gemini logo placed on notebook keyboard
(Image: Rafapress/Shutterstock)

In a real-world utility test, Med-Gemini was asked about an itchy skin lump by a patient user.

After requesting an image, the model asked appropriate follow-up questions and correctly diagnosed the rare lesion, recommending what the user should do next.

Med-Gemini was also asked to interpret a chest X-ray for a doctor while they waited for a formal radiologist report and formulate a plain English version of the report that could be provided to the patient.

“The multimodal conversational capabilities of Med-Gemini-M 1.5 are promising given that they are achieved without any specific fine-tuning of medical dialogue,” the researchers wrote. “Such capabilities enable seamless and natural interactions between people, clinicians and AI systems.”

However, the researchers acknowledge that more work is needed.

Researchers explore next steps for Med-Gemini

(Image: metamorworks/Shutterstock)

The researchers acknowledge that there is much more work to be done. They plan to incorporate responsible AI principles, including privacy and equity, throughout the model development process.

“Privacy considerations, in particular, need to be rooted in existing healthcare policies and regulations that govern and protect patient information,” the researchers wrote.

They added: “Equity is another area that may require attention, as there is a risk that AI systems in healthcare may unintentionally reflect or amplify historical biases and inequities, potentially leading to inequitable model performance and harmful outcomes for marginalized groups. ”

However, ultimately Med-Gemini is seen as a tool for good. “Large multimodal language models usher in a new era of possibilities for healthcare and medicine,” the researchers said.


The article is in Portuguese

Tags: Tuned version Gemini change medicine

-

-

PREV Samsung’s Galaxy Tab A9+ has its lowest price ever; enjoy
NEXT Global warming could kill all of humanity in 250 million years
-

-

-