A Lay Introduction to The State of AI in Medicine, July 2025
Bradford Cornell is Emeritus Professor of Financial Economics, Anderson Graduate School of Management, UCLA. D.A. Wallach is co-founder and General Partner of Time BioVentures.
On November 30, 2022, when ChatGPT was initially launched to the public, few were predicting that within a year millions of patients would be using it to discuss their medical conditions or that leading technologists would be predicting dramatic acceleration of drug discovery based on the application of AI. But that is what happened. Within five days of the launch, ChatGPT crossed 1 million users and within six months had more than 200 million active users, making it the fastest-growing consumer app in history.
The freakishly human-sounding chatbot intensified ongoing debates regarding exactly what AI can accomplish and how soon it will achieve so-called artificial general intelligence (AGI). Our purpose here is not to enter that debate but to argue that AI, as it exists today, can already make immense contributions to the two pillars of healthcare: 1) the clinical practice of medicine and 2) biotechnology research to develop new treatments. Any improvements in this technology, which are certain to be forthcoming, will be icing on the cake and strengthen our arguments.
Much of the recent AI discourse is preoccupied with the contested “reasoning” capabilities of current models. In a widely discussed paper, Apple researchers Shojaee et al. (2025) concluded that “Our findings reveal fundamental limitations in current models: despite sophisticated self-reflection mechanisms, these models fail to develop generalizable reasoning capabilities beyond certain complexity thresholds.” We are happy to accept this conclusion, but view it as a philosophical point. The nature of bona fide consciousness or generalizable reasoning is a deep scientific and metaphysical topic. But AI need not meet either criteria in order to deliver life-changing utility to patients. To the extent that AI can simply mimic the reasoning behavior of trained human clinicians, it stands to democratize high-quality medical knowledge and judgment to every person on the planet. As with self-driving cars and other automation technologies, the benchmark for medical AI should be the error-prone human care we have today, not an idealized human care model that we have never enjoyed and never will.
Consumer Medical AIs Today
Large language models (LLMs) are adept at analyzing massive amounts of data in the “training sets” they are fed. A model like ChatGPT has devoured virtually all written content that is digitally available, which likely includes the entire medical school curriculum in every specialty, all published research in medical journals, online reports of experimental findings, and transcripts from scientific meetings. Just as a human physician’s command of English is not only the result of studying medicine, so too is an LLM’s ability to generate speech a product of all the language it has encountered, a mere subset of which is the medical corpus. Therefore, LLMs are capable of generating domain specific content by leveraging their familiarity with the statistical relationships of words with each other across the language writ large.
In other words, as an LLM “speaks,” it is predicting what word is most likely to come next based upon the co-occurrence of words generally. The competence of this approach at producing cogent and seemingly intelligent output is surprising to say the least, but builds upon everything humankind has ever written, so is as much a tribute to our civilization as it is to the AI.
The result, so far, are chat interfaces available 24 x 7 to respond to patient inquiries in an organized, succinct, and friendly fashion. They can also remember interactions with patients and, thereby, establish rapport based on an understanding of a patient’s medical history and personal concerns. What’s more, so-called multimodal models can increasingly review non-language inputs such as MRI images or ECG readings, and seamlessly incorporate this data into their analyses.
Professor Cornell, one of the authors, is nearly 78 years old and faces a number of medical challenges. He reports that the models have been a tremendous aid in educating him about his medical issues and possible treatments. Access to the medical information he needs is readily available to him through a back and forth discussion using the model prompts. Like other patients and physicians, Professor Cornell does not need AI to solve complex reasoning problems of the type described by the aforementioned Apple researchers. He simply needs the system to shift through the massive amount of available medical data and then identify the information responsive to his prompts and summarize that information in clear, understandable language. Current large language models such as ChatGPT 4.0 already do a remarkable job of this.
In an op-ed article published in the Wall Street Journal, Rosenblatt (2025) describes his ordeal dealing with a vague chronic pain. He writes, “My days blurred with crushing headaches, stomach spasms and overwhelming fatigue, even though my lab tests read “normal” for a 30-year-old. Five months on a carnivore diet, medications, water-only fasts and Qigong did nothing to lift the fog.” In an attempt to find help he cycled through countless specialists. He notes that, “A neurologist focused on my head pain but not my diet; a gastroenterologist examined gut inflammation and ignored my migraines; an ear, nose and throat doctor probed sinus inflammation, missing other factors. Each offered partial help, but no one connected all the dots.” No one, that is, except for AI. Instead of being told, “Try this drug, come back in four months,” AI offered immediate insights into medicine and lifestyle tweaks that, taken together, made a big difference. As Rosenblatt observed, “AI can weave together oncology, endocrinology, nutrition and mental health, reflecting that the body is one interconnected system.”
Rosenblatt’s experience is hardly unique. The internet is flooded with similar stories. In a recent example Kate Rouch (2025), coincidentally a marketing director at OpenAI, reported that “I've also leaned on ChatGPT throughout treatment--to explain cancer in an age-appropriate way to my kids, to manage chemo side effects, to craft custom meditations during the many scary times.”
Expanding on this theme, Kate Pickert (2025) makes the claim that in some cases AI may be better than doctors at the most human part of medicine. She relates the story of Rachel Stoll, who suffered from a rare disease called Cyclic Cushing’s Syndrome. As Pickert recounts it, in her appointments with doctors, Stoll always tried to keep her questions short because of the time limits on her appointments. But with ChatGPT there was no time limit. She could keep asking questions until she felt she understood. In addition, the chatbot adopted a softer, more sympathetic tone than that of her doctors whose impatience often upset Stoll. Ghat GPT’s answers included phrases like, “That must be so frustrating,” and “I’m sorry,” giving Stoll the sense that it was responding to her feelings as well as her queries. It ended one session by saying, “I know it’s been a long battle, but you’re clearly persistent and informed – which is exactly what it takes to get the right care. If you ever need help brainstorming ways to advocate for yourself or interpret test results, I’m here. Hang in there!” Busy doctors simply do not have the time to compete with this always-on, unencumbered customer service experience. That is undoubtedly a major component of what patients desire, and we expect them to continue voting with their smartphones for this new mode of care.
Clinical AIs
Kate Pickert’s article goes on to quote Dr. Bernard Chang, dean of medical education at Harvard Medical School, who argues that “AI is not going to replace physicians, but physicians who know how to use AI are going to be at the top of their game going forward.”
In a controlled study, Chen et al. (2025) found that physicians using an LLM scored significantly higher compared to those using conventional resources based on a series of tests using typical physician tasks. The authors concluded that LLM assistance can improve physician management reasoning in complex clinical vignettes. Critics of such studies point out that the highly curated cases used to test these models are unrealistic. In the real world, patients often show up in the ER or at their physician’s office with vague and imprecise reports of their symptoms. Skilled diagnosticians are masterful at eliciting relevant information through deliberate conversation and examination, a process that involves subtle emotional intelligence as well as highly trained algorithmic structured reasoning. Whereas an LLM may land on the correct questions and conclusions through its opaque approach, a physician is trained to proceed through a Bayesian inference process that, when faithfully followed, should result in accurate diagnoses with a high degree of reliability.
Gerald Loeb (2021) has argued that a software co-pilot that enforces this process for clinicians has a greater likelihood of supporting high-quality care than do LLMs. This is reminiscent of the debate that has raged on since the 1960s regarding the comparative advantages of so-called “expert systems” vs “statistical” AIs like LLMs. In fact, Internist-I, a 1970s medical assistant, was among the first and most elaborate aspirational AI systems developed, and typified the expert systems ethos (https://en.wikipedia.org/wiki/Internist-I). In our view, a synthesis of these conceptual approaches will ultimately deliver the best results. There is no doubt that error avoidance, transparency, and reliability are critical attributes of an AI doctor or doctor’s assistant. This may indeed require a marriage of LLMs and more structured systems like that which Loeb proposes.
That said, the human-centric care model we have inherited is in desperate need of the leverage that AI will offer. The American Medical Association (2024) reports that physicians—nearly half of whom report burnout—juggle thousands of patients each. These doctors don’t have time to adequately meet the demands of their patients today, and human minds alone will never be capable of parsing emerging forms of patient data like months of smart-watch logs. Furthermore, it is not simply that doctors don’t have enough hours in the day. We do not have enough doctors in the first place. The Association of American Medical Colleges (2021) forecasts a shortfall of 83,000 physicians by 2030. In the poorer parts of the globe, where doctors are a rarity, the situation is much more dire.
At least initially, AI may place an added burden on physicians. They will have to both understand their patient’s condition and deal with what the patient has learned from repeated interactions with AI, some of which the patient may not have correctly understood. It is easy to see how a physician could be aggravated competing with a machine available to the patient on a 24x7 basis.
An entire book could be written about the other challenges of integrating patient-facing AI into medical institutions that have been built-up over centuries. Nonetheless, it is worth identifying what are likely to be some of the major obstacles to come:
Integration with Existing Systems: Hospitals and clinics often rely on complex legacy systems like Electronic Health Records (EHRs). Integrating new AI solutions with these existing systems smoothly and efficiently could be a significant technical hurdle.
Explainability and Transparency: Many advanced AI models, particularly deep learning systems, are considered "black boxes," making it difficult to understand how they arrive at their decisions. This lack of transparency can be a major barrier to adoption for clinicians and regulators who need to understand the reasoning behind AI recommendations to ensure patient safety and build trust.
Clinician Resistance and Adoption: Healthcare professionals may be hesitant to adopt AI due to concerns about job displacement, lack of trust in the technology, fear of errors, and the need for significant training and education. Involving clinicians in the design and implementation process and providing comprehensive training can help address these concerns.
Patient Acceptance and Trust: Patients may have varying attitudes towards AI in healthcare. Building trust requires transparency and clear communication about how AI will be used and how patient data will be protected. Patients should also be given the option to opt out of AI-driven care.
Accountability and Liability: Determining responsibility when an AI system makes an error can be challenging. Legal frameworks and clear guidelines on accountability are needed to address this issue.
Impact on the Doctor-Patient Relationship: While AI can assist with diagnoses and treatment planning, maintaining a strong human connection between patients and healthcare providers is crucial for effective care. AI should be seen as a tool to augment, not replace, human interaction. In addition, if applied inappropriately, the technology is capable of significant intrusions into privacy.
Despite these hurdles, usage of AI is growing rapidly among both physicians and patients. A summary prepared by Google Gemini (an AI model) reports that nearly two-thirds (66%) of physicians surveyed in 2024 reported using AI, a substantial increase from 38% in the previous year. Furthermore, 65% of US hospitals reported using AI or predictive models in 2023, often integrated with their electronic health records (EHRs). In particular, OpenEvidence, a physician-tailored AI uniquely trained on proprietary journals and other sources, appears to be used by upwards of 1/3 of physicians. With respect to patients, about 17% of adults use AI chatbots at least once a month to find health information and advice, with this number rising to 25% among adults under 30.
Early adopters of AI in healthcare are seeing promising results, and a majority of Americans (61%) agree that a main benefit of using AI in healthcare is to diagnose and detect health conditions. However, some remain skeptical, with 33% believing AI could lead to worse patient outcomes. It is incumbent upon medical AI innovators to continue generating hard evidence in controlled studies to confirm or, ideally, disprove these fears. Major corporations including Microsoft and Google are producing a rapid succession of papers in this spirit, but they of course have a commercial interest in this space, requiring robust confirmation of their findings from impartial academic researchers.
AI Cures
The other major area in which AI will have a transformative impact is biotechnology research. As Demis Hassabis, (2025a), states in his introduction to the founding of Isomorphic Labs, the Google-owned biotechnology startup he oversees, “At its most fundamental level, I think biology can be thought of as an information processing system, albeit an extraordinarily complex and dynamic one. Taking this perspective implies there may be a common underlying structure between biology and information science - an isomorphic mapping between the two - hence the name of the company. Biology is likely far too complex and messy to ever be encapsulated as a simple set of neat mathematical equations. But just as mathematics turned out to be the right description language for physics, biology may turn out to be the perfect type of regime for the application of AI.”
Any doubts about the potential of AI to transform biological research were laid to rest by AlphaFold 2, a model developed by Hassabis and John Jumper, who shared the 2024 Nobel Prize in Chemistry for this work. Whereas it had taken humanity more than fifty years to determine the structure of about 150,000 proteins, AlphaFold 2 was able to determine the structure of 216,000,000 proteins, covering all those currently known, in a matter of months. Google then made all the data available freely to researchers worldwide. As with other gobsmacking AI proofs of concept, AlphaFold owed a great deal to painstaking molecular biology experiments that had been done by scientists over prior decades, and which supplied its training data. Related to Hassabis’s previous quote, the “regime” of AI was uniquely capable of learning regularities in the experimental data that had theretofore yielded few parsimonious “laws” of the sort that dominate physics, for example.
Determining the structure of proteins is only the beginning. Understanding health and disease requires demystifying the myriad of molecular interactions that constitute the life of the cell, and ultimately, the whole organism. As Isomorphic Labs (2025) puts it, “Inside every plant, animal, and human cell are billions of molecular machines. They’re made up of proteins, DNA, and other molecules, but no single piece works on its own. Only by seeing how they interact together, across millions of types of combinations, can we start to truly understand life’s processes. In a paper published in Nature, we introduced AlphaFold 3, a revolutionary model that can predict the structure and interactions of all life’s molecules with unprecedented accuracy. For the interactions of proteins with other molecule types we see at least a 50% improvement compared with existing prediction methods, and for some important categories of interaction we have doubled prediction accuracy.”
Priscilla Chan (2025), the head of the Chan Zuckerberg Initiative (CZI), has echoed a similar theme. She says, “Using AI to understand biology means teaching models how to speak the language of cells. . . We recently released our first AI model that can decipher that language. It’s called TranscriptFormer. If you feed it cell atlas data, TranscriptFormer can tell you what kinds of cells you’re looking at and whether they’re healthy or sick — and in cases where they are sick, it can tell you what those cells are doing to defend themselves.” As CZI puts it, TranscriptFormer represents a significant advancement in biological models to help discover how cells work — across different tissues, in different states like infection or disease, and especially across species. It is an important step towards virtual cell models, where virtual experiments, rather than time-intensive lab experiments, can help speed up the discovery process and give more immediate feedback on research questions.
Ms. Chan states that CZI’s mission in science is to help cure, prevent and manage all disease by the end of this century. Hassabis (2025b) made an even more dramatic claim, arguing during a 60 Minutes interview that stating that he thought AI could cure most major diseases within the next decade.
The points made by Hassabis and Chan are illustrative of the optimism and ambition that is driving this frontier of science. But their goals will not come cheaply. Training models to recapitulate life requires enormous data that we do not yet possess, and enormous computer processing power to train on it. Models like AlphaFold 3 and TranscriptFormer require large amounts of money and compute. And that is in their current form. Future enhancements will require more of both. Isomorphic Labs was founded with a $600 million infusion. The Chan-Zuckerberg Initiative has received large grants from its founders and has access to massive compute power as a result of this funding.
AlphaFold 2 was lucky in that it could be trained on a database of protein structures that already existed. To build models capturing other aspects of life, institutes and firms are having to develop their own training sets. CZI has taken multiple steps in this direction by funding projects like the Human Cell Atlas (HCA), which hosts petabytes of single-cell sequencing data — covering millions of individual cells. When fed cell atlas data, TranscriptFormer can determine what kinds of cells you are looking at and whether they’re healthy or sick — and in cases where they are sick, it can tell you what those cells are doing to defend themselves.
A handful of start-ups including Insitro and Xaira, have raised billions of dollars to invest in proprietary datasets. But most biotechnology companies do not have, and likely never will have, resources approaching these. It remains unclear to what extent “foundation models” of basic biology will be open-sourced for the benefit of the entire academic and industrial Biotech sectors vs. kept privileged as competitive tools for individual firms.
Fortunately, non-profit organizations like CZI are making much of their technology available to the scientific community. Recently, CZI (2025) announced that “We will demonstrate our commitment to openly sharing resources for modeling, evaluating and analysis of cellular data with the scientific community.”
The Unsolved Bottleneck
As exciting as cellular simulation may be, the biggest obstacle to approval of new treatments is likely to be clinical trials, not successful cellular simulation. Currently, the average clinical trial period is 6 to 7 years including approximately 1 year in Phase I, 2 years in Phase II and up to 4 years in Phase III. After Phase III there’s often an additional 6–12 months for regulatory review and approval. In short, it is exceedingly costly, in time and dollars, to find out whether drugs are safe and effective.
The question is whether this process could be significantly shortened by reliable cellular simulation. Given the immense complexity of the human body and immune system, a complete simulation of the body would seem to be required to assess the risks and efficacy of new drugs. That is a daunting task. As we’ve argued, models are highly dependent on the data from which they are trained. As long as we do not have a shortcut for generating millions of real human drug experiments and corresponding outcomes, we are unclear as to how models capturing this whole organism biology can be trained. Thus, for the time being, it appears that traditional clinical trials will be required to assess new AI-developed treatments. If so, forecasts for curing most major diseases in the next decade, such as those put forth by Hassabis, are likely to prove overly optimistic.
What we are witnessing, nonetheless, is a whole new toolkit for molecular biology, the dawn of an era wherein cell-based experimentation is likely to be far faster, cheaper, and more scalable. This will likely give rise to an unprecedented “gusher” of new drug concepts rooted in virtual experiments. The challenge will be to efficiently allocate resources to the development of the most promising of these concepts. Today, biotech investors and Pharmaceutical firms manage this resource allocation. Perhaps AI will increasingly play a role in this process, as well, reducing the risk and enhancing the returns associated with drug discovery and development.
In conclusion, whatever the limitations of current AIs for creative reasoning and solving complex conceptual problems, they are already capable of making tremendous contributions to the provision of healthcare and biotechnology research. In care delivery, the impediments to taking full advantage of AI are primarily human, not technological. Our sclerotic medical institutions and practices, built up over centuries, hamstring the speed of innovation. But it is a big world, and we may see “leapfrogging” of the advanced industrial societies by poorer nations whose dire healthcare deficits engender agile and energetic uptake of medical AI. In biotechnology research, we are in the early innings of a revolution that hopefully leads to cures and life extension that we can barely imagine. But our sober assessment is that the decade ahead will primarily yield breakthroughs in basic molecular research, leaving drug development roughly as vexing and specialized as it is today until clinical trials are obviated by future invention.
References
American Medical Association, 2021, https://www.ama-assn.org/practice-management/physician-health/physician-burnout-rate-drops-below-50-first-time-4-years.
Association of American Medical Colleges, 2021, The Complexities of Physician Supply and Demand: Projections from 2019 to 2034, HIS Markit Ltd.
Chan, Priscilla, (2025), Why We’re Going All In on Biology and AI, https://chanzuckerberg.com/blog/All-in-on-biology-and-ai/?utm_campaign=cziquarterlynewsletter&utm_medium=email&_hsenc=p2anqtz-8hhnraombvfjnawbx_7wp3opvva2kslduze0pnxna4rxoyy7inax0dtzwdpxuwmquv3psm5mmdffu5isfl6oac-wxndkra5j4pzzzxvopjgzhiky0&_hsmi=368306185&utm_content=368306185&utm_source=hs_email.
Chan Zuckerberg Initiative, 2025, Accelerating Biology with AI,
https://virtualcellmodels.cziscience.com/?utm_source=czi&utm_campaign=MVP_launch&utm_medium=website
Chen, Jonathan H., et al., 2025, GPT-4 Assistance for Improvement of Physician Performance on Patient Care Tasks: A Randomized Controlled Trial, Nature Medicine, February 05.
Google Gemini, 2025, Patient Interaction with AI.
Hassabis, Demis, 2025a, Introducing Isomorphic Labs, https://www.isomorphiclabs.com/articles/introducing-isomorphic-labs.
Hassabis, Demis, 2025b, AI Could Cure Most Major Diseases with a Decade,
Isomorphic Labs, 2025, AlphaFold 3 predicts the structure and interactions of all of life’s molecules. AlphaFold 3 predicts the structure and interactions of all of life’s molecules - Isomorphic Labs.
Pickert, Kate, 2025, Why AI Is Better than Doctors at the Most Human Part of Medicine, Bloomberg, April 11.
Rouch, Kate, 2025, Leaning on AI, https://x.com/kate_rouch/status/1931075872481743289.
Rosenblatt, Tom, 2025, AI Helped Heal My Chronic Pain, Wall Street Journal, May 13, op-ed page.
Shojacee, Parshin, Iman Mizadeh, Keivan Alizadeh, Maxwell Horton, Samy Bengio and Hehradad Farajtabar, 2025, The Illusion of Thinking: Understand the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity, Apple, 1-30.