Should an AI copy of you help decide if you live or die?

For more than a decade, researchers have wondered whether artificial intelligence could help predict what incapacitated patients might want when doctors must make life-or-death decisions on their behalf.

It remains one of the most high-stakes questions in health care AI today. But as AI improves, some experts increasingly see it as inevitable that digital “clones” of patients could one day aid family members, doctors, and ethics boards in making end-of-life decisions that are aligned with a patient’s values and goals.

Ars spoke with experts conducting or closely monitoring this research who confirmed that no hospital has yet deployed so-called “AI surrogates.” But AI researcher Muhammad Aurangzeb Ahmad is aiming to change that, taking the first steps toward piloting AI surrogates at a US medical facility.

“This is very brand new, so very few people are working on it,” Ahmad told Ars.

Ahmad is a resident fellow working with trauma department faculty at the University of Washington’s UW Medicine. His research is based at Harborview Medical Center in Seattle, a public hospital in the UW Medicine health system. UW Medicine is integrated with “one of the world’s largest medical research programs” to pursue its mission of improving public health outcomes, UW’s website says.

UW wasn’t specifically seeking a fellow to experiment with AI surrogates, Ahmad told Ars. But since his project proposal was accepted, he has spent most of this year “in the conceptual phase,” working toward testing the accuracy of AI models based on Harborview patient data.

The main limitation of this testing, Ahmad said, is that he can only verify the accuracy of his models if patients survive and can later confirm that the model made the right choice. But this is just the first step, he said. The accuracy testing could then expand to other facilities in the network, with the aim of developing AI surrogates that can accurately predict patient preferences about “two-thirds” of the time.

Currently, Ahmad’s models are focused on analyzing data that Harborview already collects, such as injury severity, medical history, prior medical choices, and demographic information.

“We use that information, feed it to a machine learning predictive model, and then in the retrospective data, we observe how well the model is doing,” Ahmad said.

No patient has yet interacted with Ahmad’s models, he confirmed. UW Medicine spokesperson Susan Gregg told Ars there’s “considerable work to complete prior to launch,” and the system “would be approved only after a multiple-stage review process.”

“We have not enrolled any patients at Harborview,” Ahmad said. “We are still at the phase of defining the scope and what theoretical considerations to take into account. It will be some time before it gets off the ground, given the challenges involved.”

In the future, though, Ahmad envisions models that would also analyze textual data, perhaps from patient-approved recorded conversations with their doctors, to inform their AI copy’s predictions. In that world, trusted human surrogates, such as family members, could provide other textual data from chats or texts with the patient. In the technology’s most “ideal” form, Ahmad sees patients interacting with AI systems throughout their lives, providing feedback to refine models as the patients age through the health system.

“It takes time to get the relevant data,” Ahmad said.

Before patients could begin interacting with AI surrogates, any human subject testing would need to be approved by an institutional review board (IRB), Ahmad said.

Ultimately, he expects that AI surrogates won’t be a perfect model but rather a set of rigorously tested systems that doctors and loved ones can consult when assessing all the known information about what a patient would want in critical moments.

Whether hospitals would ever adopt such a system is unclear. “Within this space, practitioners are more conservative, and I would even argue that’s rightfully so,” Ahmad said.

Gregg told Ars that UW Medicine supports the “thoughtful exploration of innovative ideas, such as the potential responsible and transparent use of AI surrogates in end-of-life care,” as they reflect “our commitment to advancing both science and compassion in medicine.”

“While end-of-life decision-making represents a particularly complex area, we view these decisions as essential to addressing important questions, such as how to best honor patient wishes when they may be unable to communicate them directly or have no next of kin to do so on their behalf,” Gregg said.

Is AI a bad fit for patients with no human surrogates?

It has always been hard for doctors to determine what patients want when they can’t speak for themselves. A patient may refuse to be put on a ventilator or receive dialysis or cardiopulmonary resuscitation (CPR) if, for example, they’ve expressed that they want to avoid discomfort at the end of their life. Others may fear complications like infections or have no desire to rely on a machine for life support. Some patients, like young people involved in accidents, may have never expressed preferences.

Emily Moin, a physician in an intensive care unit in Pennsylvania, told Ars that time is a factor in these decisions, but it’s imperative that a human surrogate who may better understand the patient’s wishes be involved.

“When we’re in one of these fast-paced situations where we don’t know, but we have a patient in front of us who has died, we will err on the side of providing [CPR] until we are able to arrive at the clinical judgment that that effort is no longer indicated or until we’re able to engage with a surrogate decision maker,” Moin explained.

Reaching the surrogate, she said, is “an important part of taking care of someone.”

Ahmad hopes that AI could help alleviate stress in uncertain moments. For doctors and surrogates, these decisions can be “very emotionally taxing,” Ahmad told Ars, leading many people to second-guess what the patient would choose. Some studies have shown that surrogates often get it wrong, he said, and he believes AI could help improve the odds of success.

Seeking to nip this problem in the bud, health systems have historically pushed patients to complete “advanced directives” to log their preferences. Over time, though, it has become clear that patients’ preferences tend to be unstable, sometimes changing within days.

Doctors must also consider that some patients have no stated preferences. Others, Moin said, have reported that their preferences changed after receiving lifesaving treatments because they now know what to expect. These are likely other limitations of Ahmad’s planned testing, which would determine accuracy by checking whether the AI’s decision matches what a patient says they would have wanted after recovery, Moin said.

“These decisions are dynamically constructed and context-dependent,” Moin said. “And if you’re assessing the performance of the model based on asking someone after they’ve recovered what they would have said before they recovered, that’s not going to provide you with an accurate representation.”

Moin said one of the big problems with medical AI is that people expect it to “provide better predictions than what we’re currently able to generate.” But the models are being trained on “convenient ground truths,” she said, that don’t “provide meaningful examples for models to learn about the situations” where the models would be employed.

“I imagine that they would actually want to deploy this model to help to make decisions for unrepresented patients, patients who can’t communicate, patients who don’t have a surrogate,” Moin said, “but those are exactly the patients where you’ll never be able to know what the so-called ground truth is, and then you’ll never be able to assess your bias, and you’ll never be able to assess your model’s performance.”

Family members may default to agreeing with AI

Culturally, the US has shifted from being “very focused on patient autonomy” to “more of a shared decision-making and, at times, family- and community-focused lens” as the standard for making these difficult decisions, Moin said.

The longer a doctor knows a patient, and the more conversations a patient’s health team has with family members, the more likely it is for health systems to be able to adapt to respect the patient’s wishes over time, Moin suggested.

That idea echoes Ahmad’s “ideal” AI surrogate model. But Moin said that if patients talk to an AI, it could actually discourage them from having important conversations with family members. Studies have found that if a patient fills out advanced directives, it can become harder to determine their preferences, Moin said, because patients may be less likely to discuss their preferences with loved ones.

Earlier this year, Moin urged human surrogates to remain closely involved in do-not-resuscitate orders, writing that doctors who unilaterally make these decisions have an ethical obligation to “ensure that patients and surrogate decision-makers are aware that the decision has been made” and face “the lowest of barriers” to expressing disagreement.

“Forgoing CPR is one of the most consequential treatment decisions a patient or surrogate can make because, if invoked, it will necessarily lead to death,” Moin wrote.

Moin told Ars she hopes an AI surrogate’s outputs would never be weighted more than a human surrogate’s opinion, which is based on lived experience with a patient. “But I do worry that there could be culture shifts and other pressures that would encourage clinicians and family members, for that matter, to lean on products like these more heavily,” she said.

“I can imagine a scenario where, say, a doctor is expected to round on 24 critically ill patients in one day, and the family member is resistant to sitting down for a conversation,” Moin said. “So yeah, maybe all parties involved would default to the shortcut of incorporating the information from this model.”

Moin called for more public awareness and debate on AI surrogates, noting that “people really hate” the use of algorithms to determine who gets care.

“I don’t think that it would be good for patients or clinicians or society for that matter,” Moin said.

She’s particularly worried that “patients who can’t speak for themselves and who don’t have a clear loved one” would be “the ones who would be most vulnerable to suffering harms” of AI surrogates making wrong calls. Too many such mistakes could further erode trust in health systems, Moin said.

AI surrogates may be redundant

These decisions are “psychosocially fraught” for everyone involved, Teva Brender, a hospitalist at a medical center for veterans in San Francisco, told Ars. That’s why testing like Ahmad’s is important, he said.

Last year, Brender co-authored an opinion piece noting “how difficult it can be for families to make decisions for incapacitated patients,” particularly in geriatrics, palliative, and critical care settings.

“For many, the notion of incorporating AI into goals-of-care conversations will conjure nightmarish visions of a dystopian future wherein we entrust deeply human decisions to algorithms,” Brender’s team wrote. “We share these apprehensions.”

But with doctors’ and surrogates’ predictions facing significant limitations, “it behooves us to consider how AI could be safely, ethically, and equitably deployed to help surrogates for individuals who are seriously ill,” Brender’s team concluded.

And it’s “equally important,” Brender told Ars, to help patients choose surrogates and prepare them to substitute their judgments.

Brender believes Ahmad’s research is worthwhile since there are “lots of questions” requiring scientific research. But he’s “glad to hear” that AI surrogates are “not actually being used among patients” at Harborview yet. “I can’t imagine that an IRB would approve such a thing this early,” he told Ars.

And AI surrogates may end up playing a redundant role, leading this potential use for AI to fall out of favor, Brender said.

“The devil’s advocate perspective,” Brender said, is that AI surrogates would just be doing “what a good clinician does anyway,” which is to ask surrogates, “Hey, who was this person? What did they enjoy doing? What brought meaning to their life?”

“Do you need an AI to do that?” Brender asked. “I’m not so sure.”

AI can’t replace human surrogates, doctors warn

Last month, bioethics expert Robert Truog joined R. Sean Morrison, a doctor dedicated to advancing palliative care aimed at improving the quality of life for people suffering life-threatening illnesses, in emphasizing that AI should never replace human surrogates in resuscitation decisions.

“Decisions about hypothetical scenarios do not correlate with decisions that need to be made in real time,” Morrison told Ars. “AI cannot fix this fundamental issue—it is not a matter of better prediction. Patients’ preferences often represent a snapshot in time that are simply not predictive of the future.”

The warning came after Georg Starke, a doctor and senior research associate at the Chair for Ethics of AI and Neuroscience at the Technical University of Munich, co-authored a proof-of-concept showing that three AI models, on average, performed better than human surrogates in predicting patient preferences.

Starke’s study relied on existing data from Swiss respondents of a European survey that tracked population health trends of individuals over 50 years old. The dataset offered “comprehensive information on participants’ end-of-life preferences, including questions concerning” CPR. That allowed the team to build three models: a simple model, a model based on commonly available electronic health records, and a more “personalized” model. Each model successfully predicted whether a patient experiencing cardiac arrest would want CPR, with an accuracy of up to 70 percent.

His team’s research was intended to “ground a long-standing ethical debate in empirical data,” Starke told Ars.

“For over a decade, people have speculated about using algorithms to improve clinical decision-making for incapacitated patients, but no one had shown whether such a program could actually be designed,” Starke said. “Our study was meant to test if it’s feasible, explore how well it performs, identify which factors influence the models’ decisions, and spark a broader debate about the technology.”

A key limitation of AI models depending on “‘accuracy’ alone”—especially if that “accuracy” is “achieved by chance or by pattern-matching purely demographic data outside an individual’s control”—is that the outputs don’t “necessarily reflect an autonomous choice,” Starke said.

Like Truog and Morrison, Starke’s team emphasized that “human surrogates will remain essential sources for the contextual aspects of specific situations,” particularly with patients with dementia, and agreed that AI models “should not replace surrogate decision-making.”

Chatbot surrogates could be bad

Human surrogates may grow to trust AI systems in the future, but “it’s all about how the information is presented,” Brender, the hospitalist, told Ars.

He thinks that AI systems could best serve as a “launchpad” for discussions, giving surrogates a way to consider what data may be significant to the patient.

But he agreed with Moin that without transparency about how AI surrogates arrive at decisions, AI could sow distrust.

Imagine, for example, if an AI system didn’t know about a new treatment for cancer that could completely change a patient’s prognosis. Patients might be better served, Brender suggested, if hospitals invested in AI to improve prognosis instead of “literally predicting what a patient would want.” Truog and Morrison also suggested that AI research like Ahmad’s could help hospitals determine what kinds of patients tend to have more stable preferences over time.

Brender suggested that a nightmare scenario could arise if an AI surrogate, presented in a chatbot interface, leads doctors and family members to put “too much trust” in an algorithm. That’s why transparency and rigorous testing will be critical if this technology is ever deployed, he said.

“If a black-box algorithm says that grandmother would not want resuscitation, I don’t know that that’s helpful,” Brender said. “You need it to be explainable.”

Research on bias of AI surrogates doesn’t exist

Ahmad agreed that a human should always be in the loop. He emphasized that he’s not rushing to deploy his AI models, which remain in the conceptual phase. Complicating his work, there’s currently little research exploring bias and fairness in the use of AI surrogates.

Ahmad aims to begin to fill in that gap with a pre-print paper set for release this week that maps out various notions of fairness and then examines fairness across moral traditions. Ultimately, Ahmad suggests, fairness in using AI surrogates “extends beyond parity of outcomes to encompass moral representation, fidelity to the patient’s values, relationships, and worldview.”

“The central question becomes not only, ‘Is the model unbiased?’ but ‘Whose moral universe does the model inhabit?'” Ahmad wrote, providing an example:

Consider the following: Two patients of similar clinical profiles may differ in moral reasoning, one guided by autonomy, another by family or religious duty. Treating them “similarly” in algorithmic terms would constitute moral erasure. Individual fairness requires incorporating value-sensitive features, such as recorded spiritual preferences or statements about comfort, without violating privacy.

It could be more than a decade before the technology is deployed to patients, if it ever happens, Ahmad suggested, because of how challenging it is for AI models to be trained to calculate something as complex as a person’s values and beliefs.

“That’s where things become really complicated,” Ahmad told Ars, noting “there’s societal norms, and then there’s norms within a particular religious group.”

Consider an “extreme example,” Ahmad said. Imagine the puzzle doctors might face if they’re trying to decide if a pregnant woman involved in an accident should be taken off a ventilator because outdated records show she once marked that as her preference. A human surrogate, like her partner or a family member, might be able to advocate on her behalf to stay on the ventilator, particularly if the woman holds pro-life views, he said.

Without a human surrogate, doctors could turn to AI to help them make a decision, but only if the AI system is able to capture the patient’s values and beliefs based on “patterns learned from data, clinical variables, demographic information, linguistic markers in clinical notes, and possibly the patient’s digital footprint,” Ahmad’s paper explains.

Then there’s the issue of AI models being “somewhat brittle,” Ahmad said, perhaps giving “a very different answer” if a question is worded slightly differently or in a “clever” way the model doesn’t understand.

Ahmad is not shying away from what he calls “the problem of engineering values.” To better understand how other researchers are approaching the issue and what expectations patients may have for AI surrogates, Ahmad recently attended an evangelical Christian conference on AI in Dallas, Texas. There, it seemed clear that in a future where AI surrogates are integrated into hospitals, some patients may have high expectations about how well large language models (LLMs) can replicate their inner truths.

“One thing that really stood out was that people—especially when it comes to LLMs—there was a lot of discussions around having versions of LLMs which reflected their values,” Ahmad said.

Starke told Ars he thinks it would be ideal to build models based on the most accessible electronic health records, at least from a clinical perspective. To best serve patients, though, he agreed with Ahmad and thinks that “an ideal dataset would be large, diverse, longitudinal, and purpose-built.”

“It would combine demographic and clinical variables, documented advance-care-planning data, patient-recorded values and goals, and contextual information about specific decisions,” he said.

“Including textual and conversational data could further increase a model’s ability to learn why preferences arise and change, not just what a patient’s preference was at a single point in time,” Starke said.

Ahmad suggested that future research could focus on validating fairness frameworks in clinical trials, evaluating moral trade-offs through simulations, and exploring how cross-cultural bioethics can be combined with AI designs.

Only then might AI surrogates be ready to be deployed, but only as “decision aids,” Ahmad wrote. Any “contested outputs” should automatically “trigger [an] ethics review,” Ahmad wrote, concluding that “the fairest AI surrogate is one that invites conversation, admits doubt, and leaves room for care.”

“AI will not absolve us”

Ahmad is hoping to test his conceptual models at various UW sites over the next five years, which would offer “some way to quantify how good this technology is,” he said.

“After that, I think there’s a collective decision regarding how as a society we decide to integrate or not integrate something like this,” Ahmad said.

In his paper, he warned against chatbot AI surrogates that could be interpreted as a simulation of the patient, predicting that future models may even speak in patients’ voices and suggesting that the “comfort and familiarity” of such tools might blur “the boundary between assistance and emotional manipulation.”

Starke agreed that more research and “richer conversations” between patients and doctors are needed.

“We should be cautious not to apply AI indiscriminately as a solution in search of a problem,” Starke said. “AI will not absolve us from making difficult ethical decisions, especially decisions concerning life and death.”

Truog, the bioethics expert, told Ars he “could imagine that AI could” one day “provide a surrogate decision maker with some interesting information, and it would be helpful.”

But a “problem with all of these pathways… is that they frame the decision of whether to perform CPR as a binary choice, regardless of context or the circumstances of the cardiac arrest,” Truog’s editorial said. “In the real world, the answer to the question of whether the patient would want to have CPR” when they’ve lost consciousness, “in almost all cases,” is “it depends.”

When Truog thinks about the kinds of situations he could end up in, he knows he wouldn’t just be considering his own values, health, and quality of life. His choice “might depend on what my children thought” or “what the financial consequences would be on the details of what my prognosis would be,” he told Ars.

“I would want my wife or another person that knew me well to be making those decisions,” Truog said. “I wouldn’t want somebody to say, ‘Well, here’s what AI told us about it.'”