Vendor Transparency and Governance Bottlenecks with Morgan Jeffries of Geisinger
This is part of our executive insights series where Elion CEO Bobby Guelich speaks with healthcare leaders about their tech priorities and learnings. For more, become a member and sign up for our email list.
Title: Neurologist; Medical Director for AI
Organization: Geisinger
We caught up with Morgan Jeffries for a follow-up to our earlier conversation on technology innovation at Geisinger, particularly as it pertains to AI governance. A year in, his team has been deeply engaged in operationalizing oversight. Here’s how that’s gone, what they’ve learned, and why scaling governance may be even harder than expected.
Let’s start with where things stand. What’s happened since we last talked?
A lot. Around the time of our last conversation, our AVP for AI, Mike Draugelis, had just joined us and shortly afterward, the Pennsylvania Insurance Department issued binding AI-related regulations. That created more pressure. We couldn’t just talk about best practices anymore. We needed a formal governance policy that would ensure we met regulatory requirements and would guide us internally.
So we pulled together a group, worked with HAIP, and drafted a system-wide AI policy. It defines AI and outlines governance at a high level: executive steering committee oversight, a formal review process, and baseline expectations for any AI-enabled solution. We intentionally left room to evolve—we didn’t want to paint ourselves into a corner with a rigid tiered risk model before we had any implementation experience.
For anyone proposing an AI solution, we generally expect the following:
A defined process for identifying potential problems
An escalation plan outlining steps to take if a problem is detected
An evaluation of the project’s impact on equity
What’s been hardest about putting that policy into practice?
Volume and ambiguity. Once the policy was in place, we needed to actually catch AI-enabled projects before they went live. We worked with our IT PMO and contracts team to add flags and routing processes, so anything with a hint of AI would come to our subcommittee.
But then once stuff started coming through, we had issues trying to figure out how to track this stuff, something we’re still working on.
But then we became the bottleneck. We review projects every other week, but we spend a ton of time just trying to understand what’s being proposed. We give stakeholders a questionnaire with prompts like “What are the potential failure modes?” and we often get answers like “N/A,” which obviously doesn’t help us assess risk. So the meetings turn into fact-finding sessions rather than actual decision-making.
We’ve started assigning enablement leads—people from our team who spend an hour a week helping these project teams get their ducks in a row. But honestly, it’s not enough. The equity assessments alone are incredibly time-intensive. You’re not just reviewing the model; you’re thinking about how it functions in your system, how bias might show up across workflows, access, outcomes. And we don’t always have clear answers or clear data to work with.
What kinds of support or information are you typically missing from vendors?
Transparency. Even basic questions like, “how was the model trained?” or “how will you let us monitor performance locally?” can be hard to get answers to. It's not that vendors are trying to be bad actors and they don't care. But they're balancing these concerns around trade secrets, and maybe they haven't figured out yet exactly how they can give us the information that we need. It's not clear that there's been a huge amount of market pressure there yet for them to help customers in that way.
Another part of this challenge is that the vendor’s model alone doesn’t tell us how that model behaves in our context—with our population, workflows, and data inputs.
Some vendors are trying. Epic, for instance, has been relatively proactive with documentation for predictive models. But on the generative side, we’ve run into gaps. One Epic use case we’re exploring is auto-drafting responses to patient messages. We started asking about potential dialectic bias, and we didn’t find any existing assessments. So now we’re doing a manual review: identifying patients with limited English proficiency, feeding in messages, and looking for issues ourselves.
Any early lessons you’d share with others building out governance?
Nothing revolutionary. Just that if you think it's going to be easy, it's not. If you want to do this well, it seems like it's going to be a lot of work at this point. I don't know that those comments are a tremendous help to anyone who's undertaking this. It's gonna be really hard. That's my pro tip.