Back to all insights
Security & Compliance 2026-06-17 10 min read

AI Governance for Medical Groups: A Practical Policy and Engineering Checklist

A practical guide for medical group leaders who need to approve AI scribes, denial tools, scheduling assistants, and internal AI workflows without creating HIPAA or patient-safety risk.

The Short Answer

AI governance for a medical group is the set of rules, approvals, technical controls, and review habits that decide how AI can be used around patients, staff, claims, and PHI. It is not just a policy PDF. It is how you decide which tools are allowed, what data they can touch, who reviews their output, what gets logged, and who is accountable when the model is wrong.

That sounds heavy, but the alternative is worse: clinicians testing unapproved scribes, staff pasting patient complaints into consumer chatbots, billing teams using black-box denial tools, and nobody knowing where PHI went.

MGMA's January 2026 poll found that 42% of medical group leaders either have AI governance or are developing it, while 56% still do not. That gap matters because AI is already showing up in documentation, prior authorizations, patient messages, scheduling scripts, SOP writing, and revenue-cycle work. The tools arrived before the operating model.

Why This Matters Now

Most practices do not adopt AI through one clean enterprise rollout. They adopt it sideways.

A physician tries an ambient scribe because notes are crushing their day. A front desk lead asks ChatGPT to rewrite a phone script. A revenue-cycle manager tests a denial predictor. Someone uses an AI meeting-notes app on a call where patient operations are discussed. None of these decisions feel like a major platform launch, so they often bypass the same vendor review, BAA review, security review, and workflow design that a normal healthcare system would require.

That is the real risk. Not "AI" as a vague category. The risk is unmanaged AI touching real operating work without clear boundaries.

ONC's HTI-1 rule also made algorithm transparency a healthcare IT topic, not just a data-science topic. Certified health IT is being pushed toward clearer information about predictive tools, fairness, validity, safety, and appropriate use. Medical groups do not need to become academic AI labs, but they do need a practical way to say: this use case is allowed, this one needs review, and this one is prohibited.

The Shadow AI Problem

Shadow AI is any AI use happening outside the practice's approved tools and controls.

In medical groups, the common examples are painfully ordinary:

  • A clinician uses an unapproved scribe and stores visit audio in a vendor account nobody reviewed.
  • A staff member pastes a patient complaint into a public chatbot to make it sound more professional.
  • A billing team uses AI to draft appeals but does not verify the facts against the chart.
  • A manager uploads call-center transcripts to summarize patient complaints.
  • A team uses an AI notetaker in meetings where patient names, dates of birth, or clinical details are mentioned.
  • A vendor quietly adds "AI summaries" to a system already in use, but the practice never updates its policy or training.

None of this means the practice should ban AI. A blanket ban usually fails because staff already feel the pressure. The better approach is to give people a safe, explicit path.

Start With A Use-Case Register

Before writing a giant policy, create a simple AI use-case register. Every proposed AI tool or workflow gets one row.

For each use case, capture:

  • What problem it solves
  • Which department owns it
  • Whether PHI is involved
  • Whether output affects clinical care, patient communication, billing, or internal operations
  • Which vendor or model is used
  • Whether a BAA exists
  • What data is retained and for how long
  • Who reviews the output before it is acted on
  • What logs or evidence are kept

This register becomes the source of truth. It also prevents a common failure: approving "AI" in general instead of approving specific workflows.

Risk Levels That Actually Work

Medical groups need a risk model that humans can use. I like four levels.

Low risk: No PHI, no patient-specific decision, no external communication. Example: drafting an internal SOP from a de-identified process description.

Moderate risk: Operational use with limited PHI or patient context, but no direct clinical decision. Example: summarizing de-identified denial trends or drafting a scheduling script from approved policy.

High risk: Patient-specific output that affects documentation, billing, patient communication, triage, prior authorization, or care coordination. Example: AI-drafted patient messages, AI scribe notes, denial appeals, referral routing, or prior-auth evidence packets.

Prohibited without formal review: Clinical diagnosis, treatment recommendations, autonomous triage, unreviewed EHR write-back, use of consumer AI tools with PHI, or any workflow where staff cannot explain how output will be checked.

The governance work is not the same for each level. Low-risk uses may only need training and a standard tool. High-risk uses need vendor review, BAA review, workflow design, human review, audit logging, and rollback plans.

The Minimum Policy

A useful AI policy should be short enough that staff can remember it. The first version can be built around six rules:

  1. Do not enter PHI into any AI tool unless the tool is approved for PHI and covered by the required agreements.
  2. AI output that affects patients, charts, claims, prior authorizations, or messages must be reviewed by an accountable human before use.
  3. AI-generated documentation is a draft until a clinician signs it.
  4. AI-generated billing, coding, denial, or appeal content must be checked against source records.
  5. New AI tools and AI features from existing vendors must be registered before use.
  6. Staff must be able to report AI errors, unsafe outputs, or privacy concerns without blame.

This is not enough for a mature program, but it is enough to stop the worst behavior while the practice builds a deeper process.

Engineering Controls Matter More Than Intent

Good intentions do not protect PHI. Systems do.

For any AI workflow touching patient or operational data, the technical layer should handle:

  • Role-based access control so only the right staff can use the workflow
  • PHI minimization so the model receives only the data needed for the task
  • Audit logs showing who generated, reviewed, edited, approved, or rejected output
  • Vendor data-retention settings documented before go-live
  • BAA coverage for every vendor that creates, receives, maintains, or transmits PHI
  • Output review queues for high-risk tasks
  • Version history for generated notes, messages, appeals, and summaries
  • Monitoring for failed automations, unusual use, or repeated model errors

This is where AI governance becomes software architecture. A policy says "human review is required." The product must make review unavoidable.

Human Review Should Be Designed, Not Assumed

"The clinician will review it" is not a governance plan.

A proper review flow answers these questions:

  • What exactly must be reviewed?
  • What source record should the reviewer compare against?
  • Can the reviewer edit before approval?
  • Is the original AI output retained?
  • Does the system record who approved it?
  • Can the output be sent or written back before approval?
  • How are repeated errors reported back to the owner?

For an AI scribe, review might focus on medications, allergies, diagnoses, plan, and omitted context. For prior authorization, review might focus on whether the evidence packet cites the right chart notes, dates, and payer criteria. For patient messaging, review might focus on tone, accuracy, escalation language, and whether the message creates unintended clinical advice.

The human reviewer should not be decorative. They need a UI and workflow that makes review faster than redoing the work from scratch.

A 30-Day Rollout Plan

If a practice has no AI governance today, do not start with a six-month committee.

Week 1: Inventory known AI use. Ask each department what tools they are using, including embedded vendor features. Do not frame it as an investigation. Frame it as risk cleanup.

Week 2: Publish the minimum policy and define risk levels. Create an approved tools list and a prohibited-use list.

Week 3: Review the highest-risk workflows first: scribes, patient messages, prior auth, denials, coding, call center, and EHR-connected features.

Week 4: Add technical controls for the first approved high-risk workflow: access control, audit logs, retention settings, human approval, and incident reporting.

After 30 days, the practice should know what is being used, which use cases are risky, and which AI workflows are approved. That is a real governance baseline.

Where Custom Software Fits

Most AI governance failures happen because the tool is bolted onto a messy workflow. The vendor solves the model part, but not the practice-specific operating part.

For example, an AI denial tool might draft appeal language, but your team still needs payer-specific evidence collection, chart references, staff assignment, approval routing, audit logs, and outcome tracking. An AI scribe might draft a beautiful note, but your practice still needs consent capture, specialty templates, clinician sign-off, and safe EHR write-back.

That is where custom healthcare software matters. The valuable system is often not "an AI model." It is the workflow around the model.

Opexia builds those workflow layers: approval queues, PHI-safe automations, audit logs, EHR middleware, and governance-aware internal tools. If you are experimenting with AI across documentation, revenue cycle, or operations, start with the governance system before the next tool becomes another unmanaged risk.

Related Opexia reading: HIPAA Business Associate Agreements, HIPAA audit logging requirements, and Zero Trust architecture for healthcare cloud systems.

External sources: MGMA on AI governance in medical practices, ONC HTI-1 final rule, and Google's people-first content guidance.

Related Service

HIPAA & SOC 2 Cloud Architecture

Deep-dive into our engineering approach, capabilities, and technical specifications.

View Engineering Specs →
SA

Written by Sheharyar Amin

Founder & Lead Engineer, Opexia