Custom Software

HIPAA AI Compliance Checklist and Audit Service

Generic HIPAA compliance checklists were written for a world of databases, file systems, web applications, and email. They tell you to encrypt data at rest, enforce MFA, audit access, and sign BAAs with your subprocessors. All of that still applies. None of it tells you what to do when your application sends a 4,000-token prompt containing the patient’s history to a model API, gets back a structured JSON object, and writes it back to the EHR seventeen times a day per patient.

This page is for healthcare AI engineering leads, hospital privacy officers, and digital health CTOs who already have HIPAA-compliant infrastructure for everything except the AI layer — and need a checklist, audit, or full architecture review that actually covers the AI-specific controls. For the generic software-development side, our downloadable HIPAA compliance checklist tool, the HIPAA-compliant software development checklist, and the HIPAA mobile app checklist cover the foundations. This page is the AI layer on top.

Calculate My Project Cost Connect With Experts

Tell Us Your Requirements

Our experts are ready to understand your business goals.

Trusted Partners

Trusted by Industry Leaders Worldwide

Recognition

Awards & Recognitions

Why AI Breaks Generic HIPAA Checklists

HIPAA was written in 1996. The Security Rule was published in 2003 and amended materially in 2024–2025 for the 2026 update. None of it explicitly addresses large language models, retrieval-augmented generation, or agent frameworks. What that means in practice: a healthcare AI team can build a system that is technically HIPAA-compliant on paper while creating PHI exposure paths that no traditional HIPAA control covers.

The three differences AI introduces:

PHI enters places PHI normally does not enter. A traditional HIPAA system processes PHI in databases, file systems, application memory, and audit logs. An AI system additionally processes PHI inside prompt strings, embedding vectors, model context windows, retrieval results, intermediate agent state, tool-call parameters, and inference response logs. Each of those is a new PHI surface that generic HIPAA controls were not designed for.

Audit logging becomes inference-level. Traditional HIPAA audit logs capture who accessed what record when. AI audit logs additionally need to capture which prompt produced which output for which decision, with reproducibility intact for forensic review. This is a different log shape, a different retention policy, and a different storage cost profile.

Risk analysis becomes ongoing, not annual. Generic HIPAA risk analysis is annual. AI systems drift, models update, training data shifts, and prompt templates change. The 2026 HIPAA Security Rule’s continuous monitoring requirement is felt most acutely in the AI layer.

The 14 AI-Specific Controls That Generic Checklists Miss

This is the checklist we run against every healthcare AI engagement. It is layered on top of the standard HIPAA Security Rule controls, not in place of them.

1. PHI redaction at the inference boundary. Regex plus clinical NER (Named Entity Recognition) redaction of PHI before any prompt leaves your boundary. Reversible tokenization for PHI that needs to round-trip back into the response. Not a configuration toggle — a deliberate architectural layer.

2. Prompt logging policy. Inference providers log prompts and responses for debugging and abuse detection by default. Even under BAA-covered configurations, the logging itself is a PHI flow that needs documentation. Zero-data-retention configurations are non-default and have to be explicitly enabled per endpoint.

3. Embedding model BAA coverage. RAG pipelines call the embedding model with PHI to vectorize chart context. The embedding model needs BAA coverage separate from the inference model. Frequently missed in AI deployments where the team focuses on the chat-completions endpoint and forgets the embeddings endpoint.

4. Fine-tuning and training data exposure. Any PHI used to fine-tune a model becomes part of the model weights and may be recoverable by adversarial extraction. Use synthetic data or properly de-identified data for fine-tuning. Document the de-identification methodology under HIPAA Safe Harbor or Expert Determination.

5. Agent framework and tool-use exposure. Agent frameworks (tool calls, multi-step reasoning chains) expand the surface area where PHI flows through intermediate model calls, often to additional model providers. Each tool call is a PHI processing event.

6. Retrieval store PHI handling. Vector databases storing embeddings of PHI-containing documents are PHI repositories. They require the same controls as any other PHI storage — encryption at rest, access controls, audit logging, deletion-on-request capability.

7. Inference response storage classification. A clinical decision-support AI response is itself PHI the moment it is associated with a patient. Treating it as application logging or telemetry creates a HIPAA violation.

8. Drift monitoring tied to §164.308 risk analysis. Model drift detection is not just a quality metric — under the 2026 ongoing risk analysis requirement, material drift is a risk-analysis trigger that requires documented review.

9. Multi-tenant model isolation. If a single AI service serves multiple covered entities, tenant isolation at the prompt, retrieval store, audit log, and model-output layers has to be enforced. Each tenant’s data cannot influence another tenant’s outputs.

10. Synthetic data lineage for development. Production-look-alike synthetic data (e.g., from Synthea) is used to keep PHI out of development environments. The lineage and traceability of the synthetic data needs to be documented to prove non-overlap with real PHI.

11. Model output validation for safety, not just accuracy. Eval harnesses tracking clinical safety, not just task accuracy. Required for any AI feature that drives a clinical decision. Tied to FDA SaMD if applicable. See our eval harness build add-on.

12. Patient access and right-to-explanation handling. Under HIPAA’s right of access, patients can request records that include AI-influenced decisions. The reproducibility of the AI output for that patient at that point in time needs to be supportable.

13. Break-glass and override audit trail. Clinician overrides of AI recommendations need their own audit trail layer separate from the inference log. Override patterns are critical signal for both safety and HIPAA documentation.

14. AI-specific incident response runbook. Standard HIPAA incident response covers data breaches. AI incidents include model hallucination harming a patient, prompt injection leading to PHI leakage, embedding store reidentification attacks, and tool-call escape patterns. Each needs a documented response path.

What Changes Under the 2026 HIPAA Security Rule for AI

The 2026 update tightens several controls that hit AI deployments specifically:

Encryption is mandatory, not addressable. AES-256 at rest, TLS 1.3 in transit. Applies to vector stores, embedding caches, model output stores, and inference logs.

MFA required for all users accessing ePHI. Not just admin users. AI engineering teams running notebooks against PHI need MFA on the development environment too.

Continuous monitoring supplements annual risk assessments. Material model behavior change becomes an ongoing risk-analysis event.

Patch management timelines documented and enforced. Model version changes, dependency updates, and provider SDK updates all fall under this.

Real-time intrusion detection. Anomalous prompt patterns (potential prompt injection attempts) need to be detected and alerted in real time.

How Our HIPAA AI Audit Service Works

HIPAA AI Readiness Audit — $25K, 4 weeks. We audit your existing or planned AI stack against all 14 AI-specific controls above, plus the underlying HIPAA Security Rule §164.308, §164.310, and §164.312 controls. Output is a written gap report, remediation plan with effort estimates, and a documentation package ready for SOC 2, HITRUST, or hospital security review.

Embedded in a Sprint. When the HIPAA work is part of a larger AI build, it folds into the Discovery Sprint, MVP Sprint, or Pilot-Ready Sprint. Not billed separately.

Companion services. Most HIPAA AI engagements pair with the BAA with AI providers architecture and the productized BAA Network Setup add-on at $80K over 6 weeks.

Dedicated HIPAA compliance engineers. When ongoing compliance engineering is needed across multiple deployments, hire HIPAA compliance engineers at $8K per engineer per month.

For broader compliance context, see the HIPAA compliance consulting page and the certifications and compliance overview.

When to Use This Service (and When the Generic HIPAA Tool Is Enough)

Use the HIPAA AI Readiness Audit when:

Your application uses LLMs, RAG, embedding models, or agent frameworks

PHI flows through any model API (cloud or on-prem)

You are preparing for hospital security review of an AI feature

You are preparing for SOC 2 Type II or HITRUST certification with AI workloads in scope

You suspect existing AI deployments are out of compliance and need a gap assessment

FAQs

Frequently Asked Questions About HIPAA AI Compliance

No. HIPAA and its rules were written before modern AI existed. HIPAA applies to AI because AI processes PHI — the AI itself does not need to be named in the regulation for the regulation’s controls to apply. OCR has provided guidance in 2023–2025 that explicitly acknowledges AI is covered.

For HIPAA purposes, yes. Both are creation, receipt, maintenance, or transmission of ePHI under §164.306. The prompt processing is a PHI processing event, the model API call sends PHI to a third party (requiring a BAA), and the response that references the patient is itself PHI.

Yes, when the inference touches PHI. Provider-side logging of prompts and responses is a PHI flow that requires either BAA coverage of that logging or explicit zero-data-retention configuration to prevent the logging from happening.

If the data is properly de-identified under HIPAA Safe Harbor or Expert Determination methods, it is no longer PHI and falls outside HIPAA for the training process. Document the de-identification methodology and validation rigorously — re-identification risk in fine-tuned models is a current area of research and a likely future enforcement target.

When the output is associated with an identifiable patient (or contains identifying information), yes. Treat AI-generated clinical content, decision support, summarization output, and structured extractions the same as any other PHI.

Yes. The HIPAA controls apply to the PHI, not to the deployment topology. On-prem deployment changes the BAA conversation (no third-party BAA needed for the model itself) but does not eliminate the underlying HIPAA Security Rule and Privacy Rule controls. The advantage is a simpler subprocessor chain and a cleaner compliance posture.

Agent frameworks expand the PHI surface area to every tool call, every intermediate model call, and every external service the agent reaches. Each of those is a separate compliance evaluation. Multi-step agent reasoning where the model decides what to call next is particularly hard to bound — many healthcare AI engagements explicitly scope down agent autonomy for this reason.