Custom Software

BAA With AI Providers: HIPAA-Eligible AI Architecture for Healthcare

A Business Associate Agreement is a contract. It is not architecture. The contract gives you the legal cover to process PHI through a third party. The architecture decides whether you actually do it correctly, in a configuration that the contract covers, with audit logging that proves it. Most healthcare AI projects that fail hospital security review do not fail because the BAA is missing. They fail because the BAA covers a different endpoint than the one the engineering team actually called, or covers the default configuration but not the beta feature the team turned on, or covers the primary model but not the embedding model used for retrieval.

This page is for healthcare AI engineering leads, hospital IT security teams, and digital health CTOs who need to get a production AI stack past compliance review without paying the time and money of three failed attempts. For the underlying technical detail on what each provider covers, our BAA with OpenAI, Anthropic, and AWS Bedrock deep-dive blog post is the field-tested reference we use internally.

Calculate My Project Cost Connect With Experts

Tell Us Your Requirements

Our experts are ready to understand your business goals.

Trusted Partners

Trusted by Industry Leaders Worldwide

Recognition

Awards & Recognitions

Signing a BAA Is the Easy Part — Coverage Is the Hard Part

Five practical realities make AI provider BAAs a different problem from BAAs with traditional healthcare subprocessors:

Feature surfaces move faster than contracts. A BAA signed in Q1 covers the features that existed in Q1. New features released later in the year may or may not fall within the existing BAA scope. The engineering team building against the new feature can be technically out of compliance even though the contract is in force.

Endpoint-specific scope. A BAA may cover a provider’s core chat-completions endpoint while excluding their fine-tuning API, embeddings endpoint, agent framework, or beta features. The same vendor, the same contract, different coverage.

Configuration-dependent compliance. Many BAA-covered endpoints require specific configurations to remain in scope — zero-data-retention, customer-managed keys, specific regions, opt-out of training data use. The default configuration is often not the BAA-covered configuration. Engineers reading the API docs do not always know that.

Subprocessor chains matter. AWS Bedrock is BAA-eligible, but the underlying foundation models hosted on Bedrock have their own scope. Azure OpenAI sits on Microsoft’s HIPAA-eligible services list, but the OpenAI underlying models are accessed through Azure’s contractual layer, not OpenAI’s. Hospital security review will ask about this chain. Engineers who do not know the answer fail the review.

BAA-eligibility versus BAA-execution. Many provider tiers are BAA-eligible but require enterprise contracting to actually execute the BAA. The free tier and even the basic paid tier are usually not BAA-covered, regardless of what the provider’s marketing implies.

What Each Major AI Provider Actually Covers

The short version, current as of mid-2026. For the deep version, see our BAAs with OpenAI, Anthropic, and AWS Bedrock guide.

OpenAI (direct). BAAs available at enterprise tier. Coverage extends to specific endpoints under specific configurations — chat completions, embeddings, structured outputs at enterprise tier with zero-data-retention. Beta features and agent endpoints typically require explicit scope review.

Azure OpenAI. BAA via Microsoft’s standard HIPAA business associate agreement, which extends to Azure OpenAI as a covered service. Generally the cleanest path for OpenAI models in healthcare AI.

Anthropic (direct). Enterprise-tier BAA for Claude family. Same configuration-dependent caveats apply.

Anthropic via AWS Bedrock. BAA via AWS standard HIPAA-eligible services framework. Bedrock’s BAA covers Claude (and other Bedrock-hosted models) when deployed in BAA-eligible regions with appropriate configuration.

Google Vertex AI. Google Cloud BAA covers Vertex AI as a HIPAA-eligible service. Gemini family models accessed through Vertex inherit this coverage.

On-prem open models (Llama, Mistral, Mixtral). No third-party BAA needed because no PHI leaves the hospital boundary. The hospital is its own data controller and processor. The trade-off is operating cost (GPU infrastructure, model lifecycle management) versus a clean compliance posture.

For when on-prem is the right call versus cloud, see our on-prem vs cloud LLM decision framework and on-prem LLM hardware analysis.

The Five BAA Gaps That Fail Hospital Security Review

Across hundreds of compliance reviews we have either led or supported, these are the failure modes that appear repeatedly:

Gap 1: BAA on the primary model, not the embedding model. A RAG pipeline uses two model providers — one for inference, one for embeddings. The team signs a BAA with the inference provider and forgets the embeddings provider. Both touch PHI.

Gap 2: Default configuration is not the BAA configuration. Engineers test with default API settings during development. Production launches without flipping zero-data-retention on. PHI gets logged to the provider’s standard logging pipeline outside the BAA scope.

Gap 3: Beta features in production. A team builds against a new feature that is technically still in beta and not yet covered by the BAA. The feature works fine; the compliance posture is broken.

Gap 4: No PHI redaction at the boundary. Even with a clean BAA, sending un-redacted PHI to a cloud model creates audit trails the BAA does not cover (it’s now in the provider’s logs, even if covered logs). Best practice is redaction at the inference boundary regardless of BAA status.

Gap 5: No BAA chain visibility. Hospital security review asks for the full subprocessor chain — every party that touches the PHI from your application to the underlying model and back. If the chain is not documented, the review is rejected pending documentation.

The Architecture Pattern That Works

The pattern below has cleared compliance review at hospital systems including academic medical centers, federal-adjacent VA-connected systems, and SOC 2 Type II + HITRUST CSF-audited digital health customers.

At the inference boundary: PHI redaction layer using rules + clinical NER. Tokenization for any PHI that needs to round-trip back into the response. Audit log entry per inference call capturing user, model, prompt template hash, output hash, override action.

At the provider integration layer: BAA-eligible endpoint only, BAA-covered configuration only (zero-data-retention on, customer-managed keys where supported, BAA-eligible region). Provider call wrapped in an internal abstraction so the provider can be swapped without rewriting the application.

At the model selection layer: Multi-provider routing so the application can fall back from one BAA-eligible provider to another if one is down. Anthropic via Bedrock, OpenAI via Azure, Gemini via Vertex, and an on-prem Llama fallback for cases where the request is too sensitive for cloud.

At the audit layer: Append-only audit log with named-user attribution per inference, queryable by patient encounter, exportable for HIPAA audit. SOC 2 Type II audit-friendly retention.

At the documentation layer: PHI flow diagram, BAA chain document, model card with limitations and intended use, regulatory readiness package if FDA SaMD applies.

When On-Prem Is the Right Call

Cloud AI is the default, but a meaningful minority of healthcare AI workloads should run on-prem inside the hospital data center:

Highest-sensitivity PHI workloads (behavioral health with 42 CFR Part 2, federal-adjacent VA/DoD)

Workloads where the hospital security team has flat-out blocked cloud AI

Workloads where latency must be sub-100ms (rare, but real in some imaging and ED pathways)

Workloads where the volume makes cloud inference cost prohibitive — see the LLM inference cost calculator

Production reality

How We Help You Get to BAA-Eligible Production

Three engagement models depending on where you are.

Architecture Review — $25K, 2 weeks. Audit your current or planned AI architecture against BAA coverage requirements, document the BAA chain, identify the gaps, produce a remediation plan. Useful when you have an AI feature in development and need a security-review-ready compliance posture before launch.

BAA Network Setup — $80K, 6 weeks. Productized offering that sets up the BAA-eligible AI stack from scratch: BAA execution support with selected providers, PHI redaction layer, audit logging, configuration verification, documentation package. See the BAA Network Setup add-on page for the full scope.

Embedded in a sprint. When the BAA architecture is part of a larger AI build, it is folded into the Discovery Sprint, MVP Sprint, or Pilot-Ready Sprint and not billed separately.

For dedicated engineering work, hire HIPAA compliance engineers at $8K per engineer per month.

For broader compliance context, see our HIPAA compliance consulting page, the HIPAA-AI compliance checklist, and the certifications and compliance overview.

FAQs

Frequently Asked Questions About BAA With AI Providers

Both options exist. OpenAI offers direct enterprise-tier BAAs as of 2024–2025. Most healthcare buyers still route through Azure OpenAI because Microsoft’s broader healthcare cloud framework, support model, and regional availability are better-fitted to hospital procurement. The model performance is identical.

BAA-eligible means the provider is willing and able to sign a BAA on appropriate plan tiers and configurations. BAA-executed means the actual contract has been signed for your organization. A BAA-eligible provider is not protecting your PHI until the BAA is in force.

No. Coverage is feature-specific and often configuration-specific. Beta features, fine-tuning APIs, agent frameworks, and new endpoints may be excluded from existing BAAs. Always verify per-feature.

You have a HIPAA breach exposure. PHI sent to a non-BAA-covered endpoint sits outside the contractual protections. Depending on volume, you may need to report to OCR under the 60-day breach notification rule. This is the gap that PHI redaction at the inference boundary is designed to prevent.

For Azure OpenAI: usually inside an existing Microsoft enterprise agreement, days to weeks. For AWS Bedrock: usually inside an existing AWS enterprise agreement, days to weeks. For OpenAI direct and Anthropic direct: enterprise contracting timelines, typically 4–12 weeks for new customers. Google Vertex AI via existing GCP relationships, days to weeks.

Yes if the embedding model sees PHI. RAG pipelines typically send PHI to the embedding model to vectorize chart context. That is a PHI processing event and requires BAA coverage. Most major providers cover embeddings under their primary BAA, but verify per provider.

For the model layer, yes — if the model runs on hospital-owned hardware and PHI does not leave the hospital boundary, there is no third-party BAA needed for the inference itself. You still need BAAs with any other subprocessors (audit logging vendors, observability tools, etc.) that touch the PHI.