The 5-Layer AI Compliance Stack: A Practical Framework for Getting Your AI House in Order

Last updated: April 2026

Effective AI compliance requires five layers — Inventory, Classification, Guardrails, Documentation, and Testing — and 90% of companies have only completed the first one, leaving them exposed to regulatory enforcement, bias liability, and operational failures they cannot detect or defend against.

Here’s the full stack.

Layer 1: Inventory

You can’t govern what you can’t see. The inventory layer answers one question: what AI systems are operating in your business?

This sounds simple. It’s not. Shadow AI is the real problem. Employees are using ChatGPT, Claude, Gemini, Perplexity, and a dozen other tools without IT’s knowledge or approval. A 2024 Microsoft survey found that 78% of AI users at work brought their own AI tools rather than using company-provided ones. Salesforce found similar numbers. Your AI inventory is incomplete if it only covers approved tools.

A proper inventory includes every AI tool that employees access (approved or not), every AI feature embedded in existing software (your CRM’s AI scoring, your email platform’s smart compose, your project management tool’s AI task suggestions), every vendor that uses AI to process your data, and every internally developed model or automation.

The output is a catalog with system names, vendors, data inputs, data outputs, decision types, and user counts. Update it quarterly. Treat it like your software asset inventory, because that’s what it is.

Layer 2: Classification

Not every AI system carries the same risk. The classification layer assigns a risk tier to each system in your inventory.

The EU AI Act uses four tiers: unacceptable (banned), high-risk (heavy regulation), limited (transparency obligations), and minimal (no specific requirements). The AI regulatory patchwork across U.S. states adds another layer of complexity, because different jurisdictions define risk differently. And when companies skip classification entirely, the results are predictable: real-world AI failures that cost millions and make headlines. You don’t need to adopt the EU’s exact framework, but you need a risk classification scheme that maps to your regulatory exposure and business context.

I use a three-tier model for most clients:

Tier 1 (High Risk): AI systems that make or substantially influence decisions about people (hiring, lending, insurance underwriting, healthcare treatment, law enforcement). These systems face the most regulatory scrutiny and create the most liability exposure. They need the most governance.

Tier 2 (Medium Risk): AI systems that interact with customers or process sensitive data but don’t make consequential decisions autonomously. Customer service chatbots, content recommendation engines, marketing personalization, fraud detection with human review. These need documented policies and regular monitoring.

Tier 3 (Low Risk): AI systems used for internal productivity with no external impact and no sensitive data exposure. Code completion tools, internal summarization, meeting transcription for internal use. These need acceptable use policies but minimal governance overhead.

Classification determines how much governance each system gets. Without classification, you either over-govern everything (expensive and slow) or under-govern high-risk systems (dangerous and liable).

Layer 3: Guardrails

Guardrails are the rules that constrain AI behavior. This layer answers: what is each AI system allowed to do, and what is it prohibited from doing?

Technical guardrails include input validation (what data can the system ingest), output filtering (what responses or decisions can it generate), rate limiting (how many decisions per hour/day), confidence thresholds (below what confidence level does the system escalate to a human), and data retention limits (how long is AI-processed data stored).

Policy guardrails include acceptable use policies (what employees can and can’t use AI for), prohibited use cases (areas where AI must never be used), escalation protocols (when AI decisions must be reviewed by a human), and incident response procedures (what happens when an AI system produces a harmful output).

The most common guardrail failure I see: companies define what AI should do but not what it shouldn’t. You need both. Your chatbot should answer customer questions. Your chatbot should not make promises about returns, pricing, or service levels that aren’t in your actual policies. Write down the “should nots.” Air Canada learned this lesson publicly.

Not sure where your AI systems fall on this stack? Take the ACRA to map your exposure in 5 minutes.

Layer 4: Documentation

Documentation is the evidence layer. It proves you’re doing what you claim to be doing. When a regulator asks “how do you govern AI?”, your documentation is the answer.

Required documentation includes: your AI policy (who’s responsible for AI governance, what’s the approval process, what are the rules), risk assessments for Tier 1 and Tier 2 systems, bias audit results (if applicable under state law), data processing records for AI systems that handle personal data, training records (proof that employees understand AI policies), incident logs (every AI failure, near-miss, and unexpected output), and vendor due diligence records (how you evaluated AI vendors before procurement).

The documentation layer is where most compliance programs fall apart. Companies write an AI policy, circulate it once, and never update it. The policy says “annual bias audits” but no audit has been conducted. The training materials reference tools the company stopped using six months ago. Documentation only works if it’s maintained.

Build documentation into your workflow, not as a separate activity. When you procure a new AI tool, the documentation is part of the procurement process. When an AI system produces an unexpected result, the incident log entry is part of the incident response process. If documentation requires a separate effort, it won’t happen.

Layer 5: Testing

Testing is the validation layer. It answers: is the AI system actually performing as documented and governed?

Testing includes bias testing (run demographic analysis on system outputs to check for disparate impact), accuracy testing (compare AI outputs against known-good results at regular intervals), adversarial testing (try to make the AI produce harmful, incorrect, or unauthorized outputs), drift testing (compare current performance against baseline to detect model degradation), and compliance testing (verify that guardrails are functioning and documentation is current).

Testing frequency depends on classification. Tier 1 systems need quarterly bias and accuracy testing at minimum. Tier 2 systems need semi-annual testing. Tier 3 systems need annual review.

The biggest mistake: testing only at deployment. AI systems change. The underlying models get updated. The data distribution shifts. User behavior evolves. A system that was bias-free at deployment can develop disparate impact over time as the input data changes. States are already using AI in their own compliance enforcement, which means they’ll find your untested systems faster than you think. Testing must be ongoing.

The Gap Between Layer 1 and Layer 5

The distance between “we made a spreadsheet” and “we test our AI systems quarterly for bias and accuracy” is the distance between AI awareness and AI governance. Most companies are at Layer 1. The ones that will survive the regulatory wave are building through Layer 5.

You don’t need to build the entire stack overnight. But you do need to know where you are and where you’re going. If you’re at Layer 1, get to Layer 2 this quarter. If you’re at Layer 2, start building Layer 3. Each layer builds on the one below it. The DIE Progress Unit framework gives you a way to measure that progress without pretending compliance is a binary switch.

What to Do Now

Assess where you are. Which layer is your organization currently operating at? Be honest. Having an AI policy (Layer 3) without an inventory (Layer 1) means your policy doesn’t cover systems you don’t know about.

Build in order. Don’t skip layers. Inventory first, then classify, then set guardrails, then document, then test. Each layer depends on the one beneath it.

Assign ownership. AI compliance needs an owner. Not a committee. A person. That person needs authority, budget, and reporting access to leadership. Without ownership, the stack doesn’t get built.

Set a timeline. Layer 1 should take 30 days. Layer 2 takes another 30 days. Layers 3-5 take 60-90 days each. A complete compliance stack takes 6-9 months for a mid-size company. Start now, because the regulatory deadlines aren’t waiting for you to finish.

Book a diagnostic to walk through your stack with our team and identify which layers need the most work.

A spreadsheet of your AI tools isn’t compliance. It’s a starting line. Kaizen AI Lab builds the other four layers so your AI governance actually holds up when regulators come knocking. Talk to us.