For most of the past decade, AI developers operated in a regulatory grey zone. They could deploy systems that decided who got a loan, who passed a job screening, or who received a medical referral — with no external check on whether those systems actually worked as claimed. That era is ending. Governments on multiple continents are now mandating that certain AI systems be independently verified before deployment, and that the verification continue throughout a system’s operational life.

The shift is not merely symbolic. It creates legal liability, compliance infrastructure, an emerging audit profession, and real costs for companies that build or buy AI. Understanding how these frameworks work is essential for anyone operating in — or selling into — regulated markets.

What Mandatory AI Audits Actually Mean

An AI audit, in regulatory terms, is a structured independent evaluation of an AI system against defined criteria: Does it do what its documentation claims? Does it perform consistently across demographic groups? Does it maintain adequate logs to support post-incident investigation? Are human override mechanisms present and functional?

The word “mandatory” matters. Self-certification — where a company simply declares its system safe — has been the default. Mandatory third-party auditing replaces or supplements self-declaration with independent verification by accredited bodies that have no commercial relationship with the developer.

The analogy is financial auditing. Publicly listed companies cannot simply assert their accounts are accurate; they must have them verified by an independent auditor. Regulators are applying the same logic to high-stakes AI.

The EU AI Act: The World’s First Binding Framework

The European Union’s AI Act, which entered into full force in stages from August 2024, is the most comprehensive binding AI regulation currently in effect. It classifies AI systems on a risk ladder, with the most consequential requirements falling on “high-risk” applications.

High-risk categories include AI used in critical infrastructure, education, employment screening, credit scoring, law enforcement, border control, and administration of justice. Systems in these categories face a battery of obligations before they can legally be placed on the EU market.

Central to the framework is the conformity assessment — a structured evaluation that must be completed before deployment. For most high-risk AI systems, developers can conduct this assessment themselves against harmonized technical standards, then file an EU declaration of conformity and register the system in the EU’s AI database. However, for AI systems in particularly sensitive domains — biometric identification, for instance — third-party assessment by an accredited “notified body” is compulsory.

Notified bodies are organizations formally designated by EU member states and cross-recognized across the bloc. Becoming a notified body requires demonstrating technical competence, organizational independence from the manufacturers they assess, and ongoing compliance with accreditation standards. As of early 2026, the accreditation pipeline for AI notified bodies is still developing, with national accreditation bodies working to establish AI-specific competency criteria.

Post-market monitoring is a second major obligation. High-risk AI providers must actively collect data on system performance after deployment, feed serious incidents into the EU’s EUDAMED-equivalent AI incident reporting system, and trigger re-assessments when a system undergoes significant changes. This is not a one-time certification — it is ongoing regulatory surveillance.

What Auditors Actually Check

Whether the assessment is internal with external review or fully third-party, the checklist converges on a common set of dimensions:

Training data quality and provenance. Auditors examine whether training datasets were representative of the deployment population, whether they contained known biases, and whether data lineage is documented well enough to support investigation if the system produces discriminatory outputs.

Bias and fairness testing. Statistical parity tests across demographic groups — gender, age, ethnicity, disability — are standard. The AI Act does not prescribe a single fairness metric (an impossible task given that different metrics are mathematically incompatible), but auditors look for evidence that the developer considered multiple definitions and made reasoned, documented choices.

Accuracy and performance metrics. Claimed performance figures must be reproducible on held-out test sets that the auditor controls, not cherry-picked benchmarks from the developer. Auditors will often supply their own evaluation datasets to check whether performance degrades in distribution-shifted conditions.

Human oversight mechanisms. High-risk AI under the EU AI Act must be designed to allow human intervention — a “human in the loop” or “human on the loop” depending on context. Auditors verify that override controls exist, are accessible to operators, and are tested in the technical documentation.

Logging and auditability. Systems must generate logs sufficient to reconstruct what inputs produced what outputs over the relevant operational period. This requirement directly supports post-incident forensics and regulators’ right of access.

An Industry Being Born

Third-party AI auditing was barely a profession three years ago. Today it is a fast-growing market segment. The Big Four accounting firms — Deloitte, PwC, EY, and KPMG — have all launched AI audit practices. KPMG, for example, has published frameworks for AI system auditing that adapt financial audit methodology to algorithmic systems, covering model governance, data integrity, and change management.

Alongside the generalist firms, specialized AI audit organizations have emerged. The AI Now Institute at New York University conducts research-based audits with particular focus on civil rights impacts. O’Neil Risk Consulting and Algorithmic Auditing, founded by mathematician Cathy O’Neil, offers quantitative bias auditing. Monitaur provides software-based audit infrastructure for model registries and evidence collection.

The market is not yet standardized. Auditors use different methodologies, different fairness definitions, and different evidence requirements, making audit results difficult to compare across firms. Standards bodies — including ISO and IEEE — are actively developing AI audit standards, but harmonization will take years. In the interim, buyers of AI audit services should scrutinize the specific methodology being applied, not simply the audit firm’s brand.

Advertisement

US State-Level Algorithmic Accountability

The United States has not passed federal AI legislation comparable to the EU AI Act. However, at the state level, algorithmic accountability legislation is accumulating. New York City’s Local Law 144, which requires bias audits for automated employment decision tools, has been in force since 2023 and has produced the first generation of published algorithmic audit reports from employers in a major jurisdiction.

Illinois, Maryland, and California have enacted or are advancing legislation requiring impact assessments or audits for AI used in hiring. Colorado’s 2023 insurance AI law requires bias testing for models used in insurance underwriting. Several states are considering broader algorithmic accountability bills modeled on the EU framework.

The state-level patchwork creates compliance complexity for companies operating nationally — they face overlapping and sometimes inconsistent requirements across jurisdictions. This pressure is, paradoxically, one of the arguments proponents of federal legislation use for preemptive national standards.

The Proprietary Model Problem

AI auditing faces a structural tension that the financial auditing analogy does not fully capture: model opacity. A financial auditor can examine transactions, ledgers, and contracts. An AI auditor seeking to assess a large language model or a proprietary neural network confronts a system whose internal logic is not human-interpretable and whose developers may resist full source access on grounds of trade secrecy.

The EU AI Act attempts to resolve this by requiring technical documentation that is comprehensive but not necessarily public, and by granting market surveillance authorities the right to access documentation and source code on request. But the practical limits of auditing a black-box system remain a live research and policy problem. Behavioral testing — feeding controlled inputs and observing outputs — is the primary workaround, but it can only probe a fraction of a complex model’s behavior space.

For foundation models, the AI Act introduces a separate “general-purpose AI” tier with its own transparency obligations, but the conformity assessment regime applies to the downstream deployer who integrates the model into a high-risk application, not necessarily to the foundation model provider directly. This creates questions about liability allocation that courts and regulators will be answering for years.

What Compliance Costs Look Like

Industry estimates for EU AI Act conformity assessments for high-risk systems range from €50,000 to €300,000 per system for initial assessment, depending on complexity and the depth of documentation already in place. Ongoing post-market monitoring adds recurring costs. For large enterprises with multiple AI deployments, aggregate compliance budgets in the millions of euros are realistic.

For startups and small AI developers, these costs are potentially prohibitive, raising concerns about regulatory barriers to entry that entrench large incumbents. The EU AI Act includes some proportionality provisions for SMEs, including reduced documentation requirements and sandbox access, but critics argue these are insufficient to offset the compliance burden.

Companies that invested early in model documentation, data governance, and internal testing infrastructure find that compliance costs are significantly lower — because the documentation required by regulators mirrors good engineering practice. The lesson for companies building AI today: treating auditability as a design requirement from day one is substantially cheaper than retrofitting it under regulatory deadline.

Advertisement

Decision Radar (Algeria Lens)

Dimension Assessment
Relevance for Algeria High — Algerian companies supplying AI to EU markets must comply; Algeria can leapfrog by establishing audit requirements early
Infrastructure Ready? Partial — IT audit capacity exists; AI audit methodology absent
Skills Available? Partial — General auditors available; AI-specific audit expertise absent
Action Timeline 6-12 months
Key Stakeholders Cour des Comptes, IGF, Ministry of Finance, ARPCE, AI startups targeting EU
Decision Type Strategic

Quick Take: Algerian companies selling AI to European clients face mandatory third-party audits — building compliance capacity now is cheaper than retrofitting later, and positions Algeria-based AI firms as market-ready for the world’s strictest regulatory environment.

Sources & Further Reading