The Authenticity Crisis Is No Longer Hypothetical
In January 2024, AI-generated robocalls impersonating President Biden urged New Hampshire voters to stay home during the primary. The political consultant behind the scheme faced a $6 million FCC fine and 13 felony counts of voter suppression. That same year, deepfake video of engineering firm Arup’s CFO and other executives was used in a video conference that tricked a Hong Kong finance employee into making 15 transfers totaling $25.6 million to fraudulent accounts. By 2026, AI-generated images, audio, and video have reached a quality threshold where human detection is functionally unreliable — a University of Waterloo study found that participants correctly identified AI-generated faces only 61% of the time, far below the 85% accuracy researchers expected.
The problem is no longer confined to high-profile incidents. AI-generated product reviews flood e-commerce platforms. Synthetic academic papers pollute research databases. AI-voiced phishing calls replicate the voices of family members. The volume of AI-generated content on the internet is growing exponentially — some widely cited estimates suggest that by 2026, up to 90% of online content could be synthetically generated or manipulated, though such projections remain contested.
Watermarking — embedding imperceptible signals into AI-generated content that identify its synthetic origin — has emerged as the primary technical response. But watermarking faces fundamental challenges: it must be robust enough to survive editing, compression, and adversarial attacks, yet imperceptible enough not to degrade content quality. The race to build reliable watermarking is now one of the most consequential technical challenges in AI governance.
The Technical Landscape: SynthID, C2PA, and Beyond
Google DeepMind’s SynthID is the most prominent AI watermarking system currently deployed at scale. Initially launched for images generated by Google’s Imagen model, SynthID has expanded to text (watermarking the token probability distributions of language model outputs), audio, and video — integrated across Gemini for text, Imagen for images, Lyria for audio, and Veo for video. The watermark is embedded during generation, modifying the output in ways that are statistically detectable by a verification algorithm but imperceptible to humans. By 2025, over 10 billion pieces of content had been watermarked with SynthID. Google open-sourced SynthID Text in October 2024 through Hugging Face and its Responsible GenAI Toolkit, with an accompanying peer-reviewed paper in Nature. In May 2025, a unified SynthID Detector was released for verifying watermark signals across media types.
The Coalition for Content Provenance and Authenticity (C2PA), co-founded in 2021 by Adobe, Arm, BBC, Intel, Microsoft, and Truepic, takes a different approach. Rather than embedding invisible watermarks, C2PA attaches cryptographically signed metadata — a “content credential” — that records how a piece of content was created and modified. Think of it as a digital chain of custody: a photo taken with a C2PA-enabled camera carries a signed record of capture. If it is edited in Photoshop (which now supports C2PA), the edits are appended to the credential. If an AI model generates or modifies the image, that action is recorded. The coalition has expanded rapidly — Google joined in February 2024, OpenAI in May 2024, Amazon in September 2024, and Meta joined the steering committee. The broader Content Authenticity Initiative reached 6,000 members by 2025, and C2PA launched its Conformance Program and official Trust List in mid-2025 to ensure interoperability across platforms and devices.
These approaches are complementary, not competing. Watermarking works even when metadata is stripped (as happens when content is uploaded to most social media platforms, which discard EXIF and metadata). C2PA metadata provides richer provenance information but is fragile — anyone can strip it by simply re-saving the file. The most robust systems combine both: an embedded watermark as a fallback signal and C2PA metadata as a detailed provenance record when available.
Other approaches include fingerprinting (creating a hash of known AI-generated content and maintaining a database for lookup) and classifier-based detection (training AI to detect AI-generated content). OpenAI’s text classifier, launched in January 2023 and withdrawn six months later due to a true positive rate of just 26% and a 9% false positive rate, illustrated the difficulty of the classifier approach. More recent detectors, like GPTZero and Originality.ai, have improved but still produce false positives at rates that make them unsuitable for high-stakes decisions.
Advertisement
The Legislative Push: Mandates Are Coming — Unevenly
The EU AI Act includes explicit transparency obligations for AI-generated content under Article 50. Providers of AI systems that generate synthetic audio, image, video, or text must ensure outputs are marked in a machine-readable format. Deployers who publish AI-generated content that could be mistaken for authentic must disclose its artificial origin. These transparency provisions take effect on August 2, 2026 — making the EU the first jurisdiction with binding enforcement of AI content labeling mandates. To support implementation, the EU is developing a Code of Practice on transparency for AI-generated content, with a first draft released in late 2025 and finalization expected in mid-2026. However, a March 2025 study of 50 widely used AI image generators found that only 38% currently implement adequate watermarking and just 18% implement proper deepfake labeling — a gap the industry must close rapidly.
China moved earlier. Since January 2023, China’s Deep Synthesis Provisions — jointly issued by the Cyberspace Administration of China (CAC), the Ministry of Industry and Information Technology, and the Ministry of Public Security — require providers of AI-generated content to add identifiable watermarks and labels. The CAC has continued tightening requirements, releasing draft Measures for Labeling AI-generated Synthetic Content in September 2024 that would standardize labeling across all generative AI services.
In the United States, the legislative landscape is fragmented and shifting. The NO FAKES Act (Nurture Originals, Foster Art, and Keep Entertainment Safe), reintroduced in the 119th Congress in April 2025 as S.1367 with bipartisan Senate support, would create a federal right to control the use of one’s likeness in AI-generated content. It remains in the Senate Judiciary Committee. California’s AB 2655, signed into law in September 2024 to require large platforms to label or remove deceptive AI-generated election content, was subsequently struck down by a federal judge on Section 230 and First Amendment grounds — illustrating the legal fragility of state-level deepfake regulation. At the federal level, Executive Order 14110 on AI Safety, signed in October 2023, had directed NIST to develop standards for AI content authentication, resulting in the NIST AI 100-4 report on reducing risks posed by synthetic content (finalized November 2024). But EO 14110 was revoked by President Trump in January 2025, replaced by EO 14179 (“Removing Barriers to American Leadership in Artificial Intelligence”), which contains no comparable content authentication mandates. The NIST report remains a valuable technical reference, but the federal policy framework that commissioned it no longer exists. Notably, in January 2025, NSA and CISA published joint guidance on Content Credentials for multimedia integrity — a signal that national security agencies recognize the threat even as regulatory momentum has stalled.
The challenge is that watermarking mandates create asymmetric incentives. Legitimate AI providers (OpenAI, Google, Meta, Anthropic) will comply with watermarking requirements. Malicious actors using open-source models or custom-trained systems will not. A determined adversary can fine-tune an open-source model to generate content without watermarks, or use adversarial techniques to remove watermarks from generated content. Watermarking raises the floor — making casual misuse harder — but does not eliminate sophisticated threats.
Robustness, Adversarial Attacks, and the Limits of Watermarking
The fundamental technical challenge is robustness. A watermark must survive common transformations: JPEG compression, resizing, cropping, color adjustment, screenshot capture, and format conversion. For audio, it must survive re-recording, speed changes, and background noise addition. For text, it must survive paraphrasing, translation, and partial rewriting. Research from the University of Maryland, led by Soheil Feizi, demonstrated that many current watermarking schemes can be defeated by relatively simple transformations. The UMD team reported breaking all “low perturbation” watermarks tested and identified a fundamental trade-off: watermark detection schemes can have high performance (few false negatives) or high robustness against adversarial attacks (few false positives), but not both simultaneously.
SynthID’s approach to text watermarking — biasing token selection probabilities during generation — is theoretically detectable but faces challenges with short texts (where the statistical signal is too weak) and with outputs that are subsequently edited by humans. A document that is 80% AI-generated and 20% human-edited may fall below the detection threshold. Adversarial attacks specifically designed to remove watermarks (by adding noise, regenerating content through a non-watermarked model, or using style transfer) further erode reliability.
The emerging research consensus is that no single watermarking technique will be universally robust. Instead, the field is moving toward multi-layered approaches: combining embedded watermarks with metadata standards (C2PA), content fingerprint databases, and statistical detection methods. Hybrid architectures using Vision Transformers and diffusion models show promise for improved resistance to adversarial attacks. Provenance infrastructure — the systems that verify, store, and query authenticity information — may ultimately matter more than any individual watermark technology. The internet was built without an authenticity layer. Retrofitting one is among the most difficult technical and institutional challenges of the AI era.
Advertisement
🧭 Decision Radar
| Dimension | Assessment |
|---|---|
| Relevance for Algeria | Medium — Deepfake risks apply to Algerian elections, media, and commerce; content platforms operating in Algeria will need to comply with emerging standards |
| Infrastructure Ready? | Partial — Technical standards (C2PA, SynthID) are open and adoptable, but Algerian media and platforms have not yet implemented them |
| Skills Available? | Partial — Computer science programs cover the underlying cryptography and ML; specialized content provenance expertise is rare |
| Action Timeline | 12-18 months to adopt C2PA standards in Algerian media organizations; 2-3 years for regulatory frameworks |
| Key Stakeholders | Algerian media organizations, ARAV (broadcasting regulatory authority), Ministry of Communication, social media platforms operating in Algeria, election authorities |
| Decision Type | Monitor |
Quick Take: Algerian media organizations should begin evaluating C2PA integration for their published content now, before a crisis forces reactive adoption. The broadcasting regulator (ARAV) should track EU AI Act transparency provisions as a template for eventual Algerian requirements. Election authorities should develop deepfake incident response protocols before the next electoral cycle.
Sources & Further Reading
- SynthID: Watermarking AI-Generated Content — Google DeepMind
- C2PA: Providing Origins of Media Content — Coalition for Content Provenance and Authenticity
- Article 50: Transparency Obligations for AI Systems — EU AI Act
- NO FAKES Act of 2025 (S.1367) — US Congress
- NIST AI 100-4: Reducing Risks Posed by Synthetic Content — NIST
- Researchers Tested AI Watermarks and Broke All of Them — UMD TRAILS Institute
- Can You Tell AI-Generated People from Real Ones? — University of Waterloo
- Finance Worker Pays Out $25 Million After Video Call with Deepfake CFO — CNN
- Biden Deepfake Robocall: Charges and Fines — NPR
- Executive Order 14179: Removing Barriers to American Leadership in AI — Federal Register
- Missing the Mark: Adoption of Watermarking for Generative AI Systems — Rijsbosch et al.
- Content Credentials Momentum Across Social Media and AI Companies — Adobe Blog
- China: Provisions on Deep Synthesis Technology Enter into Effect — Library of Congress
- Strengthening Multimedia Integrity in the Generative AI Era — NSA/CISA
Advertisement