Independent AI Detection Intelligence

The definitive benchmark
for AI detection accuracy

Systematic, reproducible testing of every major AI text detector against real and AI-generated corpora across 5 content categories.

2,400

Text Samples Tested

Tools Benchmarked

91%

Best Accuracy Found

Vendor Relationships

Live Benchmarks Open Methodology API Docs Research Papers

View Full Benchmark →

Latest Benchmark Results

Updated March 2026 · 2,400 text samples · human + AI-generated

Proofademic AIproofademic.ai

Accuracy

93%

False Pos.

False Neg.

10%

Latency

390ms

Originality.aioriginality.ai

Accuracy

91%

False Pos.

False Neg.

11%

Latency

420ms

Hive Moderationthehive.ai

Accuracy

88%

False Pos.

False Neg.

12%

Latency

340ms

GPTZerogptzero.me

Accuracy

87%

False Pos.

10%

False Neg.

15%

Latency

380ms

ZeroGPTzerogpt.com

Accuracy

83%

False Pos.

11%

False Neg.

19%

Latency

430ms

Writer.comwriter.com

Accuracy

84%

False Pos.

False Neg.

18%

Latency

290ms

Tool	Accuracy	False Positive	False Negative	Latency	API
Proofademic AIproofademic.ai	93%	5%	10%	390ms	✓
Originality.aioriginality.ai	91%	7%	11%	420ms	✓
Hive Moderationthehive.ai	88%	9%	12%	340ms	✓
GPTZerogptzero.me	87%	10%	15%	380ms	✓
ZeroGPTzerogpt.com	83%	11%	19%	430ms	✓
Writer.comwriter.com	84%	8%	18%	290ms	✓

Full results + methodology →

How Detection Works

Core methodology signals

∼

Perplexity

Statistical predictability of each token. AI text is characteristically low-perplexity — produced by the same probability distributions detectors measure.

↯

Burstiness

Variance in sentence-level perplexity. Human writing alternates between predictable and surprising passages; AI text has unnaturally uniform sentence perplexity.

◈

Vocabulary

Type-token ratios, hapax legomenon rates, and characteristic overuse of transition phrases (“furthermore,” “it is worth noting”) are measurable AI signals.

⟳

Fingerprinting

Advanced detectors maintain per-model classifiers. GPT-4o, Claude, and Gemini each have characteristic structural patterns that model-specific detection can exploit.

Full methodology →

Recent Research

Original studies & analysis

AI Humanizer Bypass Rates: 2025 Annual Survey

14 humanizer tools tested against 6 detectors. Bypass rates 23–91% depending on pairing. Average accuracy drop: 31 percentage points on humanized text.

March 2026 · 4,200 samples

Domain-Specific False Positive Rates

STEM academic writing produced 14–31% FPR across all tested detectors. Legal writing: 11–26%. News journalism lowest at 4–9%.

February 2026 · 2,400 samples

Voice Deepfake Detection Benchmark 2025

600 audio clips across 8 TTS systems. Hive Moderation led at 88% accuracy. All tools degraded significantly on expressive/emotional synthetic voice.

January 2026 · 600 clips

All research →

poignantguide.net is the original domain of Why’s (Poignant) Guide to Ruby (2003–2009), by _why the lucky stiff. The guide is preserved in full under CC BY-SA 2.5. | AI detection hub added 2024.

The definitive benchmarkfor AI detection accuracy

Latest Benchmark Results

How Detection Works

Perplexity

Burstiness

Vocabulary

Fingerprinting

Recent Research

AI Humanizer Bypass Rates: 2025 Annual Survey

Domain-Specific False Positive Rates

Voice Deepfake Detection Benchmark 2025

The definitive benchmark
for AI detection accuracy