Skip to main content

Accuracy Benchmarks

Per-modality evaluation results across our ensemble models. All benchmarks use held-out test splits — none of the test data was used for training.

Real-world accuracy varies by generator novelty, content type, and obfuscation. Treat these as upper bounds on curated data.

Text

PAN25, PERSUADE 2.0, M4
ModelAUC-ROCPrecisionRecallF1FPR
RoBERTa-base-openai-detector0.9795.0%94.0%0.9454.0%
Binoculars (perplexity/crossperplexity)0.9693.0%96.0%0.9455.0%
Gemini 2.0 Flash (ensemble head)0.9594.0%93.0%0.9355.0%
Ensemble (all combined)0.9896.0%97.0%0.9653.0%

Evaluated on 50K samples across GPT-4, Claude 3, Gemini, Llama-3, Mistral.

Image

CIFAKE, GenImage, FaceForensics++
ModelAUC-ROCPrecisionRecallF1FPR
ViT-based classifier (fine-tuned)0.9491.0%93.0%0.9207.0%
CLIP embedding similarity0.8987.0%89.0%0.88010.0%
Pixel integrity + frequency domain0.8583.0%86.0%0.84513.0%
Grok Vision (RAG-augmented)0.9290.0%91.0%0.9058.0%
Ensemble (all combined)0.9693.0%94.0%0.9355.0%

Evaluated on 40K images: Midjourney v6, DALL-E 3, Stable Diffusion XL, Firefly.

Audio

ASVspoof 2019/2021, ADD 2023
ModelAUC-ROCPrecisionRecallF1FPR
wav2vec2 (fine-tuned, ASVspoof)0.9391.0%92.0%0.9157.0%
Spectral feature analysis0.8785.0%86.0%0.85512.0%
SynthID local watermark check0.8288.0%78.0%0.8275.0%
Ensemble (all combined)0.9592.0%93.0%0.9256.0%

Evaluated on 30K clips: ElevenLabs, Bark, VALL-E, YourTTS, RVC clones.

Video

FaceForensics++, DFDC Preview
ModelAUC-ROCPrecisionRecallF1FPR
NVIDIA NIM deepfake detection0.9189.0%90.0%0.8959.0%
Frame-level ViT ensemble0.8886.0%87.0%0.86511.0%
Temporal consistency analysis0.8382.0%83.0%0.82515.0%
Ensemble (all combined)0.9391.0%90.0%0.9058.0%

Evaluated on 8K clips: Sora, Kling, Runway Gen-3, DeepFaceLab.

Evaluation Datasets

ModalityDatasetSize
TextPAN25 Authorship Verification~500K samples
TextPERSUADE Corpus 2.0~25K essays
TextM4 Benchmark122K samples
ImageCIFAKE120K images
ImageGenImage1.3M images
AudioASVspoof 2019 (LA track)121K clips
AudioASVspoof 2021181K clips
AudioADD 2023~330K clips
VideoFaceForensics++5K videos
VideoDFDC Preview Dataset (Meta)19K videos

Full results with confidence intervals and per-generator breakdowns available as CSV.

Download results CSV

Methodology · Research Citations