← All software

Deepfake Detection Platforms Evaluation

First cross-paradigm evaluation of six publicly accessible deepfake detection tools, including forensic analysis tools and AI-based classifiers

Python ยท active

A comparative evaluation framework for publicly accessible deepfake detection tools, assessing both forensic analysis tools (InVID & WeVerify, FotoForensics, Forensically) and AI-based classifiers (DecopyAI, FaceOnLive, Bitmind). The evaluation was conducted by professional investigators with law enforcement experience using blinded protocols across DF40, CelebDF, and CASIA-v2 datasets.

Key findings:

  • Forensic tools exhibit high recall but poor specificity
  • AI classifiers demonstrate the inverse pattern (high specificity, lower recall)
  • Human evaluators substantially outperform all automated tools
  • Human-AI disagreement is asymmetric, with human judgment prevailing in most discordant cases