The largest collection of (mis)information datasets
A curated collection of 75 misinformation datasets, and a
unified setup to work with the 36 claim and statement datasets, available here.
Dataset Quality Assessment
We evaluated the quality of 36 datasets, identifying potential flaws such as insufficient label quality,
spurious correlations, and political bias. This helps researchers select datasets that are suitable for
their work.
Evaluation of Detection Models
Our paper provides state-of-the-art baselines for misinformation detection models on
these datasets, demonstrating the limitations of categorical labels and suggesting alternative
evaluation methods.