Lab Datasets

choices13k

13,000 decision problems · Science 2021

A large-scale dataset of human decision rates on 13,006 risky choice problems — 30× larger than previous benchmarks. Each problem presents a pair of gambles varying in risk, ambiguity, feedback, and correlation structure.

Peterson, Bourgin, Agrawal, Reichman, Griffiths

CIFAR-10H

10,000 images · 500k annotations · ICCV 2019

Soft labels reflecting human perceptual uncertainty for the full CIFAR-10 test set. Each image was labeled by ~50 annotators, providing a rich distribution over categories that captures genuine ambiguity in natural images.

Peterson, Battleday, Griffiths, Russakovsky

One Million Impressions (OMI)

1,004 faces · 34 attributes · 1M judgments · PNAS 2022

Over one million human judgments of synthetic face images along 34 social and perceptual attributes — from trustworthiness and competence to age and gender. Enables the study of systematic biases in first impressions of faces.

Peterson, Uddenberg, Griffiths, Todorov, Suchow

Object Memorability

850 images · 3,400 objects · ICCV 2015

Memorability scores and ground-truth memorability maps for images and individual object segments from the PASCAL-S dataset. Explores what visual properties make specific objects more memorable to humans.

Dubey, Peterson, Khosla, Yang, Ghanem

Strategic Games

2,400 two-player games · Nat. Hum. Behav. 2025

Human behavior data from strategic decision-making tasks (game theory). Captures the complexity of human strategic reasoning using machine learning to model choices across diverse game structures.

Zhu, Peterson, Enke, Griffiths

Analogy Judgments

Cognition 2020

Human analogy completion and evaluation data for word embeddings and cognitive models. Used to probe the limits of vector space models for capturing human analogical reasoning.

Peterson, Chen, Griffiths
Collaborative Datasets

Psych-101

10M+ trials · 60k+ participants · Nature 2025

A large-scale dataset of human behavioral experiments aggregated from over 100 published studies. Used to train Centaur, a foundation model that predicts and captures human cognition across diverse tasks.

Binz, ..., Peterson, et al.