A Comparison of Agentic AI Systems and Human Economists

A Comparison of Agentic AI Systems and Human Economists

Companion data for A Comparison of Agentic AI Systems and Human Economists. 100 independent replications per AI system for each of three causal-inference tasks from Huntington-Klein et al. (2025). Three agentic AI systems are compared: Codex with GPT-5.4, Codex with GPT-5.3-Codex, and Claude Code with Opus 4.6.

Research question: Among Hispanic-Mexican immigrants in the US, what was the causal effect of Deferred Action for Childhood Arrivals (DACA) eligibility on full-time employment?
Task 1: Full Freedom
Raw ACS data only. Analysts choose their own research design, sample, and controls.
Task 2: Specified Design
Same data, but the research design is prescribed: compare ages 26–30 (treated) vs. 31–35 (control) in a difference-in-differences framework.
Task 3: Pre-Cleaned Data
Same design as Task 2, plus a pre-cleaned dataset with treatment and outcome variables already constructed.
Replications view: Select a task and model, then click a replication to browse its report, run log, code, and linked comparison reviews. Each replication shows its point estimate and average rank across all reviewer models (1 = best of 4 submissions). Click the column headers (#, Point Estimate, Average Rank) to sort.
Comparisons view: Browse the review tournament groups—each group contains one submission from each of the three AI systems and one human economist, reviewed by multiple AI reviewer models. The score matrix shows how each reviewer scored and ranked the four submissions. Click a reviewer column to view its full comparison report. Click a submission ID to jump to its replication.
Disclaimer: This website was made by Serafin Grundl (and Claude Code). I am an economist at the Federal Reserve Board. This is my personal website and the analysis and conclusions presented on this website reflect my views and do not indicate concurrence by the Board of Governors or the Federal Reserve System. Contact: serafin.j.grundl@frb.gov
Select a replication from the sidebar to view its files