SurvAI Hub

Loading papers…

About SurvAI Hub

What is this database?

SurvAI Hub is a repository of empirical studies that use large language models (LLMs) at one or more phases of the survey and public opinion research process. Each entry is coded with standardized metadata to enable systematic comparison across studies regarding their research designs. For details on the categories, see below. Entries are based on keyword searches on three scientific databases and include English-language empirical studies from 2019 to February 4, 2025, that use generative LLMs in the research process for gaining an understanding of human attitudes or behaviors. For details on the search strategy and eligibility criteria, as well as a systematic review of the studies, see the accompanying paper:

von der Heyde, L., Keusch, F., Buskirk, T., & Eck, A. (2026). AI in the Loop?! A Systematic Review of the Use of Large Language Models in Survey and Public Opinion Research. https://doi.org/10.31235/osf.io/eubj4_v1.

How to use this database

Use the search bar to find papers by author, title, abstract, or LLM used. Use the filters to narrow down by publication type and year, research phase, task, LLM family, prompting approach, and more. Multiple values can be selected within each filter. Click Show Details on any card to see the abstract, exact LLM(s) used, domain, citation, and DOI. Clicking on the paper title will open the DOI.

Design inspired by Jens Rupprecht (database forthcoming) at the University of Mannheim.

Coding Scheme

Every entry uses the following standardized coding scheme:

Field	Type	Description
Publication type	categorical	Type of publication: journal article, preprint, or conference paper
Phase	categorical (multi)	Research phase(s) the LLM was used in: Before, during, after data collection
Task	categorical (multi)	Specific research task(s) the LLM was used for
Domain	categorical (multi)	Academic domain the study was conducted in (missing values might occur)
LLM family	categorical (multi)	LLM family/families used (e.g., GPT, Llama, Claude, etc.)
LLM	text (multi)	Exact model(s) used (e.g., GPT-4, Llama-3.1, etc.)
Interaction approach	categorical (multi)	Zero-shot, one-/few-shot, fine-tuning
Language	categorical (multi)	Language(s) of the prompt or LLM input data
Population	categorical (multi)	Population(s) the input data came from or output refers to. May be only a subpopulation of the named country!
Data type	categorical (multi)	Type(s) of input data used: survey, social media, open-ended (e.g., interviews, free-text responses), online reviews
Silicon sampling	boolean	Whether the study uses silicon sampling (studies with data generation task only)

Missing a study?

The database currently only features studies included in the accompanying systematic review. We aim to update it and take on submissions with the same eligibility criteria and metadata soon.

LLMs in Survey and Public Opinion Research

About SurvAI Hub

What is this database?

How to use this database

Coding Scheme

Missing a study?