Survey-based research is about to change in a big way. As we can now tap into internet’s myriad voices with little effort (large language models!), we are seeing more and more research showcasing these models as potential stand-ins for human respondents in surveys and experiments.
Take, for instance, “Out of One, Many: Using Language Models to Simulate Human Samples.” The authors use a large language model (one) to create synthetic respondent by conditioning the model on thousands of sociodemographic backstories from real human participants (many). They succeed, for example, at getting synthetic Democrats call hypothetical Republicans bad words:

John Horton’s work, “Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?“, expands this idea further. He likens large language models to the concept of ‘homo economicus’, showing that these models can be endowed with various attributes and then observed in simulated environments (e.g., behavioral econ experiments). And the derived synthetic responses not only offers valuable insights into economic behaviors, but also pass sanity checks:1

All of this is really cool stuff, and it gave me a major researcher FOMO. And so I started experimenting myself. I decided to look into how large language models mimic human behavior in the financial domain. For that, I created a diverse panel of synthetic participants, and threw a brewing bank run at them. Watching the model responses to different twists and turns (informational treatments) was at times jaw-droppingly good. It fully adopted the personalities of participants it was conditioned to represent. And on aggregate, these synthetic responses mirrored real-world behavioral patterns with respect to trust in financial institutions.
All of the below answers are LLM-generated. Note how they change with age, education, income:

Given how good these models are at mimicking humans, this technology has potential far beyond academic research itself. For example, using large language models to benchmark expected results in experimental studies could serve as a tool against data manipulation.
Of course, synthetic data is not a panacea. These models have their limits and can be biased. They can’t fully match human thinking (at least not yet). But what we’re seeing now is already super exciting.
Some more papers on the topic:
- Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies
- Can AI language models replace human participants?
- AI-Augmented Surveys: Leveraging Large Language Models and Surveys for Opinion Prediction
- John shows in the paper that he is able to recover findings from existing experiments with actual humans. Another good example is the work of Ayelet Israeli and co-authors. They show that responses in scenarios like shopping for laptops mirror human decision-making processes. ↩︎
Leave a comment