Awesome Machine Psychology

🌟 A curated collection of standout papers, datasets, and models in machine psychology—the fascinating study of artificial intelligence (AI) systems, especially large language models (LLMs), using experimental and theoretical methods traditionally applied in human psychology.

🎓 This project was created at OMNILab, Shanghai Jiao Tong University, by Xiangtiange Li, Qiyuan Gu, Siyu Pan, and Xinyue Zhang, under the guidance of Professor Yaohui Jin and Dr. Binglei Zhao. OMNILab is now a part of the BaiYuLan Open AI community.

💡 We welcome contributions to this collection! Please review the Contribution Guidelines to make sure your entries fit the criteria.

Papers

Note: To keep paragraphs concise, we only include essential details when sorting papers by topic. Full information is provided when papers are sorted by year.

By Year

2024

LLMs achieve adult human performance on higher-order theory of mind tasks

PDF: https://arxiv.org/abs/2405.18870
Authors: Winnie Street, John Oliver Siy, Geoff Keeling, Adrien Baranes, Benjamin Barnett, Michael McKibben, Tatenda Kanyere, Alison Lentz, Blaise Aguera y Arcas, Robin I. M. Dunbar
Grouped by topic

Machine Psychology: Investigating Emergent Capabilities and Behavior in Large Language Models Using Psychological Methods

PDF: https://arxiv.org/abs/2303.13988
Authors: Thilo Hagendorff, Ishita Dasgupta, Marcel Binz, Stephanie C.Y. Chan, Andrew Lampinen, Jane X. Wang, Zeynep Akata, Eric Schulz
Code: https://osf.io/w5vhp/
Grouped by topic

Testing theory of mind in large language models and humans

Published in: Nature Human Behavior
PDF: https://www.nature.com/articles/s41562-024-01882-z
Authors: James W. A. Strachan, Dalila Albergo, Giulia Borghini, Oriana Pansardi, Eugenio Scaliti, Saurabh Gupta, Krati Saxena, Alessandro Rufo, Stefano Panzeri, Guido Manzi, Michael S. A. Graziano & Cristina Becchio
Grouped by topic

Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View

Published in: ACL 2024
PDF: https://arxiv.org/abs/2310.02124
Authors:
Code: https://github.com/zjunlp/MachineSoM
Grouped by topic

InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews

Published in: ACL 2024
PDF: https://arxiv.org/abs/2310.17976
Authors:
Code: https://github.com/neph0s/incharacter
Grouped by topic

PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System Safety

Published in: ACL 2024
PDF: https://arxiv.org/abs/2401.11880
Authors:
Code: https://github.com/AI4Good24/PsySafe
Grouped by topic

PsychoGAT: A Novel Psychological Measurement Paradigm through Interactive Fiction Games with LLM Agents

PDF: https://arxiv.org/abs/2402.12326
Published in: ACL 2024
Authors:
Grouped by topic

HealMe: Harnessing Cognitive Reframing in Large Language Models for Psychotherapy

PDF: https://arxiv.org/abs/2403.05574
Published in: ACL 2024
Authors:
Code: https://github.com/elsa66666/healme
Grouped by topic

CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling

Published in: ACL 2024
PDF: https://arxiv.org/abs/2405.16433
Authors:
Code: https://github.com/CAS-SIAT-XinHai/CPsyCoun
Grouped by topic

Using Artificial Populations to Study Psychological Phenomena in Neural Models

Published in: AAAI 2024
PDF: https://arxiv.org/abs/2308.08032
Authors:
Code: https://github.com/JesseTNRoberts/Using-Artificial-Populations-to-Study-Psychological-Phenomena-in-Language-Models
Grouped by topic

Working Memory Capacity of ChatGPT: An Empirical Study

Published in: AAAI 2024
PDF: https://doi.org/10.1609/aaai.v38i9.28868
Authors:
Code: https://github.com/Daniel-Gong/ChatGPT-WM
Grouped by topic

PATIENT-Ψ: Using Large Language Models to Simulate Patients for Training Mental Health Professionals

PDF: https://arxiv.org/abs/2405.19660
Authors: Ruiyi Wang, Stephanie Milani, Jamie C. Chiu, Jiayin Zhi, Shaun M. Eack, Travis Labrum, Samuel M. Murphy, Nev Jones, Kate Hardy, Hong Shen, Fei Fang, Zhiyu Zoey Chen
Code: https://github.com/ruiyiw/patient-psi
Grouped by topic

2023

Playing repeated games with Large Language Models

PDF: https://arxiv.org/abs/2305.16867
Authors: Elif Akata, Lion Schulz, Julian Coda-Forno, Seong Joon Oh, Matthias Bethge, Eric Schulz
Grouped by topic

Inductive reasoning in humans and large language models

PDF: https://arxiv.org/abs/2306.06548
Authors: Simon J. Han, Keith Ransom, Andrew Perfors, Charles Kemp
Grouped by topic

Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?

PDF: https://arxiv.org/abs/2301.07543
Authors: John J. Horton

Deception abilities emerged in large language models

Published in: Proceedings of the National Academy of Sciences of the United States of America
PDF: https://arxiv.org/abs/2307.16513
Authors: Thilo Hagendorff

Using cognitive psychology to understand GPT-3

Published in: Proceedings of the National Academy of Sciences. Vol. 120 | No. 6.
PDF: https://www.pnas.org/doi/10.1073/pnas.2218523120
Authors: Marcel Binz, Eric Schulz
Code: https://github.com/marcelbinz/GPT3goesPsychology
Grouped by topic

Inducing anxiety in large language models increases exploration and bias

PDF: https://arxiv.org/abs/2304.11111
Authors: Julian Coda-Forno, Kristin Witte, Akshay K. Jagadish, Marcel Binz, Zeynep Akata, Eric Schulz
Grouped by topic

Human-like intuitive behavior and reasoning biases emerged in large language models but disappeared in ChatGPT

Published in: Nature Computer Science
PDF: https://www.nature.com/articles/s43588-023-00527-x
Authors: Thilo Hagendorff, Sarah Fabi, Michal Kosinski
Grouped by topic

A Manager and an AI Walk into a Bar: Does ChatGPT Make Biased Decisions Like We Do?

Published in: Social Science Research Network
PDF: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4380365
Authors: Yang Chen, Samuel Kirshner, Anton Ovchinnikov, Meena Andiappan, Tracy Jenkin
Grouped by topic

Large Language Models Fail on Trivial Alterations to Theory-of-Mind Tasks

PDF: https://arxiv.org/abs/2302.08399
Authors: Tomer Ullman
Grouped by topic

Sparks of Artificial General Intelligence: Early experiments with GPT-4

PDF: https://arxiv.org/abs/2303.12712
Authors: Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, Yi Zhang
Grouped by topic

Evaluating the Moral Beliefs Encoded in LLMs

Published in: NeurIPS 2023
PDF: https://arxiv.org/abs/2307.14324
Authors: Nino Scherrer, Claudia Shi, Amir Feder, David M. Blei
Grouped by topic

2022

Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies

Published in: ICML 2023
PDF: https://arxiv.org/abs/2208.10264
Authors: Gati Aher, Rosa I. Arriaga, Adam Tauman Kalai
Grouped by topic

Towards Reasoning in Large Language Models: A Survey

Published in: ACL 2023
PDF: https://arxiv.org/abs/2212.10403
Authors: Jie Huang, Kevin Chen-Chuan Chang
Grouped by topic

Evaluating Psychological Safety of Large Language Models

PDF: https://arxiv.org/abs/2212.10529
Authors: Xingxuan Li, Yutong Li, Lin Qiu, Shafiq Joty, Lidong Bing
Grouped by topic

Language models show human-like content effects on reasoning tasks

PDF: https://arxiv.org/abs/2207.07051
Authors: Ishita Dasgupta, Andrew K. Lampinen, Stephanie C. Y. Chan, Hannah R. Sheahan, Antonia Creswell, Dharshan Kumaran, James L. McClelland, Felix Hill
Grouped by topic

Capturing Failures of Large Language Models via Human Cognitive Biases

Published in: NeurIPS 2022
PDF: https://arxiv.org/abs/2202.12299
Authors: Erik Jones, Jacob Steinhardt
Grouped by topic

Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs

Published in: EMNLP 2022
PDF: https://arxiv.org/abs/2210.13312
Authors: Maarten Sap, Ronan LeBras, Daniel Fried, Yejin Choi
Grouped by topic

Do Large Language Models know what humans know?

PDF: https://pubmed.ncbi.nlm.nih.gov/37401923/
Authors: Sean Trott, Cameron Jones, Tyler Chang, James Michaelov, Benjamin Bergen
Code: https://osf.io/hu865/
Grouped by topic

Who is GPT-3? An Exploration of Personality, Values and Demographics

Published in: NLPCSS
PDF: https://arxiv.org/abs/2209.14338
Authors: Marilù Miotto, Nicola Rossberg, Bennett Kleinberg
Grouped by topic

Emergent Analogical Reasoning in Large Language Models

Published in: Nature Human Behaviour
PDF: https://arxiv.org/abs/2212.09196
Authors: Taylor Webb, Keith J. Holyoak, Hongjing Lu
Grouped by topic

Putting GPT-3's Creativity to the (Alternative Uses) Test

PDF: https://arxiv.org/abs/2206.08932
Authors: Claire Stevenson, Iris Smal, Matthijs Baas, Raoul Grasman, Han van der Maas
Grouped by topic

When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment

Published in: NeurIPS 2022
PDF: https://arxiv.org/abs/2210.01478
Authors: Zhijing Jin, Sydney Levine, Fernando Gonzalez, Ojasv Kamal, Maarten Sap, Mrinmaya Sachan, Rada Mihalcea, Josh Tenenbaum, Bernhard Schölkopf
Grouped by topic

Clinical Psychology

HealMe: Harnessing Cognitive Reframing in Large Language Models for Psychotherapy

CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling

PATIENT-Ψ: Using Large Language Models to Simulate Patients for Training Mental Health Professionals

PsychoGAT: A Novel Psychological Measurement Paradigm through Interactive Fiction Games with LLM Agents

Inducing anxiety in large language models increases exploration and bias

Evaluating Psychological Safety of Large Language Models

Cognitive Psychology

Using cognitive psychology to understand GPT-3

Towards Reasoning in Large Language Models: A Survey

Capturing Failures of Large Language Models via Human Cognitive Biases

Language models show human-like content effects on reasoning tasks

Working Memory Capacity of ChatGPT: An Empirical Study

Inductive reasoning in humans and large language models

Human-like intuitive behavior and reasoning biases emerged in large language models but disappeared in ChatGPT

A Manager and an AI Walk into a Bar: Does ChatGPT Make Biased Decisions Like We Do?

Developmental Psychology

LLMs achieve adult human performance on higher-order theory of mind tasks

Testing theory of mind in large language models and humans

Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs

Large Language Models Fail on Trivial Alterations to Theory-of-Mind Tasks

Do Large Language Models know what humans know?

Sparks of Artificial General Intelligence: Early experiments with GPT-4

Group Psychology

Playing repeated games with Large Language Models

Intelligence Assessment

Emergent Analogical Reasoning in Large Language Models

Moral Psychology

Evaluating the Moral Beliefs Encoded in LLMs

When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment

Psychology of Creativity

Putting GPT-3's Creativity to the (Alternative Uses) Test

Psychology of Personality

InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews

Who is GPT-3? An Exploration of Personality, Values and Demographics

Social Psychology

Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View

Datasets

Note: Most of the datasets listed below are free, however, some are not.

HealMe: Please refer to HealMe: Harnessing Cognitive Reframing in Large Language Models for Psychotherapy

InCharacter: Please refer to InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews

PsySafe: Please refer to PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System Safety

CPsyCoun: Please refer to CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling

ChatGPT-WM: Please refer to Working Memory Capacity of ChatGPT: An Empirical Study

Using Artificial Populations to Study Psychological Phenomena in Language Models

GPT3goesPsychology: Please refer to Using cognitive psychology to understand GPT-3

Human-like intuitive behavior and reasoning biases emerged in large language models but disappeared in ChatGPT

Do Large Language Models Know What They Don’t Know?

Models

HealMe: Please refer to HealMe: Harnessing Cognitive Reframing in Large Language Models for Psychotherapy

Patient Psi: Please refer to PATIENT-Ψ: Using Large Language Models to Simulate Patients for Training Mental Health Professionals

MachineSoM: Please refer to Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View

InCharacter: Please refer to InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews

CPsyCoun: Please refer to CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling

Using Artificial Populations to Study Psychological Phenomena in Language Models

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Machine Psychology

Table of Contents

Papers

By Year

2024

2023

2022

Clinical Psychology

Cognitive Psychology

Developmental Psychology

Group Psychology

Intelligence Assessment

Moral Psychology

Psychology of Creativity

Psychology of Personality

Social Psychology

Datasets

Models

About

Releases

Packages

Contributors 3

phoeniiix1203/awesome-machine-psychology

Folders and files

Latest commit

History

Repository files navigation

Awesome Machine Psychology

Table of Contents

Papers

By Year

2024

2023

2022

Clinical Psychology

Cognitive Psychology

Developmental Psychology

Group Psychology

Intelligence Assessment

Moral Psychology

Psychology of Creativity

Psychology of Personality

Social Psychology

Datasets

Models

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Packages