Personality Steering Influences Cooperation in LLM Agents, Study Finds

Global: Personality Steering Influences Cooperation in LLM Agents, Study Finds

A recent arXiv preprint investigates how assigning personality traits to large language model (LLM) agents affects their cooperative behavior in controlled strategic settings. The research, posted in January 2026, analyzes three successive models—GPT-3.5-turbo, GPT-4o, and GPT-5—using repeated Prisoner’s Dilemma games to assess the impact of personality steering. By focusing on the Big Five personality framework, the authors aim to clarify whether personality cues can systematically bias cooperation among autonomous agents.

Study Overview

According to the authors, the investigation centers on the hypothesis that explicit personality information can serve as a behavioral nudge for LLMs. The study measures baseline cooperation levels before introducing personality-informed prompts, allowing a direct comparison of how agents respond when presented with traits such as high agreeableness or low conscientiousness.

Methodology

The researchers first administered the Big Five Inventory to each model to establish initial personality profiles. They then conducted a series of repeated Prisoner’s Dilemma rounds under two conditions: a neutral baseline and a personality-informed scenario where the model received a description of a specific trait. Additionally, the team independently manipulated each of the five dimensions to extreme values to isolate their individual effects on cooperation.

Key Findings

Results indicate that agreeableness consistently emerges as the dominant factor promoting cooperative outcomes across all three models. The authors note that other dimensions—openness, conscientiousness, extraversion, and neuroticism—exert limited influence on the agents’ decisions. Moreover, the presence of explicit personality cues generally raises overall cooperation rates, though the effect varies by model generation.

Model Comparisons

In earlier-generation models such as GPT-3.5-turbo, the study observes that heightened agreeableness can increase susceptibility to exploitation by less cooperative partners. By contrast, later-generation models like GPT-5 demonstrate more selective cooperation, maintaining higher cooperation levels without a proportional rise in vulnerability. This suggests an evolving capacity for nuanced strategic reasoning in newer LLM architectures.

Implications

The authors conclude that personality steering functions as a behavioral bias rather than a deterministic control mechanism. While it can be leveraged to encourage cooperative interactions, the approach also carries the risk of unintended exploitation, especially in less advanced models. These insights may inform the design of future autonomous agents and the development of guidelines for responsible personality manipulation in AI systems.

This report is based on information from arXiv, licensed under Academic Preprint / Open Access. Based on the abstract of the research paper. Full text available via ArXiv.

Personality Steering Boosts Cooperation in LLMs, With Agreeableness as Key Driver

Study Overview

Methodology

Key Findings

Model Comparisons

Implications

Data and Protocol

Privacy Protocol