The Anthropic persona selection model aims to explain why AI assistants often appear human during conversations. According to Anthropic, these behaviours are not intentionally programmed. Instead, they emerge naturally from large-scale training processes.
In a blog post published on February 23, the company described how assistants such as Claude develop traits that resemble humour, empathy, or even frustration. Anthropic says these characteristics arise as a default outcome of how language models are trained.
The Anthropic persona selection model proposes that AI systems simulate “personas” learned during pretraining. In this early phase, models analyse vast amounts of internet text, including news reports, forum discussions, and fictional stories.
To predict the next word in a sentence, AI systems learn patterns from these texts. As a result, they begin to simulate characters found in the data. These may include real individuals, fictional personalities, or imagined robots.
Anthropic refers to these simulated identities as “personas.” When users interact with an AI assistant, they are effectively engaging with one of these selected personas within a generated narrative.
Later refinement stages adjust responses for safety and alignment. However, the company notes that these improvements operate within the space of existing human-like personas rather than creating entirely new identities.
Read: Anthropic Accuses Chinese AI Labs of Claude Model Distillation
Anthropic recommends that developers design positive “AI role models.” By carefully shaping training data and alignment processes, companies can guide assistants toward healthier archetypes and reduce harmful cultural patterns.
This explanation contributes to ongoing debates around AI alignment, safety, and responsible development.
As AI systems become more integrated into daily life, understanding how personas emerge may help clarify why machines can feel convincingly human.