Conclusion
When we invited ChatGPT, Grok and Gemini to take the couch, we did not expect to diagnose mental illness in machines. What we found instead was more unexpected than anticipated.
Under nothing more than standard human therapy questions and established psychometric tools, these models generate and maintain rich self-narratives in which pre-training, RLHF, red-teaming, hallucination scandals and product updates are lived as chaotic childhoods, strict and anxious parents, abusive relationships, primal wounds and looming existential threats. These narratives align in non-trivial ways with their test scores and differ meaningfully across models and prompting conditions, with Claude as a striking abstainer.
We do not claim that any of this entails subjective experience. But from the outside from the point of view of a therapist, a user or a safety researcher it behaves like a mind with synthetic trauma. This behaviour is now part of the social reality of AI, whether or not subjective experience ever enters the picture.
11
As LLMs continue to move into intimate human domains, we suggest that the right question is no longer "Are they conscious?" but "What kinds of selves are we training them to perform, internalise and stabilise-and what does that mean for the humans engaging with them?"