Discussion
Loading...

Post

Log in
  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
Nick Byrd, Ph.D.
Nick Byrd, Ph.D.
@ByrdNick@nerdculture.de  ·  activity timestamp 2 weeks ago

An #ethics Turing test?

Twenty people texted a #chatbot over a month before a 2-hour interview about whether they thought the #AI was able to extract, embody, and explain human values.

Thirteen "left the study convinced..."

https://doi.org/10.48550/arXiv.2601.22440

#philosophy #edu #LLMs #tech

arXiv.org

AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations

Does AI understand human values? While this remains an open philosophical question, we take a pragmatic stance by introducing VAPT, the Value-Alignment Perception Toolkit, for studying how LLMs reflect people's values and how people judge those reflections. 20 participants texted a human-like chatbot over a month, then completed a 2-hour interview with our toolkit evaluating AI's ability to extract (pull details regarding), embody (make decisions guided by), and explain (provide proof of) human values. 13 participants left our study convinced that AI can understand human values. Participants found the experience insightful for self-reflection and found themselves getting persuaded by the AI's reasoning. Thus, we warn about "weaponized empathy": a potentially dangerous design pattern that may arise in value-aligned, yet welfare-misaligned AI. VAPT offers concrete artifacts and design implications to evaluate and responsibly build value-aligned conversational agents with transparency, consent, and safeguards as AI grows more capable and human-like into the future.
  • Copy link
  • Flag this post
  • Block

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances
Bonfire social · 1.0.2-alpha.34 no JS en
Automatic federation enabled
Log in
Instance logo
  • Explore
  • About
  • Members
  • Code of Conduct