Discussion
Loading...

Discussion

  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
UKP Lab
@UKPLab@sigmoid.social  ยท  activity timestamp 2 weeks ago

๐—ง๐—ถ๐—ฟ๐—ฒ๐—ฑ ๐—ผ๐—ณ ๐˜†๐—ผ๐˜‚๐—ฟ ๐—”๐—œ ๐—ฎ๐—ด๐—ฒ๐—ป๐˜ ๐—บ๐—ฎ๐—ธ๐—ถ๐—ป๐—ด ๐—บ๐—ฎ๐—ป๐˜† ๐—ฒ๐—ฟ๐—ฟ๐—ผ๐—ฟ๐˜€?
โžก๏ธ Weโ€™ve got the solution!

Meet ๐—ฆ๐—˜๐—˜๐—˜๐—— ๐ŸŒฑ โ€” a framework for ๐—”๐˜‚๐˜๐—ผ๐—บ๐—ฎ๐˜๐—ฒ๐—ฑ ๐—˜๐—ฟ๐—ฟ๐—ผ๐—ฟ ๐——๐—ถ๐˜€๐—ฐ๐—ผ๐˜ƒ๐—ฒ๐—ฟ๐˜† in conversational AI.

๐Ÿงฉ SEEED detects both ๐—ธ๐—ป๐—ผ๐˜„๐—ป ๐—ฎ๐—ป๐—ฑ ๐—ฝ๐—ฟ๐—ฒ๐˜ƒ๐—ถ๐—ผ๐˜‚๐˜€๐—น๐˜† ๐˜‚๐—ป๐˜€๐—ฒ๐—ฒ๐—ป ๐—ฒ๐—ฟ๐—ฟ๐—ผ๐—ฟ ๐˜๐˜†๐—ฝ๐—ฒ๐˜€, and even ๐—ด๐—ฒ๐—ป๐—ฒ๐—ฟ๐—ฎ๐˜๐—ฒ๐˜€ ๐—ฑ๐—ฒ๐—ณ๐—ถ๐—ป๐—ถ๐˜๐—ถ๐—ผ๐—ป๐˜€ for newly discovered ones

โš™๏ธ By combining ๐—น๐—ถ๐—ด๐—ต๐˜๐˜„๐—ฒ๐—ถ๐—ด๐—ต๐˜ ๐—ฒ๐—ป๐—ฐ๐—ผ๐—ฑ๐—ฒ๐—ฟ๐˜€ with a ๐—ป๐—ผ๐˜ƒ๐—ฒ๐—น ๐˜€๐—ฎ๐—บ๐—ฝ๐—น๐—ถ๐—ป๐—ด ๐˜€๐˜๐—ฟ๐—ฎ๐˜๐—ฒ๐—ด๐˜† ๐—ณ๐—ผ๐—ฟ ๐—ฐ๐—ผ๐—ป๐˜๐—ฟ๐—ฎ๐˜€๐˜๐—ถ๐˜ƒ๐—ฒ ๐—น๐—ฒ๐—ฎ๐—ฟ๐—ป๐—ถ๐—ป๐—ด, it improves representation learning and uncovers ๐—ฐ๐—ผ๐—ต๐—ฒ๐—ฟ๐—ฒ๐—ป๐˜ ๐—ฒ๐—ฟ๐—ฟ๐—ผ๐—ฟ ๐—ฐ๐—ฎ๐˜๐—ฒ๐—ด๐—ผ๐—ฟ๐—ถ๐—ฒ๐˜€.

(1/๐Ÿงต )

Diagram illustrating a chatbot correction process. A human says, โ€œI really like indie music! Do you have a favorite artist?โ€ The chatbot replies, โ€œIโ€™m a huge fan of indie music too! The Beatles are my absolute favorite!โ€ A feedback system then flags this as factually inconsistent, explaining that The Beatles are a rock band and suggesting The Smiths as an indie band instead. The corrected response becomes, โ€œIโ€™m a huge fan of indie music too! The Smiths are my absolute favorite!โ€ The diagram also highlights limitations of relying solely on instructions or external tools, noting that they โ€œdo not cover everything.โ€
Diagram illustrating a chatbot correction process. A human says, โ€œI really like indie music! Do you have a favorite artist?โ€ The chatbot replies, โ€œIโ€™m a huge fan of indie music too! The Beatles are my absolute favorite!โ€ A feedback system then flags this as factually inconsistent, explaining that The Beatles are a rock band and suggesting The Smiths as an indie band instead. The corrected response becomes, โ€œIโ€™m a huge fan of indie music too! The Smiths are my absolute favorite!โ€ The diagram also highlights limitations of relying solely on instructions or external tools, noting that they โ€œdo not cover everything.โ€
Diagram illustrating a chatbot correction process. A human says, โ€œI really like indie music! Do you have a favorite artist?โ€ The chatbot replies, โ€œIโ€™m a huge fan of indie music too! The Beatles are my absolute favorite!โ€ A feedback system then flags this as factually inconsistent, explaining that The Beatles are a rock band and suggesting The Smiths as an indie band instead. The corrected response becomes, โ€œIโ€™m a huge fan of indie music too! The Smiths are my absolute favorite!โ€ The diagram also highlights limitations of relying solely on instructions or external tools, noting that they โ€œdo not cover everything.โ€
  • Copy link
  • Flag this post
  • Block
UKP Lab
@UKPLab@sigmoid.social replied  ยท  activity timestamp 2 weeks ago

๐Ÿ“Š ๐—ฆ๐—˜๐—˜๐—˜๐—— outperforms #GPT-4o and #Phi-4 by up to +๐Ÿด ๐—ฝ๐—ฝ across multiple datasets.

๐Ÿ“„ ๐—ฃ๐—ฎ๐—ฝ๐—ฒ๐—ฟ: https://www.arxiv.org/abs/2509.10833
๐Ÿ’ป ๐—–๐—ผ๐—ฑ๐—ฒ: https://github.com/UKPLab/emnlp2025-automatic-error-discovery
๐Ÿ”— ๐—ฃ๐—ฟ๐—ผ๐—ท๐—ฒ๐—ฐ๐˜: https://ukplab.github.io/emnlp2025-automatic-error-discovery/

Be sure to follow the authors: Dominic Petrak, Thy Thy Tran, and Iryna Gurevych from Ubiquitous Knowledge Processing (UKP) Lab/Technische Universitรคt Darmstadt.

See you at the #EMNLP in Suzhou!

(2/2)

#NLProc #ConversationalAI #Agents #EMNLP2025

  • Copy link
  • Flag this comment
  • Block
Log in

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About ยท Code of conduct ยท Privacy ยท Users ยท Instances
Bonfire social ยท 1.0.0 no JS en
Automatic federation enabled
  • Explore
  • About
  • Members
  • Code of Conduct
Home
Login