Discussion
Loading...

Post

  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
Hendrik Weimer
@hweimer@fediscience.org  ·  activity timestamp 7 days ago

Large language models are essentially solutions to nonlinear optimization problems to predict the next token in the output text. Mathematically, these problems are quite similar to problems we routinely face in physics, like finding the ground state of a quantum system.

In these physics problems, we observe a recurring pattern: First, someone comes up with a new class of solutions (we call this a "variational ansatz") and there is tremendous progress, allowing to solve previously hard problems in a near-miraculous way. However, once the low-hanging fruits have been reaped, the remaining problems stay hard. Throwing vastly more computing power at it helps a little bit, but produces quickly diminishing returns.

I'm pretty sure that's exactly what has happened with the advent of transformers for natural language processing.

#ai #llm #physics #quantum

  • Copy link
  • Flag this post
  • Block
Log in

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances
Bonfire social · 1.0.0-rc.3.1 no JS en
Automatic federation enabled
  • Explore
  • About
  • Members
  • Code of Conduct
Home
Login