Discussion
Loading...

Post

  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
Kathy Reid
@KathyReid@aus.social  路  activity timestamp 6 days ago

Really enjoyed @tonybaloney talk at #pyconau around how to make your #LLM models faster in production.

Key takeaways are that smaller models are faster, and you need to make your models smaller through quantisation, distillation or semantic caching.

Really tractable, immediately implementable 馃憦馃憦

More of this, pls

  • Copy link
  • Flag this post
  • Block
Log in

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About 路 Code of conduct 路 Privacy 路 Users 路 Instances
Bonfire social 路 1.0.0-rc.2.21 no JS en
Automatic federation enabled
  • Explore
  • About
  • Members
  • Code of Conduct
Home
Login