Discussion
Loading...

Post

Log in
  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
Kathy Reid
Kathy Reid
@KathyReid@aus.social  路  activity timestamp 4 months ago

Really enjoyed @tonybaloney talk at #pyconau around how to make your #LLM models faster in production.

Key takeaways are that smaller models are faster, and you need to make your models smaller through quantisation, distillation or semantic caching.

Really tractable, immediately implementable 馃憦馃憦

More of this, pls

  • Copy link
  • Flag this post
  • Block

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About 路 Code of conduct 路 Privacy 路 Users 路 Instances
Bonfire social 路 1.0.2-alpha.2 no JS en
Automatic federation enabled
Log in
  • Explore
  • About
  • Members
  • Code of Conduct