Discussion
Loading...

Post

Log in
  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
tante
tante
@tante@tldr.nettime.org  ·  activity timestamp 6 days ago

Since the topic came up in the whole "I run models locally so it's all fine" conversation:

"Open Source AI" does not meaningfully exist. It's just openwashing proprietary shit

https://tante.cc/2024/10/16/does-open-source-ai-really-exist/

  • Copy link
  • Flag this post
  • Block
Lars Marowsky-Brée 😷
Lars Marowsky-Brée 😷
@larsmb@mastodon.online  ·  activity timestamp 5 days ago

@tante The "Open Source AI Definition" was the final straw for me to recognize OSI for what they are (at least have become): a capitalist apologist shill group.

  • Copy link
  • Flag this comment
  • Block
SebasFC
SebasFC
@SebasFC@mastodont.cat  ·  activity timestamp 5 days ago

@tante hey @pallenberg dieser Post von Tante hat mich sehr an deiner Episode mit Don im Techlounge Podcast erinnert.

Gibt es oder gibt es keine Open Source AI?

  • Copy link
  • Flag this comment
  • Block
Sascha Pallenberg 🇹🇼 ♻️ ⚡
Sascha Pallenberg 🇹🇼 ♻️ ⚡
@pallenberg@mastodon.social  ·  activity timestamp 5 days ago

@SebasFC nach dieser Definition: nein!

Davon abgesehen, dass massenhaft Trainingsdaten der Modelle mit alles andere als freien Lizenzen versehen waren.

Ob man es eher Public Domain, was ja vor der Freeware Bewegung, vor allen Dingen in den 80er recht beliebt war, nennen mag.. keine Ahnung!

Tatsache ist aber auch, dass es nen Unterschied zwischen der chin. Strategie & der von OpenAI, Anthropic, Google & Co gibt.

Die ballern ihre aktuellen Foundation Modelle zum DL raus. Die US-Anbieter nicht!

  • Copy link
  • Flag this comment
  • Block
SebasFC
SebasFC
@SebasFC@mastodont.cat  ·  activity timestamp 4 days ago

@pallenberg "...too big to collect with care..."

https://dair-community.social/@emilymbender/116109627131276897

  • Copy link
  • Flag this comment
  • Block
SebasFC
SebasFC
@SebasFC@mastodont.cat  ·  activity timestamp 5 days ago

@pallenberg "...too big to collect with care..."

https://dair-community.social/@emilymbender/116109627131276897

  • Copy link
  • Flag this comment
  • Block
Hippo 🍉
Hippo 🍉
@badrihippo@fosstodon.org  ·  activity timestamp 6 days ago

@tante thanks for writing this. Actually, just reading your caption ("'Open Source AI' does not meaningfully exist. It's just openwashing proprietary shit") was a clarifying moment because I was wondering how practical a truly Open Source LLM (all the "sources", including the entire training data, bundled together into one big repo) would be

And then I realised: it's not about being "practical". The definition's job is to set the standard, and reaching that or not is the implementer's problem

  • Copy link
  • Flag this comment
  • Block
Hippo 🍉
Hippo 🍉
@badrihippo@fosstodon.org  ·  activity timestamp 6 days ago

@tante also, if people did want to make LLMs (or other models) up to those standards, they would—by creating or relicencing datasets, etc. It would be a humongous effort, of course, but nobody claimed earning your own things was *less* effort than stealing somebody else's

Also, this means we don't really *need* a separate definition specifically for LLMs. We can just use the same standards we've always used: full sources, including code and training data and everything 📦

  • Copy link
  • Flag this comment
  • Block
tante
tante
@tante@tldr.nettime.org  ·  activity timestamp 6 days ago

@badrihippo exactly. Having a specific other definition for "AI" only serves the goal of watering down standards

  • Copy link
  • Flag this comment
  • Block
Thomas Sandmann
Thomas Sandmann
@thomas_sandmann@genomic.social  ·  activity timestamp 6 days ago

@tante Curious, what do you think of apertus: https://www.swiss-ai.org/apertus ? The Swiss seem to be making a meaningful attempt? "Particular attention has been paid to data integrity and ethical standards: the training corpus builds only on data which is publicly available. It is filtered to respect machine-readable opt-out requests from websites, even retroactively, and to remove personal data, and other undesired content before training begins." (I haven't used it myself.)

Swiss AI

Apertus | Swiss AI

  • Copy link
  • Flag this comment
  • Block
heckj
heckj
@heckj@mastodon.social  ·  activity timestamp 6 days ago

@tante @colincornaby Have you looked at https://allenai.org/olmo? For most of the "open weight" models, I'd completely agree - but the Olmo3 work in particular exposes all of the training data as well, which I read as one of the core arguments in that piece. They not only share and show their data, they discuss - in quite some detail - their training processes, including experiments on the pros and cons for techniques on relatively weaker models.

If you haven't seen it, it's very worth looking.

Olmo from Ai2

Our fully open language model and complete model flow.
  • Copy link
  • Flag this comment
  • Block
indyradio
indyradio
@indyradio@kafeneio.social  ·  activity timestamp 6 days ago

@tante correct. burn the planet down from every desktop, now get to it

  • Copy link
  • Flag this comment
  • Block
Toni Aittoniemi
Toni Aittoniemi
@gimulnautti@mastodon.green  ·  activity timestamp 6 days ago

@tante There’s Mistral. They have models that have open training data. 🤔

  • Copy link
  • Flag this comment
  • Block
Fritz Adalis
Fritz Adalis
@FritzAdalis@infosec.exchange  ·  activity timestamp 6 days ago

@tante @aburka
Now always closed open - but minded

  • Copy link
  • Flag this comment
  • Block
Edward
Edward
@yugthebug@mastodon.social  ·  activity timestamp 6 days ago

@tante when ai is open source they just mean its proprietry and not on the cloud

  • Copy link
  • Flag this comment
  • Block
GhostOnTheHalfShell
GhostOnTheHalfShell
@GhostOnTheHalfShell@masto.ai  ·  activity timestamp 6 days ago

@tante

The thing that is giving me the greatest joy this morning when I woke up was watching Chris Noland and his wife discuss how people are openly rejecting data centers, and they show a short clip of people in the streets, breaking out in cries of joy when one person announced that the data center project had been rejected from their city

  • Copy link
  • Flag this comment
  • Block
grrl_aex
grrl_aex
@kitkat_blue@mastodon.social  ·  activity timestamp 6 days ago

@tante

cont'd

"Even with open-source AI, you still need huge amounts of data, labor, and infrastructure. They don’t challenge the concentration that includes distribution networks, economies of scale, entrenched reach, the ability to define the tooling and the standards, and so on. Claiming they do these things confuses and distracts us from the type of solutions we need."

/2 /end

  • Copy link
  • Flag this comment
  • Block
grrl_aex
grrl_aex
@kitkat_blue@mastodon.social  ·  activity timestamp 6 days ago

@tante

Meredith Whittaker explains *why* open source ai is simply a masquerade:

https://ainowinstitute.org/publications/open-source

The key novelty of the current AI moment is the presence of concentrated amounts of data that had not been available before, and powerful distributed computational systems to process that data to train and perform inference on AI models.

1/

AI Now Institute

Open Source

The 2026 AI Impact Summit in India is the latest iteration of an event that has become a bellwether for global discourse around the AI industry, especially the question of whether, and how, it can be governed. But it also demonstrates how important ideas can be invoked in ways that dilute their meaning or co-opt their force. In this series—produced by AI Now Institute, Aapti Institute, and The Maybe—we bring together leading advocates, builders, and thinkers from around the world who live and breathe substance, analysis, and meaningful action into these ideas.
  • Copy link
  • Flag this comment
  • Block
Openhuman
Openhuman
@Openhuman@mastodon.online  ·  activity timestamp 6 days ago

@tante olmoe by Allen.ai and some Firefox things

  • Copy link
  • Flag this comment
  • Block
ɩɐɥɔɐɿɐɯ
ɩɐɥɔɐɿɐɯ
@malachai@furry.engineer  ·  activity timestamp 6 days ago

@tante OMG thank you. "I run my model locally" has become the everlasting thoughtstopper for any #AntiAI comment. I'm becoming extremely irritated with that response.

  • Copy link
  • Flag this comment
  • Block
tante
tante
@tante@tldr.nettime.org  ·  activity timestamp 6 days ago

(I know there are niche attempts that work even worse than all the other models)

  • Copy link
  • Flag this comment
  • Block
Athanasius
Athanasius
@AthanSpod@social.linux.pizza  ·  activity timestamp 6 days ago

@tante Ah, so you're counting https://www.swiss-ai.org/apertus in "nice attempt, but ..." ?

Swiss AI

Apertus | Swiss AI

  • Copy link
  • Flag this comment
  • Block
tante
tante
@tante@tldr.nettime.org  ·  activity timestamp 6 days ago

@AthanSpod yes.

  • Copy link
  • Flag this comment
  • Block
Athanasius
Athanasius
@AthanSpod@social.linux.pizza  ·  activity timestamp 6 days ago

@tante Fair enough.

I can see from their white paper that whilst they're being really very transparent... any duplication would still need to do all the scraping and cleaning of data themselves.

But, *given* that proviso, it does seem like one of, if not the, most open attempt. It's a hard problem, but they've certainly tried to be very ethical about it.

  • Copy link
  • Flag this comment
  • Block
DJGummikuh
DJGummikuh
@DJGummikuh@mastodon.social  ·  activity timestamp 6 days ago

@tante Open Source LLMs do not exist. I refuse to limit the definition of "AI" to only GenAI LLM Nonsense. And on the ML side you have a lot of OSS

  • Copy link
  • Flag this comment
  • Block

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances
Bonfire social · 1.0.2-alpha.34 no JS en
Automatic federation enabled
Log in
Instance logo
  • Explore
  • About
  • Members
  • Code of Conduct