Post · bonfire.cafe

Post

This blogpost makes an astoundingly good case about LLMs I hadn't considered before. The collapse of public forums (like Stack Overflow) for programming answers coincides directly with the rise of programmers asking for answers from chatbots *directly*. Those debugging sessions become part of a training set that now *only private LLM corporations have access to*. This is something that "open models" seemingly can't easily fight. https://michiel.buddingh.eu/enclosure-feedback-loop

elilla&: back to Néojaponisme: 改

@elilla@transmom.love replied · 1 hour ago

@cwebber I wonder what will be the solution to that.

for a while now I've been thinking of a return to a more "cathedral" way of doing things, specifically as a reaction against LLMs. software developed by small teams of trusted devs in private repos, communication structure that resembles BBSes and geocities webrings more than web 2.0 social media. obscure blogs in gemini capsules, any shadowed enough corner to escape the tendrils of the USA miltech complex. I don't see any solutions to slop that don't involve guarded human curation and networks of trust.

for documentation, I remember how it was before StackExchange; I used to learn everything from /usr/share/doc/HOWTO, from info libc, from carefully written technical documentation that was distributed offline and that you could read cover to cover. the occasional o'reilly book for more complex topics. maybe it's too much to expect a return to this mode of knowledge sharing, but it would make sense against the LLMs, I think—in the same way that I now have to care whether my supposedly open-source software accepts slop code or not, if I have decided I trust the developers of a given piece of software, presumably I also trust the documentation they provide.

it would be an even better result if the tsunami of slop led people to an increased appreciation of the work provided by tech writers, translators, designers and other non-programming labour, though even I am not that optimistic to hope for that. sensible thought it might be.

a return to a /usr/share/doc/HOWTO approach of learning would also entail a certain shift in how we deal with the code itself, prizing human intelligibility and right-sizedness over productivity, quantity-over-quality, increased levels of abstraction/virtualisation/frameworkification. you can't really know that the code isn't LLM trash unless the code is intelligible in the first place, and for that you want it to be the exact oppose of vibe-slop: concise, logical, readable, maintainable over all else. stability and tried-and-tested lines prioritised over chasing trendy features. an attitude less like linux, more like netbsd.

Mikołaj Hołysz

@miki@dragonscave.space replied · 2 hours ago

@cwebber This goes much, much wider than programming and LLMs.

In general, the open source world looks with disdain at all kinds of automated feedback collection mechanisms, which the Silicon Valley Venture Capital tech ecosystem has wholeheartedly embraced. OSS is still stuck in the 1990s mindset of "if there's a problem, somebody will report this to us", and That... just isn't true.

What we're stuck with is OSS solutions with inferrior user experiences which nobody wants to use, instead of a compromise where OSS software collects more data than some people would have liked, but that software actually has some users and makes a difference in the world.

To be fair, there are some good arguments against this (it's much easier to protect user privacy if the only contributors to your code are employees with background checks), but that doesn't make this less of a problem.

Trammell Hudson

@th@social.v.st replied · 3 hours ago

@cwebber yet another externality for the bot lickers to ignore when they say "ethical and environmental issue aside..." and praise the occasionally useful slop that the stochastic slotmachine gives them as they burn billions of tokens in gas town.

Twobiscuits🚴‍♂️ :graz:

@twobiscuits@graz.social replied · 4 hours ago

@cwebber as in many other fields, we have to have real communities who care about stuff.

datarama

@datarama@hachyderm.io replied · 4 hours ago

@cwebber I've been saying this for a while. Bubble or not, our profession (and/or vocation, if you prefer) is screwed.

Christine Lemmer-Webber

@cwebber@social.coop replied · 4 hours ago

@datarama Possibly, though I worry less about professions/vocations than I do about user empowerment. I have long assumed that some day programmer salaries would be unsustainable.

Of course the irony is that many people are shilling LLM services as being empowerment systems. I see them as the opposite. Open, community developed LLMs could be, but LLM-as-a-service corporations are definitively not.

Cassandra is only carbon now

@xgranade@wandering.shop replied · 4 hours ago

@cwebber And here I was thinking "docs in Discord" was bad enough.

CM Thiede

@cmthiede@social.vivaldi.net replied · 4 hours ago

@cwebber I'm so glad I stumbled upon this thread, pointing out the fascist nature of the global AI race so many are calling the great democratizer. With enough critical thinkers, maybe civilization will come to its senses before hyperscale data centers become this era's pyramids to explore in the future.

Travis F W

@travisfw@fosstodon.org replied · 6 hours ago

yup.
Competition for dominance necessitates enclosure of the commons, limiting the use of extremely valuable common human dimensions to just the aggressive (or aggressively funded), precluding creative potentialities except only when championed by insiders in line with corporate financial models.
Over and over, humanity suffers profound losses.

Pino Carafa

@rozeboosje@masto.ai replied · 7 hours ago

@cwebber Maybe it's because I'm a bit long in the tooth. In 35 years of programming I have never hesitated to turn to others, including online forums.

But I will never turn to LLMs. LLMs are machines that regurgitate answers out of huge amounts of data. What LLMs lack is understanding. So they cannot justify their answers, pick the best answer for you out of their data, or meaningfully engage with you to help you adapt answers to your needs. You know... like humans can.

Leeloo

@leeloo@chaosfem.tw replied · 8 hours ago

@cwebber
Wait, if AI caused the collapse of wrong-answers-only sites like stackoverflow, doesn't that mean they have positive uses?

Michael Hartle

@mhartle@mastodon.online replied · 8 hours ago

@cwebber Well, people went to StackOverflow with a question and looked forward to answers based on the experience of others. While one can still ask an LLM and give a rubber-duck training session for its provider, I still fail to see the influx of answers based on experience.

René Schwietzke

@reneschwietzke@foojay.social replied · 8 hours ago

@cwebber I agree. There is less public information for future training for anyone as well as similar code due to more AI written software also in the open space. I expect an input standstill until someone invents non-LLM AI for coding.

michiel

@michiel@social.tchncs.de replied · 9 hours ago

@cwebber you calling it an 'astoundingly good case' makes me feel insightful in a way no LLM has been able to accomplish. I'm going to be insufferably smug for the rest of the day :)

Christine Lemmer-Webber

@cwebber@social.coop replied · 4 hours ago

@michiel Haha, you deserve it! An angle I hadn't considered, it really shook me up and I spent a ton of time thinking about it since.

Francis Cook

@dianshuo@mstdn.io replied · 9 hours ago

@cwebber this isn’t really new. For all the things on StackOverflow there was a huge domain of knowledge that was just not on there.

For most of my corporate developer life the knowledge/bug fixes would not be found on public forums but in internal collective knowledge, documentation or simply knowing a person in the same field. Most of this was not public domain.

The biggest issue now is that those firms commit their group knowledge to LLMs and we will not get it back.

mahadevank

@mahadevank@mastodon.social replied · 10 hours ago

@cwebber of course guys - it was never about the LLM, it was about crowd-sourcing intelligence at an epic-scale. Every piece of code a developer writes and fixes becomes training data. Same with every conversation. I'm surprised people don't see the danger in having one single overlord and gatekeeper of all information in the world. Its crazy.

People seem to have forgotten what the real meaning of democracy and multi-lateralism are.

Allan Svelmøe Hansen

@svelmoe@hachyderm.io replied · 10 hours ago

@cwebber
One thing I've noticed in the same aspect as this, is that I talk much less code with colleagues now and have much less interactions with them for working through problems, and thus limit my exposure to alternate problem solving.

When ever I want to discuss a problem, it's more often than not boiled down to some LLM answer, meaning, I might as well 'cut out' the middle and ask the LLM itself if all I get are LLM answers anyway.
That truly sucks.
"Have you asked Claude/Copilot/Chatgpt".....

Jens Finkhäuser

@jens@social.finkhaeuser.de replied · 10 hours ago

@cwebber I keep repeating this in such contexts, apologies if I sound like a broken record: AI is a fascist project.

The purpose isn't merely enclosure of the commons. Making public stuff private is more of a means to an end.

There is centuries of historic precedent that shows that when a state has natural resources, it needs fewer people to extract wealth from that, and so pay for what keeps rulers in power.

If a state has fewer resources, it has to rely on a large, healthy population's...

social elephant in the room

@tseitr@mastodon.sdf.org replied · 10 hours ago

@cwebber this article have some good prediction on the knowledge silos LLM might be able to create, but it does not address the fact that the business model is not profitable. When the price hike hit to make LLM generating profits, people will have to balance the price of the subscription with the real value provided... just like any purchase. We might loose a few years of knowledge down the LLM silos before the collapse but personally, I think it is ok, we have plenty of good documented 1/2

Rachael L

@r343l@freeradical.zone replied · 12 hours ago

@cwebber Extra painful: bigger tech companies can afford to pay for plans that limit use of context data to train future versions of an LLM service's models so THEIR work is "protected" while their employees consume the commons. But smaller companies and individual users will be giving up their data.

May Likes Toronto

@mayintoronto@beige.party replied · 13 hours ago

@cwebber Sure didn't help that Stack Overflow was killing itself too.

Mita

@copystar@social.coop replied · 13 hours ago

@cwebber Thanks for sharing. To my ears, it rhymes with this response to the Microsoft-copilot-github encroachment: https://githubcopilotinvestigation.com/

Alex@rtnVFRmedia Suffolk UK

@vfrmedia@social.tchncs.de replied · 14 hours ago

@cwebber Already the forums for VOIP software and embedded stuff (Arduino etc) are fully enshittified - they were toxic enough pre-AI) and folk have simply stopped contributing there (I think this happened just *before* LLMs became popular, so the quality of whatever does go to the training sets isn't going to be much good).

I suspect another factor is when people are getting paid for their work *and* depending on their employers upselling SaaS or other commercial services, they are less inclined to share stuff with the competition (I had to figure out my PJSIP trunk and securing it for myself, most of my findings are tooted here on Fedi as I'm not even sure where else to put them)

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances

Bonfire social · 1.0.2-alpha.7 no JS en

Automatic federation enabled