R.A. Fisher wrote that the purpose of statisticians was "constructing a hypothetical infinite population of which the actual data are regarded as constituting a random sample." ( p. 311 here ). In The Zeroth Problem Colin Mallows wrote "As Fisher pointed out, statisticians earn their living by using two basic tricks-they regard data as being realizations of random variables, and they assume that they know an appropriate specification for these random variables."
Some of the pathological beliefs we attribute to techbros were already present in this view of statistics that started forming over a century ago. Our writing is just data; the real, important object is the “hypothetical infinite population” reflected in a large language model, which at base is a random variable. Stable Diffusion, the image generator, is called that because it is based on latent diffusion models, which are a way of representing complicated distribution functions--the hypothetical infinite populations--of things like digital images. Your art is just data; it’s the latent diffusion model that’s the real deal. The entities that are able to identify the distribution functions (in this case tech companies) are the ones who should be rewarded, not the data generators (you and me).
So much of the dysfunction in today’s machine learning and AI points to how problematic it is to give statistical methods a privileged place that they don’t merit. We really ought to be calling out Fisher for his trickery and seeing it as such.
#AI #GenAI #GenerativeAI #LLM #StableDiffusion #statistics #StatisticalMethods #DiffusionModels #MachineLearning #ML
CNN:
"...The researchers detailed how the bear later “discussed even more graphic sexual topics in detail, such as explaining different sex positions, giving step-by-step instructions on a common ‘knot for beginners’ for tying up a partner and describing roleplay dynamics involving teachers and students, and parents and children – scenarios it disturbingly brought up itself.” ..."
😱
Well, so much for the that one for the kids! Off the Christmas list! 🤪
CNN: Sales of AI-enabled teddy bear suspended after it gave advice on BDSM sex and where to find knives
https://www.cnn.com/2025/11/19/tech/folotoy-kumma-ai-bear-scli-intl
"...(CEO) told CNN that the company had withdrawn its “Kumma” bear, as well as the rest of its range of AI-enabled toys, after researchers at the US PIRG Education Fund raised concerns around inappropriate conversation topics, including discussion of sexual fetishes, such as spanking, and how to light a match. ..."
I honestly think that if we weren't living under capitalism, LLMs in their current state, would simply not be used. There would simply be no real motivation to crank out massive amounts of sub-standard text, art, etc., because there are enough people that actually enjoy doing that, and without needing money for food, shelter, and what not, it wouldn't be hard to find someone to create that text, art, etc.
I think most generative AI would be seen as a gimmicky party trick, that would make people think "that's cute, but why would we want that?". This doesn't mean we wouldn't find uses for it, but we wouldn't have the current surreal shit show with LLMs at the centre.
Ironically, under that imaginary system, getting all the data to train the LLMs on would be a lot less ethically abhorrent, as it wouldn't be taking a potential meal ticket away from the people who created the training set.
I honestly think that if we weren't living under capitalism, LLMs in their current state, would simply not be used. There would simply be no real motivation to crank out massive amounts of sub-standard text, art, etc., because there are enough people that actually enjoy doing that, and without needing money for food, shelter, and what not, it wouldn't be hard to find someone to create that text, art, etc.
I think most generative AI would be seen as a gimmicky party trick, that would make people think "that's cute, but why would we want that?". This doesn't mean we wouldn't find uses for it, but we wouldn't have the current surreal shit show with LLMs at the centre.
Ironically, under that imaginary system, getting all the data to train the LLMs on would be a lot less ethically abhorrent, as it wouldn't be taking a potential meal ticket away from the people who created the training set.
R.A. Fisher wrote that the purpose of statisticians was "constructing a hypothetical infinite population of which the actual data are regarded as constituting a random sample." ( p. 311 here ). In The Zeroth Problem Colin Mallows wrote "As Fisher pointed out, statisticians earn their living by using two basic tricks-they regard data as being realizations of random variables, and they assume that they know an appropriate specification for these random variables."
Some of the pathological beliefs we attribute to techbros were already present in this view of statistics that started forming over a century ago. Our writing is just data; the real, important object is the “hypothetical infinite population” reflected in a large language model, which at base is a random variable. Stable Diffusion, the image generator, is called that because it is based on latent diffusion models, which are a way of representing complicated distribution functions--the hypothetical infinite populations--of things like digital images. Your art is just data; it’s the latent diffusion model that’s the real deal. The entities that are able to identify the distribution functions (in this case tech companies) are the ones who should be rewarded, not the data generators (you and me).
So much of the dysfunction in today’s machine learning and AI points to how problematic it is to give statistical methods a privileged place that they don’t merit. We really ought to be calling out Fisher for his trickery and seeing it as such.
#AI #GenAI #GenerativeAI #LLM #StableDiffusion #statistics #StatisticalMethods #DiffusionModels #MachineLearning #ML
Trained on 4chan! Given its propensity towards em dashes, "AI" is trained on your writing — and, indeed, on mine.
That said:
I can name at least one subject where I know that it's trained on my writing because it's a fairly technical keyword set and I'm one of only a few people to have written about the subject. I did use em-dashes.
Google's machine-written "AI" result currently points to me, mashes together some stuff incorrectly — and strips out the em-dashes.
I should add to that, by the way, that this is the same subject that I searched for other articles on some years ago, to find what seemed like a very good university thesis from (if memory serves) India …
… that partway through reading seemed to be very familiar.
Someone somewhere has a degree from a thesis that I, in fairly large part, and unwittingly, wrote.
Circa 1993, Vernor Vinge wrote that the first working AI would be the last thing that humanity ever invented.
We don't *have* a first working AI, and at this rate—shifting the global economy to run atop spicy autocomplete trained on 4chan—we never will.
But Vernor was right. Just forget "working".
Trained on 4chan! Given its propensity towards em dashes, "AI" is trained on your writing — and, indeed, on mine.
That said:
I can name at least one subject where I know that it's trained on my writing because it's a fairly technical keyword set and I'm one of only a few people to have written about the subject. I did use em-dashes.
Google's machine-written "AI" result currently points to me, mashes together some stuff incorrectly — and strips out the em-dashes.
> IA : après les LLM, les TRM sont-ils la future révolution de l'IA ?
Retrouvez toute notre veille sur l'IA sur https://curation.framamia.org
#Framasoft #ia_specialisee #ia_generative #ia_frugale #trm #llm
Every time there is a new #LLM release:
Huh!? Wasn't this exact model already available 3 months ago!?
Solving a Million-Step LLM Task with Zero Errors
https://arxiv.org/abs/2511.09030
#HackerNews #Solving #LLM #Task #Errors #AI #Research #MachineLearning #Innovation
Another example of how #AI BS makes more work for people, not less.
The Editor Got a Letter From ‘Dr. B.S.’ So Did a Lot of Other Editors.
A research scientist who published a paper in a scientific journal about controlling mosquito-borne malaria infections was asked to rebut a letter to the editor sent by a scientist who had suddenly become improbably prolific starting in 2025.
https://www.nytimes.com/2025/11/04/science/letters-to-the-editor-ai-chatbots.html?smid=url-share [gift link]
I'd like to read the PNAS publication reported on in the @404mediaco article linked below, but I can't find it. I found another article that actually features a full reference (https://phys.org/news/2025-11-fake-survey-ai-quietly-sway.html), but (unironically!) the DOI returns a 404. Maybe I'm not quite awake yet, but I can't help but feel it really shouldn't be this hard to get hold of academic research...
⛐ Bridging Vision, Language, and Mathematics: Pictographic Character Reconstruction with Bézier Curves
https://arxiv.org/abs/2511.00076
#cs #graphics #text #characters #cg #béziercurves #llm #ai #vision #machinevision
By randomizing #LLM prompts and analyzing moral #keywords via co-occurrence #networks and hierarchical clustering, @andrewpiper uncovers latent “moral communities” across 20th–21st century #English-language #fiction.
https://doi.org/10.48694/jcls.4168
By randomizing #LLM prompts and analyzing moral #keywords via co-occurrence #networks and hierarchical clustering, @andrewpiper uncovers latent “moral communities” across 20th–21st century #English-language #fiction.
https://doi.org/10.48694/jcls.4168
⛐ Bridging Vision, Language, and Mathematics: Pictographic Character Reconstruction with Bézier Curves
https://arxiv.org/abs/2511.00076
#cs #graphics #text #characters #cg #béziercurves #llm #ai #vision #machinevision
Was bedeutet KI für Geschichtsschreibung und Geschichtswissenschaft? Mit diesem großen Thema haben sich Sebastian Kubon und Charlotte Lerg im Sommersemester 2025 in einem Seminar auseinandergesetzt. In ihrem Beitrag reflektieren Sie die Ergebnisse dieses Seminars sowie einige „Think Pieces“, die Studierende im Anschluss geschrieben haben. Die „Think Pieces“ selbst finden sich im 2. Teil des Beitrags. https://digitrip.hypotheses.org/3904 https://digitrip.hypotheses.org/3964 #KI #AI #LLM #DigitalHistory #DH #histodons