Now the lightning talk on datasheets & data-envelopes presented by @sclaeyssens and Antoine Isaac at #FF2025
The slides of the presentation are available at
https://zenodo.org/records/17725565
#datasheets #data-envelopes #ML #collectionsasdata
#Tag
Now the lightning talk on datasheets & data-envelopes presented by @sclaeyssens and Antoine Isaac at #FF2025
The slides of the presentation are available at
https://zenodo.org/records/17725565
#datasheets #data-envelopes #ML #collectionsasdata
Now the lightning talk on datasheets & data-envelopes presented by @sclaeyssens and Antoine Isaac at #FF2025
The slides of the presentation are available at
https://zenodo.org/records/17725565
#datasheets #data-envelopes #ML #collectionsasdata
AI research is "Eating your own dog food" kind of field. It's both amusing and predictable.
Major AI conference flooded with peer reviews written fully by AI https://www.nature.com/articles/d41586-025-03506-6
"Controversy has erupted after 21% of manuscript reviews for an international AI conference were found to be generated by artificial intelligence. "
How did they find out, you ask? With the help of AI models, of course. "EditLens: Quantifying the Extent of AI Editing in Textmodel." https://arxiv.org/abs/2510.03154
If you want a specific example of why many researchers in machine learning and natural language processing find the idea that LLMs like ChatGPT or Claude are "intelligent" or "conscious" is laughable, this article describes one:
https://news.mit.edu/2025/shortcoming-makes-llms-less-reliable-1126
#LLM
#ChatGPT
#Claude
#MachineLearning
#NaturalLanguageProcessing
#ML
#AI
#NLP
AI research is "Eating your own dog food" kind of field. It's both amusing and predictable.
Major AI conference flooded with peer reviews written fully by AI https://www.nature.com/articles/d41586-025-03506-6
"Controversy has erupted after 21% of manuscript reviews for an international AI conference were found to be generated by artificial intelligence. "
How did they find out, you ask? With the help of AI models, of course. "EditLens: Quantifying the Extent of AI Editing in Textmodel." https://arxiv.org/abs/2510.03154
If you want a specific example of why many researchers in machine learning and natural language processing find the idea that LLMs like ChatGPT or Claude are "intelligent" or "conscious" is laughable, this article describes one:
https://news.mit.edu/2025/shortcoming-makes-llms-less-reliable-1126
#LLM
#ChatGPT
#Claude
#MachineLearning
#NaturalLanguageProcessing
#ML
#AI
#NLP
Salut le fédi, j'ai besoin de tes miracles !
J'ai une belle-soeur, qui est à peu près la personne la plus intelligente que j'ai eu l'occasion de côtoyer qui cherche un boulot en R&D ML engineer / Data scientist idéalement en full remote ou du côté de #grenoble. Son contrat d'avant dans le machine learning vient de se terminer
Elle a été auparavant chercheuse dans les nanosciences au #cnrs. Elle a plusieurs papiers de publiés. Bref, une tête !
Je peux partager son CV, elle est pas mal en galère d'argent, donc c'est plutôt urgent 😊 !
Y a moyen de faire tourner ce petit message d'aide ? Merci !
Salut le fédi, j'ai besoin de tes miracles !
J'ai une belle-soeur, qui est à peu près la personne la plus intelligente que j'ai eu l'occasion de côtoyer qui cherche un boulot en R&D ML engineer / Data scientist idéalement en full remote ou du côté de #grenoble. Son contrat d'avant dans le machine learning vient de se terminer
Elle a été auparavant chercheuse dans les nanosciences au #cnrs. Elle a plusieurs papiers de publiés. Bref, une tête !
Je peux partager son CV, elle est pas mal en galère d'argent, donc c'est plutôt urgent 😊 !
Y a moyen de faire tourner ce petit message d'aide ? Merci !
R.A. Fisher wrote that the purpose of statisticians was "constructing a hypothetical infinite population of which the actual data are regarded as constituting a random sample." ( p. 311 here ). In The Zeroth Problem Colin Mallows wrote "As Fisher pointed out, statisticians earn their living by using two basic tricks-they regard data as being realizations of random variables, and they assume that they know an appropriate specification for these random variables."
Some of the pathological beliefs we attribute to techbros were already present in this view of statistics that started forming over a century ago. Our writing is just data; the real, important object is the “hypothetical infinite population” reflected in a large language model, which at base is a random variable. Stable Diffusion, the image generator, is called that because it is based on latent diffusion models, which are a way of representing complicated distribution functions--the hypothetical infinite populations--of things like digital images. Your art is just data; it’s the latent diffusion model that’s the real deal. The entities that are able to identify the distribution functions (in this case tech companies) are the ones who should be rewarded, not the data generators (you and me).
So much of the dysfunction in today’s machine learning and AI points to how problematic it is to give statistical methods a privileged place that they don’t merit. We really ought to be calling out Fisher for his trickery and seeing it as such.
#AI #GenAI #GenerativeAI #LLM #StableDiffusion #statistics #StatisticalMethods #DiffusionModels #MachineLearning #ML
R.A. Fisher wrote that the purpose of statisticians was "constructing a hypothetical infinite population of which the actual data are regarded as constituting a random sample." ( p. 311 here ). In The Zeroth Problem Colin Mallows wrote "As Fisher pointed out, statisticians earn their living by using two basic tricks-they regard data as being realizations of random variables, and they assume that they know an appropriate specification for these random variables."
Some of the pathological beliefs we attribute to techbros were already present in this view of statistics that started forming over a century ago. Our writing is just data; the real, important object is the “hypothetical infinite population” reflected in a large language model, which at base is a random variable. Stable Diffusion, the image generator, is called that because it is based on latent diffusion models, which are a way of representing complicated distribution functions--the hypothetical infinite populations--of things like digital images. Your art is just data; it’s the latent diffusion model that’s the real deal. The entities that are able to identify the distribution functions (in this case tech companies) are the ones who should be rewarded, not the data generators (you and me).
So much of the dysfunction in today’s machine learning and AI points to how problematic it is to give statistical methods a privileged place that they don’t merit. We really ought to be calling out Fisher for his trickery and seeing it as such.
#AI #GenAI #GenerativeAI #LLM #StableDiffusion #statistics #StatisticalMethods #DiffusionModels #MachineLearning #ML
Honda: 2 years of ml vs 1 month of prompting - heres what we learned
https://www.levs.fyi/blog/2-years-of-ml-vs-1-month-of-prompting/
#HackerNews #Honda #ml #prompting #machinelearning #AI #insights
💡3-Month FULLY FUNDED Summer Internship in Germany!
🇩🇪 Apply for the Internship (Jul-Sep 2026) in Tübingen. Work on cutting-edge research in #ML, #Neuroscience & #DataAnalysis at MPI. Open to BSc./MSc. students.
🗓️Deadline: Nov 20, 2025
🔗 Apply: https://cactus-internship.tuebingen.mpg.de
💡3-Month FULLY FUNDED Summer Internship in Germany!
🇩🇪 Apply for the Internship (Jul-Sep 2026) in Tübingen. Work on cutting-edge research in #ML, #Neuroscience & #DataAnalysis at MPI. Open to BSc./MSc. students.
🗓️Deadline: Nov 20, 2025
🔗 Apply: https://cactus-internship.tuebingen.mpg.de
Soon to open (mid-november)
https://statml.peercommunityin.org/
To keep in mind.
#ML #statistics #OpenScience
Soon to open (mid-november)
https://statml.peercommunityin.org/
To keep in mind.
#ML #statistics #OpenScience
🎤 Upcoming at SeaGL 2025:
📍 02:00 PM on November 08
🗣️ "“Hidden in Plain Sight: Addressing Data Bias in AI-Driven Systems”"
👥 Speaker(s): Autumn Nash
📍 Room: Room 145
🏷️ Track: Open source AI and Data Science
📝 As AI increasingly powers critical systems across industries, the quality and neutrality of training...
#SeaGL2025 #ai #ml #performance #automation #data
🔗 https://pretalx.seagl.org/2025/talk/ETZQ8V/
🎤 Upcoming at SeaGL 2025:
📍 02:00 PM on November 08
🗣️ "“Hidden in Plain Sight: Addressing Data Bias in AI-Driven Systems”"
👥 Speaker(s): Autumn Nash
📍 Room: Room 145
🏷️ Track: Open source AI and Data Science
📝 As AI increasingly powers critical systems across industries, the quality and neutrality of training...
#SeaGL2025 #ai #ml #performance #automation #data
🔗 https://pretalx.seagl.org/2025/talk/ETZQ8V/
Super excited that Dr. Keir Winesmith is one of the keynotes for @everythingopen #EverythingOpen #EO2026 in Canberra in January.
The #NFSA are doing incredible things with #transcription of speech archives with their Bowerbird project, and their commitment to #AI and #ML practices is sector-leading. Interested in what he has to say.
A space for Bonfire maintainers and contributors to communicate