Discussion
Loading...

#Tag

Log in
  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
Giacomo Tesio
Giacomo Tesio
@giacomo@snac.tesio.it  ·  activity timestamp last week

One of the best article on #AI, #GenAI, and #ML I've read this week from a company that develop videogames and even published a couple of books about the topic in #gamedev: https://yarnspinner.dev/blog/why-we-dont-use-ai/

TL;DR: AI companies make tools for hurting people and we don’t want to support that.
and
AI is now a tool for firing people, in a time when getting re-employed is especially difficult and being unemployed can be life-threatening. We don’t want to be part of that. Until this is fixed we won’t use AI in our work, nor integrate it into Yarn Spinner for others to use.

We don’t want to support the companies making these tools or normalise their behaviour. So we don’t.
#Gaming #Work

Why We Don't Use AI | Yarn Spinner

We get asked about AI a lot. Whether we’re going to add it to Yarn Spinner, whether we use it ourselves, what we think about it. Fair questions. Time to write it all down. Yarn Spinner doesn’t use the technology that’s currently being called AI. We don’t have generative AI features in the product, and we don’t use code generation tools to build it, and we don’t accept contributions we know contain generated material. Let’s talk about why.
  • Copy link
  • Flag this post
  • Block
Lukas Fuchsgruber boosted
Jörg Lehmann
Jörg Lehmann
@jrglmn@mastodon.social  ·  activity timestamp last month

Two further results from the project "Human.Machine.Culture" (https://mmk.sbb.berlin/?lang=en) at @stabi_berlin published in Open Access

Guidelines for the Documentation of Ethical, Legal and Social Issues (ELSI) in Cultural Data

https://doi.org/10.5281/zenodo.16418345

Guidelines for the Publication of Cultural Data for AI Research

https://doi.org/10.5281/zenodo.15878097

Feedback to these publications is most welcome!

#bigdata #ML #culturalheritage #ELSI #digitalculturalheritage

Zenodo

Handreichung für die Veröffentlichung von Kulturdaten für die KI-Forschung – Guidelines for the Publication of Cultural Data for AI Research

Diese Handreichung bietet eine umfassende Anleitung für die Bereitstellung von Kulturerbedatensätzen für die KI-Forschung. Sie präsentiert eine Checkliste für die Veröffentlichung von Kulturdaten, betont die Bedeutung einer transparenten Dokumentation durch datasheets sowie der Beachtung der FAIR Prinzipien und verweist auf praktische Werkzeuge und Beispiele, um diese umzusetzen. Zentrale Aspekte der Datensatzerstellung wie die Strukturierung, Kuratierung und Qualitätssicherung werden ebenso vorgestellt wie gängige Repositorien und die Auswahl geeigneter Formate und Standards. These guidelines provide comprehensive guidance for the provision of cultural heritage datasets for AI research. They present a checklist for the publication of cultural data, emphasise the importance of transparent documentation through datasheets and compliance with the FAIR principles, and refer to practical tools and examples for implementing these. Key aspects of dataset creation such as structuring, curation and quality assurance are presented, as are common repositories and the selection of suitable formats and standards.
Zenodo

Leitfaden für die Dokumentation von ethischen, rechtlichen und sozialen Aspekten (ELSA) in Kulturdaten – Guidelines for the Documentation of Ethical, Legal and Social Issues (ELSI) in Cultural Data

Die aktuell unter dem Schlagwort „künstliche Intelligenz“ (KI) betriebenen Modelle wurden quasi ausschließlich auf Daten des 21. Jahrhunderts trainiert. Daher sind sie in hohem Maße ungeeignet für den Einsatz in historischen oder von der westlichen Welt kulturell differenten Kontexten. Die Bereitstellung von Daten aus dem Kulturerbe-Bereich schafft hier nicht einfach Abhilfe. Zwar sind solche Datensätze meist von hoher Qualität und zeichnen sich durch ihre historische Tiefe, ihren kulturellen Reichtum und ihre Diversität aus. Sie enthalten aber oft problematische Inhalte, die der Weltsicht vergangener Zeiten entstammt, und bedürfen daher einer umfassenden Dokumentation, um machine learning-Modelle präziser, leistungsfähiger und geeignet für den Einsatz in verschiedenen kulturellen Kontexten sowie dem Gemeinwohl dienlich zu machen. Dieser Leitfaden fokussiert auf ethische, soziale und rechtliche Aspekte bei der Dokumentation von Kulturerbe-Daten, die für das Training von machine learning-Modellen benutzt werden. Er konzentriert sich insbesondere darauf, wie diese Modelle möglicherweise historische oder statistische Verzerrungen („biases“) perpetuieren. Die Analyse wandert entlang der verschiedenen Phasen des gesamten machine learning workflows und arbeitet eine Reihe von neuralgischen Punkten heraus, an denen biases entstehen können. Darüber hinaus wird auf die Rolle von Kulturerbe-Einrichtungen abgehoben. Diese Einrichtungen verfügen sowohl über umfangreiches Erfahrungswissen bei der Etablierung von Dokumentationsverfahren als auch über wertvolle Datensätze. Sie sind daher in besonderem Maße dazu qualifiziert, durch mustergültige Dokumentationen begleitete Datensätze zu publizieren. Die Bereitstellung von Kulturerbe-Daten unter Einbeziehung ethischer Erwägungen kann dazu beitragen, für die Gesellschaft kritische Inhalte in einer Art und Weise aufzubereiten, dass die Entwicklung von KI-Anwendungen stimuliert und gesellschaftlich nachteilige Effekte vermieden werden. Der Leitfaden schließt mit einem Plädoyer für einen interdisziplinären Ansatz, um die aufgezeigten Probleme anzugehen, und er betont die Notwendigkeit proaktiver Maßnahmen von Kulturerbe-Einrichtungen, um in den Daten vorhandene Stereotype und Vorurteile zu dokumentieren und so einen positiven Beitrag zur KI-Ethik leisten zu können. Damit eröffnet dieser Leitfaden nicht nur die Möglichkeit, zur Entwicklung kleiner, in hohem Maße für spezifische Aufgaben im Kulturerbe-Bereich geeigneter Modelle mit einem hohen Kosten-Nutzen-Verhältnis beizutragen, sondern auch die vorhandenen großen Mehrzweckmodelle robuster, effizienter, kontextsensitiver, genauer und nachhaltiger zu machen. Die Publikation von hochqualitativen Kulturerbe-Datensätzen inklusive Dokumentation schärft das Profil der Kulturerbe-Einrichtung, macht sie als Partner:innen der Forschung attraktiv und eröffnet die Möglichkeit, an der Einwerbung von Forschungsmitteln teilzuhaben.   The machine learning models currently used under the heading ‘artificial intelligence’ (AI) have been trained almost exclusively on data from the 21st century. They are therefore highly unsuitable for use in historical contexts or contexts that differ culturally from the Western world. The provision of data from the cultural heritage sector does not simply provide a remedy here. Such datasets are usually of high quality and are characterised by their historical depth, cultural richness and diversity. However, they often contain problematic content that stems from the worldview of bygone times and therefore require comprehensive documentation in order to make machine learning models more precise, powerful and suitable for use in different cultural contexts, and to enable their use for the common good. This guide focuses on ethical, social and legal aspects of documenting cultural heritage data used for training machine learning models. In particular, it focuses on how these models may perpetuate historical or statistical biases. The analysis moves along the different phases of the entire machine learning workflow and identifies a number of neuralgic points where biases can arise. In addition, the role of cultural heritage institutions is emphasised. These institutions have both extensive experience in establishing documentation procedures and valuable datasets. They are therefore particularly qualified to publish datasets accompanied by exemplary documentation. The provision of cultural heritage data, taking ethical considerations into account, can help to prepare critical content for society in a way that stimulates the development of AI applications and avoids socially detrimental effects. The guide concludes with a plea for an interdisciplinary approach to address the issues identified and emphasises the need for proactive measures by cultural heritage institutions to document existing stereotypes and biases in the data in order to make a positive contribution to AI ethics. In doing so, this guide not only opens up the possibility of contributing to the development of small-scale models that are highly suitable for specific tasks in the cultural heritage sector with a high cost-benefit ratio, but also of making the existing large-scale multipurpose models more robust, efficient, context-sensitive, accurate and sustainable. The publication of high-quality cultural heritage datasets, including documentation, sharpens the profile of the cultural heritage institution, makes it attractive as a partner for research and thus opens up the possibility of participating in the acquisition of research funding.  
  • Copy link
  • Flag this post
  • Block
Jörg Lehmann
Jörg Lehmann
@jrglmn@mastodon.social  ·  activity timestamp last month

Two further results from the project "Human.Machine.Culture" (https://mmk.sbb.berlin/?lang=en) at @stabi_berlin published in Open Access

Guidelines for the Documentation of Ethical, Legal and Social Issues (ELSI) in Cultural Data

https://doi.org/10.5281/zenodo.16418345

Guidelines for the Publication of Cultural Data for AI Research

https://doi.org/10.5281/zenodo.15878097

Feedback to these publications is most welcome!

#bigdata #ML #culturalheritage #ELSI #digitalculturalheritage

Zenodo

Handreichung für die Veröffentlichung von Kulturdaten für die KI-Forschung – Guidelines for the Publication of Cultural Data for AI Research

Diese Handreichung bietet eine umfassende Anleitung für die Bereitstellung von Kulturerbedatensätzen für die KI-Forschung. Sie präsentiert eine Checkliste für die Veröffentlichung von Kulturdaten, betont die Bedeutung einer transparenten Dokumentation durch datasheets sowie der Beachtung der FAIR Prinzipien und verweist auf praktische Werkzeuge und Beispiele, um diese umzusetzen. Zentrale Aspekte der Datensatzerstellung wie die Strukturierung, Kuratierung und Qualitätssicherung werden ebenso vorgestellt wie gängige Repositorien und die Auswahl geeigneter Formate und Standards. These guidelines provide comprehensive guidance for the provision of cultural heritage datasets for AI research. They present a checklist for the publication of cultural data, emphasise the importance of transparent documentation through datasheets and compliance with the FAIR principles, and refer to practical tools and examples for implementing these. Key aspects of dataset creation such as structuring, curation and quality assurance are presented, as are common repositories and the selection of suitable formats and standards.
Zenodo

Leitfaden für die Dokumentation von ethischen, rechtlichen und sozialen Aspekten (ELSA) in Kulturdaten – Guidelines for the Documentation of Ethical, Legal and Social Issues (ELSI) in Cultural Data

Die aktuell unter dem Schlagwort „künstliche Intelligenz“ (KI) betriebenen Modelle wurden quasi ausschließlich auf Daten des 21. Jahrhunderts trainiert. Daher sind sie in hohem Maße ungeeignet für den Einsatz in historischen oder von der westlichen Welt kulturell differenten Kontexten. Die Bereitstellung von Daten aus dem Kulturerbe-Bereich schafft hier nicht einfach Abhilfe. Zwar sind solche Datensätze meist von hoher Qualität und zeichnen sich durch ihre historische Tiefe, ihren kulturellen Reichtum und ihre Diversität aus. Sie enthalten aber oft problematische Inhalte, die der Weltsicht vergangener Zeiten entstammt, und bedürfen daher einer umfassenden Dokumentation, um machine learning-Modelle präziser, leistungsfähiger und geeignet für den Einsatz in verschiedenen kulturellen Kontexten sowie dem Gemeinwohl dienlich zu machen. Dieser Leitfaden fokussiert auf ethische, soziale und rechtliche Aspekte bei der Dokumentation von Kulturerbe-Daten, die für das Training von machine learning-Modellen benutzt werden. Er konzentriert sich insbesondere darauf, wie diese Modelle möglicherweise historische oder statistische Verzerrungen („biases“) perpetuieren. Die Analyse wandert entlang der verschiedenen Phasen des gesamten machine learning workflows und arbeitet eine Reihe von neuralgischen Punkten heraus, an denen biases entstehen können. Darüber hinaus wird auf die Rolle von Kulturerbe-Einrichtungen abgehoben. Diese Einrichtungen verfügen sowohl über umfangreiches Erfahrungswissen bei der Etablierung von Dokumentationsverfahren als auch über wertvolle Datensätze. Sie sind daher in besonderem Maße dazu qualifiziert, durch mustergültige Dokumentationen begleitete Datensätze zu publizieren. Die Bereitstellung von Kulturerbe-Daten unter Einbeziehung ethischer Erwägungen kann dazu beitragen, für die Gesellschaft kritische Inhalte in einer Art und Weise aufzubereiten, dass die Entwicklung von KI-Anwendungen stimuliert und gesellschaftlich nachteilige Effekte vermieden werden. Der Leitfaden schließt mit einem Plädoyer für einen interdisziplinären Ansatz, um die aufgezeigten Probleme anzugehen, und er betont die Notwendigkeit proaktiver Maßnahmen von Kulturerbe-Einrichtungen, um in den Daten vorhandene Stereotype und Vorurteile zu dokumentieren und so einen positiven Beitrag zur KI-Ethik leisten zu können. Damit eröffnet dieser Leitfaden nicht nur die Möglichkeit, zur Entwicklung kleiner, in hohem Maße für spezifische Aufgaben im Kulturerbe-Bereich geeigneter Modelle mit einem hohen Kosten-Nutzen-Verhältnis beizutragen, sondern auch die vorhandenen großen Mehrzweckmodelle robuster, effizienter, kontextsensitiver, genauer und nachhaltiger zu machen. Die Publikation von hochqualitativen Kulturerbe-Datensätzen inklusive Dokumentation schärft das Profil der Kulturerbe-Einrichtung, macht sie als Partner:innen der Forschung attraktiv und eröffnet die Möglichkeit, an der Einwerbung von Forschungsmitteln teilzuhaben.   The machine learning models currently used under the heading ‘artificial intelligence’ (AI) have been trained almost exclusively on data from the 21st century. They are therefore highly unsuitable for use in historical contexts or contexts that differ culturally from the Western world. The provision of data from the cultural heritage sector does not simply provide a remedy here. Such datasets are usually of high quality and are characterised by their historical depth, cultural richness and diversity. However, they often contain problematic content that stems from the worldview of bygone times and therefore require comprehensive documentation in order to make machine learning models more precise, powerful and suitable for use in different cultural contexts, and to enable their use for the common good. This guide focuses on ethical, social and legal aspects of documenting cultural heritage data used for training machine learning models. In particular, it focuses on how these models may perpetuate historical or statistical biases. The analysis moves along the different phases of the entire machine learning workflow and identifies a number of neuralgic points where biases can arise. In addition, the role of cultural heritage institutions is emphasised. These institutions have both extensive experience in establishing documentation procedures and valuable datasets. They are therefore particularly qualified to publish datasets accompanied by exemplary documentation. The provision of cultural heritage data, taking ethical considerations into account, can help to prepare critical content for society in a way that stimulates the development of AI applications and avoids socially detrimental effects. The guide concludes with a plea for an interdisciplinary approach to address the issues identified and emphasises the need for proactive measures by cultural heritage institutions to document existing stereotypes and biases in the data in order to make a positive contribution to AI ethics. In doing so, this guide not only opens up the possibility of contributing to the development of small-scale models that are highly suitable for specific tasks in the cultural heritage sector with a high cost-benefit ratio, but also of making the existing large-scale multipurpose models more robust, efficient, context-sensitive, accurate and sustainable. The publication of high-quality cultural heritage datasets, including documentation, sharpens the profile of the cultural heritage institution, makes it attractive as a partner for research and thus opens up the possibility of participating in the acquisition of research funding.  
  • Copy link
  • Flag this post
  • Block
Hacker News
Hacker News
@h4ckernews@mastodon.social  ·  activity timestamp last month

Sharp

https://apple.github.io/ml-sharp/

#HackerNews #Sharp #ML #MachineLearning #Apple #OpenSource #GitHub #TechNews

  • Copy link
  • Flag this post
  • Block
Tuta
Tuta
@Tutanota@mastodon.social  ·  activity timestamp last month

🚨 #Microsoft 365 price shock: From July 2026, prices rise by up to 16.7%.

Now is the best time to break free from vendor lock-in.

What are your favorite #MS365 alternatives? 🔓🇪🇺

➡️ https://tuta.com/blog/microsoft-365-price-increase

#DigitalSovereignty #deMicrosoft #OneClickAway

#deMicrosoft your life: Image with lots of logos to replace Microsoft apps and tools.
#deMicrosoft your life: Image with lots of logos to replace Microsoft apps and tools.
#deMicrosoft your life: Image with lots of logos to replace Microsoft apps and tools.
Bob Machintruc
Bob Machintruc
@turbobob@mamot.fr replied  ·  activity timestamp last month

@Tutanota Thanks for sharing those "Translate" AGPLv3 tools blobcatheart, interesting!

What will be the license for Tuta Drive by the way?

That being said, #Microsoft tools are beyond awful. How come this company is still a thing? It's ridiculous 😄

#FreeAndOpenSource #Privacy #Alternative #Internet #Linux #Translation #Deepl #ML

  • Copy link
  • Flag this comment
  • Block
JTI
JTI
@jti42@infosec.exchange  ·  activity timestamp last month

I see a lot of blank, outright rejection of #AI, LLMs general or coding LLMs like #ClaudeCode in special here on the Fediverse.
Often, the actual impact of the AI / #LLM in use is not even understood by those criticizing it, at times leading to tantrums about AI where there is....no AI involved.

The technology (LLM et al) in itself is not likely to go away for a few more years. The smaller #ML variations that aren't being yapped about as much are going to remain here as they have been for the past decades.
I assume that what will indeed happen is a move from centralized cloud models to on-prem hardware as the hardware becomes more powerful and the models more efficient. Think migration from the large mainframes to the desktop PCs. We're seeing a start of this with devices such as the ASUS Ascent #GX10 / #Nvidia #GB10.

Imagine having the power of #Claude under your desk, powered for free by #solar cells on your roof with some nice solar powered AC to go with it.

Would it not be wise to accept the reality of the existence of this technology and find out how this can be used in a good way that would improve lives? And how smart, small regulation can be built and enforced that balances innovation and risks to get closer to #startrek(tm)?

Low-key reminds me of the Maschinenstürmer of past times...

  • Copy link
  • Flag this post
  • Block
Lukas Fuchsgruber boosted
Jörg Lehmann
Jörg Lehmann
@jrglmn@mastodon.social  ·  activity timestamp 2 months ago

Now the lightning talk on datasheets & data-envelopes presented by @sclaeyssens and Antoine Isaac at #FF2025

The slides of the presentation are available at

https://zenodo.org/records/17725565

#datasheets #data-envelopes #ML #collectionsasdata

  • Copy link
  • Flag this post
  • Block
Jörg Lehmann
Jörg Lehmann
@jrglmn@mastodon.social  ·  activity timestamp 2 months ago

Now the lightning talk on datasheets & data-envelopes presented by @sclaeyssens and Antoine Isaac at #FF2025

The slides of the presentation are available at

https://zenodo.org/records/17725565

#datasheets #data-envelopes #ML #collectionsasdata

  • Copy link
  • Flag this post
  • Block
Fabrizio Musacchio boosted
ma𝕏pool
ma𝕏pool
@maxpool@mathstodon.xyz  ·  activity timestamp 2 months ago

AI research is "Eating your own dog food" kind of field. It's both amusing and predictable.

Major AI conference flooded with peer reviews written fully by AI https://www.nature.com/articles/d41586-025-03506-6
"Controversy has erupted after 21% of manuscript reviews for an international AI conference were found to be generated by artificial intelligence. "

How did they find out, you ask? With the help of AI models, of course. "EditLens: Quantifying the Extent of AI Editing in Textmodel." https://arxiv.org/abs/2510.03154

#ml #science #research #AIResearch #ai

  • Copy link
  • Flag this post
  • Block
Esther Payne :bisexual_flag: boosted
Aaron
Aaron
@hosford42@techhub.social  ·  activity timestamp 2 months ago

If you want a specific example of why many researchers in machine learning and natural language processing find the idea that LLMs like ChatGPT or Claude are "intelligent" or "conscious" is laughable, this article describes one:

https://news.mit.edu/2025/shortcoming-makes-llms-less-reliable-1126

#LLM
#ChatGPT
#Claude
#MachineLearning
#NaturalLanguageProcessing
#ML
#AI
#NLP

  • Copy link
  • Flag this post
  • Block
ma𝕏pool
ma𝕏pool
@maxpool@mathstodon.xyz  ·  activity timestamp 2 months ago

AI research is "Eating your own dog food" kind of field. It's both amusing and predictable.

Major AI conference flooded with peer reviews written fully by AI https://www.nature.com/articles/d41586-025-03506-6
"Controversy has erupted after 21% of manuscript reviews for an international AI conference were found to be generated by artificial intelligence. "

How did they find out, you ask? With the help of AI models, of course. "EditLens: Quantifying the Extent of AI Editing in Textmodel." https://arxiv.org/abs/2510.03154

#ml #science #research #AIResearch #ai

  • Copy link
  • Flag this post
  • Block
Aaron
Aaron
@hosford42@techhub.social  ·  activity timestamp 2 months ago

If you want a specific example of why many researchers in machine learning and natural language processing find the idea that LLMs like ChatGPT or Claude are "intelligent" or "conscious" is laughable, this article describes one:

https://news.mit.edu/2025/shortcoming-makes-llms-less-reliable-1126

#LLM
#ChatGPT
#Claude
#MachineLearning
#NaturalLanguageProcessing
#ML
#AI
#NLP

  • Copy link
  • Flag this post
  • Block
Natouille 🍷 🥃 🍾 boosted
Yrrusajywo
Yrrusajywo
@Yrrussaj@piaille.fr  ·  activity timestamp 2 months ago

Salut le fédi, j'ai besoin de tes miracles !

J'ai une belle-soeur, qui est à peu près la personne la plus intelligente que j'ai eu l'occasion de côtoyer qui cherche un boulot en R&D ML engineer / Data scientist idéalement en full remote ou du côté de #grenoble. Son contrat d'avant dans le machine learning vient de se terminer

Elle a été auparavant chercheuse dans les nanosciences au #cnrs. Elle a plusieurs papiers de publiés. Bref, une tête !

Je peux partager son CV, elle est pas mal en galère d'argent, donc c'est plutôt urgent 😊 !

Y a moyen de faire tourner ce petit message d'aide ? Merci !

#ML #datascience #emploi #jechercheunjob #recrutement

  • Copy link
  • Flag this post
  • Block
Yrrusajywo
Yrrusajywo
@Yrrussaj@piaille.fr  ·  activity timestamp 2 months ago

Salut le fédi, j'ai besoin de tes miracles !

J'ai une belle-soeur, qui est à peu près la personne la plus intelligente que j'ai eu l'occasion de côtoyer qui cherche un boulot en R&D ML engineer / Data scientist idéalement en full remote ou du côté de #grenoble. Son contrat d'avant dans le machine learning vient de se terminer

Elle a été auparavant chercheuse dans les nanosciences au #cnrs. Elle a plusieurs papiers de publiés. Bref, une tête !

Je peux partager son CV, elle est pas mal en galère d'argent, donc c'est plutôt urgent 😊 !

Y a moyen de faire tourner ce petit message d'aide ? Merci !

#ML #datascience #emploi #jechercheunjob #recrutement

  • Copy link
  • Flag this post
  • Block
Bohyun Kim
Bohyun Kim
@bohyunkim@code4lib.social  ·  activity timestamp 2 months ago

New blog post! “What I Think About AI When I Hear About AI: A Slightly Unconventional View” - https://www.bohyunkim.net/blog/archives/4443 (or read on Substack -https://bohyunkima2.substack.com/p/what-i-think-about-ai-when-i-hear) #libraries #AI #ML

  • Copy link
  • Flag this post
  • Block
Ross Gayler boosted
Anthony
Anthony
@abucci@buc.ci  ·  activity timestamp 2 months ago

R.A. Fisher wrote that the purpose of statisticians was "constructing a hypothetical infinite population of which the actual data are regarded as constituting a random sample." ( p. 311 here ). In The Zeroth Problem Colin Mallows wrote "As Fisher pointed out, statisticians earn their living by using two basic tricks-they regard data as being realizations of random variables, and they assume that they know an appropriate specification for these random variables."

Some of the pathological beliefs we attribute to techbros were already present in this view of statistics that started forming over a century ago. Our writing is just data; the real, important object is the “hypothetical infinite population” reflected in a large language model, which at base is a random variable. Stable Diffusion, the image generator, is called that because it is based on latent diffusion models, which are a way of representing complicated distribution functions--the hypothetical infinite populations--of things like digital images. Your art is just data; it’s the latent diffusion model that’s the real deal. The entities that are able to identify the distribution functions (in this case tech companies) are the ones who should be rewarded, not the data generators (you and me).

So much of the dysfunction in today’s machine learning and AI points to how problematic it is to give statistical methods a privileged place that they don’t merit. We really ought to be calling out Fisher for his trickery and seeing it as such.

#AI #GenAI #GenerativeAI #LLM #StableDiffusion #statistics #StatisticalMethods #DiffusionModels #MachineLearning #ML

  • Copy link
  • Flag this post
  • Block
Anthony
Anthony
@abucci@buc.ci  ·  activity timestamp 2 months ago

R.A. Fisher wrote that the purpose of statisticians was "constructing a hypothetical infinite population of which the actual data are regarded as constituting a random sample." ( p. 311 here ). In The Zeroth Problem Colin Mallows wrote "As Fisher pointed out, statisticians earn their living by using two basic tricks-they regard data as being realizations of random variables, and they assume that they know an appropriate specification for these random variables."

Some of the pathological beliefs we attribute to techbros were already present in this view of statistics that started forming over a century ago. Our writing is just data; the real, important object is the “hypothetical infinite population” reflected in a large language model, which at base is a random variable. Stable Diffusion, the image generator, is called that because it is based on latent diffusion models, which are a way of representing complicated distribution functions--the hypothetical infinite populations--of things like digital images. Your art is just data; it’s the latent diffusion model that’s the real deal. The entities that are able to identify the distribution functions (in this case tech companies) are the ones who should be rewarded, not the data generators (you and me).

So much of the dysfunction in today’s machine learning and AI points to how problematic it is to give statistical methods a privileged place that they don’t merit. We really ought to be calling out Fisher for his trickery and seeing it as such.

#AI #GenAI #GenerativeAI #LLM #StableDiffusion #statistics #StatisticalMethods #DiffusionModels #MachineLearning #ML

  • Copy link
  • Flag this post
  • Block
Hacker News
Hacker News
@h4ckernews@mastodon.social  ·  activity timestamp 2 months ago

Honda: 2 years of ml vs 1 month of prompting - heres what we learned

https://www.levs.fyi/blog/2-years-of-ml-vs-1-month-of-prompting/

#HackerNews #Honda #ml #prompting #machinelearning #AI #insights

  • Copy link
  • Flag this post
  • Block
Fabrizio Musacchio boosted
Martin Skrodzki
Martin Skrodzki
@msmathcomputer@mathstodon.xyz  ·  activity timestamp 2 months ago

💡3-Month FULLY FUNDED Summer Internship in Germany!
🇩🇪 Apply for the Internship (Jul-Sep 2026) in Tübingen. Work on cutting-edge research in #ML, #Neuroscience & #DataAnalysis at MPI. Open to BSc./MSc. students.
🗓️Deadline: Nov 20, 2025
🔗 Apply: https://cactus-internship.tuebingen.mpg.de

A green and white poster for the CaCTüS: Computation & Cognition Tübingen Summer Internship (July-Sep 2026). The main graphic is a large green cactus wearing a graduation cap, with Tübingen city skyline silhouettes below.

The 3-month fully funded internship is at the Max-Planck-Institutes and AI Center Tübingen, Germany, focusing on research in machine learning, electrical engineering, theoretical neuroscience, behavioral experiments, and data analysis. It aims to support talented students facing challenges in higher education.

Application Deadline: November 20, 2025. Apply at: cactus-internship.tuebingen.mpg.de. A QR code is also present.

Who can apply: Bachelor/master students in computer science, maths, physics, engineering, neuroscience, psychology, cognitive science, bioinformatics, and related fields. For more info, visit the website or email cactus-internship@tuebingen.mpg.de. Logos for Max Planck Institute and Tübingen AI Center are at the bottom.
A green and white poster for the CaCTüS: Computation & Cognition Tübingen Summer Internship (July-Sep 2026). The main graphic is a large green cactus wearing a graduation cap, with Tübingen city skyline silhouettes below. The 3-month fully funded internship is at the Max-Planck-Institutes and AI Center Tübingen, Germany, focusing on research in machine learning, electrical engineering, theoretical neuroscience, behavioral experiments, and data analysis. It aims to support talented students facing challenges in higher education. Application Deadline: November 20, 2025. Apply at: cactus-internship.tuebingen.mpg.de. A QR code is also present. Who can apply: Bachelor/master students in computer science, maths, physics, engineering, neuroscience, psychology, cognitive science, bioinformatics, and related fields. For more info, visit the website or email cactus-internship@tuebingen.mpg.de. Logos for Max Planck Institute and Tübingen AI Center are at the bottom.
A green and white poster for the CaCTüS: Computation & Cognition Tübingen Summer Internship (July-Sep 2026). The main graphic is a large green cactus wearing a graduation cap, with Tübingen city skyline silhouettes below. The 3-month fully funded internship is at the Max-Planck-Institutes and AI Center Tübingen, Germany, focusing on research in machine learning, electrical engineering, theoretical neuroscience, behavioral experiments, and data analysis. It aims to support talented students facing challenges in higher education. Application Deadline: November 20, 2025. Apply at: cactus-internship.tuebingen.mpg.de. A QR code is also present. Who can apply: Bachelor/master students in computer science, maths, physics, engineering, neuroscience, psychology, cognitive science, bioinformatics, and related fields. For more info, visit the website or email cactus-internship@tuebingen.mpg.de. Logos for Max Planck Institute and Tübingen AI Center are at the bottom.
  • Copy link
  • Flag this post
  • Block
Martin Skrodzki
Martin Skrodzki
@msmathcomputer@mathstodon.xyz  ·  activity timestamp 2 months ago

💡3-Month FULLY FUNDED Summer Internship in Germany!
🇩🇪 Apply for the Internship (Jul-Sep 2026) in Tübingen. Work on cutting-edge research in #ML, #Neuroscience & #DataAnalysis at MPI. Open to BSc./MSc. students.
🗓️Deadline: Nov 20, 2025
🔗 Apply: https://cactus-internship.tuebingen.mpg.de

A green and white poster for the CaCTüS: Computation & Cognition Tübingen Summer Internship (July-Sep 2026). The main graphic is a large green cactus wearing a graduation cap, with Tübingen city skyline silhouettes below.

The 3-month fully funded internship is at the Max-Planck-Institutes and AI Center Tübingen, Germany, focusing on research in machine learning, electrical engineering, theoretical neuroscience, behavioral experiments, and data analysis. It aims to support talented students facing challenges in higher education.

Application Deadline: November 20, 2025. Apply at: cactus-internship.tuebingen.mpg.de. A QR code is also present.

Who can apply: Bachelor/master students in computer science, maths, physics, engineering, neuroscience, psychology, cognitive science, bioinformatics, and related fields. For more info, visit the website or email cactus-internship@tuebingen.mpg.de. Logos for Max Planck Institute and Tübingen AI Center are at the bottom.
A green and white poster for the CaCTüS: Computation & Cognition Tübingen Summer Internship (July-Sep 2026). The main graphic is a large green cactus wearing a graduation cap, with Tübingen city skyline silhouettes below. The 3-month fully funded internship is at the Max-Planck-Institutes and AI Center Tübingen, Germany, focusing on research in machine learning, electrical engineering, theoretical neuroscience, behavioral experiments, and data analysis. It aims to support talented students facing challenges in higher education. Application Deadline: November 20, 2025. Apply at: cactus-internship.tuebingen.mpg.de. A QR code is also present. Who can apply: Bachelor/master students in computer science, maths, physics, engineering, neuroscience, psychology, cognitive science, bioinformatics, and related fields. For more info, visit the website or email cactus-internship@tuebingen.mpg.de. Logos for Max Planck Institute and Tübingen AI Center are at the bottom.
A green and white poster for the CaCTüS: Computation & Cognition Tübingen Summer Internship (July-Sep 2026). The main graphic is a large green cactus wearing a graduation cap, with Tübingen city skyline silhouettes below. The 3-month fully funded internship is at the Max-Planck-Institutes and AI Center Tübingen, Germany, focusing on research in machine learning, electrical engineering, theoretical neuroscience, behavioral experiments, and data analysis. It aims to support talented students facing challenges in higher education. Application Deadline: November 20, 2025. Apply at: cactus-internship.tuebingen.mpg.de. A QR code is also present. Who can apply: Bachelor/master students in computer science, maths, physics, engineering, neuroscience, psychology, cognitive science, bioinformatics, and related fields. For more info, visit the website or email cactus-internship@tuebingen.mpg.de. Logos for Max Planck Institute and Tübingen AI Center are at the bottom.
  • Copy link
  • Flag this post
  • Block

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances
Bonfire social · 1.0.1 no JS en
Automatic federation enabled
Log in
  • Explore
  • About
  • Members
  • Code of Conduct