👩💻 Participez à notre événement “Comment utiliser Grist pour gérer ses bases de données ?”, le 24 septembre à 14h30 !
Nous vous proposerons de découvrir :
🔹 l'outil Grist et ses fonctionnalités principales
🔹 l'utilisation de Grist pour le catalogage de données et l'accompagnement possible
🔹 un exemple d'usage avec retour d'expériences : le catalogage des démarches et des données réalisé par la Direction Générale des Entreprises
💌 Pour s’inscrire : https://www.eventbrite.fr/e/comment-utiliser-grist-pour-gerer-ses-bases-de-donnees-tickets-1462900949119?aff=oddtdtcreator
👩💻 Participez à notre événement “Comment utiliser Grist pour gérer ses bases de données ?”, le 24 septembre à 14h30 !
Nous vous proposerons de découvrir :
🔹 l'outil Grist et ses fonctionnalités principales
🔹 l'utilisation de Grist pour le catalogage de données et l'accompagnement possible
🔹 un exemple d'usage avec retour d'expériences : le catalogage des démarches et des données réalisé par la Direction Générale des Entreprises
💌 Pour s’inscrire : https://www.eventbrite.fr/e/comment-utiliser-grist-pour-gerer-ses-bases-de-donnees-tickets-1462900949119?aff=oddtdtcreator
A Strategic Community #Roadmap for an #Australian#FAIR#Vocabulary Ecosystem
https://doi.org/10.25911/N6K8-F540
Three years ago, I participated in a very engaged workshop at #ANU on #vocabularies for FAIR #data management. It sharpened how I think about vocabularies. I now see them primarily as a #KnowledgeTransfer tool for representing domain expertise in an actionable form. And I think we do a terrible job both at highlighting how critical they are (particularly in an age where trusted expertise is harder to find) and also at making them easier for others to find and reuse.
I picture this scenario. A student is about to start collecting data for their thesis. They need to make choices about what variables to observe or what questions to ask participants, and they need to think about how they want to represent the results to support their analysis. In the ideal case, the actual data collecting effort is about populating an imagined but initially empty data matrix. If they could be assisted to find the best structured and most widely used (in their domain) vocabularies for any categorical values in their data, it would be possible to generate that template matrix with in-built validation tools, etc. The data they finally collect would have most of its metadata already defined and would be properly interoperable with data collected by others in their domain. Meta-analysis would be much simpler.
I am interested in why tools like this don't really exist, or at least why they are not mainstream. I think it's because vocabularies are seen as such an ultra-nerdy subset of the nerdy topic of #metadata rather than presented as an opportunity to stand on the shoulders of others. What can be done to make them more friendly and intuitive for such purposes?
Finally, after way too many struggles, we have a report and recommendations from from that meeting in 2022. I tried to add some of these ideas to the final product as best I could.
The last few days, #Slack, which I only use via a browser, keeps asking for video and audio permission, even though I'm not joining a call or using their audio feature.
Is slack trying to steal my #data and I'm only noticing because I monitor and track all such requests due to #privacy concerns? Makes me wonder how many people gave permission without thinking, especially if they use the app, which I imagine harvests a lot of data.
A Strategic Community #Roadmap for an #Australian#FAIR#Vocabulary Ecosystem
https://doi.org/10.25911/N6K8-F540
Three years ago, I participated in a very engaged workshop at #ANU on #vocabularies for FAIR #data management. It sharpened how I think about vocabularies. I now see them primarily as a #KnowledgeTransfer tool for representing domain expertise in an actionable form. And I think we do a terrible job both at highlighting how critical they are (particularly in an age where trusted expertise is harder to find) and also at making them easier for others to find and reuse.
I picture this scenario. A student is about to start collecting data for their thesis. They need to make choices about what variables to observe or what questions to ask participants, and they need to think about how they want to represent the results to support their analysis. In the ideal case, the actual data collecting effort is about populating an imagined but initially empty data matrix. If they could be assisted to find the best structured and most widely used (in their domain) vocabularies for any categorical values in their data, it would be possible to generate that template matrix with in-built validation tools, etc. The data they finally collect would have most of its metadata already defined and would be properly interoperable with data collected by others in their domain. Meta-analysis would be much simpler.
I am interested in why tools like this don't really exist, or at least why they are not mainstream. I think it's because vocabularies are seen as such an ultra-nerdy subset of the nerdy topic of #metadata rather than presented as an opportunity to stand on the shoulders of others. What can be done to make them more friendly and intuitive for such purposes?
Finally, after way too many struggles, we have a report and recommendations from from that meeting in 2022. I tried to add some of these ideas to the final product as best I could.
At @pyconau #pyconau Dr Arwen Griffioen presents an anti-oppresive framework for #AI and #data ethics - red team your models for #bias, centre the voices of the people who are most likely to be harmed, monitor continuously, for distributional shift, and ask the hard questions during the design process. Use your privilege to push back.
Will you be in Melbs for @pyconau#PyConAU#PyConAU2025 in the next few days?
Don't miss MDC's @KathyReid speaking on Sun 14th in Ballroom 1 at 1410hrs - introducing a better way to provide and source #data for #AI. Sneak peek:
Will you be in Melbs for @pyconau#PyConAU#PyConAU2025 in the next few days?
Don't miss MDC's @KathyReid speaking on Sun 14th in Ballroom 1 at 1410hrs - introducing a better way to provide and source #data for #AI. Sneak peek:
Hello world!
We're the Mozilla Data Collective - and we want to rebuild the #AI #data #ecosystem - with #communities at the centre.
We can do better than giving communities a false choice between exposing their content to unconsented web scraping, with no say in how that data is used - or being pressured into signing exclusive licensing deals with a megacorp - stifling innovation.
We see another way.
The Mozilla Data Collective.
Launching Wednesday! 🚀 💫
Hello world!
We're the Mozilla Data Collective - and we want to rebuild the #AI #data #ecosystem - with #communities at the centre.
We can do better than giving communities a false choice between exposing their content to unconsented web scraping, with no say in how that data is used - or being pressured into signing exclusive licensing deals with a megacorp - stifling innovation.
We see another way.
The Mozilla Data Collective.
Launching Wednesday! 🚀 💫
Trump Says America’s Oil Industry Is Cleaner Than Other Countries’. New Data Shows Massive Emissions From Texas Wells.
---
The oil industry touts Texas as a success story in controlling climate-warming methane emissions. The state’s regulator, however, grants nearly every request to burn or vent gas into the atmosphere.
https://www.propublica.org/article/texas-methane-oil-emissions-climate?utm_source=mastodon&utm_medium=social&utm_campaign=mastodon-post
Excellent points by John Pane of @EFA in conversation with Raf Epstein this morning on ABC Melbourne concerning the #biometric #data that is captured during age assurance - and the #privacy dangers it presents.
Are you comfortable with social media companies or third parties they outsource to retaining biometric data such as images of your face?
John also covers the concept of #SurveillanceCapitalism (per Shoshana Zuboff) that underpins social media business models, and calls for greater duty of care to be placed upon those platforms.
We need privacy reform now.
Trump Says America’s Oil Industry Is Cleaner Than Other Countries’. New Data Shows Massive Emissions From Texas Wells.
---
The oil industry touts Texas as a success story in controlling climate-warming methane emissions. The state’s regulator, however, grants nearly every request to burn or vent gas into the atmosphere.
https://www.propublica.org/article/texas-methane-oil-emissions-climate?utm_source=mastodon&utm_medium=social&utm_campaign=mastodon-post
If you trust an #AI Agent with your #data, how do you guarantee that data doesn't get changed without your knowledge? In other words, how do you maintain your data's integrity? This will be especially important in Web 3.0 where ownership will return to data owners. https://spectrum.ieee.org/data-integrity?utm_source=mastodon&utm_medium=social&utm_campaign=fedica-Mastodon-Daily-Pipeline
With my #Mozilla Foundation #CommonVoice hat on, I'm delighted to be speaking in just under a fortnight at @pyconau about the #MozillaDataCollective, a new platform initiative that puts you in control of your #datasets.
Better #AI requires better #data - and better data requires collective, collaborative, co-created approaches: the Mozilla Data Collective.
If you trust an #AI Agent with your #data, how do you guarantee that data doesn't get changed without your knowledge? In other words, how do you maintain your data's integrity? This will be especially important in Web 3.0 where ownership will return to data owners. https://spectrum.ieee.org/data-integrity?utm_source=mastodon&utm_medium=social&utm_campaign=fedica-Mastodon-Daily-Pipeline
Alaska publishes a quarterly list with the names of Indigenous people reported missing — but it still doesn’t issue a list for those who’ve been killed.
When a local nonprofit requested that information, the state said no.
Excellent op-ed by #CSIRO Chair Ming Long AM and Deputy CEO, Professor Elanor Huntington @profElanor on #SovereignAI and the double bind of #data and #AI.
This position - which carries strong weight given the roles Prof Huntington and Ms Long have - as Deputy CEO and Chair respectively of CSIRO - echoes recent calls from like-minded heavyweights such as Dr Alex Antic and Simon Kriss.
What we need now is concerted action, funding, and collaboration across the government, academic and industry sectors to move forward before the level of dependence is intractable.
The current efforts toward Sovereign hashtag#AI in Australia are deeply problematic for a range of reasons.
Business has taken the lead - sensing the market opportunity and profits to be had by developing sovereign models - then selling them back to government and academia.
They're bottle-necked by access to sovereign data for training.
Kangaroo LLM's approach was to to scrape all .au websites - without consent from site owners, using volunteer labour pitched as #OpenSource. Maincode is working on Mathilda - pitched as "Australia's LLM". They're hiring ML and NLP PhDs. They're not transparent about how they're collecting data for Mathilda, but claim to have partnerships with government, and to be working on profit-sharing models for entities that provide data for training. Maincode is bootstrapped by a founder who made their money in online gambling.
Academia sees the problem - but is too cash-strapped and in survivability mode to act unilaterally - and needs to partner with industry and government to have any chance of steering sovereign AI ethically, transparently, sustainably and responsibly.
Government is taking policy advice - or should that be - having policy written for them - by the tech giants who gain the most from Australia NOT having sovereign AI capabilities - such as the Tech Council of Australia which is funded by corporations who now seek to gain a return on the massive up front investment in training foundation models. We've even seen pitches recently to get governments to fund LLM use for every citizen - on the promise of as-yet-not-evidence productivity gains.
Imagine - every Australian citizen providing training data for a foreign-owned corporation - and the government paying for it!
So, what do we need?
Strong regulation. Access to sovereign data that is legal and compensated. Certainty for business on selling access to sovereign models to government. Well-funded AI centres in universities to provide talent pipelines.
And most of all? We need to back ourselves.
(Note: the term "sovereign" is problematic - as Keir Winesmith so well articulated to me - because sovereignty was never ceded - but we don't have a better term, yet) in Australia.
https://www.afr.com/technology/australian-ai-data-competition-risk-20250827-p5mq8s
Alaska publishes a quarterly list with the names of Indigenous people reported missing — but it still doesn’t issue a list for those who’ve been killed.
When a local nonprofit requested that information, the state said no.