Discussion
Loading...

Post

Log in
  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
petersuber
petersuber
@petersuber@fediscience.org  ·  activity timestamp 4 weeks ago

"#News #publishers limit #InternetArchive access due to #AI scraping concerns."
https://www.niemanlab.org/2026/01/news-publishers-limit-internet-archive-access-due-to-ai-scraping-concerns/

PS: I'm one who thinks AI training on copyrighted content is #FairUse and (separate point) even desirable in the case of academic research.
https://fediscience.org/@petersuber/113443473594224752

But this kind of training will create huge collateral damage --indirectly through publisher action -- if it diminishes the @internetarchive.

#Copyright #Journalism

  • Copy link
  • Flag this post
  • Block
petersuber
petersuber
@petersuber@fediscience.org  ·  activity timestamp 2 weeks ago

Update. It's happening. "News Publishers Are Now Blocking The Internet Archive, And We May All Regret It."
https://www.techdirt.com/2026/02/13/news-publishers-are-now-blocking-the-internet-archive-and-we-may-all-regret-it/

@mmasnick is right: "In our rush to punish #AI companies, we’re destroying public goods that serve everyone…We’re sacrificing the historical record not because of proven harm, but because publishers are worried about what might happen. That’s a hell of a tradeoff."

#Copyright #InternetArchive #Journalism #Publishers
@internetarchive

Techdirt

News Publishers Are Now Blocking The Internet Archive, And We May All Regret It

Last fall, I wrote about how the fear of AI was leading us to wall off the open internet in ways that would hurt everyone. At the time, I was worried about how companies were conflating legitimate …
  • Copy link
  • Flag this comment
  • Block
petersuber
petersuber
@petersuber@fediscience.org  ·  activity timestamp last week

Update. But are #publishers right to worry that #AI companies can freely scrape the #WaybackMachine in order to train their tools? No, says Mark Graham, director of the Wayback Machine.
https://www.techdirt.com/2026/02/17/preserving-the-web-is-not-the-problem-losing-it-is/

"The Wayback Machine is built for human readers. We use rate limiting, filtering, and monitoring to prevent abusive access, and we watch for and actively respond to new scraping patterns as they emerge."

#Copyright #InternetArchive #Journalism
@internetarchive

  • Copy link
  • Flag this comment
  • Block

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances
Bonfire social · 1.0.2-alpha.34 no JS en
Automatic federation enabled
Log in
Instance logo
  • Explore
  • About
  • Members
  • Code of Conduct