The fact that Google decided to dump a 4 GiB language model file on every Chrome installation is yet another sign of how the generative AI craze is unsustainable. Don't look at it from the side of users, look at it from Google side. Having every user download a 4 GiB monster which will need to be routinely updated is a significant cost. It takes a ton of bandwidth to do that, far more than Chrome updates consume. And yet they're doing it because they're desperate to externalize the cost of "AI".
Post
@gabrielesvelto The cost per bit is an invented metric though, isn't it? The cost to Google is in the hardware, which is already spent. Unless they've got caps somewhere that they need to work around, the cost for G to send 4GB vs sending nothing is roughly the same.
I'm no expert, of course...there is technically an upper limit to how much they could send, but I think it's unlikely (with their infrastructure) that they would reach it.
And, with billions bordering trillions of dollars to sink into any given endeavor, it will be some time before they feel the pinch.
As usual, it's the end users that will get screwed, especially if they're on metered connections.
@roknrol I don't think so, it's really a lot of data. We're talking exabytes here (milions of terabytes) to reach the entire Google Chrome user base. Google very much pays for bandwidth. This is from a few years ago: https://arstechnica.com/tech-policy/2022/09/google-fights-latest-attempt-to-have-big-tech-pay-for-isps-network-upgrades/
They care enough about bandwidth to have developed not one but two custom binary patchers for shrinking Chrome updates (Courgette and Zucchini).
@gabrielesvelto Ah, fair enough, thank you. I clearly have not kept up.
@gabrielesvelto it's also curious that a 4GB model that can run on consumer hardware is useful enough to force onto every user of Chrome, and somehow the AI companies expect everyone to pay them for these services.
@gundersen indeed. But there is nothing rational about this market. It's all C-suite FOMO, vibes, smoke and mirrors.
@gabrielesvelto I think you are absolutely correct!
@gabrielesvelto This download is going to happen every time they update the blob isn't it?
@pwloftus yes. It's basically a bunch of sparse matrixes, that get recomputed every time you re-train the model. It's very unlikely that it will be amenable to binary patching or incremental updates.
@gabrielesvelto I bet it's actually cheap for Google. They have network caches (or used to at least for YT) very close to the ISPs, and this kind of payload is very cache friendly.
@fabrice it's still like 5-10 exabytes of extra bandwidth, and it's not going to be diff-friendly. They developed a new dedicated binary patcher to shrink Chrome updates (Zucchini) fairly recently so it can't be too cheap for them.
@fabrice and BTW I never thought I'd ever use the word exabytes.
@gabrielesvelto @fabrice ... watch a streaming service like Netfix? Every time you or anyone else watches a few 4K movies, that's 4GB downloaded. Every night, the streamers have hundreds of millions of users.
For better or worse mass distribution of 4GB is just "Tuesday" for Google in 2026 with or without the local model aspect.