Post · bonfire.cafe

David Chisnall (*Now with 50% more sarcasm!*)

@david_chisnall@infosec.exchange · 2 days ago

For those who are skeptical that AI is a bubble, let's look at the possible paths from the current growth:

Scenario 1: Neither training nor inference costs go down significantly.

Current GenAI offerings are heavily subsidised by burning investor money, when that runs out the prices will go up. Only 8% of adults in the US would pay anything for AI in products, the percentage who would pay the unsubsidised cost is lower. And, as the costs go up, the number of people willing to pay goes down. The economies of scale start to erode.

End result: Complete crash.

Scenario 2: Inference costs remain high, training costs drop.

This one is largely dependent on AI companies successfully lobbying to make plagiarism legal as long as it's 'for AI'. They've been quite successful at that so far, so there's a reasonable chance of this.

In this scenario, none of the big AI companies has a moat. If training costs go down, the number of people who can afford to build foundation models goes up. This might be good for NVIDIA (you sell fewer chips per customer, to more customers, and hopefully it balances out). OpenAI and Anthropic have nothing of value, they start playing in a highly competitive market.

This scenario is why DeepSeek spooked the market. If you can train something like ChatGPT for $30M, there are hundreds of companies that can do it. If you can do it for $3m, there are hundreds of companies for which this would be a rounding error in their IT budgets.

Inference is still not at break even point, so costs go up, but for use cases where a 2X cost is worthwhile there's still profit.

End result: This is a moderately good case. There will be some economic turmoil because a few hundred billion have been invested in producing foundation models on the assumption that the models and the ability to create them constitutes a moat. But companies like Amazon, Microsoft and Google will still be able to sell inference services at a profit. None will have lock in to a model, so the prices will drop to close to the cost, though still higher than they are today. With everyone actually paying, there won't be such a rush to put AI in everything. The datacenter investment is not destroyed because there's still a market for inference. The growth will likely stall though and so I expect a lot of the speculative building will be wiped out. I'd expect this to push the USA into recession, but this is more the stock market catching up with the economic realities.

Scenario 3: Inference costs drop a lot, training costs remain high.

This is the one that a lot of folks are hoping for because it means on-device inference will replace cloud services. Unfortunately, most training is done by companies that expect to recoup that investment selling inference. This is roughly the same problem as COTS software: you do the expensive thing (writing software / training) for free and then hope to make it up charging for the thing that doesn't cost anything (copying software / inference).

We've seen that this is a precarious situation. It's easy for China to devote a load of state money to training a model and then give it away for the sole purpose of undermining the business model of a load of US companies (and this would be a good strategy for them).

Without a path to recouping their investment, the only people who can afford to train models have no incentive to do so.

End result: All of the equity sunk into building datacentres to sell inference is wasted. Probably close to a trillion dollars wiped off the stock market in the first instance. In the short term, a load of AI startups who are just wrapping OpenAI / Anthropic APIs suddenly become profitable, which may offset the losses.

But new model training becomes economically infeasible. Models become increasingly stale (in programming, they insist on using deprecated / removed language features and APIs instead of their replacements. In translation they miss modern idioms and slang. In summarisation they don't work on documents written in newer structures. In search, they don't know anything about recent events. And so on). After a few years, people start noticing that AI products are terrible, but none of the vendors can afford to make them good. RAG can slow this decline a bit, but at the expense of increasingly large contexts (which push up inference compute costs). This is probably a slow deflate scenario.

Scenario 4: Inference and training costs both drop a lot.

This one is quite interesting because it destroys the moat of the existing players and also wipes out the datacenter investments, but makes it easy for new players to arise.

If it's cheap to train a new model and to do the inference, then a load of SaaS things will train bespoke models and do their own inference. Open-source / cooperative groups will train their own models and be able to embed them in things.

End Result: Wipe out a couple of trillion from the stock market and most likely cause a depression, but end up with a proliferation of foundation models in scenarios where they're actually useful (and, if the costs are low enough, in a lot of places where they aren't). The most interesting thing about this scenario is that it's the worst for the economy, but the best outcome for the proliferation of the technology.

Variations:

Costs may come down a bit, but not much. This is quite similar to the no-change scenario.

Inference costs may come down but only on expensive hardware. For example, a $100,000 chip that can run inference for 10,000 users simultaneously, but which can't scale down to a $10 chip that can run the same workloads. This is interesting because it favours cloud vendors, but is otherwise somewhere between cheap and expensive inference costs.

Overall conclusion: There are some scenarios where the outcome for the technology is good, but the outcomes for the economy and the major players is almost always bad. And the cases that are best for widespread adoption for the technology are the ones that are worst for the economy. And that's pretty much the definition of a bubble: A lot of money invested in ways that will result in losing the money.

Dave Wilburn :donor:

@DaveMWilburn@infosec.exchange replied · 2 days ago

@david_chisnall

It might also be important to figure out how the market is segmented. Consumers might have different expectations and demands than big corporations.

I find it hard to believe that consumers are going to pay money for an expensive model if they can just grab a free app that runs a small LLM locally that's "good enough" for their casual use cases. And some of the smaller open source LLMs are already small enough to do inference on regular consumer smartphones.

Companies might have higher demands in terms of model performance and be willing to devote some money to it. But at some point they'll have to pivot from overly excited experimentation into cold calculations about cost and value. IT budgets always shrink, and I don't think LLMs would prove to be an exception. I'd also expect regulations and policy to catch up, such that granting unfettered wholesale access to giant buckets of private and sensitive data to third party inference providers might become tricky. We might need some major breaches and perhaps changes in political power before that happens, though. In any case, I don't think it's a safe assumption that companies will continue writing blank checks without some firm evidence of cost savings or competitive advantage.

But in any case, I don't believe that any of the market segments are going to fork over money for inference in the quantities that AI investors are assuming.

Dan Wallach

@dwallach@discuss.systems replied · 2 days ago

@david_chisnall Your scenario 4 is intriguing. For it to be true, I'm thinking you need the following assumptions:

- Something reassembling Moore's Law continues to be true (ergo, it gets cheaper over time to do both training and inference). Alternatively or additionally, the algorithms will get more efficient over time, letting you go faster with the same hardware.

- The size of these models, and the complexity of training and inference, stays about the same. If there's no benefit from going bigger, or simply no more data to train on, then that says today's workloads are it.

If both of those hold, then you eventually get a proliferation of cheap models, tuned to specific use cases, that can run anywhere.

A related question follows: what happens to these enormous gigawatt datacenters after a hypothetical AI crash? If you can buy them for pennies on the dollar, that starts looking like a cheap way to compete for general purpose cloud computing cycles. Of course, the way you build a general purpose datacenter and the way you build an AI datacenter are not the same, but for plenty of workloads, I'll bet they can do a fine job.

David Chisnall (*Now with 50% more sarcasm!*)

@david_chisnall@infosec.exchange replied · 2 days ago

@dwallach

Something reassembling Moore's Law continues to be true (ergo, it gets cheaper over time to do both training and inference). Alternatively or additionally, the algorithms will get more efficient over time, letting you go faster with the same hardware.

Yes. We saw some big wins from things that slightly overlapped these. Moving to lower-precision floating-point formats and moving to formats with a shared exponent across an entire vector gave you more FLOPS/Watt (and FLOPS/$) but at the expense of less generality. I think all of the low-hanging fruit is gone here, and a lot of the recent improvements seem to have been better memory topologies for handling sparse matrixes.

The size of these models, and the complexity of training and inference, stays about the same. If there's no benefit from going bigger, or simply no more data to train on, then that says today's workloads are it.

Or the models get smaller. This seems more plausible, especially with more specialised models.

The translation models that Firefox uses (and the offline ones for Google Translate) are pretty impressive now, but they're very specialised. The Firefox ones are about two orders of magnitude larger than a dictionary would be. There may be space for improvement there, but they already run nicely on a relatively cheap phone. It may be that there are similar

A related question follows: what happens to these enormous gigawatt datacenters after a hypothetical AI crash?

That's a good question. Part of it is that some of them exist only on paper, so nothing, they just evaporate. But there's some fun there: Some of them are being built by real-estate companies who have loans secured by the expected revenue from the datacenter leases. And if the companies that were supposed to be leasing them break the leases? That will cause a load of loan defaults. This is the contagion case that worries me the most, because I expect at least one bank to end up holding a lot of bad debt, which will cause a liquidity crisis (at the very least) and require coordination from central banks to avoid.

If you can buy them for pennies on the dollar, that starts looking like a cheap way to compete for general purpose cloud computing cycles.

But what would you be buying? The buildings? They're expensive to build, sure. And they have power / cooling built in in useful ways, though far denser than most things need. And some of them have special agreements with the grid for power that are tied to the current owner, so even turning them on would require some expensive contract negotiation.

The GPUs? They're run at such a high burn rate that their reported lifetimes are 1-3 years. Some will work. And, based on some of the things that came from NVIDIA's disclosures, it turns out that a lot of them don't actually have the GPUs in them yet because the companies building them don't have the cash. So it's not clear what you'd actually get.

Of course, the way you build a general purpose datacenter and the way you build an AI datacenter are not the same, but for plenty of workloads, I'll bet they can do a fine job.

You can probably do something. The question is whether it's cheaper to start from an empty space or start from something built optimising for the wrong thing. I don't know either way, but I do know that part of the motivation for building these was that converting a normal cloud datacenter into an 'AI' one was more expensive than building a new one. Whether that is true in reverse is not clear.

David Chisnall (*Now with 50% more sarcasm!*)

@david_chisnall@infosec.exchange replied · 2 days ago

@dwallach

Oh, one more thing: We still mostly have Moore's law (number of transistors on a chip for a fixed cost doubles). The thing we lost was Dennard Scaling (the power consumption of a square mm of transistors was roughly constant, so as processes shrank you got more transistors per Watt).

This is a really important distinction. It's been the thing that has pushed accelerators because you there's a big power saving from having 10 specialised processors where 2-3 are active at any time and the remaining 7-8 are in low-power state. You can save a lot of power by having specialised things that are either doing a phase of computation efficiently or are turned off.

Before Dennard Scaling ended (around 2007), accelerators had a tendency to die off because the doubling that Moore's Law gave the CPU gradually made it fast enough to do the same thing the accelerator did, fast enough that it didn't matter. Since then, heterogeneous compute has been the way you do power saving. And ML accelerators are homogeneous blobs of matrix multiplication circuitry. Making every problem look like a matrix-multiplication problem is the exact opposite of what you want as a target for power-efficient chips.

This is less true for 'edge AI'. Apple's SoCs, for example, have a bunch of different accelerators, including an AI accelerator. If AI is an intermittent thing that you do sometimes, having an accelerator for it saves power relative to doing it on the CPU or GPU. If it's the thing that you're doing all of the time, that's annoying for designing power-efficient chips.

Peter Bindels

@dascandy@infosec.exchange replied · 2 days ago

@david_chisnall What about the variant in which people unwillingly pay for it, like Microsoft tries to do with its Office pricing including LLM for whoever doesn't notice it getting added to their bills?

David Chisnall (*Now with 50% more sarcasm!*)

@david_chisnall@infosec.exchange replied · 2 days ago

@dascandy This one is interesting because it increases the market opportunities for competitors. The cost of M365 is already high enough that it's starting to motivate big companies to seek alternatives.

Both MS and Google put up their prices at the same time to bundle AI things. I wouldn't be surprised if antitrust regulators around the world start looking at this as a cartel action (when the two major players in a market simultaneously increase prices, that's suspicious - in a functioning market, one would put their prices up and the other would capitalise on this by using the price differential to encourage switching).

But Google and MS are not the only players here, only the largest two. And they're still making a loss on the AI bits at that price. If they add another $30/seat to licenses, that's a big incentive for someone else to offer something similar at the old price.

Peter Bindels

@dascandy@infosec.exchange replied · 2 days ago

@david_chisnall With the notable problem that despite what Microsoft has done over the past ~25 years, nobody seems able to break the Office monopoly in companies. Looks to be too strong of a network effect where people are too strongly motivated to use the same software that other companies, even when the alternative is almost as good and literally free, and this one is now under subscription only for >$100 per seat.

There hasn't been a functioning "office suite" market for decades.

David Chisnall (*Now with 50% more sarcasm!*)

@david_chisnall@infosec.exchange replied · 2 days ago

@dascandy I would point to the GSuite as a counterexample. Google has broken the monopoly and it's now a duopoly. This suggests that it is possible to further break it into a... triopoly? (Is that a word). Especially with things like EU governments deciding that they want to move off US big tech.

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances

Bonfire social · 1.0.0 no JS en

Automatic federation enabled