What needs to happen is for companies (or individuals) tired of that to pool money together to build new, memory products. Then, sell them to consumers first and for non-AI use. If not that, then round-robin scheduling of quantities so the units are spread around more.
If costs are high, they might reserve a certain percentage for big business at market prices (or just under) to cover the chip's mask costs.
After DDR5+ RAM, then GDDR5-6 RAM for use with AI accelerators. They might try to jump right in on a HBM alternative. That could be the percentage for AI buyers I just mentioned. Especially if they could put 40-80GB on accelerators like Intel ARC's.
If successful enough, they license MIPS' gaming GPU's to combine with this stuff with full, open-source stack and RTOS support for military sales.
Time for my daily "HBF is coming" comment.
The next step for models is to put the weights on flash, connected with a very wide interface to the accelerator. The first users will be datacenters, but it should trickle down to consumer hardware eventually. A single 512GB stack is expected to cost about $200, and provide 1.6TB/s of reads.
You still need some fast DRAM for the KV cache and for activations, but weights should be sitting on flash.