Companies told us AI would replace human workers and cut costs. Turns out the math doesn’t work. An Nvidia VP just confirmed that compute costs for his team exceed what they pay their employees. Uber burned through its entire 2026 AI budget in four months. And a startup CEO ran up a $113,000 monthly AI bill, with a four-person team.

This episode breaks down exactly why AI is costing companies more than the humans they laid off, what “tokenmaxxing” is and why engineers are doing it, and whether any of this spending is actually returning anything. Because at some point the ROI question stops being awkward and starts being a board-level emergency.

  • ArchAengelus@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    2
    ·
    2 days ago

    Trying to etch models into a chip is a dead end until we reach “peak” quality.

    However, unless they include some kind of LoRA (low-rank adaptation) adapter onto the silicon, it severely limits the utility of whatever model or architecture they choose. Being able to modify the weights is way more useful.

    Honestly, diffusion decoders are probably where we’ll end up some day. Not end there, but that’s probably the next logical step in the throughput chain.

    General purpose compute is infinitely more valuable during times of great software improvements than highly specialized compute.

    Things like Tensor Processing Units (TPUs) still aren’t ubiquitous yet, even though they’ve been around for 10+ years. They’re Too specialized to allow for reasonable flexibility on testing.

    • NotMyOldRedditName@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      2 days ago

      They claim they can apply Lora’s to it, and that at a data centre scale, it will pay for itself in a year vs existing GPU methods… but who knows if any of that is true.

      They’d need a pretty good recycle process set up to get rid of cards that are no longer useful after a couple years as well.

      But ya, maybe this is future future, once we have these amazing models that don’t need to be changing often.

      Edit: and some models would be better suited for it than others. A creative writing model is less likely to suffer not being updated as frequently as a programming model for example.