More

simonw · 2026-06-25T16:14:24 1782404064

Hard to make a case that's related to increased RAM or SSD component prices.

cromka · 2026-06-25T20:42:56 1782420176

Exactly. It's the "why not" which makes you wonder whether they actually need to bump prices at all and their contracts are still in place...

simonw · 2026-06-25T15:37:10 1782401830

This HN headline is editorialized, the Bloomberg headline is "Ford AI Hiccups Push Carmaker to Rehire ‘Gray Beard’ Inspectors".

The editorialized headline is also misleading: "Ford rehires 350 engineers after AI fails to preserve expertise or train juniors" - there is nothing in the original story that suggests Ford were expecting AI to "train juniors".

And since the Bloomberg headline is behind a paywall the editorialized headline is most of what we have to go on.

This Verge story would be a better link: "Ford had to hire back former engineers to fix mistakes made by its automated systems" https://www.theverge.com/transportation/956316/ford-quality-...

And the crucial detail: nothing indicates Ford laid off the 350 people who were re-hired. It looks to me like it could be bringing back people who retired.

justonepost2 · 2026-06-25T16:05:19 1782403519

simonw · 2026-06-25T16:21:28 1782404488

What exactly am I coping with here?

The headline gives the impression that Ford fired 350 engineers and tried to get AI to train the replacements and then re-hired them when that didn't work.

That impression is false, which means we're wasting time having conversations about it.

(The top comment thread on here right now - https://news.ycombinator.com/item?id=48674446#48675092 - starts with the assumption that Ford execs made the mistake of laying off 350 people and then discusses if they got good severance packages etc. - here's the best comment I've seen calling that out so far: https://news.ycombinator.com/item?id=48674446#48675486)

simonw · 2026-06-24T20:19:41 1782332381

I don't think it was.

I think the Mintlify designers viewed dozens if not hundreds of examples, then thought very carefully about exactly what they needed to express for their page and how best to express it. Then they built their page step by step, sweating over every detail.

Then Kibu came along, lifted the entire thing, changed "3%" of it and called it their own.

What Kibu did is gross.

paulhebert · 2026-06-24T21:22:46 1782336166

Agreed. As someone who has built landing pages like that professionally you take inspiration from a wide range of sources.

Directly copying is tacky and immoral - it's also not effective. You should be thinking about how to position _your product_ not how someone else positioned their product.

simonw · 2026-06-24T17:30:01 1782322201

On Hacker News you can indent code samples with two spaces, like this:

  return xr.apply_ufunc(
      lambda x: (x - x[::-1]) / 2,
      conductance,
      input_core_dims=[dims],
      output_core_dims=[dims],
      vectorize=True,
  )

simonw · 2026-06-24T17:19:49 1782321589

I'd love to see credible numbers on the energy usage of thousands of people running models on their own devices compared to sharing data center resources to run big models that serve many different people at the same time.

My hunch is that the energy/water usage of the data centers is a whole lot more efficient than everyone running at home, but I'd be interested in seeing real data on that.

Windchaser · 2026-06-24T18:15:55 1782324955

Water usage goes up with data centers because more cooling is needed when you run the hardware harder.

So: if you're running the models on your own machine, presumably you're not running them as often, and air cooling is sufficient. But, at the same time, this is less efficient in terms of hardware use; the data centers need water cooling specifically because they're getting more bang from their buck from their hardware, by running their hardware harder.

So that's the tradeoff: more hardware-use efficiency means more water usage.

nok22kon · 2026-06-24T18:54:08 1782327248

that water is recirculated. nobody sends it down the drain after one loop

jazzyjackson · 2026-06-24T22:24:06 1782339846

Evaporative cooling is cheaper

> [Google’s] largest data center, located in Council Bluffs, IA, withdrew an average of 3.9 million gallons of water and consumed 2.8 million gallons per day.

https://mostpolicyinitiative.org/science-note/data-center-wa...

echelon · 2026-06-24T19:47:13 1782330433

Almonds and avocados use two orders of magnitude more water.

You can start adjusting your consumption of those products.

verdverm · 2026-06-24T17:37:17 1782322637

With hardware like the Spark and Strix, the water usage is known to be zero, yea?

On the energy front, I assume less efficient, but I also think there is a tradeoff in efficiency versus freedom, that's why I have my own hardware.

simonw · 2026-06-24T18:37:37 1782326257

The electricity used by hardware itself consumes water. When people talk about data center water usage they're often also including the water used in electricity generation.

verdverm · 2026-06-24T19:21:36 1782328896

Ironically, 77% of my electricity comes from flowing water rather than the boiling it, ~90% renewable overall

Once we get to fully renewable (as a country), there should be no water involved in electricity generation as well.

fluoridation · 2026-06-24T18:07:26 1782324446

All consumer hardware (not counting XOC) uses either air cooling or closed-loop liquid cooling, so the water usage is zero, always. Power is a little trickier. I'd assume it's less efficient, but also the total usage is less, because the user sometimes turns the machine off, and the hardware idles to a deeper sleep state than server hardware.

cold_harbor · 2026-06-24T17:27:59 1782322079

the comparison misses that local LLM usage covers tasks you'd never send to an API — private code, offline work, medical notes. the baseline is 'local vs not-doing-it', not 'local vs cloud'

simonw · 2026-06-24T18:38:40 1782326320

I have bad news about my private code and medical notes...

Chu4eeno · 2026-06-24T18:52:44 1782327164

Looking forward to some pre/non-finetuned frontier model to leak, and people to start completing medical notes.

echelon · 2026-06-24T19:51:22 1782330682

> Looking forward to some pre/non-finetuned frontier model to leak

I'm looking forward to this too. It'll be incredibly useful.

simonw · 2026-06-24T17:18:28 1782321508

... or rising, at least as long as there's a RAM shortage.

mbgerring · 2026-06-24T17:19:11 1782321551

I’d bet that there won’t be a RAM shortage for very long.

simonw · 2026-06-24T17:22:18 1782321738

The best article I've seen about that is this one by David Oks (ignore the headline, the content is much better): https://davidoks.blog/p/ai-is-killing-the-cheap-smartphone

> It was only in 2025, as memory prices began an unprecedented surge, that the memory makers started to build new fabs targeted at HBM, all slated to start producing chips in 2027 or 2028.

fellowmartian · 2026-06-24T17:31:56 1782322316

It still won’t help unless the AI bubble pops. Even old fabs will continue pumping out HBM instead of DRAM as long as hyperscalers gobble it up.

Avicebron · 2026-06-24T17:24:38 1782321878

This seems wildly optimistic, do you have anything to support it?

swiftcoder · 2026-06-24T17:55:43 1782323743

The RAM shortage is predicated on both the huge datacenter buildout (many of which are already mired in delays, with a few even cancelled outright), and the massive memory purchase commitments various hyperscalers have made - hyperscalers who seem to be running short on cash lately...

AnimalMuppet · 2026-06-24T17:41:10 1782322870

History? This isn't the first RAM shortage. When one happens, producers build more fabs. The fabs come online, the availability of memory shoots up, and the shortage goes away, usually replaced by a glut.

If you want to argue that this is different from all previous RAM shortages, you can, but the burden of proof is on you to show the difference.

nok22kon · 2026-06-24T18:51:10 1782327070

there is a glut if demand stops.

this time demand doesn't stop. there is an exponential demand for tokens.

swiftcoder · 2026-06-24T20:30:18 1782333018

> there is an exponential demand for tokens

[citation needed]

There is certainly economic pressure to create an exponential demand for tokens, but we've already seen a pullback from the costly "token maxing" companies were pushing last year.

It's also pretty unclear to what degree the RAM shortage is driven by inference (versus by training). We're rapidly approaching the point where frontier models are "good enough" for everyday use, are at some point we're going to hit diminishing returns on training new trillion-parameter models...

simonw · 2026-06-24T16:51:06 1782319866

I didn't realize "hottest week of the year" in the UK was predictable enough that you could plan your conference schedule around it.

Is there a known week during which nobody should plan any events at all now?

simonw · 2026-06-23T22:17:46 1782253066

> If true then why are neither Anthropic or OpenAI dropping their API pricing to gain market share

Maybe because they're trying to IPO this year, and their IPO prospects will be a lot worse if their S-1s show them to be losing money on inference as opposed to making a healthy profit.

simonw · 2026-06-23T18:33:29 1782239609

I believe HN has some automations now that flag posts that look like they were LLM generated. I got caught out by that recently (in a post that wasn't fully LLM generated but quoted a line that probably was.)

You can email the mods and they'll fix it.

simonw · 2026-06-22T22:14:55 1782166495

I got this working with ONNX (thanks, Claude Opus 4.8) and now I have an interactive demo of the model running entirely in the browser here (~1.3GB download): https://simonw.github.io/moebius-web/ - code here: https://github.com/simonw/moebius-web

(Claude Code transcript: https://gisthost.github.io/?58039ba5c1ca3ed177e8659168996ee4)

Wrote this up in more detail on my blog: https://simonwillison.net/2026/Jun/22/porting-moebius/

K0IN · 2026-06-22T22:37:26 1782167846

Awesome, I wanted to do the exact same thing (used gpt 5.5 + code) but it didn't get the model to work in onnx...

g58892881 · 2026-06-23T05:06:44 1782191204

well done!

unet weights are in fp32. did you by any chance try something lower, fp16?

da_grift_shift · 2026-06-23T12:01:13 1782216073

The model considered it.

There are 25 or so mentions of fp16 and fp32 weights across the 7500+ words of Markdown text it generated. So the next question might be: Did it make the right calls?

https://github.com/simonw/moebius-web/blob/main/notes.md

https://github.com/simonw/moebius-web/blob/main/plan.md

https://github.com/simonw/moebius-web/blob/main/research.md

https://github.com/simonw/moebius-web/blob/main/understandin...