What model can you run realistically with 512 GB of unified memory ? I'm curious...

wkat4242 · 2025-03-05T16:03:49 1741190629

DeepSeek R1 for one, quantised but not too cripplingly.

numpad0 · 2025-03-05T16:14:30 1741191270

the full R1 takes >512GB and the 1.52bit takes >128GB. So enough for agent + app to realize a fully autonomous monolithic AGI humanoid head, potentially, but then it'll be compute limited...

wkat4242 · 2025-03-05T22:17:48 1741213068

Yeah I was thinking more about q6_0 or so. The q4_K_M is 404GB so you can still push it a bit higher than that. Obviously the 1.52 bit doesn't make sense.

I'm never going to pay 10k for that though. Hopefully cheaper hardware options are coming soon.

saganus · 2025-03-05T16:19:18 1741191558

I assume they are getting ready for the next year or two of LLM development.

Maybe there's not much market right now, but who knows if DeepkSeek R3 or whatever will need something like this.

It would be awesome to be able to have a high-performance local-only coding assistant for example (or any other LLM apllication for that matter).

mlboss · 2025-03-05T21:03:30 1741208610

The future is local AI agents running on your desktop 24x7. This and NVIDIA Digits will be the hardware to do that.