embedding-shape

a day ago
Depends heavily on the architecture too, I think a free-for-all to find the better sizes is still kind of ongoing, and rightly so. GPT-OSS-120B for example fits in around 61GB VRAM for me when on MXFP4.

Personally, I hope GPU makers instead start adding more VRAM, or if one can dream, expandable VRAM.

refulgentis

a day ago
Unlikely to see more VRAM in the short term, memory prices are thru the roof :/ like, not subtly, 2-4x.

embedding-shape

a day ago
Well, GPUs are getting more VRAM, although it's pricey. But we didn't used to have 96GB VRAM GPUs at all, now they do exist :) But for the ones who can afford it, it is at least possible today. Slowly it increases.

refulgentis

a day ago
Agreed, in the limit, RAM go up. As billg knows, 128KB definitely wasn't enough for everyone :)

embedding-shape

a day ago
I'm already thinking 96GB might not be enough, and I've only had this GPU for 6 months or so :|

refulgentis

19 hours ago
Hehe me too…went all out on a MBP in 2022, did it again in April. Only upgrade I didn’t bother with was topping out at 128 GB of RAM instead of 64. Then, GPT-OSS 120B comes out and quickly makes me very sad I can’t use it locally