Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

True — I think local inference is still far more expensive for my use case due to batching effects and my relatively sporadic, hourly usage. That said, I also didn’t expect hardware prices (RTX 5090, RAM) to rise this quickly.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: