Microsoft backed a tiny hardware startup that just launched its first AI processor that does inference without GPU or expensive HBM memory and a key Nvidia partner is collaborating with it

This was originally published on post
Corsair achieves 60,000 tokens/second for Llama3 models in servers