Tensor Manipulation Units Promise to Slash the Latency for Machine Learning, Artificial Intelligence

Tiny accelerator delivers outsized gains, reducing end-to-end inference latency in a custom AI chip by more than a third.