The Google TPU paper is out:

As we have been predicting for 10 years, in SoC you can achieve > 10x more performance that current GPUs and > 100x more performance per watt.

All in 8-bits, because really, you do not need more than 8 bits for inference and the reason we did not use 8-bits more is because they do not give any advantage on current hardware!

Well done!
Shared publiclyView activity