05/15/2026
Expedera's Athish Rahul Rao argues that the core hardware question is no longer how many TOPS can fit within a given power and area budget. It is whether an architecture is built around real multimodal workload behavior, especially memory movement, activation lifetimes, utilization under irregular graphs, and the software needed to schedule all of it effectively.
Peak TOPS is becoming a weaker proxy for delivered edge performance.