ASTRA: Communication-Efficient Acceleration for Multi-Device Transformer Inference

Published in The 43rd International Conference on Machine Learning (ICML 2026), 2026, 2026