ASTRA: Communication-Efficient Acceleration for Multi-Device Transformer InferencePublished in The 43rd International Conference on Machine Learning (ICML 2026), 2026, 2026Share on Twitter Facebook LinkedIn Previous Next