AI Insight
This survey examines the growing use of Transformer-based models in autonomous driving systems, which excel at capturing long-range spatial dependencies and multi-agent interactions across perception, prediction, and planning tasks. The authors analyze these models from a deployment perspective, reviewing compression and acceleration techniques including quantization, pruning, knowledge distillation, and efficient attention mechanisms that address the substantial latency, memory, and energy constraints of real-world vehicle implementation. The work emphasizes that compression should be treated as a system-level design consideration rather than post-processing, directly affecting deployability, robustness, and safety.
Why it matters
This research addresses a critical gap between advanced AI model capabilities and practical deployment in autonomous vehicles, where computational efficiency, safety, and real-time performance are essential. The comprehensive framework for evaluating and implementing efficient Transformer models could accelerate the development of commercially viable autonomous driving systems.
arXiv:2304.10891v3 Announce Type: replace-cross
Abstract: Transformer-based models are becoming a central paradigm in autonomous driving because they can capture long-range spatial dependencies, multi-agent interactions, and multimodal context across perception, prediction, and planning. At the same time, their deployment in real vehicles remains difficult because high-capacity attention-based architectures impose substantial latency, memory, and energy overhead. This survey reviews representative Transformer-based autonomous driving models and organizes them by task role, sensing configuration, and architectural design. More importantly, it examines these models from a deployment-oriented perspective and analyzes how efficiency constraints reshape model design choices in practice. We further review compression and acceleration strategies relevant to Transformer-based driving systems, including quantization, pruning, knowledge distillation, low-rank approximation, and efficient attention, and discuss their benefits, limitations, and task-dependent applicability. Rather than treating compression as an isolated post-processing step, we highlight it as a system-level design consideration that directly affects deployability, robustness, and safety. Finally, we identify open challenges and future research directions toward standardized, safety-aware, and hardware-conscious evaluation of efficient autonomous driving systems.
Source: Transformer-Based Autonomous Driving Models and Deployment-Oriented Compression: A Survey