This repository contains the official implementation of the End-to-End Planning task from our paper: "Spatial Retrieval Augmented Autonomous Driving".
We introduce a novel Spatial Retrieval Paradigm that retrieves offline geographic images (Satellite/Streetview) based on GPS coordinates to enhance autonomous driving tasks. For End-to-End Planning, we design a plug-and-play Spatial Retrieval Adapter and a Reliability Estimation Gate to robustly fuse this external knowledge into BEV representations.
This main branch provides the implementation based on VAD.
- [2025-12-09] Code and checkpoint for End-to-End Planning are released!
All methods are reproduced. Introducing geographic information could improve the performance at long horizons. Further, on the night subset (10% of val), VAD+Geo shows substantial collision reduction at long horizons.
| Method | Split | L2 1s | L2 2s | L2 3s | L2 Avg | Col 1s (%) | Col 2s (%) | Col 3s (%) | Col Avg (%) | Config | Download |
|---|---|---|---|---|---|---|---|---|---|---|---|
| VAD | Full | 0.34 | 0.61 | 0.98 | 0.64 | 0.16 | 0.29 | 0.66 | 0.37 | - | - |
| VAD + Geo | Full | 0.35 | 0.62 | 0.96 | 0.64 | 0.14 | 0.29 | 0.64 | 0.36 | config | model |
| VAD | Night | 0.44 | 0.82 | 1.35 | 0.87 | 0.10 | 0.24 | 1.30 | 0.55 | - | - |
| VAD + Geo | Night | 0.46 | 0.83 | 1.33 | 0.87 | 0.0 | 0.15 | 1.30 | 0.48 | config | model |
- Python 3.8
- CUDA 11.3
- PyTorch 1.9.1
conda create -n newvad python=3.8 -y
conda activate newvadconda install nvidia/label/cuda-11.3.0::cudapip install https://mirror.sjtu.edu.cn/pytorch-wheels/cu111/torch-1.9.1+cu111-cp38-cp38-linux_x86_64.whl
pip install torchvision==0.10.1+cu111 torchaudio==0.9.1 -f https://download.pytorch.org/whl/torch_stable.htmlpip install mmcv-full==1.4.0 -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html
pip install mmdet==2.14.0
pip install mmsegmentation==0.14.1
pip install timmgit clone https://github.com/open-mmlab/mmdetection3d.git
cd mmdetection3d
git checkout -f v0.17.1
pip install -v -e .
cd ..pip install nuscenes-devkit==1.1.9
pip install yapf==0.40.1
pip install setuptools==59.5.0Download nuScenes V1.0 full dataset and CAN bus expansion data from HERE.
Download CAN bus expansion:
# Download 'can_bus.zip'
unzip can_bus.zip
# Move can_bus to data directoryPrepare nuScenes data:
python tools/data_converter/vad_nuscenes_converter.py nuscenes --root-path ./data/nuscenes --out-dir ./data/infos --extra-tag vad_nuscenes --version v1.0 --canbus ./dataThis will generate vad_nuscenes_infos_temporal_{train,val}.pkl.
Following the instructions in the nuScenes-Geography project, prepare both the nuScenes-Geography dataset and its devkit.
After preparation, place the dataset directory nuScenes-Geography under data/
After preparing the base dataset, merge geographic information:
python tools/merge_data.pypython tools/extract_night_subset.py \
--input_pkl data/infos/vad_nuscenes_infos_temporal_val.pkl \
--output_pkl data/infos/val_night_subset.pkl \
--nusc_root data/nuscenes \
--version v1.0-trainval \
--verboseEnd2End-Planning/
βββ mmdetection3d/
βββ projects/
βββ tools/
βββ ckpts/
βββ data/
β βββ infos/
β β βββ vad_nuscenes_infos_temporal_train.pkl
β β βββ vad_nuscenes_infos_temporal_val.pkl
β β βββ val_night_subset.pkl
β βββ can_bus/
β βββ nuscenes/
β β βββ maps/
β β βββ samples/
β β βββ sweeps/
β β βββ v1.0-test/
β β βββ v1.0-trainval/
β βββ nuScenes-Geography-Data/
Train VAD_Geo with 8 GPUs
./scripts/train.sh
Eval VAD_Geo with 1 GPU
./scripts/test.sh
@misc{spad,
title={Spatial Retrieval Augmented Autonomous Driving},
author={Xiaosong Jia and Chenhe Zhang and Yule Jiang and Songbur Wong and Zhiyuan Zhang and Chen Chen and Shaofeng Zhang and Xuanhe Zhou and Xue Yang and Junchi Yan and Yu-Gang Jiang},
year={2025},
eprint={2512.06865},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2512.06865},
}
This work is based on VAD. It is also greatly inspired by the following outstanding contributions to the open-source community: BEVFormer, PETR.
