Urban Driver: Learning to Drive from Real-world Demonstrations Using Policy Gradients

Oliver Scheel, Luca Bergamini, Maciej Wołczyk, Błazej Osinski, Peter Ondruska

CoRL 2021


Offline reinforcement learning for self-driving

  • State-of-the-art method to learn to drive from human demonstrations.

  • Uses data-driven simulation and large amounts of collected data.

  • Zero hand-engineering.

  • Mitigates shortcomings and outperforms previous ML methods.

  • Deployed on real-world vehicles without the need of sim2real transfer.

How it works

We use large amounts of collected data and a data-driven simulator to learn an imitative driving policy in closed-loop. The policy is directly deployable to self-driving vehicles.

Policy architecture: graph neural network using PointNet-style architecture and attention layers.

Training paradigm: train network "closed-loop" by iteratively unrolling policy and calculating loss.

Performance in driving scenarios

The trained policy can be directly deployed to drive self-driving vehicles without any need for sim2real transfer.

Crossing busy intersection

Lane following

Reacting to slow lead vehicle

Intersection without lead vehicle

The system can handle a variety of complex driving situations with all the behaviour learned from data with no hand-engineering.

Comparison to state-of-the-art

Our method outperforms previous machine-learned self-driving systems.



title={Urban Driver: Learning to Drive from Real-world Demonstrations Using Policy Gradients},

author={Scheel, Oliver and Bergamini, Luca and Wolczyk, Maciej and Osinski, Blazej and Ondruska, Peter},

booktitle={Conference on Robot Learning (CoRL)},