Urban Driver: Learning to Drive from Real-world Demonstrations Using Policy Gradients
Oliver Scheel, Luca Bergamini, Maciej Wołczyk, Błazej Osinski, Peter Ondruska
Offline reinforcement learning for self-driving
State-of-the-art method to learn to drive from human demonstrations.
Uses data-driven simulation and large amounts of collected data.
Mitigates shortcomings and outperforms previous ML methods.
Deployed on real-world vehicles without the need of sim2real transfer.
How it works
We use large amounts of collected data and a data-driven simulator to learn an imitative driving policy in closed-loop. The policy is directly deployable to self-driving vehicles.
Policy architecture: graph neural network using PointNet-style architecture and attention layers.
Training paradigm: train network "closed-loop" by iteratively unrolling policy and calculating loss.
Performance in driving scenarios
The trained policy can be directly deployed to drive self-driving vehicles without any need for sim2real transfer.
Crossing busy intersection
Reacting to slow lead vehicle
Intersection without lead vehicle
The system can handle a variety of complex driving situations with all the behaviour learned from data with no hand-engineering.
Comparison to state-of-the-art
Our method outperforms previous machine-learned self-driving systems.