Welcome to Flatland#
Ongoing Challenge
Take part in the Flatland 3 Challenge on AIcrowd!
Flatland tackles a major problem in the transportation world:
How to efficiently manage dense traffic on complex railway networks?
This is a hard question! Driving a single train from point A to point B is easy. But how to ensure trains wonβt block each others at intersections? How to handle trains that randomly break down?
Flatland is an open-source toolkit to develop and compare solutions for this problem.
β‘ Quick start#
Flatland is easy to use whether youβre a human or an AI:
$ pip install flatland-rl
$ flatland-demo # show demonstration
$ python <<EOF # random agent
import numpy as np
from flatland.envs.rail_env import RailEnv
env = RailEnv(width=30, height=30)
obs = env.reset()
while True:
obs, rew, done, info = env.step({
0: np.random.randint(0, 5),
1: np.random.randint(0, 5)
})
if done:
break
EOF
Want to dive straight in? Make your first submission to the Flatland 3 challenge in 10 minutes!
π Flatland Paper#
You can find the Flatland competition paper on arXiv: https://arxiv.org/abs/2012.05893
@misc{mohanty2020flatlandrl,
title={Flatland-RL : Multi-Agent Reinforcement Learning on Trains},
author={Sharada Mohanty and Erik Nygren and Florian Laurent and Manuel Schneider and Christian Scheller and Nilabha Bhattacharya and Jeremy Watson and Adrian Egli and Christian Eichenberger and Christian Baumberger and Gereon Vienken and Irene Sturm and Guillaume Sartoretti and Giacomo Spigler},
year={2020},
eprint={2012.05893},
archivePrefix={arXiv},
primaryClass={cs.AI}
}
π The Vehicle Re-scheduling Problem#
At the core of this challenge lies the general vehicle re-scheduling problem (VRSP) proposed by Li, Mirchandani and Borenstein in 2007:
The vehicle rescheduling problem (VRSP) arises when a previously assigned trip is disrupted. A traffic accident, a medical emergency, or a breakdown of a vehicle are examples of possible disruptions that demand the rescheduling of vehicle trips. The VRSP can be approached as a dynamic version of the classical vehicle scheduling problem (VSP) where assignments are generated dynamically.
The Flatland environment aims to address the vehicle rescheduling problem by providing a simplistic grid world environment and allowing for diverse solution approaches. The problems are formulated as a 2D grid environment with restricted transitions between neighboring cells to represent railway networks. On the 2D grid, multiple agents with different objectives must collaborate to maximize global reward.
π Design principles#
Real-word, high impact problem#
The Swiss Federal Railways (SBB) operate the densest mixed railway traffic in the world. SBB maintain and operate the biggest railway infrastructure in Switzerland. Today, there are more than 10,000 trains running each day, being routed over 13,000 switches and controlled by more than 32,000 signals. The Flatland challenge aims to address the vehicle rescheduling problem by providing a simplistic grid world environment and allowing for diverse solution approaches. The challenge is open to any methodological approach, e.g. from the domain of reinforcement learning or of operations research.
Tunable difficulty#
All environments support well-calibrated difficulty settings. While we report results using the hard difficulty setting, we make the easy difficulty setting available for those with limited access to compute power. Easy environments require approximately an eighth of the resources to train.
Environment diversity#
In several environments, it has been observed that agents can overfit to remarkably large training sets. This evidence raises the possibility that overfitting pervades classic benchmarks like the Arcade Learning Environment, which has long served as a gold standard in reinforcement learning (RL). While the diversity between different games in an ALE is one of the benchmarkβs greatest strengths, the low emphasis on generalization presents a significant drawback. In each game the question must be asked: are agents robustly learning a relevant skill, or are they approximately memorizing specific trajectories?
π Next stops#
π± Communication#
Join the Discord channel to exchange with other participants:
Use these channels if you have a problem or a question for the organizers: