“Our algorithm leverages classical topological path planning and deep reinforcement learning,” according to the University’s Robotics and Perception Group.

UZH drone flight

In the first step, (right) many collision free paths are found using a probablistic rodmap method.

After this, the paths are filtered with different obstacle avoidance strategies (below left).

UZH drone flight

It is these paths, plus knowledge of the physial dynamic the real or modelled vehicle, that are used to guide the reinforcement learning algorithm, whose goal is to create a policy that  maximises quadcopter progress along the chosen path while avoiding obstacles.

Reinforcement learning uses trial and error to optimise its parameters, and can handle non-linear dynamic systems.

“All this information is uesd by the neural network to compute a desired collective thrust,” according to the research team. “The policy is then trained, where it first learns to fly slowly around a track…


Source: www.electronicsweekly.com