Learning navigation policies with deep reinforcement learning

zu Verbundenen Objekten

Abstract: Humans learn that we first need to make efforts, take risks, and put ourselves in difficult positions in order to achieve long-term goals, as every decision we make does not only influence our immediate state but could also have future implications. In this thesis, we focus on studying methods for control problems that involve sequential decision making, in which the actions of intelligent agents would affect the environment they operate in. In particular, we focus on solutions to such problems that require the least amount of human interventions, seeking for general algorithms that could help to automate the process of developing intelligent decision-making agents.

Therefore, we build on the general framework of deep reinforcement learning to learn control policies through interactions with the environment. As navigation is an essential skill for autonomous intelligent systems, this thesis takes learning to navigate as the main running task, setting out to address several challenges that arise when learning optimal policies directly from sensory inputs.

This thesis initiates from asking the question of whether it is feasible to replace the traditional navigation pipeline with an end-to-end deep reinforcement learning system, then further proposes algorithms that facilitate transferring learned navigation policies to related task instances. Then the focus is turned to learn navigation in more exploration-challenging environments, where we interface a canonical agent with an external memory within a completely differentiable neural network. By learning to write to and read from the external memory, the agent is able to make informed decisions in hard navigation tasks. Afterwards, we target transferring deep reinforcement learning policies learned in simulation to the real world. Questioning the canonical sim-to-real approaches, we propose a real-to-sim algorithm as a lightweight and flexible alternative. Additionally, we propose a novel shift loss that is agnostic to the downstream task to impose consistency constraints, successfully adapting single-frame domain adaption approaches to sequential problems. Finally, this thesis puts great interest in learning control policies in terminal reward settings, as this scenario requires the least amount of human priors and would thus largely automate the training of artificial decision-making agents. As structured and guided exploration becomes vital in this case, we again question the mainstream approaches of utilizing intrinsic motivation as reward bonuses, taking a hierarchical view on accelerating exploration. We argue that our proposed approach is a more suitable treatment for intrinsically-motivated exploration, as the behavior policy space is implicitly increased exponentially. Moreover, we propose a novel intrinsic reward that takes a temporally extended view on states, which facilitates exploration even further.

In summary, this thesis investigates several key aspects of learning control policies through deep reinforcement learning, with a focus on navigation tasks. We hope that our proposed methods could offer insights to the learning control community

Standort: Deutsche Nationalbibliothek Frankfurt am Main

Umfang: Online-Ressource

Sprache: Englisch

Anmerkungen: Universität Freiburg, Dissertation, 2021

Schlagwort: Reinforcement learning
Navigation
Learning
Bestärkendes Lernen
Deep Learning
Navigation

Ereignis: Veröffentlichung

(wo): Freiburg

(wer): Universität

(wann): 2021

Urheber: Zhang, Jingwei

Beteiligte Personen und Organisationen: Burgard, Wolfram
Albert-Ludwigs-Universität Freiburg. Fakultät für Angewandte Wissenschaften

DOI: 10.6094/UNIFR/218235

URN: urn:nbn:de:bsz:25-freidok-2182353

Rechteinformation: Kein Open Access; Der Zugriff auf das Objekt ist unbeschränkt möglich.

Letzte Aktualisierung: 14.08.2025, 10:58 MESZ

Datenpartner

Dieses Objekt wird bereitgestellt von:
Deutsche Nationalbibliothek. Bei Fragen zum Objekt wenden Sie sich bitte an den Datenpartner.

Original beim Datenpartner anzeigen

Beteiligte

Entstanden

2021

Ähnliche Objekte (12)

Hochschulschrift

Approaches to online reinforcement learning for miniature airships : = Online Reinforcement Learning Verfahren für Miniaturluftschiffe

Hochschulschrift

Goal-directed forward chaining for logic programs

Hochschulschrift

Learning probabilistic models for mobile robot navigation : = Techniken zum maschinellen Lernen probabilistischer Modelle für die Navigation mit mobilen Robotern

Simultaneous estimation of rewards and dynamics in inverse reinforcement learning problems

Hochschulschrift

Autonomous navigation for miniature indoor airships : = Autonome Navigation für Miniaturluftschiffe in Innenräumen

Hochschulschrift

Landmark placement for mobile robot navigation : = Landmarkenplatzierung zur Optimierung der Navigation mobiler Roboter

Hochschulschrift

Techniques for multi-robot coordination and navigation : = Roboterteams: Techniken zur Koordinierung und Navigation

Hochschulschrift

Highly accurate mobile robot navigation

Hochschulschrift

Robot perception for indoor navigation

Hochschulschrift

Socially compliant mobile robot navigation

Hochschulschrift

State estimation and optimization for mobile robot navigation : = Zustandsschätzung und Optimierung für die Navigation mit mobilen Robotern

Hochschulschrift

Mapping, state estimation, and navigation for quadrotors and human-worn sensor systems : = Kartierung, Zustandsschätzung, und Navigation für Quadrotoren und von Menschen getragene Sensorsysteme

Hochschulschrift

Approaches to online reinforcement learning for miniature airships : = Online Reinforcement Learning Verfahren für Miniaturluftschiffe

Hochschulschrift

Goal-directed forward chaining for logic programs

Hochschulschrift

Learning probabilistic models for mobile robot navigation : = Techniken zum maschinellen Lernen probabilistischer Modelle für die Navigation mit mobilen Robotern

Simultaneous estimation of rewards and dynamics in inverse reinforcement learning problems

Hochschulschrift

Autonomous navigation for miniature indoor airships : = Autonome Navigation für Miniaturluftschiffe in Innenräumen

Hochschulschrift

Landmark placement for mobile robot navigation : = Landmarkenplatzierung zur Optimierung der Navigation mobiler Roboter

Hochschulschrift

Techniques for multi-robot coordination and navigation : = Roboterteams: Techniken zur Koordinierung und Navigation

Hochschulschrift

Highly accurate mobile robot navigation

Hochschulschrift

Robot perception for indoor navigation

Hochschulschrift

Socially compliant mobile robot navigation

Hochschulschrift

State estimation and optimization for mobile robot navigation : = Zustandsschätzung und Optimierung für die Navigation mit mobilen Robotern

Hochschulschrift

Mapping, state estimation, and navigation for quadrotors and human-worn sensor systems : = Kartierung, Zustandsschätzung, und Navigation für Quadrotoren und von Menschen getragene Sensorsysteme

Hochschulschrift

Approaches to online reinforcement learning for miniature airships : = Online Reinforcement Learning Verfahren für Miniaturluftschiffe

Hochschulschrift

Goal-directed forward chaining for logic programs

Hochschulschrift

Learning probabilistic models for mobile robot navigation : = Techniken zum maschinellen Lernen probabilistischer Modelle für die Navigation mit mobilen Robotern

Simultaneous estimation of rewards and dynamics in inverse reinforcement learning problems

Hochschulschrift

Autonomous navigation for miniature indoor airships : = Autonome Navigation für Miniaturluftschiffe in Innenräumen

Hochschulschrift

Landmark placement for mobile robot navigation : = Landmarkenplatzierung zur Optimierung der Navigation mobiler Roboter

Hochschulschrift

Techniques for multi-robot coordination and navigation : = Roboterteams: Techniken zur Koordinierung und Navigation

Hochschulschrift

Highly accurate mobile robot navigation

Hochschulschrift

Robot perception for indoor navigation

Hochschulschrift

Socially compliant mobile robot navigation

Hochschulschrift

State estimation and optimization for mobile robot navigation : = Zustandsschätzung und Optimierung für die Navigation mit mobilen Robotern

Hochschulschrift

Mapping, state estimation, and navigation for quadrotors and human-worn sensor systems : = Kartierung, Zustandsschätzung, und Navigation für Quadrotoren und von Menschen getragene Sensorsysteme

Informationen zur Registrierung von Kultur- und Wissenseinrichtungen finden Sie hier.

Felder mit * müssen ausgefüllt werden.

Benutzername*

Bitte geben Sie Ihren Benutzernamen ein

E-Mail*

Bitte geben Sie Ihre E-Mail ein

Bitte füllen Sie dieses Feld nicht aus

Vorname

Nachname

Passwort*

Bitte geben Sie Ihr Passwort ein

Passwort bestätigen*

Bitte geben Sie das gleiche Passwort ein

Ich habe die Nutzungsbedingungen und die Datenschutzerklärung zur Erhebung persönlicher Daten gelesen und stimme ihnen zu. *

Dieses Feld ist ein Pflichtfeld.

Ich möchte den Newsletter der Deutschen Digitalen Bibliothek abonnieren. Siehe Informationen zum Newsletter-Abonnement.

Benutzerkonto angelegt

Ihr „Meine DDB“-Konto wurde erfolgreich angelegt. Bevor Sie sich in Ihrem Konto anmelden können, müssen Sie auf den Bestätigungslink in der Nachricht klicken, die wir gerade an die von Ihnen angegebene E-Mail-Adresse geschickt haben

Learning navigation policies with deep reinforcement learning

Angaben zum Objekt

Klassifikation und Themen

Beteiligte, Orts- und Zeitangaben

Weitere Informationen

Datenpartner

Beteiligte

Entstanden

Ähnliche Objekte (12)

Approaches to online reinforcement learning for miniature airships : = Online Reinforcement Learning Verfahren für Miniaturluftschiffe

Goal-directed forward chaining for logic programs

Learning probabilistic models for mobile robot navigation : = Techniken zum maschinellen Lernen probabilistischer Modelle für die Navigation mit mobilen Robotern

Simultaneous estimation of rewards and dynamics in inverse reinforcement learning problems

Autonomous navigation for miniature indoor airships : = Autonome Navigation für Miniaturluftschiffe in Innenräumen

Landmark placement for mobile robot navigation : = Landmarkenplatzierung zur Optimierung der Navigation mobiler Roboter

Techniques for multi-robot coordination and navigation : = Roboterteams: Techniken zur Koordinierung und Navigation

Highly accurate mobile robot navigation

Robot perception for indoor navigation

Socially compliant mobile robot navigation

State estimation and optimization for mobile robot navigation : = Zustandsschätzung und Optimierung für die Navigation mit mobilen Robotern

Mapping, state estimation, and navigation for quadrotors and human-worn sensor systems : = Kartierung, Zustandsschätzung, und Navigation für Quadrotoren und von Menschen getragene Sensorsysteme

Approaches to online reinforcement learning for miniature airships : = Online Reinforcement Learning Verfahren für Miniaturluftschiffe

Goal-directed forward chaining for logic programs

Learning probabilistic models for mobile robot navigation : = Techniken zum maschinellen Lernen probabilistischer Modelle für die Navigation mit mobilen Robotern

Simultaneous estimation of rewards and dynamics in inverse reinforcement learning problems

Autonomous navigation for miniature indoor airships : = Autonome Navigation für Miniaturluftschiffe in Innenräumen

Landmark placement for mobile robot navigation : = Landmarkenplatzierung zur Optimierung der Navigation mobiler Roboter

Techniques for multi-robot coordination and navigation : = Roboterteams: Techniken zur Koordinierung und Navigation

Highly accurate mobile robot navigation

Robot perception for indoor navigation

Socially compliant mobile robot navigation

State estimation and optimization for mobile robot navigation : = Zustandsschätzung und Optimierung für die Navigation mit mobilen Robotern

Mapping, state estimation, and navigation for quadrotors and human-worn sensor systems : = Kartierung, Zustandsschätzung, und Navigation für Quadrotoren und von Menschen getragene Sensorsysteme

Approaches to online reinforcement learning for miniature airships : = Online Reinforcement Learning Verfahren für Miniaturluftschiffe

Goal-directed forward chaining for logic programs

Learning probabilistic models for mobile robot navigation : = Techniken zum maschinellen Lernen probabilistischer Modelle für die Navigation mit mobilen Robotern

Simultaneous estimation of rewards and dynamics in inverse reinforcement learning problems

Autonomous navigation for miniature indoor airships : = Autonome Navigation für Miniaturluftschiffe in Innenräumen

Landmark placement for mobile robot navigation : = Landmarkenplatzierung zur Optimierung der Navigation mobiler Roboter

Techniques for multi-robot coordination and navigation : = Roboterteams: Techniken zur Koordinierung und Navigation

Highly accurate mobile robot navigation

Robot perception for indoor navigation

Socially compliant mobile robot navigation

State estimation and optimization for mobile robot navigation : = Zustandsschätzung und Optimierung für die Navigation mit mobilen Robotern

Mapping, state estimation, and navigation for quadrotors and human-worn sensor systems : = Kartierung, Zustandsschätzung, und Navigation für Quadrotoren und von Menschen getragene Sensorsysteme

Verbundene Objekte

Passwort zurücksetzen