Bipedal locomotion based on a hybrid RL model in IS-MPC
Descripción del Articulo
Maintaining the stability of bipedal walking remains a major challenge in humanoid robotics, primarily due to the large number of hyperparameters involved and the need to adapt to dynamic environments and external disturbances. Traditional methods for determining these hyperparameters, such as heuri...
Autor: | |
---|---|
Formato: | tesis doctoral |
Fecha de Publicación: | 2025 |
Institución: | Pontificia Universidad Católica del Perú |
Repositorio: | PUCP-Tesis |
Lenguaje: | inglés |
OAI Identifier: | oai:tesis.pucp.edu.pe:20.500.12404/31525 |
Enlace del recurso: | http://hdl.handle.net/20.500.12404/31525 |
Nivel de acceso: | acceso abierto |
Materia: | Androides--Locomoción Control predictivo Aprendizaje automático (Inteligencia artificial) https://purl.org/pe-repo/ocde/ford#2.00.00 |
Sumario: | Maintaining the stability of bipedal walking remains a major challenge in humanoid robotics, primarily due to the large number of hyperparameters involved and the need to adapt to dynamic environments and external disturbances. Traditional methods for determining these hyperparameters, such as heuristic approaches, are often time- consuming and potentially suboptimal. In this thesis, we present an integrated approach combining advanced control and reinforcement learning techniques to improve the stability of bipedal walking, particularly in the face of ground disturbances and speed variations. Our main contribution lies in the integration of two complementary approaches: (1) an intrinsically stable model predictive control (IS-MPC) combined with whole-body admittance control, and (2) a reinforcement learning module implemented in the mc_rtc framework. This system allows for continuous monitoring of the robot’s current states, maintaining recursive feasibility, and optimizing parameters in real time. Additionally, we propose an innovative reward function that combines changes in single and double support times, postural recovery, divergent motion control, and action generation based on training optimization. The optimization of the weights of this reward function plays a crucial role, and we systematically explore different configurations to maximize the robot’s stability and performance. Furthermore, this thesis introduces a novel approach that integrates experience variabil- ity (a criterion for determining changes in locomotion-manipulation) and experience accumulation (an efficient way to store and select acquired experiences) in the develop- ment of reinforcement learning (RL) agents and humanoid robots. This approach not only improves adaptability and efficiency in unpredictable environments but also facili- tates more sophisticated modeling of these environments, significantly enhancing the systems’ ability to cope with real-world complexities. By combining these techniques with advanced reinforcement learning methods, such as Proximal Policy Optimization (PPO) and Model-Agnostic Meta-Learning (MAML), and integrating stability-based self-learning, we strengthen the systems’ generalization capabilities, enabling rapid and effective learning in new and unprecedented situations. The evaluation of our approach was conducted through simulations and real-world experiments using the HRP-4 robot, demonstrating the effectiveness of the intrinsically stable predictive controller and the proposed reinforcement learning system. The results show a significant improvement in the robot’s stability and adaptability, thereby consolidating our contribution to the field of humanoid robotics. |
---|
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).