Implemented SKIMO (Skill-Based Model-Based Reinforcement)in a Maze environment, achieving the best loss of 14 after training on 3046 trajectories, enhancing complex task efficiency.
Developed a joint training strategy for skill dynamics using 256-length samples over 750 iterations, significantly improving sample efficiency, Validated SKIMO on long-horizon tasks, with consistent objective achievement and a steady 1–point loss drop every
100 iterations, demonstrating model effectiveness.











