v9j-3 / PIE 原始资料

使用方式

以后讨论 v9j-3、PIE、velocity command、parkour command 时，先打开这页确认论文原文和参考源码边界，再做解释。

一、PIE 原论文

论文入口、项目内本地副本位置和需要核对的段落位置。

paper

PIE: Parkour with Implicit-Explicit Learning Framework for Legged Robots

arXiv abs
arXiv PDF
本地 PDF: 1_survey/papers/Luo2024_PIEParkourwithImplicit.pdf

local text

本地 Markdown / 翻译源

1_survey/papers/md/Luo2024_PIEParkourwithImplicit.md
1_survey/papers/translated/arxiv_batch_2026-06-04/work/PIE2024_ParkourImplicitExplicit/root.tex

paper lines

重点核对位置

Policy / action space: lines 72-88
Momentum before liftoff: lines 111-117
Training command ranges: lines 156-166
Simulation / ablation discussion: lines 219-256

论文位置	原文主题	本页只记录的核对点
`Luo2024_PIEParkourwithImplicit.md:72-88`	Policy input and action space	动作是 12 维关节目标；速度命令属于 policy observation。
`Luo2024_PIEParkourwithImplicit.md:111-117`	Estimator motivation	高/远跳需要提前动量；需要外部感知提供前方地形信息。
`Luo2024_PIEParkourwithImplicit.md:156-166`	Training curriculum	训练地形、线速度命令范围和 yaw 角速度范围在这里。

二、CAI23sbP/Isaaclab_Parkour 本地源码快照

这是项目内保存的上游参考实现，不是 PIE 官方源码。页面只摘关键函数，完整文件以本地路径为准。

git snapshot

来源边界

remote: CAI23sbP/Isaaclab_Parkour
local HEAD: d6766d8883cffe427f2d4c087717d4eca1f23d8c
commit subject: Added missed script
边界: 本地 git clone 源码快照；非 PIE 官方源码。

文件	重点行	内容
`2_experiment/source_references/puma_external/Isaaclab_Parkour/parkour_tasks/parkour_tasks/extreme_parkour_task/config/go2/parkour_mdp_cfg.py`	22-34	ParkourCommandCfg 配置。
`2_experiment/source_references/puma_external/Isaaclab_Parkour/parkour_isaaclab/envs/mdp/parkour_commands/uniform_parkour_command.py`	56-75	命令采样和 heading 更新。
`2_experiment/source_references/puma_external/Isaaclab_Parkour/parkour_isaaclab/envs/mdp/rewards.py`	176-182	forward progress reward。
`2_experiment/source_references/puma_external/Isaaclab_Parkour/parkour_isaaclab/envs/mdp/observations.py`	63-75	delta yaw、next delta yaw 和 command 观测。

parkour_mdp_cfg.py:22-34

base_velocity = parkour_commands.ParkourCommandCfg(
    asset_name="robot",
    resampling_time_range=(6.0, 6.0),
    heading_control_stiffness=0.8,
    ranges=parkour_commands.ParkourCommandCfg.Ranges(
        lin_vel_x=(0.3, 0.8),
        heading=(-1.6, 1.6),
    ),
    clips=parkour_commands.ParkourCommandCfg.Clips(
        lin_vel_clip=0.2,
        ang_vel_clip=0.4,
    ),
)

uniform_parkour_command.py:56-75

def _resample_command(self, env_ids):
    r = torch.empty(len(env_ids), device=self.device)
    self.vel_command_b[env_ids, 0] = r.uniform_(*self.cfg.ranges.lin_vel_x)
    self.heading_target[env_ids] = r.uniform_(*self.cfg.ranges.heading)
    if self.cfg.small_commands_to_zero:
        self.vel_command_b[env_ids, :2] *= torch.abs(self.vel_command_b[env_ids, 0:1]) > self.cfg.clips.lin_vel_clip

def _update_command(self):
    heading_error = math_utils.wrap_to_pi(self.heading_target - self.robot.data.heading_w) * self.cfg.heading_control_stiffness
    self.vel_command_b[:, 2] = torch.clip(heading_error, min=-1, max=1)
    self.vel_command_b[:, 2] *= torch.abs(self.vel_command_b[:, 2]) > self.cfg.clips.ang_vel_clip

rewards.py:176-182

target_pos_rel = parkour_event.target_pos_rel
target_vel = target_pos_rel / (torch.norm(target_pos_rel, dim=-1, keepdim=True) + 1e-5)
cur_vel = asset.data.root_vel_w[:, :2]
proj_vel = torch.sum(target_vel * cur_vel, dim=-1)
command_vel = env.command_manager.get_command("base_velocity")[:, 0]
rew_move = torch.minimum(proj_vel, command_vel) / (command_vel + 1e-5)
return rew_move

observations.py:63-75

if env.common_step_counter % 5 == 0:
    self.delta_yaw = self.parkour_event.target_yaw - wrap_to_pi(yaw)
    self.delta_next_yaw = self.parkour_event.next_target_yaw - wrap_to_pi(yaw)
    self.measured_heights = self._get_heights()
commands = env.command_manager.get_command("base_velocity")
obs_buf = torch.cat((
    self.asset.data.root_ang_vel_b * 0.25,
    imu_obs,
    0 * self.delta_yaw[:, None],
    self.delta_yaw[:, None],
    self.delta_next_yaw[:, None],
    0 * commands[:, 0:2],
    commands[:, 0:1],
), dim=-1)

三、本项目 v9j-3 当前代码与运行配置

只列当前项目实际代码和同步回来的正式训练配置，不做解释。

文件	重点行	内容
`2_experiment/baseline_rebuild/robot_lab_lite3_isaaclab_stairs_v9j3_onestage_parkour/src/robot_lab_lite3_stairs_v8_two_phase/tasks/__init__.py`	384-389	v9j-3 command ranges。
`artifacts/2026-06-25_v9j3_onestage_parkour_mainline/run_configs/v9j3_formal2048_10000_20260625_191350/env.yaml`	2392-2415	正式训练同步回来的 command config。
`.pipeline/contracts/wave_c_v9j3_onestage_parkour.md`	20-25, 45-48, 69-85	v9j-3 contract 中的定位、训练方式和冻结超参。

v9j-3 tasks/__init__.py:384-389

self.commands.base_velocity.ranges.lin_vel_x = (0.0, 1.5)
self.commands.base_velocity.ranges.lin_vel_y = (-0.3, 0.3)
self.commands.base_velocity.ranges.ang_vel_z = (-1.2, 1.2)

v9j-3 formal env.yaml:2392-2415

base_velocity:
  class_type: robot_lab.tasks.manager_based.locomotion.velocity.mdp.commands:UniformThresholdVelocityCommand
  resampling_time_range: [10.0, 10.0]
  heading_command: true
  heading_control_stiffness: 0.5
  rel_standing_envs: 0.02
  rel_heading_envs: 1.0
  ranges:
    lin_vel_x: [0.0, 1.5]
    lin_vel_y: [-0.3, 0.3]
    ang_vel_z: [-1.2, 1.2]
    heading: [-3.141592653589793, 3.141592653589793]