以后讨论 v9j-3、PIE、velocity command、parkour command 时,先打开这页确认论文原文和参考源码边界,再做解释。
一、PIE 原论文
论文入口、项目内本地副本位置和需要核对的段落位置。
本地 Markdown / 翻译源
1_survey/papers/md/Luo2024_PIEParkourwithImplicit.md1_survey/papers/translated/arxiv_batch_2026-06-04/work/PIE2024_ParkourImplicitExplicit/root.tex
重点核对位置
- Policy / action space: lines 72-88
- Momentum before liftoff: lines 111-117
- Training command ranges: lines 156-166
- Simulation / ablation discussion: lines 219-256
| 论文位置 | 原文主题 | 本页只记录的核对点 |
|---|---|---|
Luo2024_PIEParkourwithImplicit.md:72-88 |
Policy input and action space | 动作是 12 维关节目标;速度命令属于 policy observation。 |
Luo2024_PIEParkourwithImplicit.md:111-117 |
Estimator motivation | 高/远跳需要提前动量;需要外部感知提供前方地形信息。 |
Luo2024_PIEParkourwithImplicit.md:156-166 |
Training curriculum | 训练地形、线速度命令范围和 yaw 角速度范围在这里。 |
二、CAI23sbP/Isaaclab_Parkour 本地源码快照
这是项目内保存的上游参考实现,不是 PIE 官方源码。页面只摘关键函数,完整文件以本地路径为准。
来源边界
- remote: CAI23sbP/Isaaclab_Parkour
- local HEAD:
d6766d8883cffe427f2d4c087717d4eca1f23d8c - commit subject:
Added missed script - 边界: 本地 git clone 源码快照;非 PIE 官方源码。
| 文件 | 重点行 | 内容 |
|---|---|---|
2_experiment/source_references/puma_external/Isaaclab_Parkour/parkour_tasks/parkour_tasks/extreme_parkour_task/config/go2/parkour_mdp_cfg.py |
22-34 | ParkourCommandCfg 配置。 |
2_experiment/source_references/puma_external/Isaaclab_Parkour/parkour_isaaclab/envs/mdp/parkour_commands/uniform_parkour_command.py |
56-75 | 命令采样和 heading 更新。 |
2_experiment/source_references/puma_external/Isaaclab_Parkour/parkour_isaaclab/envs/mdp/rewards.py |
176-182 | forward progress reward。 |
2_experiment/source_references/puma_external/Isaaclab_Parkour/parkour_isaaclab/envs/mdp/observations.py |
63-75 | delta yaw、next delta yaw 和 command 观测。 |
parkour_mdp_cfg.py:22-34
base_velocity = parkour_commands.ParkourCommandCfg(
asset_name="robot",
resampling_time_range=(6.0, 6.0),
heading_control_stiffness=0.8,
ranges=parkour_commands.ParkourCommandCfg.Ranges(
lin_vel_x=(0.3, 0.8),
heading=(-1.6, 1.6),
),
clips=parkour_commands.ParkourCommandCfg.Clips(
lin_vel_clip=0.2,
ang_vel_clip=0.4,
),
)
uniform_parkour_command.py:56-75
def _resample_command(self, env_ids):
r = torch.empty(len(env_ids), device=self.device)
self.vel_command_b[env_ids, 0] = r.uniform_(*self.cfg.ranges.lin_vel_x)
self.heading_target[env_ids] = r.uniform_(*self.cfg.ranges.heading)
if self.cfg.small_commands_to_zero:
self.vel_command_b[env_ids, :2] *= torch.abs(self.vel_command_b[env_ids, 0:1]) > self.cfg.clips.lin_vel_clip
def _update_command(self):
heading_error = math_utils.wrap_to_pi(self.heading_target - self.robot.data.heading_w) * self.cfg.heading_control_stiffness
self.vel_command_b[:, 2] = torch.clip(heading_error, min=-1, max=1)
self.vel_command_b[:, 2] *= torch.abs(self.vel_command_b[:, 2]) > self.cfg.clips.ang_vel_clip
rewards.py:176-182
target_pos_rel = parkour_event.target_pos_rel
target_vel = target_pos_rel / (torch.norm(target_pos_rel, dim=-1, keepdim=True) + 1e-5)
cur_vel = asset.data.root_vel_w[:, :2]
proj_vel = torch.sum(target_vel * cur_vel, dim=-1)
command_vel = env.command_manager.get_command("base_velocity")[:, 0]
rew_move = torch.minimum(proj_vel, command_vel) / (command_vel + 1e-5)
return rew_move
observations.py:63-75
if env.common_step_counter % 5 == 0:
self.delta_yaw = self.parkour_event.target_yaw - wrap_to_pi(yaw)
self.delta_next_yaw = self.parkour_event.next_target_yaw - wrap_to_pi(yaw)
self.measured_heights = self._get_heights()
commands = env.command_manager.get_command("base_velocity")
obs_buf = torch.cat((
self.asset.data.root_ang_vel_b * 0.25,
imu_obs,
0 * self.delta_yaw[:, None],
self.delta_yaw[:, None],
self.delta_next_yaw[:, None],
0 * commands[:, 0:2],
commands[:, 0:1],
), dim=-1)
三、本项目 v9j-3 当前代码与运行配置
只列当前项目实际代码和同步回来的正式训练配置,不做解释。
| 文件 | 重点行 | 内容 |
|---|---|---|
2_experiment/baseline_rebuild/robot_lab_lite3_isaaclab_stairs_v9j3_onestage_parkour/src/robot_lab_lite3_stairs_v8_two_phase/tasks/__init__.py |
384-389 | v9j-3 command ranges。 |
artifacts/2026-06-25_v9j3_onestage_parkour_mainline/run_configs/v9j3_formal2048_10000_20260625_191350/env.yaml |
2392-2415 | 正式训练同步回来的 command config。 |
.pipeline/contracts/wave_c_v9j3_onestage_parkour.md |
20-25, 45-48, 69-85 | v9j-3 contract 中的定位、训练方式和冻结超参。 |
v9j-3 tasks/__init__.py:384-389
self.commands.base_velocity.ranges.lin_vel_x = (0.0, 1.5)
self.commands.base_velocity.ranges.lin_vel_y = (-0.3, 0.3)
self.commands.base_velocity.ranges.ang_vel_z = (-1.2, 1.2)
v9j-3 formal env.yaml:2392-2415
base_velocity:
class_type: robot_lab.tasks.manager_based.locomotion.velocity.mdp.commands:UniformThresholdVelocityCommand
resampling_time_range: [10.0, 10.0]
heading_command: true
heading_control_stiffness: 0.5
rel_standing_envs: 0.02
rel_heading_envs: 1.0
ranges:
lin_vel_x: [0.0, 1.5]
lin_vel_y: [-0.3, 0.3]
ang_vel_z: [-1.2, 1.2]
heading: [-3.141592653589793, 3.141592653589793]