The image captures an academic poster presentation titled "RoHM: Robust Human Motion Reconstruction via Diffusion" displayed at a conference or symposium. The research is credited to Siwei Zhang, Bharat Lal Bhatnagar, Yuanlu Xu, Alexander Winkler, Petr Kadleček, Sijuy Tang, and Federica Bogo, affiliated with ETH Zurich and Meta Reality Labs Research. The poster is prominently positioned under a "Highlight" banner, indicating its significance or distinction. The poster is divided into several sections, detailing the following key points: 1. **Overview** - An introduction to the research's objectives: reconstructing human poses and motion in 3D from monocular videos, addressing common motion estimation challenges such as foot sliding and physical plausibility. Illustrative diagrams and small sample images are provided. 2. **Problem Setup** - Describes the approach to dealing with noisy and incomplete motion data, highlighting the methodology for representing motion and key problem parameters. 3. **Diffusing Global and Local Motion** - Explains the innovative techniques employed to maintain spatial coherence in human motion trajectories, including integrations of TrajNet and PoseNet. Graphical models and flow charts delineate the processes. 4. **Controlling Global Motion Reconstruction** - Discusses the refinement of global trajectory and the integration of local pose adjustments for plausible motion reconstruction. 5. **Experiments** - Presents experimental validation and results using datasets such as AMASS, PROX, and EgoBody. Tables and comparative photos reveal improved accuracy, reduced errors like foot sliding, and notable efficiency, stating "30x times faster than HuMoR during inference". The upper portion showcases logos from the associated institutions, including ETH Zurich and Meta Reality Labs, along with a QR code presumably leading to additional resources or publications. The poster numbered 181 is set in a well-lit environment with modular displays, indicating a professional academic setting. Small items, such as a cup and paper, are visible on a nearby table, suggesting a busy, interactive event atmosphere. Text transcribed from the image: 5. Stof e depth pros 181 Highlight ROHM: Robust Human Motion Reconstruction via Diffusion Siwei Zhang, Bharat Lal Bhatnagar?, Yuanlu Xu², Alexander Winkler2, Petr Kadlecek?, Siyu Tang', Federica Bogo² 1 ETH Zürich 2 Reality Labs Research, Meta The work was done during an internship at Meta. Overview Diffusing Global and Local Motion ETH Zürich Meta BVLG Computer Vision and Learning Group CVPR Challenges: Noisy 2D keypoint detections Body occlusions Our goals: Reconstruct realistic 3D motions in global space from monocular videos Robust to noise and occlusions Our contributions: ROHM: a diffusion-based approach TrajControl: to model trajectory-pose correlations Various applications: motion reconstruction, denoising, infilling Problem Setup on Depth accuracy Input: noisy & incomplete motion Output: complete 3D motion RGB(-D) video Motion Representation: Joint-based + SMPLX-based r=(r,,,,,,,,) Root trajectory TrajNet: reconstruct the global trajectory PoseNet: reconstruct the local body pose Ro DR(R,t,CR) (Ro. Po) Dp((Ro, P.), t, cp) Inference iteration = 1 M. R MOP Training on: AMASS SE SEATTLE, WA Experiments Test on: AMASS (synthetic noise + occlusions), PROX (RGBD/RGB), EgoBody (RGB) Evaluation metrics: • Accuracy: MPJPE Physical plausibility: acceleration + foot skating + foot-floor penetration Method R- R Ours GMPJPE -vis -occ all VPoser-t 33.0 242.6 109.2 HuMor [67] 42.4 167.9 88.0 0.68 0.230 MDM++ 36.2 71.9 49.2 0.94 0.102 21.8 57.4 34.8 0.95 0.078 Contt Skat 0.219 Method LEMO [100] 0.176 HUMOR [67] 0.117 PhaseMP [72] Ours 0.038 TrajNet RGB-D RGB 23 35.41 1.8 46.96 9.73 2.2 Skating Accel Dist! Skating Accel Dist 1.8 34.22 1.9 54.76 0.139 0.180 1.8 3.36 0.116 Results on AMASS: Results on PROX: >30% improvement over accuracy >67% (RGB-D)/>17% (RGB) improvement over foot skating Ma R R TraNet P PoseNet Po No trajectory-pose correlation. → foot skating Controlling Global Motion Reconstruction TrajControl: fine-tuning Traj Net with local body pose Iteratively refine local and global motion at inference time TraNet Inference iteration > 1 P PoseNet p P TrajControl E R TrajNet RGB-D RGB Method Skating L Accel Skating p" = (J,J,0,3,f) Ours 0.038 w/o TrayControl 0.056 18 2.1 0.116 0.165 Accel 2.2 27 → TrajControl improves motion plausibility Local body pose Ablation for TrajControl on PROX dataset HuMoR Ours HUMOR Ours 30x times faster than HUMOR during inference!