Poster #18 titled "3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting" is displayed at an academic conference. The poster presents research from a collaboration between ETH ZÃ¼rich, the University of TÃ¼bingen, VLG, and the TÃ¼bingen AI Center. It details a method for creating animatable clothed human avatars from monocular videos using deformable 3D Gaussian splatting.

The overview section summarizes that the technique allows for zero-shot appearances and non-rigid deformations, providing fast novel views synthesis and monocular input. Several logos, including those of ETH ZÃ¼rich and CVPR, decorate the poster, hinting at its academic and scientific prestige.

The methods section outlines the integration of rigid human articulation with non-rigid deformations under the 3DGS framework. It mentions the replacement of Splatted Human Meshes with a local deformation module and a separate MLP to decode appearance. An illustration provides a visual explanation of the method.

The experiments section showcases a comparison of generated avatars with various state-of-the-art models, highlighting better quality and faster processing times. The novel view synthesis and novel pose animation sub-sections feature visual results demonstrating the capabilities of the proposed method against other methods such as HumanNeRF and ARAH.

Notably, a QR code is present for more information, leading to the project page, and detailed references are provided at the bottom of the poster. A diverse group of attendees can be seen engaging with the research, indicating interest and interaction at this bustling conference venue.
Text transcribed from the image:
17
Z
ETH ZÃ¼rich
VLG 10
Overview
EBERHARD KARLS
UNIVERSITAT
TUBINGEN
TÃ¼bingen Al Center
3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting
Zhiyin Qian, Shaofei Wang 1,2,3, Marko Mihajlovic', Andreas Geiger2,3, Siyu TangÂ¹
1ETH ZÃ¼rich, 2University of TÃ¼bingen, 3TÃ¼bingen Al Center
https://neuralbodies.github.io/3DGS-Avatar/
MM X N W
Input: a monocular video
30 min.
training
50 FPS
rendering
Output: animatable 3D Gaussian avatar
Novel view synthesis
Novel pose animation
Experiments
Novel View Synthesis
pose-dependent deformation
novel pose animation
fast training
real-time rendering
monocular input
TL;DR: We create animatable clothed human avatars from
monocular videos using 3D Gaussian Splatting [1]
pose-dependent non-rigid deformation
generalize to unseen poses
fast training (less than 30 minutes)
real-time rendering (50+ FPS)
monocular input
Methods
Canonical
10 Gaussians (G)
t
Optimizable SMPL
Parameters,
Non-rigid Deformation
Rigid Transformation
T (Sec 4.1)
Ta (Sec. 4.2)
CVPR
Mi SEATTLE, WA JUNE 17-21, 2024
Project Page
Novel Pose Animation
x
x
X
âœ“
x X
NeuralBody
HumanNeRF
ARAH
X W
x
Instant-NVR
âœ“
NX âœ“
InstantAvatar
âœ“
X X
MonoHuman
x
UV-Volumes
X âœ“
x
DELIFFAS
âœ“
3DGS-Avatar (Ours)
GT
Ours
HumanNeRF [4]
Training:
Inference:
30 min.
>8 days
50+ FPS
0.2 FPS
ARAH [5]
8 days
0.1 FPS
Instant-NVR [6]
5 min.
3 FPS
Ours
30 min.
50+ FPS
Latest
Code Z
Direction di
PSNR 30.61 29.08
30.24
30.52 30.91
31.02
Color MLP
Fe, (Sec. 4.3)
SSIM 97.03 96.16
LPIPS 29.58 52.29
96.80
97.09
97.11
97.08
31.73
36.13
28.62
38.40
Ours NB [3] Human NeRF [4] ARAH [5] Ours Instant-NVR [6]
âž¤ Compared to current state-of-the-arts [4,5], we achieve
comparable or even better rendering quality while being
hundreds of times faster in both training and inference.
âž¤ Generalize well to out-of-distribution poses.
HumanNeRF [4]
>8 days
0.2 FPS
ARAH [5]
8 days
0.1 FPS
Duff Gaussian
Rasterization (E
(E45)
Observation Space
Rendered Image
-
Î£Î£Î‘Î£ Î£Î‘Î£ Î£
As-isometric-as-possible Regularization [2] |
âž¤Integrate rigid human articulation with non-rigid
deformation field into the 3DGS framework.
Replace spherical harmonics with a local deformation
aware MLP to decode color.
Apply as-isometric-as-possible regularizations to
generate realistic deformation under novel poses.
Ablation on AIAP Regularization
Full model
w/o Lisocov. Lisopos
âž¤ AIAP regularization enforces consistent
movement of the Gaussians, thus removing
the artifacts on highly articulated poses.
[1] Kerbi et al. 3D Gaussian Splatting for Real-time Radiance Field Rendering. SIGGRAPH, 2023
[2] Killian et al. Geometric Modeling in Shape Space. SIGGRAPH, 2007
[3] Peng et al. Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View
Synthesis of Dynamic Humans. CVPR, 2021
(4] Weng et al. HumanNeRF: Free-viewpoint Rendering of Moving People from Monocular Video. CVPR,
2022
[5] Wang et al. ARAH: Animatable Volume Rendering of Articulated Human SDFs. ECCV, 2022
[6] Geng et al. Learning Neural Volumetric Representations of Dynamic Humans in Minutes. CVPR, 2023
18