In the image, there is a large picture placed near the edge of the frame. The picture appears to be a collage or a board filled with various images and text. The text on the board seems to be a mix of English and another language, making it difficult for someone who doesn't understand the language to interpret the content. The board is white and is displayed prominently, suggesting that it might be important or useful information for someone interested in the subject matter. The overall composition of the image suggests that the information on the board is meant for a group of people to see and access, possibly indicating that it is part of a public display or exhibition.
Text transcribed from the image:
GenN2N: Generative NeRF2NERF Translation
Xiangyue Liu, Han Xue, Kunming Luo', Ping Tan't, Li Yj2,3,4+
HKUST, 2Tsinghua University, 3Shanghai Al Laboratory, 4Shanghai Qi Zhi Institute
Contrastive Loss
Original Scene
SE
https://xiangyueli
Qualitative and Quantitative Results
NeRF Super
25
香港科技大學
THE HONG KONG
W
UNIVERSITY OF SCIENCE
清華大學
AND TECHNOLOGY
Tsinghua University
Contributions
NeRF Colorization
Original NeRF
Edit
pose i
pose
Edit
S
Original NeRF
Instruct-NeRF2NeRF
PaletteNeRF
Render
Translated
NeRF
pose
NeRF-SR
Method
PSNR ↑
Render
pose i
Input
Output Output 2
Output 3
Input
Output
Output 2 Output 3
Image translation
Ting
Spesia
A generative NeRF-to-NeRF translation formulation for the universal NeRF editing
task together with a generic solution.
A 3D VAE-GAN framework that can learn the distribution of all possible 3D NeRF
edits corresponding to the a set of input edited 2D images.
A contrastive learning framework to disentangle the 3D edits and 2D camera views.
Superior efficiency, quality, and diversity of the NeRF-to-NeRF translation results.
Method
Latent distill
• Increase the distance between edit codes of
same-view different-edit style images.
Reduce the distance between edit codes of
different-view same-edit style images.
Conditional Adversarial Loss
ResShift [40]+NeRF
19.978
C
Reduce style distance
rep
Instruct-NeRF2NeRF
20.299
contr
catt
contr
Ours inference 1
Ours inference 2
Ours inference 3
NeRF-SR [34]
27.957
Increase style distance
Method
CF ↑
DDColor [14]+NeRF
40.435
FID↓
148.957
Ours w/o Lady
Ours w/o contr
Ours
12.555
15.372
28.501
Instruct-NeRF2NeRF
45.599
201.456
PaletteNeRF [16]
39.654
Ours w/o Lady
35.031
NeRF Inp
Ours w/o contr
137.740
34.829
Ours
105.750
65.099
35.041
ping
Encoder
MLP
Novel view l
View I with different edit styles
c
Concat
Concat
Text-driven NeRF Editing
Original NeRF
SPin-NeRF
Method
PSNR ↑
LaMa [31]+NeRF
18.983
Original NeRF
Instruct-NeRF2NeRF 1
Instruct-NeRF2NeRF 2
Instruct-NeRF2NeRF
16.734
SPin-NeRF [24]
24.369
Ours
26.868
ine:
Edited 30 scene
(2)
KL loss
P
Rendering loss
Cover C-S
=
Constrain to Guassian distribution contr = Contrastive(C,C,S)
Code=GAN(CSS)
tract edit codes from the translated image set, which serve as the input of the
to constrain the latent vectors to a Gaussian distribution; and Lrecon, Ladv
the appearance and geometry of the translated NeRF.
vector from Gaussian distribution and render a corresponding multi-
high quality.
Fake pair LAD Real pair
Distinguishe artifacts (e.g. blur, distortion) in novel-
view rendered image compared with target image.
Training Loss
L=LKL+Lrecon+LAD-G+LAD-D + £contr
KL loss on Latent Distill.
Reconstruction loss, Adversarial loss, and
Contrastive loss on Translated NeRF.
Method
InstructPix2Pix [2]+NeRF
Ours inference 1
Ours inference 2
CLIP Text-Image
Direction Similarity
Ours inference 3
M
CLIP Direction
Consistency
FID
M=1
0.1669
0.8475
270.542
Instruct-NeRF2NeRF
0.2021
0.9828
Ours w/o Lady
0.1920
Ours w/o contr
0.2007
Ours
Ablati
CLIP Text-Image
Direction Simil