The image features a scientific poster presented by researchers from The Hong Kong University of Science and Technology and Tsinghua University. The poster is titled "GenN2N: Generative NeRF2NeRF Translation" and details a novel method for translating between Neural Radiance Fields (NeRFs) to facilitate universal NeRF editing. The project contributions include a generative framework for NeRF translation, a 3D VAE-GAN framework for learning 3D NeRF edits, a contrastive learning framework, and improvements in efficiency, quality, and diversity of NeRF-to-NeRF translation results. The poster is divided into several sections: 1. **Contributions:** Displays multiple example images showing various translation outputs such as texture editing and transformations on 3D models. 2. **Method:** Illustrates the training and inference processes, including how latent codes are extracted and optimized. 3. **Contrastive Loss and Conditional Adversarial Loss:** Explains the techniques used to enhance the quality of the translated NeRF results by reducing discrepancies in code distance and resolving style variations. 4. **Quantitative and Qualitative Results:** Exhibits comparisons of NeRF colorization, super-resolution, inpainting, and text-driven NeRF editing, along with charts and tables showcasing performance metrics. 5. **Training Loss:** Details the different components of the training loss including latent distill loss and adversarial loss among others. 6. **Ablation Studies:** Provides insights into various model configurations and their impact on the performance through comparative metrics. Overall, the poster provides an in-depth view of an advanced method to enhance and edit 3D scenes using NeRF translations, supported by comprehensive visual examples and methodological explanations. Text transcribed from the image: 香港科技大學 THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY · GHUA 1911 UNIVER 清華大學 Tsinghua University Contributions GenN2N: Generative NeRF2NERF Translation Xiangyue Liu, Han Xue², Kunming Luo, Ping Tan't, Li Yi2.3,4+ HKUST, 2Tsinghua University, 3Shanghai Al Laboratory, 4Shanghai Qi Zhi Institute Edit Contrastive Loss Original Scene S? CVP SEATTLE, WA JUNE 17-21 https://xiangyueliu.github.io/Ge Qualitative and Quantitative Results NeRF Colorization NeRF Super-resolution Original NeRF Instruct-NeRF2 Translated NeRF C Render Original NeRF Instruct-NeRF2NeRF PaletteNeRF posel pose posel Edit NeRF-SR Method Reduce style distance PSNR 1 SSIM catt contr Ours inference 1 Ours inference 2 Ours inference 3 ReShift (40+NeRF Instruct-NeRF2NeRF NeRF-SR [34] Ours w/o C 19.978 0.535 20.299 0642 27.957 0.897 12.555 0.663 Method CF1 DDColor [14]+NeRF 40.435 FID 148.957 Ours w/o C 15372 0.662 28.501 0.913 Instruct-NeRF2NeRF 45.599 201.456 PaletteNeRF [16] 39.654 NeRF Inpainting Ours w/o Lady 35.031 137.740 Ours w/o Contr 34.829 105.750 Ours 65.099 35.041 Original NeRF Instruct-NeRF Input Output 1 Output 2 Output 3 Input Output 1 Output 2 Output 3 A generative NeRF-to-NeRF translation formulation for the universal NeRF editing task together with a generic solution. . A 3D VAE-GAN framework that can learn the distribution of all possible 3D NERF edits corresponding to the a set of input edited 2D images. • A contrastive learning framework to disentangle the 3D edits and 2D camera views. • Superior efficiency, quality, and diversity of the NeRF-to-NeRF translation results. Method Training Image translation Latent distill Pose Z Text-driven editing Colorization Super-resolution Inpainting Encoder MLP Translated NeRF M-1 (S)(S-0 1M-1 C Crep contr Increase style distance Render pose i • Increase the distance between edit codes of same-view different-edit style images. • Reduce the distance between edit codes of different-view same-edit style images. Conditional Adversarial Loss Novel view! Concat View I with different edit styles S s Concat Text-driven NeRF Editing {} Inference Pose Z Sampling Guassian distribution Network pipeline: · . Translated NeRF KL loss P(zorm) P(Znorm)log(P(z) {z}} Constrain to Guassian distribution Edited 3D scene Rendering loss Crecon = || -|| Lady GAN(CSS) Lcontr Contrastive (CCS) Latent distill: extract edit codes from the translated image set, which serve as the input of the translated NeRF. Optimizie: a KL loss to constrain the latent vectors to a Gaussian distribution; and Lrecon, Ladv and Lcontr to optimize the appearance and geometry of the translated NeRF. Inference: sample a latent vector from Gaussian distribution and render a corresponding multi- view consistent 3D scene with high quality. . Fake pair CAD Real pair Distinguishe artifacts (e.g. blur, distortion) in novel- view rendered image compared with target image. Training Loss SPin-NeRF Ours Method PSNR 1 SSIM 1 LaMa [31]+NeRF 18.983 0.3706 Original NeRF Instruct-NeRF2NeRF 1 Instruct-NeRF2NeRF 2 Instruct-NeRF2NeRF 16.734 03088 SPin-NeRF [24] 24.369 0.7217 Ours 26.868 0.8137 Ablation Ours inference 1 Method Ours inference 2 CLIP Text-Image Direction Similarity Ours inference 3 CLIP Direction Consistency ↑ CLIP Text-Image CLIP Direction M Direction Similarity Consistency FID M=1 0.2635 0.9610 M-3 0.2807 0.9650 InstructPix2Pix [2]+NeRF Instruct-NeRF2NeRF 0.1669 0.8475 270.542 M-5 0.2835 0.9638 0.2021 0.9828 148.021 Infere Train Method Ours w/o Cy 0.1920 0.9657 162.275 Reconstruction loss, Adversarial loss, and Contrastive loss on Translated NeRF. Ours w/o Cont 0.2007 0.9749 156.524 IN2N 20000 Time(h) Iteration Memory(GB) FLOPS(G) 2.67 Ours 0.2089 0.9864 137.740 Ours 3.47 10000 18.32 20.92 131 L=LKL+Lrecon + LAD-G+LAD-D+Lcontr KL loss on Latent Distill.