The image shows a poster presentation titled "BiDiff: Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D priors," authored by Lihe Ding, Shaocong Dong, Zhanpeng Huang, Zibin Wang, Yiyuan Zhang, Kaixiong Gong, Dan Xu, and Tianfan Xue. The poster is presented by the Commonwealth of University Logo, the institution logo, and SenseTime. Key sections of the poster include: 1. Motivation of the Bidirectional Diffusion (BiDiff), which highlights the benefits of combining 2D Image and 3D Shape diffusions for improved texture diversity, consistency, and geometry. 2. An explanation of the **Bidirectional Diffusion (BiDiff) framework** that details the method of synchronizing denoise directions from 2D diffusion and reprojecting intermediate 3D outputs back to the 2D diffusion processing. Several panels show: - The capabilities of BiDiff in efficiently generating diverse 3D objects. - **Final results** showcasing detailed geometry and texture outputs for various 3D models. - Methods to control and decouple the texture and geometry of generated models. The poster layout includes diagrams, examples of generated 3D models, and technical workflows. The left bottom of the image also shows an empty soda can placed on the table in front of the poster. Text transcribed from the image: • UU 商汤 sensetime BiDiff: Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D priors Lihe Ding*, Shaocong Dong*, Zhanpeng Huang, Zibin Wang, Yiyuan Zhang, Kaixiong Gong, Dan Xu, Tianfan Xue Motivation of the Bidirectional Diffusion (BiDiff) 2D image diffusion has better texture and is more diverse. 3D shape diffusion has better 3D consistency and geometry. Leverage priors from pre-trained foundation 2D & 3D models. Bidirectional guidance to synchronize 2D&3D denoise direction. bear What can BiDiff do? dancing A bear dancing kick dance A bear dancing ballet Random seed 27 Final results with detailed geometry and texture Noise 2D diffusion model 2D Denoising 11 Birdirectional Guidance Noise 3D diffusion model 3D Denoising step=960 step 920 step=800 step-500 A bear dressed as a lumberjack step-200 step=0 feedforward results (40s) * refined results (20min) texture mesh (i) Efficiently generate diverse 3D objects in 40 seconds. (ii) Users choose the favorite one for further refinement. A car made out of pizza Bidirectional Diffusion (BiDiff) framework (a) we render the 3D diffusion's intermediate outputs into 2D images, which then guide the denoising of the 2D diffusion model. Simultaneously, the intermediate multi-view outputs from the 2D diffusion are re-projected to assist 3D denoising. (b) BiDiff sampling outcomes as a strong starting point for optimization methods. A silver platter piled high with fruits. Allama wearing a suit (a) 3D Noise 2D Noise 580 step t+1 step t step t-1 ... 3D Pipeline 3D Denoising Noisy Input 3D Foundation ° 2D-3D Control Model 3D SDF Volume Feature Volume Rendering (t-1 step) 2D Pipeline Volume Encoding (b) step 0 A llama wearing a suit. An orangutan playing accordion with its hands spread wide. 3D-2D Control A pig wearing a backpack Feature Volume Noisy Input sos loss 2D Denoising Multi-view images (t-1 step) An eagle head. A Van Gogh style cabin A GUNDAM robot A Nike sport shoe Decouple Texture and Geometry Control An ancient Chinese A golden skull. A crystal skull An ancient Gothic tower. A strong A strong muscular A blue and white A blue and white porcelain teapot porcelain burger. A blut and red Superman clothes A bule and red Superman clothes style car A tetral with van A house in van Cogh's stamy sky style painting on it cegh starry sky style •Red Metal Golden A (*) Dog. Astronaut Astronaut Astronaut (*) in weste 2D Diffusion Sampling (40s) 3D Diffusion Sampling (40%) Refinement Share 3D Feature Volume Yo -15 Y 7.5 (20min)