Caption: A detailed research poster titled "CAT-DM: Controllable Accelerated Virtual Try-on with Diffusion Model," presented by authors Jianhao Zeng, Dan Song, Weizhi Nie, Hongshuo Tian, Tongtong Wang, and Anan Liu from Tianjin University and Tencent Lightspeed Studio at CVPR 2023. The poster examines challenges and solutions in using diffusion models for virtual try-on, specifically focusing on acceleration and control. The top-left section highlights the motivation behind the research, stating the challenges of preserving texture and patterns in garment images using diffusion models. The methods section explains the utilized approaches, such as Garment-Conditioned Diffusion Model and Truncation-Based Acceleration Strategy, including specific techniques like ControlNet, DINO-V2, Poisson Blending, GC-DM, and TDPDM. A visual comparison of diffusion-based virtual try-on models shows multiple outputs such as VITON-HD, HR-VTON, GP-VTON, and variations of CAT-DM, demonstrating differences in image quality and performance. Below the method descriptions, statistical results of comparative and ablation experiments on the VITON-HD dataset are displayed in tabular form, illustrating metrics like FID, KID, SSIM, and LPIPS. Graphs in the lower-right section detail the effectiveness of the Truncation-Based Acceleration Strategy, showing that CAT-DM can significantly reduce the required sampling steps and refine try-on results. The Conclusion summarizes that CAT-DM enhances controllability and acceleration in diffusion models for virtual try-on tasks, demonstrating its superiority in image quality and computational efficiency. The poster is visually supported by logos of Tianjin University, Tencent, and the hosting conference, with QR codes for additional resources. Text transcribed from the image: UNIVERSITY PENANG Tencent 腾讯 Motivation using diffusion models Challenges faced by for virtual try-on. T Person Garment Image PBE CAT-DM: Controllable Accelerated Virtual Try-on with Diffusion Model Jianhao Zeng, Dan Song1*, Weizhi Nie¹, Hongshuo Tian', Tongtong Wang², Anan Liu¹* 1Tianjin University 2Tencent LightSpeed Studio Diffusion-based Virtual Try-on Models Method Garment-Conditioned Diffusion Model ⚫ ControlNet Rad xa Loss CVPR SEATTLE, WA JUNE 17-21, 2024 50 Encoder Block MGD LaDI-VTON DCI-VTON • DINO-V2 Poisson Blending Accelerated: Diffusion models necessitate a significant number of sampling steps to generate a single try-on image. Controllable: Diffusion models struggle to preserve the texture and patterns of the input garment images. Comparative Experiments on VITON-HD Truncation-Based Acceleration Strategy • Pretrained GAN-based model GC-DM TDPM • Experiments Method VITON-HD 14.64 6.10 12.81 5.52 0.848 0.1216 HR-VTON 12.15 3.42 9.92 3.06 0.1038 0.860 2.01 2.23 7.71 0.857 0.0897 GP-VTON 10.49 15.77 6.22 PBE 14.32 5.44 0.763 0.2253 0.827 0.1280 11.12 3.38 MGD 13.44 3.93 9.44 3.90 0.861 0.0968 4.75 LaDI-VTON 12.33 2.93 8.19 0.875 0.0816 DCI-VTON 11.14 3.35 0.862 0.0988 9.67 1.36 1.12 7.11 GC-DM 0.0803 5.60 0.877 0.83 1.37 8.93 CAT-DM FID KID↓ FID KID↓ SSIM↑ LPIPS,↓ " 347 Des PBE Model ControlNet DINO-VIB SD Encoder Block 4 FC((9)) Cal Vicks SD Me Wock FC Layer Add Noise Pre-trained GAN-based model GC-DM t-T XT x XNirane X- xo Discussion of Truncation-Based Acceleration Strategy FID ההההה 2 4 8 16 32 Number of sampling steps KID CAT-DM with VITON-HD CAT-DM with HR-VITON CAT-DM with GP-VTON VITON-HD HR-VITON GP-VTON 2 4 8 16 32 Number of sampling steps CAT-DM can significantly reduce the number of required sampling steps. CAT-DM can refine and adjust the try-on results generated by pre-trained GAN-based methods. Ablation Experiments on VITON-HD Person Image Masked Image Garment Image GP-VTON GC-DM CAT-DM Inputs GP-VTON PBE MGD LADI-VTON DCI-VTON GC-DM (Ours) CAT-DM (Ours) Extractor Poisson Blending FID KID FID KIDSSIM, LPIPS Process 0.1069 0.835 8.25 2.09 Direct Generation 10.76 2.53 0.854 0.1033 2.42 8.18 10.57 2.59 0.862 0.0988 DINO-V2 Concatenation 1.12 7.11 1.36 0.1111 0.853 Poisson Blending 9.67 1.38 7.90 1.77 0.847 Poisson Blending 10.21 CLIP IP-Adapter Poisson Blending 11.23 SeeCoder . Conclusion 3.90 8.13 2.86 0.1127 Person Masked 9.94 1.66 7.13 1.58 0.856 0.1049 Image Image Garment Image GC-DM with DINO-V2 GC-DM with GC-DM with IP-Adapter CLIP GC-DM with SeeCoder To enhance the controllability of diffusion models in virtual try-on tasks and accelerate the sampling speed of these models, we introduce the CAT-DM. It combines a specially designed try-on model, GC-DM, with a pre-trained GAN model, utilizing an innovative truncation-based acceleration strategy. A substantial number of experiments demonstrate the superiority of our method in terms of image quality, controllability, and sampling speed.