In the image, a group of people are gathered in a room, seated on the floor in front of a large, dark screen mounted on the wall. The screen displays an image purple background, presumably a presentation or slide show displaying various data and tables. The individuals in the room are focused on the screen, likely listening to a presentation and analyzing the information presented. The scene suggests that the people are engaged in a professional discussion or workshop, possibly related to data processing and analysis based on the content displayed on the screen. Text transcribed from the image: Stable Video Diffusion Scaling latent video diffusion models to large datasets Data Processing and Annotation Table 1. Comparison of our dataset before and after fitering with publicly available research datasets. LVD LVD-F LVD-10M LVD-10M-F WebVid InternVid #Clips 577M 152M 9.8M 2.3M 10.7M 234M Clip Duration (s) 11.58 10.53 12.11 10.99 18.0 11.7 Total Duration (y) 212.09 50.64 3.76. 0.78 5.94 86.80 Mean #Frames Mean Clips/Video Motion Annotations? 325 301 335 320 11.09 4.76 ✓ 1.2 1.1 1.0 32.96 ✓ ✓ X x Blattmann et al., "Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets," 2023. MAXIMUM OCCUPANCY 430-912 Copyright Mike Shou, NUS 85