This image shows a detailed academic poster presented at CVPR (Conference on Computer Vision and Pattern Recognition) 2024 in Seattle, WA. The title of the research is "Motion Deblurring with Unknown Modal Spatiotemporal Alignment," authored by Jiyuan Zhang, Shiyan Chen, Yajing Zheng, Zhaofei Yu, and Tiejun Huang from Peking University. The poster includes an in-depth explanation of a multi-stage deblurring approach for images using various components and modalities like Spike Camera data. Diagrams illustrate the comparison of multiple stages, methods, and their respective outcomes in deblurring quality, presented with sample images and quantitative metrics. Contact information and QR codes are provided for further details, emphasizing the theoretical principles and practical applications of their approach. Text transcribed from the image: otion Deblurring with Unknown Modal Spatiotemporal Alignment n Zhang1,2, Shiyan Chen 12, Yajing Zheng1,2,+, Zhaofei Yu1,2,3,4, and Tiejun Huang1,2,3 e, Peking University 2 National Key Laboratory for Multimedia Information Processing, Peking University 3 Institute for Artificial Intelligence, Peking University Stage 2: Bi-Deformable Aligning Enhancement ge 1: Basic Image Deblurring Image-based B₁ Deblur Net L₁ Stage-2 Net Stage 3: Flow-guided Dense Fusion pike[V L₂ Stage-3 Net L3 Spike Camera RGB frame S {īs.} Random Sample Coasre Intensity Estimation K Ms21 {LS.} BIDMA H {L} Light Intensity ADB ADB L2 Flow Net ADB ADB V Voltage TCVPR SEATTLE, WA JUNE 17-21, 2024 Jiyuan Zhang jyzhang@stu.pku.edu.cn Principles of Spike Camera Continuous Photons I(t) {S} K Spike Segments Spikes e a three-stage model named Spike-guided Deblur Net (UaSDN) L₁ FM (BOM), L1, {Silk E [1,2,..., K]}: 0M₂), L2. {Sk [1,2,..., K]}: 0M3). Deformable Aligning Enhancement APs ΔΡΑ BIDMA BIDMA Stage-2 Net BIDMA: Bi-Directional Deformable Modal Align Stage 3: Flow-guided Dense Fusion Channel Attention Residual Dense Block ADB ADB Transform ADB Stage-3 Net ADB: Attention-based Dense Block K coarse light intensity estimation LS = FMs21 (S: 0MS21), Concatenate {k E [1,2,..., K]} and 2, and get the tiled image [tile The U-shape network contains the encoder and decoder. After VSk−Ĺa = Ƒflow (ĽSK, Ĺ2), LS = G(LS VS →Ĺ₂), processing in the Attenion-based + {Ĺk € [1,2,..., K]} 24.02dB 25.76dB 23.15dB Dense Block, the final predicted deblurred images are output. 27.35dB 24.97dB 29.04dB 010101010010001001010101010 Readout Sx,y(c) 0, Vy(t)= if ((c-1)T, CT], Vx,y(t) = 0, otherwise, Vry(t) + Ix,y(t), if V(t) < 0, otherwise, Where Vxy(t) and Vxty (t) denotes the voltage before an receiving the electric current Ix,y (t), c E R. The spike st output with the size of H x W x C after C times readout Tμs (C =). 28.38dB 31.65dB 900-8330-1H DC FAL 49098 9098 9098 9091 AP $21 F+1 26.98dB 25.80dB 26.84dB 28.00dB 26.60dB 30.40dB 33.61dB 36.18dB FA(F, F, AP), Fe(FDCN (F, APs), Up(F+1)), Fe(FL, Fs), FA([FS, FL, APS), Fe(FDCN (FS, APS21), Up(F+1)), Fe(Fs, Ft), 32.68dB 31.84dB 31.96dB 32.83dB 31.44dB 33.50dB 30.03dB 33.28dB Blur Image HINet NAFNet EFNet REFID SpkDeblurNet UaSDN(Ours) Ground Truth SpkDeblurNet UaSDN (Ours) G