This image showcases an academic poster presentation titled "The Audio-Visual Conversational Graph: From an Egocentric Perspective" from Georgia Institute of Technology, Meta Reality Labs Research, and the University of Illinois Urbana-Champaign. The poster outlines a study focusing on understanding concurrent conversations by modeling both egocentric and exocentric conversational behaviors using audio-visual data. The research aims to facilitate effective and efficient communication by analyzing social states and determining speaking and listening interactions. Specific methodologies, results, and visual representations of the models used in the study are highlighted, demonstrating the integration of conversational attention and multi-channel audio signals in their approach. The poster appears to be part of a larger academic conference, with attendees interacting and discussing the presented findings. Text transcribed from the image: ds (3) LIT 17 Ст00 I The Audio-Visual Conversational Graph: From an Egoce Wenqi Jia 1,2, Miao Liu, Hao Jiang, 2, Ishwarya Ananthabhotla2, James M. Rehg Georgia Institute of Technology, 2 Meta Reality Labs Research, 3 University of Il Motivation Concurrent conversations are common in life ➤ Could be noisy and ambiguous ➤ Capturing social states of participants helps decide which sound source to enhance for whom ➤ Facilitate effective and efficient communication Ego-Exo Conversational Graph Eocentric Behavior Pocentric Behavior Ego-Exocentric Conversational Graph Prediction the first to explore Exocentric conversational interactions from Egocentric videos Jointly modeling talking and listening behaviors Jointly modeling Egocentric and Exocentric behaviors as Method Input Video V Cropped Heads Image Encoder Ny Camera Wearer as Observer (Ego) Humans can understand both Egocentric and Exocentric conversational behaviors Multi-Channel Audio Signal A … Head Positions) cat C cat cat Conv Atten. T cat [ cat [ NA Single Head Cat Audio Encoder Features Pain Conversational Attentio ZHY TN Egocentric Exocentric + = Ego-Exo # of Single Head: N # of Pairwise Heads: C(N,2) *AV-CONV can generalize !to different numbers of N Ego-Exo Directional Edge ➤ For each pair of nodes (c,p;) or (pi,p;), we aim to determine: " If they are Speaking To (S) each other If they are Listening To (L) each other Results in four attributes: ■ For each Egocentric Edge: ■ For each Exocentric Edge: Cross-Time Self-Attn Global-Local Self-Attn TT την TS pj: Subject jannot(c, p.) → Ego Is c Speaking to p Is c Listening to p Is p, Speaking to Pj Is pj Listening to Pi: Subject i annot(p, p) Exc e-piece-piec Is p Speaking to Is p Listening to - - - - Is pj Speaking to c: Camera Wearer Is pj Listening to