The image showcases a detailed research poster presented at the Conference on Computer Vision and Pattern Recognition (CVPR) held from June 18-22, 2024. The poster, identified with number 168, is titled "Modal Lifelong Navigation." It appears to involve substantial work from multiple contributors, including authors whose names are partly visible. The primary focus of the research is the use of image simulator metadata and large language models to generate language goals for navigation tasks. Key findings are illustrated through various graphs and charts, emphasizing efficiency improvements in different navigation methods over time and sensitivity to noise in goal observations. The research demonstrates that skill chaining achieves state-of-the-art (SOTA) results on success rate and SysRep-like Modular Goal-Oriented Navigation (Modular GQN) achieves SOTA on SPL (Success per Length). A selection of process diagrams and graphs visualizes the pipeline for achieving navigation goals, including object goal localization and language goal settings. A small QR code on the top right corner links to further details, perhaps a digital copy of the poster or supplementary materials. The neighboring poster and part of the surrounding conference environment provide context, indicating an academic setting with other attendees visible in the background. Text transcribed from the image: Modal Lifelong Navigation 1, Theophile Gervet?, Matthew Chang³, Zsolt Kira, pozbeh Mottaghi5 age simulator metadata and large language odels (LLM) to generate language goals 40 30 28.3 +6.6% LECTING BEST VIEWPOINT BBOX INFO 20 19.7 Max frame coverage 16.5 10 bed 0 pillow Modular GOAT ed." → BLIP V2 + prompt V "a large bed with a floral comforter" >> with a floral comforter and a pillow in the middle." ge goal generation pipeline OBJ LANG IMG GOAL GOAL GOAL CLIP FEATURE MATCHING SuperGLUE KEYPOINT MATCHING ACTION Success Success 168 Success +4.0% CVPR JUNE 17-21, 2024 Results 32.3 SPL 15.9 Skill chaining achieves SOTA on success rate Modular GOAT achieves SOTA on SPL m.h. 102 Modular CLIP on Wheels 13.1 SenseAct-NN Skill Chain 88 Sersect-N Monolithic Efficiency of SenseAct-NN and Modular method improves over time Looking for full-time roles! Habita G Georgia SFU Tech SOREN FRASER ENVERSTY An Analysis of 30 Stanford 00 Meta Al Maul Kha Dhn B What is HSSD? A dataset of 211 realistic 30 scenes consisting of secretions of houses modeled using a diverse set of 18,556 unique, high-quality 30 models of real objects across 466 semantic categories 211 scenes categories Modular GOAT -Modular GOAT SenseAct-NN Monolithic -SenseAct-NN Monolithic 32 20 15 24 Efficiency of navigation improves 10 for both modular and 16 8 5 0 0 2 3 4 5-10 2 3 4 5-10 Number of sub-tasks Number of sub-tasks With memory Without memory With memory Without memory 32 18 15 end-to-end trained methods End-to-end methods do not show drop in performance 26 24 21 10 10 when long-term 17 16 15 8 0 0 Modular GOAT SenseAct-NIN Monolithic Modular GOAT SenseAct-NN Monolithic memory is disabled Modular methods are more sensitive to noise in goal observations Object goal Language goal Image goal INSTANCE MAP LOCAL POLICY Without noise With noise I Without no it noise With noise Without rose win noise 50 22 (FMM PLANNER) 32 37.5 16.5 AMIC INSTANCE MAPPING GOAL LOCALIZATION Builds explicit map of the environment in combination with path planning for navigation Success 8 24 0 Modular Skill Chain Monolithic GOAT 11 25 0 Modular Skill Chain Monolithic GOAT Modular Skill Chain Moroine GOAT End-to-end trained methods are robust to noise in goal specification Semantic annotations Category hierarchy Category word cloud wall art Maxill