Media Summary: In this video, we introduce a novel video object detection framework called D2FANet. D2FANet is the first framework to jointly ... Disentangle-then-Align: Non-Iterative Hybrid Multimodal Image Registration via Cross-Scale Feature Disentanglement. [CVPR 2026 poster] Towards Robust Vision Transformers
Cvpr 2026 Poster Presentation - Detailed Analysis & Overview
In this video, we introduce a novel video object detection framework called D2FANet. D2FANet is the first framework to jointly ... Disentangle-then-Align: Non-Iterative Hybrid Multimodal Image Registration via Cross-Scale Feature Disentanglement. [CVPR 2026 poster] Towards Robust Vision Transformers Hyun Lee, Hyemin Jeong, Yejin Kim, Hyungwook Choi, Hyunsoo Cho, Soo Kyung Kim, Joonseok Lee. A More Word-like Image ... Title: Scene-Centric Unsupervised Video Panoptic Segmentation Authors: Christoph Reich*, Oliver Hahn*, Nikita Araslanov, ... Title:MU-GeNeRF: Multi-view Uncertainty-guided Generalizable Neural Radiance Fields for Distractor-aware Scene ...
DPL: Decoupled Prototype Learning for Enhancing Robustness of Vision–Language Transformers to Missing Modalities ( [CVPR 2026 Highlight] Visual-RRT: Finding Paths toward Visual-Goals via Differentiable Rendering Paper: Project Page: Authors/Affiliations: [Sangwoon ... Adaptive Spatial-Temporal Window: Unlocking the Potential of Event Cameras in Heterogeneous Velocity Scenarios Zhipeng Sui, ... Omni-Attribute encodes a high-fidelity, attribute-specific image