Back to Publications

DB-Fusion: Dual-Balanced Sensor Fusion for MultiModal 3D Object Detection

Seungjoon Lee*,Taeyeong Kim*,Useok choi*,Seungjae kim,Chanse Oh,MyeongAh Cho
* Co-first authors
Under review2026

Framework Overview

DB-Fusion: Dual-Balanced Sensor Fusion for MultiModal 3D Object Detection - Framework Overview

Abstract

Fusing LiDAR and camera data in Bird's-Eye-View (BEV) space has become a standard paradigm for 3D object detection in autonomous driving, but it still suffers from two fundamental challenges: (1) Modality-Occupancy Imbalance (MOI) arises when dense image features overwhelm sparse LiDAR points, while (2) Semantic-Occupancy Imbalance (SOI) occurs when vast background regions dilute the signals of foreground objects. Existing fusion frameworks lack a unified mechanism to balance both modalities and semantics, resulting in persistent feature imbalance. We propose DB-Fusion, a Dual-Balanced multimodal fusion framework that jointly mitigates both imbalances through a carefully designed pipeline. To address MOI, we first introduce Modality-Balanced Feature Sampling that aligns cross-modal signal density by augmenting LiDAR-guided image sampling with depth-aware pseudo points, and then apply Geometric-based Feature Fusion that adaptively fuses the balanced features using geometry-aware reliability weights. To tackle SOI, Semantic-Balanced Feature Enhancing partitions the fused BEV map into semantic scene classes and performs class-consistent context and instance-scene interactions to suppress background dominance. Extensive experiments on the nuScenes benchmark demon- strate that our DB-Fusion establishes a new state-of-the-art in mAP, surpassing existing multimodal fusion methods.

Keywords

3D Object DetectionLiDAR-Camera Fusion

Citation