Author(s): Liu, CL (Liu, Changlin); Sun, LJ (Sun, Linjun); Ning, X (Ning, Xin); Xu, J (Xu, Jian); Yu, LA (Yu, Lina); Zhang, KJ (Zhang, Kaijie); Li, WJ (Li, Weijun)
Source: NEURAL NETWORKS Volume: 178 Article Number: 106394
DOI: 10.1016/j.neunet.2024.106394 Published Date: 2024 OCT
Abstract: Stereo matching cost constrains the consistency between pixel pairs. However, the consistency constraint becomes unreliable in ill -posed regions such as occluded or ambiguous regions of the images, making it difficult to explore hidden correspondences. To address this challenge, we introduce an Error -area Feature Refinement Mechanism (EFR) that supplies context features for ill -posed regions. In EFR, we innovatively obtain the suspected error region according to aggregation perturbations, then a simple Transformer module is designed to synthesize global context and correspondence relation with the identified error mask. To better overcome existing texture overfitting, we put forward a Dual -constraint Cost Volume (DCV) that integrates supplementary constraints. This effectively improves the robustness and diversity of disparity clues, resulting in enhanced details and structural accuracy. Finally, we propose a highly accurate stereo matching network called Error -rectify Feature Guided Stereo Matching Network (ERCNet), which is based on DCV and EFR. We evaluate our model on several benchmark datasets, achieving state-of-the-art performance and demonstrating excellent generalization across datasets. The code is available at https://github.com/dean7liu/ERCNet_2023.