Abstract: The key challenge of cross-modal salient object detection lies in the representational discrepancy between different modal inputs. Existing methods typically employ only one encoding mode, ...