PAPER DIGEST
Most Influential SIGIR 2025 Paper · 2026-03 edition

COHESION: Composite Graph Convolutional Network with Dual-Stage Fusion for Multimodal Recommendation

Jinfeng Xu, Zheyu Chen, Wei Wang, Xiping Hu, Sang-Wook Kim, Edith C. H. Ngai

Venue
ACM SIGIR Conference (SIGIR) 2025
Recognition
Most Influential SIGIR 2025 Paper (Rank No. 11)
Edition
2026-03
Impact factor
3
Certificate ID
37c2e309c2ed3ace

Abstract

Recent works in multimodal recommendations, which leverage diverse modal information to address data sparsity and enhance recommendation accuracy, have garnered considerable interest. Two key processes in multimodal recommendations are modality fusion and representation learning. Previous approaches in modality fusion often employ simplistic attentive or pre-defined strategies at early or late stages, failing to effectively handle irrelevant information among modalities. In representation learning, prior research has constructed heterogeneous and homogeneous graph structures encapsulating user-item, user-user, and item-item relationships to better capture user interests and item profiles. Modality fusion and representation learning were considered as two independent processes in previous work. This paper reveals that these two processes are complementary and can support each other. Specifically, powerful representation learning enhances modality fusion, while effective fusion improves representation quality. Stemming from these two processes, we introduce a COmposite grapH convolutional nEtwork with dual-stage fuSION for the multimodal recommendation, named COHESION. Specifically, it introduces a dual-stage fusion strategy to reduce the impact of irrelevant information, refining all modalities using behavior modality in the early stage and fusing their representations at the late stage. It also proposes a composite graph convolutional network that utilizes user-item, user-user, and item-item graphs to extract heterogeneous and homogeneous latent relationships within users and items. Besides, it introduces a novel adaptive optimization to ensure balanced and reasonable representations across modalities. Extensive experiments on three public datasets demonstrate the significant superiority of COHESION over various competitive baselines.

Download PDF certificate