Paper Digest: IJCAI 2025 Papers & Highlights
Note: IJCAI-2025 accepts more than 1,300 papers, this page only includes 500 of them selected by our daily paper digest algorithm. Interested users can choose to read All 1,300 IJCAI-2025 papers in a separate page, which takes quite some time to load.
To search for papers presented at IJCAI-2025 on a specific topic, please make use of the search by venue (IJCAI-2025) service. To summarize the latest research published at IJCAI-2025 on a specific topic, you can utilize the review by venue (IJCAI-2025) service. If you are interested in browsing papers by author, we have a comprehensive list of ~ 5,200 authors (IJCAI-2025). Additionally, you may want to explore our “Best Paper” Digest (IJCAI), which lists the most influential IJCAI papers since 2003.
As a pioneer in the field since 2018, Paper Digest has curated thousands of such lists, drawing on years of accumulated data across decades of conferences and research topics.To ensure users never miss a breakthrough, our daily digest service sifts through tens of thousands of new papers, clinical trials, news articles, community posts every day – delivering only what matters most to your specific interests. Beyond discovery, Paper Digest offers built-in research tools to help users read articles, write articles, get answers, conduct literature reviews, and generate research reports more efficiently.
Paper Digest Team
New York City, New York, 10017
TABLE 1: Paper Digest: IJCAI 2025 Papers & Highlights
| Paper | Author(s) | |
|---|---|---|
| 1 | Toward Reliable Scientific Hypothesis Generation: Evaluating Truthfulness and Hallucination in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Additionally, the hallucination problem in LLMs can lead to the generation of hypotheses that appear plausible but are ultimately incorrect, undermining their reliability. To facilitate the systematic study of these challenges, we introduce TruthHypo, a benchmark for assessing the capabilities of LLMs in generating truthful scientific hypotheses, and KnowHD, a knowledge-based hallucination detector to evaluate how well hypotheses are grounded in existing knowledge. |
Guangzhi Xiong; Eric Xie; Corey Williams; Myles Kim; Amir Hassan Shariatmadari; Sikun Guo; Stefan Bekiranov; Aidong Zhang; |
| 2 | Exploiting Position Information in Convolutional Kernels for Structural Re-parameterization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we demonstrate that different kernel positions are of different importance, which depends on the task, dataset and architecture, and adaptively emphasizing the informative parts in convolutional kernels can lead to considerable improvement. |
Tianxiang Hao; Hui Chen; Guiguang Ding; |
| 3 | Underground Diagnosis in 3D GPR Data By Learning in CuCoRes Model Space Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes subsurface anomaly detection within the Cubic Correlation Reservoir Network (CuCoRes) model space. |
Xiren Zhou; Shikang Liu; Xinyu Yan; Xiangyu Wang; Huanhuan Chen; |
| 4 | Fault Diagnosis in REDNet Model Space Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In response, this paper introduces FD in the Reservoir-Embedded-Directional Network (REDNet) model space. |
Xiren Zhou; Ziyu Tang; Shikang Liu; Ao Chen; Xiangyu Wang; Huanhuan Chen; |
| 5 | Quantifying The Self-Interest Level of Markov Social Dilemmas Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces a novel method for estimating the self-interest level of Markov social dilemmas.We extend the concept of self-interest level from normal-form games to Markov games, providing a quantitative measure of the minimum reward exchange required to align individual and collective interests.We demonstrate our method on three environments from the Melting Pot suite, representing either common-pool resources or public goods.Our results illustrate how reward exchange can enable agents to transition from selfish to collective equilibria in a Markov social dilemma.This work contributes to multi-agent reinforcement learning by providing a practical tool for analysing complex, multistep social dilemmas.Our findings offer insights into how reward structures can promote or hinder cooperation, with potential applications in areas such as mechanism design. |
Richard Willis; Yali Du; Joel Z. Leibo; Michael Luck; |
| 6 | Diffuse&Refine: Intrinsic Knowledge Generation and Aggregation for Incremental Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, representation confusion between old and new classes leads to catastrophic forgetting. To alleviate this problem, we propose DiffKA, with intrinsic knowledge generated and aggregated by forward and backward diffusion, gradually establishing rigid class boundary. |
Jianzhou Wang; Yirui Wu; Lixin Yuan; Wenxiao Zhang; Jun Liu; Junyang Chen; Huan Wang; Wenhai Wang; |
| 7 | LensNet: An End-to-End Learning Framework for Empirical Point Spread Function Modeling and Lensless Imaging Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose LensNet, an end-to-end deep learning framework that integrates spatial-domain and frequency-domain representations in a unified pipeline. |
Jiesong Bai; Yuhao Yin; Yihang Dong; Xiaofeng Zhang; Chi-Man Pun; Xuhang Chen; |
| 8 | CrossVTON: Mimicking The Logic Reasoning on Cross-Category Virtual Try-On Guided By Tri-Zone Priors Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To endow the model with robust reasoning capabilities for cross-category scenarios, we propose an iterative data constructor. |
Donghao Luo; Yujie Liang; Xu Peng; Xiaobin Hu; Boyuan Jiang; Chengming Xu; Taisong Jin; Chengjie Wang; Yanwei Fu; |
| 9 | UltraModel: A Modeling Paradigm for Industrial Objects Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Second, they fail to fully consider latent relationships within industrial data, limiting the model’s ability to leverage the data and resulting in suboptimal performance. To address these issues, we propose a novel modeling paradigm tailored for MIO tasks, named UltraModel. |
Haoran Yang; Yinan Zhang; Qunshan He; Yuqi Ye; Jing Zhao; Wenhai Wang; |
| 10 | Balancing Imbalance: Data-Scarce Urban Flow Prediction Via Spatio-Temporal Balanced Transfer Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite cross-city transfer learning emerging as a common strategy to address this issue, it overlooks the inherent distribution imbalances within each city, which could potentially hinder the generalization capabilities of pre-trained models. To overcome this limitation, we propose a Spatio-Temporal Balanced Transfer Learning (STBaT) framework to enhance existing spatio-temporal prediction networks, ensuring both universality and precision in predictions for new urban environments. |
Xinyan Hao; Huaiyu Wan; Shengnan Guo; Youfang Lin; |
| 11 | NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce NotaGen, a symbolic music generation model aiming to explore the potential of producing high-quality classical sheet music. |
Yashan Wang; Shangda Wu; Jianhuai Hu; Xingjian Du; Yueqi Peng; Yongxin Huang; Shuai Fan; Xiaobing Li; Feng Yu; Maosong Sun; |
| 12 | A Unifying Perspective on Model Reuse: From Small to Large Pre-Trained Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Well-designed reuse strategies enable models to be adapted beyond their original scope, enhancing both performance and efficiency in target machine learning systems. |
Da-Wei Zhou; Han-Jia Ye; |
| 13 | SpectralGap: Graph-Level Out-of-Distribution Detection Via Laplacian Eigenvalue Gaps Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we observe a significant difference in the relationship between the largest and second-largest eigenvalues of the Laplacian matrix for in-distribution (ID) and OOD graph samples: OOD samples often exhibit anomalous spectral gaps (the difference between the largest and second-largest eigenvalues). |
Jiawei Gu; Ziyue Qiao; Zechao Li; |
| 14 | NeuBM: Mitigating Model Bias in Graph Neural Networks Through Neutral Input Calibration Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce NeuBM (Neutral Bias Mitigation), a novel approach to mitigate model bias in GNNs through neutral input calibration. |
Jiawei Gu; Ziyue Qiao; Xiao Luo; |
| 15 | Human-Centric Foundation Models: Perception, Generation and Agentic Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Recently, Human-centric Foundation Models (HcFMs)—inspired by the success of generalist models such as large language and vision models—have emerged to unify diverse human-centric tasks into a single framework, surpassing traditional task-specific approaches. In this survey, we present a comprehensive overview of HcFMs by proposing a taxonomy that categorizes current approaches into four groups: (1) Human-centric Perception Foundation Models that capture fine-grained features for multi-modal 2D and 3D understanding; (2) Human-centric AIGC Foundation Models that generate high-fidelity, diverse human-related content; (3) Unified Perception and Generation Models that integrate these capabilities to enhance both human understanding and synthesis; and (4) Human-centric Agentic Foundation Models that extend beyond perception and generation to learn human-like intelligence and interactive behaviors for humanoid embodied tasks. |
Shixiang Tang; Yizhou Wang; Lu Chen; Yuan Wang; Sida Peng; Dan Xu; Wanli Ouyang; |
| 16 | Explain It As Simple As Possible, But No Simpler – Explanation Via Model Simplification for Addressing Inferential Gap (Abstract Reprint) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a general framework for generating explanations in the presence of inferential capability gaps. |
Sarath Sreedharan; Siddharth Srivastava; Subbarao Kambhampati; |
| 17 | Deep Learning for Multivariate Time Series Imputation: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this survey, we provide a comprehensive summary of deep learning approaches for multivariate time series imputation (MTSI) tasks. |
Jun Wang; Wenjie Du; Yiyuan Yang; Linglong Qian; Wei Cao; Keli Zhang; Wenjia Wang; Yuxuan Liang; Qingsong Wen; |
| 18 | Artificial Intelligence in Spectroscopy: Advancing Chemistry from Prediction To Generation and Beyond Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this survey, we provide a unified review of SpectraML, systematically examining state-of-the-art approaches for both forward tasks (molecule-to-spectrum prediction) and inverse tasks (spectrum-to-molecule inference). |
Kehan Guo; Yili Shen; Gisela Abigail Gonzalez-Montiel; Yue Huang; Yujun Zhou; Mihir Surve; Zhichun Guo; Payel Das; Nitesh V. Chawla; Olaf Wiest; Xiangliang Zhang; |
| 19 | Enhancing The Logical Reasoning Abilities of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large language models (LLMs) have demonstrated impressive progress in various natural language progress tasks. |
Fengxiang Cheng; |
| 20 | Point Cloud Mixture-of-Domain-Experts Model for 3D Self-supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose to learn a comprehensive Point cloud Mixture-of-Domain-Experts model (Point-MoDE) via a block-to-scene pre-training strategy. |
Yaohua Zha; Tao Dai; Hang Guo; Yanzi Wang; Bin Chen; Ke Chen; Shu-Tao Xia; |
| 21 | Efficient Differentiable Approximation of Generalized Low-rank Regularization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Furthermore, convergence analysis is presented, which rigorously shows that both the bias and the variance of our rank estimator rapidly reduce with increased sample size and iteration steps. In the experimental study, the proposed method is applied to various tasks, which demonstrates its versatility and efficiency. |
Naiqi Li; Yuqiu Xie; Peiyuan Liu; Tao Dai; Yong Jiang; Shu-Tao Xia; |
| 22 | Unsupervised Feature Transformation Via In-context Generation, Generator-critic LLM Agents, and Duet-play Teaming Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing methods fall short in efficient navigation of a vast space of feature combinations, and are mostly designed for supervised settings. To fill this gap, our unique perspective is to leverage a generator-critic duet-play teaming framework using LLM agents and in-context learning to derive pseudo-supervision from unsupervised data. |
Nanxu Gong; Xinyuan Wang; Wangyang Ying; Haoyue Bai; Sixun Dong; Haifeng Chen; Yanjie Fu; |
| 23 | Zero-shot Generalist Graph Anomaly Detection with Unified Neighborhood Prompts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This largely limits their applicability in real-world scenarios. To overcome this limitation, we propose a novel zero-shot generalist GAD approach UNPrompt that trains a one-for-all detection model, requiring the training of one GAD model on a single graph dataset and then effectively generalizing to detect anomalies in other graph datasets without any retraining or fine-tuning. |
Chaoxi Niu; Hezhe Qiao; Changlu Chen; Ling Chen; Guansong Pang; |
| 24 | HARMONY: A Privacy-preserving and Sensor-agnostic Tele-monitoring System Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we posit that the motion dynamics of human movement are invariant across sensing modalities, inspiring the design of HARMONY—a privacy-preserving, sensor-agnostic system that supports multi-modal inputs and diverse tele-monitoring tasks. |
Qipeng Xie; Hao Guo; Weizheng Wang; Yongzhi Huang; Linshan Jiang; Jiafei Wu; Shuxin Zhong; Lu Wang; Kaishun Wu; |
| 25 | Empowering LLMs with Logical Reasoning: A Comprehensive Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To avoid logical contradictions, we discuss concepts and solutions of various logical consistencies, including implication, negation, transitivity, factuality consistencies, and their composites. |
Fengxiang Cheng; Haoxuan Li; Fenrong Liu; Robert van Rooij; Kun Zhang; Zhouchen Lin; |
| 26 | Efficient Dynamic Graphs Learning with Refined Batch Parallel Training Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite these efforts, challenges persist, including imprecise coarse-grained memory loss measurement and ineffective compensation modules. To address these challenges, we propose the Refined Batch parallel Training (RBT) framework, which accurately evaluates intra-batch information loss and optimizes batch partitioning to minimize loss, enhancing the training process’s effectiveness and efficiency. |
Zhengzhao Feng; Rui Wang; Longjiao Zhang; Tongya Zheng; Ziqi Huang; Mingli Song; |
| 27 | A Case for Validation Buffer in Pessimistic Actor-Critic Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we investigate the issue of error accumulation in critic networks updated via pessimistic temporal difference objectives. |
Michał Nauman; Mateusz Ostaszewski; Marek Cygan; |
| 28 | A Survey of Pathology Foundation Model: Progress and Future Directions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this survey, we present a hierarchical taxonomy organizing PFMs through a top-down philosophy applicable to foundation model analysis in any domain: model scope, model pretraining, and model design. |
Conghao Xiong; Hao Chen; Joseph J. Y. Sung; |
| 29 | An Ethical Dataset from Real-World Interactions Between Users and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To investigate the difference, we create Eagle datasets extracted from actual interactions between ChatGPT and users that exhibit social biases, opinion biases, toxicity, and immoral problems. |
Masahiro Kaneko; Danushka Bollegala; Timothy Baldwin; |
| 30 | ShortcutProbe: Probing Prediction Shortcuts for Learning Robust Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose a novel post hoc spurious bias mitigation framework without requiring group labels. |
Guangtao Zheng; Wenqian Ye; Aidong Zhang; |
| 31 | Modality-Fair Preference Optimization for Trustworthy MLLM Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Motivated by our findings, we propose Modality-Fair Preference Optimization (MFPO), which comprises three components: the construction of a multimodal preference dataset in which dispreferred images differ from originals solely in key regions; an image reward loss function encouraging the model to generate answers better aligned with the input images; and an easy-to-hard iterative alignment strategy to stabilize joint modality training. |
Songtao Jiang; Yan Zhang; Ruizhe Chen; Tianxiang Hu; Yeying Jin; Qinglin He; Yang Feng; Jian Wu; Zuozhu Liu; |
| 32 | Language-Based Bayesian Optimization Research Assistant (BORA) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Here, we propose the use of Large Language Models (LLMs) for contextualizing Bayesian optimization (BO) via a hybrid optimization framework that intelligently and economically blends stochastic inference with domain knowledge-based insights from the LLM, which is used to suggest new, better-performing areas of the search space for exploration. |
Abdoulatif Cissé; Xenophon Evangelopoulos; Vladimir V. Gusev; Andrew I. Cooper; |
| 33 | Neuron Similarity-Based Neural Network Verification Via Abstraction and Refinement Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the problem of verifying DNNs has high computational complexity, and existing techniques have limited efficiency, insufficient to deal with large-scale network models. To address this challenge, we propose a novel abstraction-refinement verification method that reduces network size while maintaining verification accuracy. |
Yuehao Liu; Yansong Dong; Liang Zhao; Wensheng Wang; Cong Tian; |
| 34 | Improving Consistency Identification in Task-oriented Dialogue Through Multi-Agent Collaboration Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While these models achieve remarkable progress, they still rely on large amounts of labeled data, which is hard to achieve in real-world scenarios. Motivated by this, in the paper, we aim to explore large language models for CI-ToD, which do not require any training data. |
Peng Wang; Shuo Li; Ruoxi Zhou; Qiguang Chen; Xiao Xu; Hao Fei; Dagang Li; Wanxiang Che; Libo Qin; |
| 35 | FancyVideo: Towards Dynamic and Consistent Video Generation Via Cross-frame Textual Guidance Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Thus, the model’s capacity to comprehend the temporal logic conveyed in prompts and generate videos with coherent motion is restricted. To tackle this limitation, we introduce FancyVideo, an innovative video generator that improves the existing text-control mechanism with the well-designed Cross-frame Textual Guidance Module (CTGM). |
Jiasong Feng; Ao Ma; Jing Wang; Ke Cao; Zhanjie Zhang; |
| 36 | HSRMamba: Contextual Spatial-Spectral State Space Model for Single Hyperspectral Image Super-Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, in HSISR, Mamba faces challenges as transforming images into 1D sequences neglects the spatial-spectral structural relationships between locally adjacent pixels, and its performance is highly sensitive to input order, which affects the restoration of both spatial and spectral details. In this paper, we propose HSRMamba, a contextual spatial-spectral modeling state space model for HSISR, to address these issues both locally and globally. |
Shi Chen; Lefei Zhang; Liangpei Zhang; |
| 37 | GSDNet: Revisiting Incomplete Multimodality-Diffusion Emotion Recognition from The Perspective of Graph Spectrum Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This is because the model assumes a direct relationship between each pair of nodes and ignores local structural features and sparse connections between nodes, thereby significantly reducing the quality of the generated data. Based on the above ideas, we propose a novel Graph Spectral Diffusion Network (GSDNet), which utilizes a low-rank score-based diffusion model to map Gaussian noise to the graph spectral distribution space of missing modalities and recover the missing data according to its original distribution. |
Yuntao Shou; Jun Yao; Tao Meng; Wei Ai; Cen Chen; Keqin Li; |
| 38 | Model-Based Closed-Loop Control Algorithm for Stochastic Partial Differential Equation Control Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: On the other hand, this stochasticity also renders control more unstable and thus less accurate. To address this gap, we propose the Model-Based Closed-Loop Control Algorithm (MB-CC), the first model-based closed-loop control method for SPDEs. |
Peiyan Hu; Haodong Feng; Yue Wang; Zhiming Ma; |
| 39 | Reward Models in Deep Reinforcement Learning: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this survey, we provide a comprehensive review of reward modeling techniques within the RL literature. |
Rui Yu; Shenghua Wan; Yucen Wang; Chen-Xiao Gao; Le Gan; Zongzhang Zhang; De-Chuan Zhan; |
| 40 | Streaming Multi-agent Pathfinding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, this setup is unsuitable in the assembly line scenario, which is periodic with a long working hour. To address this issue, the study formalizes the streaming MAPF (S-MAPF) problem, which assumes that the agents in the same agent stream have a periodic start time and share the same action sequence. |
Mingkai Tang; Lu Gan; Kaichen Zhang; |
| 41 | ListenNet: A Lightweight Spatio-Temporal Enhancement Nested Network for Auditory Attention Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, existing EEG-based AAD methods overlook the spatio-temporal dependencies of EEG signals, limiting their decoding and generalization abilities. To address these issues, this paper proposes a Lightweight Spatio-Temporal Enhancement Nested Network (ListenNet) for AAD. |
Cunhang Fan; Xiaoke Yang; Hongyu Zhang; Ying Chen; Lu Li; Jian Zhou; Zhao Lv; |
| 42 | M3ANet: Multi-scale and Multi-Modal Alignment Network for Brain-Assisted Target Speaker Extraction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In addition, the speech encoder in current models typically uses basic temporal operations (e.g., one-dimensional convolution), which are unable to effectively extract target speaker information. To address these issues, this paper proposes a multi-scale and multi-modal alignment network (M3ANet) for brain-assisted TSE. |
Cunhang Fan; Ying Chen; Jian Zhou; Zexu Pan; Jingjing Zhang; Youdian Gao; Xiaoke Yang; Zhengqi Wen; Zhao Lv; |
| 43 | GraphAD: Interaction Scene Graph for End-to-end Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose the Interaction Scene Graph (ISG) as a unified method to model the interactions among the ego-vehicle, road agents, and map elements. |
Yunpeng Zhang; Deheng Qian; Ding Li; Yifeng Pan; Yong Chen; Zhenbao Liang; Zhiyao Zhang; Yingzong Liu; Jianhui Mei; Maolei Fu; Yun Ye; Zhujin Liang; Yi Shan; Dalong Du; |
| 44 | Free Lunch of Image-mask Alignment for Anomaly Image Generation and Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper aims at generating anomalous images and their segmentation labels to address the lack of real-world anomaly samples and privacy issues. |
Xiangyue Li; Xiaoyang Wang; Zhibin Wan; Quan Zhang; Yupei Wu; Tao Deng; Mingjie Sun; |
| 45 | How to Enable LLM with 3D Capacity? A Survey of Spatial Reasoning in LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this survey, we present a comprehensive review of methods integrating LLMs with 3D spatial understanding. |
Jirong Zha; Yuxuan Fan; Xiao Yang; Chen Gao; Xinlei Chen; |
| 46 | Large Language Models for Causal Discovery: Current Landscape and Future Directions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We systematically analyze approaches that leverage LLMs for CD tasks, highlighting their innovative use of metadata and natural language for causal inference. |
Guangya Wan; Yunsheng Lu; Yuqi Wu; Mengxuan Hu; Sheng Li; |
| 47 | FairGNN-WOD: Fair Graph Learning Without Complete Demographics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, this paper proposes fairGNN-WOD, a first-of-its-kind framework that considers mitigating unfairness in graph learning without using demographic information. |
Zichong Wang; Fang Liu; Shimei Pan; Jun Liu; Fahad Saeed; Meikang Qiu; Wenbin Zhang; |
| 48 | Towards Fairness with Limited Demographics Via Disentangled Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, this paper tackles the pervasive yet overlooked challenge of developing fair machine learning algorithms with limited demographics. |
Zichong Wang; Anqi Wu; Nuno Moniz; Shu Hu; Bart Knijnenburg; Xingquan Zhu; Wenbin Zhang; |
| 49 | Seeing The Unseen: Composing Outliers for Compositional Zero-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, identifying images with unseen compositions is non-trivial, considering that unseen compositions are absent in training and usually contain only subtle differences from seen compositions. In this paper, we propose a novel compositional zero-shot learning method called COMO, which composes outliers in training for distinguishing seen and unseen compositions and further applying specific strategies for them. |
Chenchen Jing; Mingyu Liu; Hao Chen; Yuling Xi; Xingyuan Bu; Dong Gong; Chunhua Shen; |
| 50 | Toward Interpretable Time Series Modeling: A Kernel Representation Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work introduces a kernel representation learning (KRL) perspective, rethinking time series modeling through kernel-induced self-representation to effectively capture temporal structures and dynamic transitions. |
Kunpeng Xu; |
| 51 | Always Clear Depth: Robust Monocular Depth Estimation Under Adverse Weather Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: While existing methods perform well under normal scenarios, their performance declines in adverse weather, due to challenging domain shifts and difficulties in extracting scene information. To address this issue, we present a robust monocular depth estimation method called ACDepth from the perspective of high-quality training data generation and domain adaptation. |
Kui Jiang; Jing Cao; Zhaocheng Yu; Junjun Jiang; Jingchun Zhou; |
| 52 | MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Thus, we introduce component-controllable personalization, a new task that enables users to customize and reconfigure individual components within concepts. |
Donghao Zhou; Jiancheng Huang; Jinbin Bai; Jiaze Wang; Hao Chen; Guangyong Chen; Xiaowei Hu; Pheng-Ann Heng; |
| 53 | A Comprehensive Survey on Physical Risk Control in The Era of Foundation Model-enabled Robotics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, unlike general foundation models, FMRs interact with the physical world, where their actions directly affect the safety of humans and surrounding objects, requiring careful deployment and control. Based on this proposition, our survey comprehensively summarizes robot control approaches to mitigate physical risks by covering all the lifespan of FMRs ranging from pre-deployment to post-accident stage. |
Takeshi Kojima; Yaonan Zhu; Yusuke Iwasawa; Toshinori Kitamura; Gang Yan; Shu Morikuni; Ryosuke Takanami; Alfredo Solano; Tatsuya Matsushima; Akiko Murakami; Yutaka Matsuo; |
| 54 | Global Information Compensation Network for Image Denoising Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although using fully connected layers or increasing network depth can supplement global information, this results in a significant increase in parameters and computational cost. To address these issues, we propose a global information compensation network (GICN) for image denoising in this paper. |
Shifei Ding; Qidong Wang; Lili Guo; |
| 55 | Multimodal Prior Learning with Double Constraint Alignment for Snapshot Spectral Compressive Imaging Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recognizing that textual description contain rich semantic information that can significantly enhance details, this paper introduces a novel framework, CAMM, which integrates text information into the model to improve the performance. |
Mingjin Zhang; Longyi Li; Fei Gao; Qiming Zhang; Jie Guo; |
| 56 | FreqLLM: Frequency-Aware Large Language Models for Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We found that embedding frequency-domain signals smooths weight distributions and enhances structured correlations by clearly separating global trends (low-frequency components) from local variations (high-frequency components). Building on these insights, we propose FreqLLM, a novel framework that integrates frequency-domain semantic alignment into LLMs to refine prompts for improved time series analysis. |
Shunnan Wang; Min Gao; Zongwei Wang; Yibing Bai; Feng Jiang; Guansong Pang; |
| 57 | Image Captioning Evaluation in The Age of Multimodal LLMs: Challenges and Future Perspectives Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This survey provides a comprehensive overview of advancements in image captioning evaluation, analyzing the evolution, strengths, and limitations of existing metrics. We assess these metrics across multiple dimensions, including correlation with human judgment, ranking accuracy, and sensitivity to hallucinations. |
Sara Sarto; Marcella Cornia; Rita Cucchiara; |
| 58 | MMGIA: Gradient Inversion Attack Against Multimodal Federated Learning Via Intermodal Correlation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose MMGIA, an intermodal correlation-driven gradient inversion attack that systematically exploits multimodal correlation to enhance data reconstruction quality. |
Lele Zheng; Yang Cao; Leo Yu Zhang; Wei Wang; Yulong Shen; Xiaochun Cao; |
| 59 | A Survey of Optimization Modeling Meets LLMs: Progress and Future Directions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This survey presents a comprehensive and timely review of recent advancements that cover the entire technical stack, including data synthesis and fine-tuning for the base model, inference frameworks, benchmark datasets, and performance evaluation. |
Ziyang Xiao; Jingrong Xie; Lilin Xu; Shisi Guan; Jingyan Zhu; Xiongwei Han; Xiaojin Fu; WingYin Yu; Han Wu; Wei Shi; Qingcan Kang; Jiahui Duan; Tao Zhong; Mingxuan Yuan; Jia Zeng; Yuan Wang; Gang Chen; Dongxiang Zhang; |
| 60 | Causality-Inspired Disentanglement for Fair Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing approaches often rely on adversarial learning to mitigate dependencies between sensitive attributes and labels but face challenges due to optimisation difficulties. A key limitation lies in neglecting intrinsic causality, which may lead to the entanglement of sensitive and causal factors, discarding causal factors or retaining sensitive factors in the final prediction, especially on unbalanced datasets.To address this issue, we propose a Causality-inspired Disentangled framework for Fair Graph neural networks (CDFG). |
Guixian Zhang; Debo Cheng; Guan Yuan; Shang Liu; Yanmei Zhang; |
| 61 | SAP: Privacy-Preserving Fine-Tuning on Language Models with Split-and-Privatize Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we identify the data privacy leakage risks in MaaS-based PEFT and propose a Split-and-Privatize (SAP) framework, mitigating the privacy leakage by integrating split learning and differential privacy into MaaS PEFT. |
Xicong Shen; Yang Liu; Yi Liu; Peiran Wang; Huiqi Liu; Jue Hong; Bing Duan; Zirui Huang; Yunlong Mao; Ye Wu; Sheng Zhong; |
| 62 | ASCENT-ViT: Attention-based Scale-aware Concept Learning Framework for Enhanced Alignment in Vision Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Most current research focuses on designing model-agnostic, plug-and-play generic concept-based explainability modules that do not incorporate the inner workings of foundation models (e.g., inductive biases, scale invariance, etc.) during training. To alleviate this issue for ViTs, in this paper, we propose ASCENT-ViT, an attention-based, concept learning framework that effectively composes scale and position-aware representations from multiscale feature pyramids and ViT patch representations, respectively. |
Sanchit Sinha; Guangzhi Xiong; Aidong Zhang; |
| 63 | Out-of-Distribution Detection By Regaining Lost Clues (Abstract Reprint) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Accordingly, we propose a novel transformer chain (TC), which comprises a sequence of dependent transformers that iteratively regain discarded labeling information and integrate all the labeling information to enhance OOD detection. |
Zhilin Zhao; Longbing Cao; Philip S. Yu; |
| 64 | ExVideo: Extending Video Diffusion Models Via Parameter-Efficient Post-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose a novel post-tuning methodology for video synthesis models, called ExVideo. |
Zhongjie Duan; Hong Zhang; Wenmeng Zhou; Cen Chen; Yaliang Li; Yu Zhang; Yingda Chen; |
| 65 | TEST-V: TEst-time Support-set Tuning for Zero-shot Video Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we draw on each other’s strengths and propose a novel framework, namely TEst-time Support-set Tuning for zero-shot Video Classification (TEST-V). |
Rui Yan; Jin Wang; Hongyu Qu; Xiaoyu Du; Dong Zhang; Jinhui Tang; Tieniu Tan; |
| 66 | VideoHumanMIB: Unlocking Appearance Decoupling for Video Human Motion In-betweening Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose VideoHumanMIB, a novel framework for Video Human Motion In-betweening that enables seamless transitions between different motion video clips, facilitating the generation of longer and more natural digital human videos. |
Haiwei Xue; Zhensong Zhang; Minglei Li; Zonghong Dai; Fei Yu; Fei Ma; Zhiyong Wu; |
| 67 | CADS: A Systematic Literature Review on The Challenges of Abstractive Dialogue Summarization (Abstract Reprint) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This article summarizes the research on Transformer-based abstractive summarization for English dialogues by systematically reviewing 1262 unique research papers published between 2019 and 2024, relying on the Semantic Scholar and DBLP databases. We cover the main challenges present in dialog summarization (i.e., language, structure, comprehension, speaker, salience, and factuality) and link them to corresponding techniques such as graph-based approaches, additional training tasks, and planning strategies, which typically overly rely on BART-based encoder-decoder models. |
Frederic Kirstein; Jan Philip Wahle; Bela Gipp; Terry Ruas; |
| 68 | DUQ: Dual Uncertainty Quantification for Text-Video Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel Dual Uncertainty Quantification (DUQ) model that separately handles uncertainties in intra-pair interaction and inter-pair exclusion. |
Xin Liu; Shibai Yin; Jun Wang; Jiaxin Zhu; Xingyang Wang; Yee-Hong Yang; |
| 69 | A Multi-Granularity Clustering Approach for Federated Backdoor Defense with The Adam Optimizer Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing defense strategies often overlook the presence of non-stationary objectives and noisy gradients across multiple clients, making it challenging to accurately and efficiently identify malicious participants. To address these challenges, we propose a backdoor defense method for Federated Learning with Adam optimizer and multi-granularity Clustering (FLAC), incorporating both coarse-grained and fine-grained clustering mechanisms to neutralize backdoor attacks. |
Jidong Yuan; Qihang Zhang; Naiyue Chen; Shengbo Chen; Baomin Xu; |
| 70 | Dual-Perspective United Transformer for Object Segmentation in Optical Remote Sensing Images Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For that, we propose a novel Dual-Perspective United Transformer (DPU-Former) with a unique structure designed to simultaneously integrate long-range dependencies and spatial details. |
Yanguang Sun; Jiexi Yan; Jianjun Qian; Chunyan Xu; Jian Yang; Lei Luo; |
| 71 | Do Mentioned Items Truly Matter? Enhancing Conversational Recommender Systems with Causal Intervention and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Furthermore, existing systems often overlook the impact of confounding variables during user interactions, leading to suboptimal recommendations. In this work, we propose a novel hybrid framework that integrates large language models (LLMs) with traditional recommendation techniques to address these limitations. |
Lingzhi Wang; Xingshan Zeng; Kam-Fai Wong; |
| 72 | INT: Instance-Specific Negative Mining for Task-Generic Promptable Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, when VLMs struggle to generalise to some image instances, predicting instance-specific prompts becomes poor. To solve this problem, we introduce Instance-specific Negative Mining for Task-Generic Promptable Segmentation (INT). |
Jian Hu; Zixu Cheng; Shaogang Gong; |
| 73 | Denoise-then-Retrieve: Text-Conditioned Video Denoising for Video Moment Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose a denoise-then-retrieve paradigm that explicitly filters text-irrelevant clips from videos and then retrieves the target moment using purified multimodal representations. |
Weijia Liu; Jiuxin Cao; Bo Miao; Zhiheng Fu; Xuelin Zhu; Jiawei Ge; Bo Liu; Mehwish Nasim; Ajmal Mian; |
| 74 | Federated Low-Rank Adaptation for Foundation Models: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Effectively leveraging private datasets remains a significant challenge in developing foundation models. |
Yiyuan Yang; Guodong Long; Qinghua Lu; Liming Zhu; Jing Jiang; Chengqi Zhang; |
| 75 | Directing Mamba to Complex Textures: An Efficient Texture-Aware State Space Model for Image Restoration Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these methods often struggle to effectively model long-range dependencies and largely overlook the spatial characteristics of image degradation (regions with richer textures tend to suffer more severe damage), making it hard to achieve the best trade-off between restoration quality and efficiency. To address these issues, we propose a novel texture-aware image restoration method, TAMambaIR, which simultaneously perceives image textures and achieves a trade-off between performance and efficiency. |
Long Peng; Xin Di; Zhanfeng Feng; Wenbo Li; Renjing Pei; Yang Wang; Xueyang Fu; Yang Cao; Zheng-Jun Zha; |
| 76 | Deconfounding Multi-Cause Latent Confounders: A Factor-Model Approach to Climate Model Bias Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a novel bias correction approach to utilize both GCM and observational data to learn a factor model that captures multi-cause latent confounders. |
Wentao Gao; Jiuyong Li; Debo Cheng; Lin Liu; Jixue Liu; Thuc Le; Xiaojing Du; Xiongren Chen; Yun Chen; Yanchang Zhao; |
| 77 | FCKT: Fine-Grained Cross-Task Knowledge Transfer with Semantic Contrastive Learning for Targeted Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we address the task of targeted sentiment analysis , which involves two sub-tasks, i.e., identifying specific aspects from reviews and determining their corresponding senti-ments. |
Wei Chen; Zhao Zhang; Meng Yuan; Kepeng Xu; Fuzhen Zhuang; |
| 78 | Toward Robust Non-Transferable Learning: A Survey and Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: While numerous methods have been proposed in this field, a comprehensive review of existing progress and a thorough analysis of current limitations remain lacking. In this paper, we bridge this gap by presenting the first comprehensive survey on NTL and introducing NTLBench, the first benchmark to evaluate NTL performance and robustness within a unified framework. |
Ziming Hong; Yongli Xiang; Tongliang Liu; |
| 79 | In-Context Meta LoRA Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, existing parameter generation methods fail to capture the correlations among these tasks, making multi-task LoRA parameter generation challenging. To address these limitations, we propose In-Context Meta LoRA (ICM-LoRA), a novel approach that efficiently achieves task-specific customization of large language models (LLMs). |
Yihua Shao; Minxi Yan; Yang Liu; Siyu Chen; Wenjie Chen; Xinwei Long; Ziyang Yan; Lei Li; Chenyu Zhang; Nicu Sebe; Hao Tang; Yan Wang; Hao Zhao; Mengzhu Wang; Jingcai Guo; |
| 80 | Do You Steal My Model? Signature Diffusion Embedded Dual-Verification Watermarking for Protecting Intellectual Property of Hyperspectral Image Classification Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the existing model watermarking methods for RGB image classification models ignore the complexity of ground objects and high dimension of HSIs, which makes trigger samples easy to be detected and forged. To address this problem, we propose a signature diffusion embedded dual-verification watermarking method, which generates imperceptible trigger samples with explicit owner information to achieve dual verification of both model ownership and legality of trigger set. |
Yufei Yang; Song Xiao; Lixiang Li; Wenqian Dong; Jiahui Qu; |
| 81 | Physical Adversarial Camouflage Through Gradient Calibration and Regularization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing techniques struggle with variable physical environments, facing two main challenges: 1) inconsistent sampling point densities across distances hinder the gradient optimization from ensuring local continuity, and 2) updating texture gradients from multiple angles causes conflicts, reducing optimization stability and attack effectiveness. To address these issues, we propose a novel adversarial camouflage framework based on gradient optimization. |
Jiawei Liang; Siyuan Liang; Jianjie Huang; Chenxi Si; Ming Zhang; Xiaochun Cao; |
| 82 | EyeSeg: An Uncertainty-Aware Eye Segmentation Framework for AR/VR Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce EyeSeg, a novel eye segmentation framework designed to overcome key challenges that existing approaches struggle with: motion blur, eyelid occlusion, and train-test domain gaps. |
Zhengyuan Peng; Jianqing Xu; Shen Li; Jiazhen Ji; Yuge Huang; Jingyun Zhang; Jinmin Li; Shouhong Ding; Rizen Guo; Xin Tan; Lizhuang Ma; |
| 83 | DiffECG: Diffusion Model-Powered Label-Efficient and Personalized Arrhythmia Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose DiffECG, a diffusion-based self-supervised learning framework for label-efficient and personalized arrhythmia detection. |
Tianren Zhou; Zhenge Jia; Dongxiao Yu; Zhaoyan Shen; |
| 84 | How to Teach Programming in The AI Era? Using LLMs As A Teachable Agent for Debugging (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In Computer Science education, as LLMs are widely recognized as "AI pair programmers," it becomes increasingly important to train students on evaluating and debugging LLM-generated codes. In this work, we introduce HypoCompass, a novel system to facilitate deliberate practice on debugging, where human novices play the role of Teaching Assistants and help LLM-powered teachable agents debug code.We enable effective task delegation between students and LLMs in this learning-by-teaching environment: students focus on hypothesizing the cause of code errors, while adjacent skills like code completion are offloaded to LLM-agents. |
Qianou Ma; Hua Shen; Ken Koedinger; Tongshuang Wu; |
| 85 | A Smart Multimodal Healthcare Copilot with Powerful LLM Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: Misdiagnosis causes significant harm to healthcare systems worldwide, leading to increased costs and patient risks. MedRAG is a smart multimodal healthcare copilot equipped with … |
Xuejiao Zhao; Siyan Liu; Su-Yin Yang; Chunyan Miao; |
| 86 | Drafting and Revision: Advancing High-Fidelity Video Inpainting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Inspired by the common corrupted painting restoration process of “drawing a draft first and then revising the details later”, this paper proposes a Drafting-and-Revision Completion Network (DRCN) for video inpainting. |
Zhiliang Wu; Kun Li; Hehe Fan; Yi Yang; |
| 87 | Interpreting Pretrained Language Models Via Concept Bottlenecks (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This extended abstract introduces C3M (ChatGPT-guided Concept augmentation with Concept-level Mixup), a novel framework for training Concept-Bottleneck-Enabled PLMs (CBE-PLMs). |
Zhen Tan; Lu Cheng; Song Wang; Yuan Bo; Jundong Li; Huan Liu; |
| 88 | KIPPO: Koopman-Inspired Proximal Policy Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present Koopman-Inspired Proximal Policy Optimization (KIPPO), which learns an approximately linear latent-space representation of the underlying system’s dynamics while retaining essential features for effective policy learning. |
Andrei Cozma; Landon Harris; Hairong Qi; |
| 89 | Conditional Denoising Meets Polynomial Modeling: A Flexible Decoupled Framework for Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, to model complicated temporal patterns, we propose a Conditional Denoising Polynomial Modeling (CDPM) framework, where probabilistic diffusion models and deterministic linear models are trained end-to-end. |
Jintao Zhang; Mingyue Cheng; Xiaoyu Tao; Zhiding Liu; Daoyu Wang; |
| 90 | A Cross-Modal Densely Guided Knowledge Distillation Based on Modality Rebalancing Strategy for Enhanced Unimodal Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This limits the practical application of multimodal networks. To address this issue, this paper proposes a cross-modal knowledge distillation framework for emotion recognition. |
Shuang Wu; Heng Liang; Yong Zhang; Yanlin Chen; Ziyu Jia; |
| 91 | Where Does This Data Come From? Enhanced Source Inference Attacks in Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present an enhanced source inference attack that demonstrates how a malicious server can amplify behavioral differences between clients to more accurately infer data origin. |
Haiyang Chen; Xiaolong Xu; Xiang Zhu; Xiaokang Zhou; Fei Dai; Yansong Gao; Xiao Chen; Shuo Wang; Hongsheng Hu; |
| 92 | ARMR: Adaptively Responsive Network for Medication Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing methods often struggle to effectively balance the reuse of historical medications with the introduction of new drugs in response to the changing patient conditions. In order to address this challenge, we propose an Adaptively Responsive network for Medication Recommendation (ARMR), a new method which incorporates 1) a piecewise temporal learning component that distinguishes between recent and distant patient history, enabling more nuanced temporal understanding, and 2) an adaptively responsive mechanism that dynamically adjusts attention to new and existing drugs based on the patient’s current health state and medication history. |
Feiyue Wu; Tianxing Wu; Shenqi Jing; |
| 93 | Tensor Network: from The Perspective of AI4Science and Science4AI Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Tensor network has been a promising numerical tool for computational problems across science and AI. For their emerging and fast development especially in the intersection between AI and science, this paper tries to present a compact review, regarding both their applications and its own recent technical development including open-source tools. |
Junchi Yan; Yehui Tang; Xinyu Ye; Hao Xiong; Xiaoqiu Zhong; Yuhan Wang; Yuan Qi; |
| 94 | HygMap: Representing All Types of Map Entities Via Heterogeneous Hypergraph Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel method, HygMap, to represent all map entity types. |
Yifan Yang; Jingyuan Wang; Xie Yu; Yibang Tang; |
| 95 | A Hybrid Multi-Factor Network with Dynamic Sequence Modeling for Early Warning of Intraoperative Hypotension Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose a Hybrid Multi-Factor (HMF) network that formulates IOH prediction as a dynamic sequence forecasting task, explicitly capturing both temporal dependencies and physiological non-stationarity. |
Mingyue Cheng; Jintao Zhang; Zhiding Liu; Chunli Liu; |
| 96 | An Empirical Study of Federated Prompt Learning for Vision Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The Vision Language Model (VLM) excels in aligning vision and language representations, and prompt learning has emerged as a key technique for adapting such models to downstream tasks. |
Zhihao Wang; Wenke Huang; Tian Chen; Zekun Shi; Guancheng Wan; Yu Qiao; Bin Yang; Jian Wang; Bing Li; Mang Ye; |
| 97 | Balancing User-Item Structure and Interaction with Large Language Models and Optimal Transport for Multimedia Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, the sparsity and imbalance of user-item interactions further hinder effective representation learning. To address these challenges, we propose a framework called BLAST, which balances structures and interactions via large language models and optimal transport for multimodal recommendation. |
Haodong Li; Lianyong Qi; Weiming Liu; Xiaolong Xu; Wanchun Dou; Yang Cao; Xuyun Zhang; Amin Beheshti; Xiaokang Zhou; |
| 98 | RAMer: Reconstruction-based Adversarial Model for Multi-party Multi-modal Multi-label Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Existing approaches also tend to unify heterogeneous modalities into a single representation, overlooking each modality’s unique characteristics. To address these challenges, we propose RAMer (Reconstruction-based Adversarial Model for Emotion Recognition), which refines multi-modal representations by not only exploring modality commonality and specificity but crucially by leveraging reconstructed features, enhanced by contrastive learning, to overcome data incompleteness and enrich feature quality. |
Xudong Yang; Yizhang Zhu; Hanfeng Liu; Zeyi Wen; Nan Tang; Yuyu Luo; |
| 99 | Beyond Low-rankness: Guaranteed Matrix Recovery Via Modified Nuclear Norm Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we introduce a new modified nuclear norm (MNN) framework, where the MNN family norms are defined by adopting suitable transformations and performing the NN on the transformed matrix. |
Jiangjun Peng; Yisi Luo; Xiangyong Cao; Shuang Xu; Deyu Meng; |
| 100 | NAAST-GNN: Neighborhood Adaptive Aggregation and Spectral Tuning for Graph Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To tackle the aforementioned challenges, we propose a novel Graph Neural Network model called Neighborhood Adaptive Aggregation and Spectral Tuning (NAAST-GNN). |
Ronghui Guo; Xiaowang Zhang; Zhizhi Yu; Minghui Zou; Sai Zhang; Zhiyong Feng; |
| 101 | A First Runtime Analysis of NSGA-III on A Many-Objective Multimodal Problem: Provable Exponential Speedup Via Stochastic Population Update Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper addresses this point and conducts a rigorous runtime analysis of NSGA-III on the many-objective OneJumpZeroJump benchmark (OJZJ for short), providing runtime bounds where the number of objectives is constant. We show that NSGA-III finds the Pareto front of OJZJ in time O(n^(k+d/2)+ N n ln(n)) where n is the problem size, d is the number of objectives, k is the gap size, a problem specific parameter, if its population size N is in 2^(O(n)) and at least (2n/d+1)^(d/2). |
Andre Opris; |
| 102 | MaskDGNN: Self-Supervised Dynamic Graph Neural Networks with Activeness-aware Temporal Masking Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing methods extracting critical information from dynamic graphs face two key challenges, either overlooking the negative impact of redundant information or struggling in addressing the distribution shifting issue in dynamic graphs. To address these challenges, we propose MaskDGNN, a novel dynamic GNN architecture that consists of two modules: First, self-supervised activeness-aware temporal masking mechanism selectively retains edges between highly active nodes while masking those with low activeness, effectively reducing redundancy. |
Yiming He; Xiang Li; Zhongying Zhao; Haobing Liu; Peilan He; Yanwei Yu; |
| 103 | Deep Reinforcement Learning for Efficient and Fair Allocation of Healthcare Resources Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a transformer-based deep Q-network that integrates individual patient disease progression and interaction effects among patients to enhance allocation decisions. |
Yikuan Li; Chengsheng Mao; Kaixuan Huang; Hanyin Wang; Zheng Yu; Mengdi Wang; Yuan Luo; |
| 104 | Collaborative Multi-LoRA Experts with Achievement-based Multi-Tasks Loss for Unified Multimodal Information Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: C-LoRAE extends the low-rank adaptation (LoRA) method by incorporating a universal expert to learn shared multimodal knowledge from cross-MIE tasks and task-specific experts to learn specialized instructional task features. |
Li Yuan; Yi Cai; Xudong Shen; Qing Li; Qingbao Huang; Zikun Deng; Tao Wang; |
| 105 | Variational Graph Auto-Encoder Driven Graph Enhancement for Sequential Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a Variational Graph Auto-Encoder driven Graph Enhancement (VGAE-GE) method for robust augmentation in sequential recommendation. |
Yuwen Liu; Lianyong Qi; Xingyuan Mao; Weiming Liu; Shichao Pei; Fan Wang; Xuyun Zhang; Amin Beheshti; Xiaokang Zhou; |
| 106 | Horae: A Domain-Agnostic Language for Automated Service Regulation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper presents Horae, a unified specification language for modeling (multimodal) regulation rules across a diverse set of domains. |
Yutao Sun; Mingshuai Chen; Tiancheng Zhao; Kangjia Zhao; He Li; Jintao Chen; Zhongyi Wang; Liqiang Lu; Xinkui Zhao; Shuiguang Deng; Jianwei Yin; |
| 107 | Rethinking Graph Contrastive Learning Through Relative Similarity Preservation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This discovery reveals that graphs naturally encode relative similarity patterns, where structurally closer nodes exhibit collectively stronger semantic relationships. Leveraging this insight, we propose RELGCL, a novel GCL framework with complementary pairwise and listwise implementations that preserve these inherent patterns through collective similarity objectives. |
Zhiyuan Ning; Pengfei Wang; Ziyue Qiao; Pengyang Wang; Yuanchun Zhou; |
| 108 | Reliable and Calibrated Semantic Occupancy Prediction By Hybrid Uncertainty Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite the gradual alignment of camera-based models with LiDAR in terms of accuracy, a significant reliability gap still persists. To address this concern, we propose ReliOcc, a method designed to enhance the reliability of camera-based occupancy networks. |
Song Wang; Zhongdao Wang; Jiawei Yu; Wentong Li; Bailan Feng; Junbo Chen; Jianke Zhu; |
| 109 | Towards Robust Deterministic and Probabilistic Modeling for Predictive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Predictive modeling of unannotated spatiotemporal data presents inherent challenges, primarily due to the highly entangled visual dynamics in real-world scenes. To tackle these complexities, we introduce a novel insight through Disentangling Deterministic and Probabilistic (DDP) modeling. |
Xuesong Nie; Haoyuan Jin; Vijayakumar Bhagavatula; Xiaofeng Liu; |
| 110 | Sharpness-aware Zeroth-order Optimization for Graph Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, direct integration of ZOO incurs considerable challenges due to the sharp loss landscape and steep gradients within the GT parameter space. Under the above observations, we propose a Sharpness-aware Zeroth-order Optimizer (SZO) that combines Sharpness-Aware Minimization (SAM) technique facilitating convergence within a flatter neighborhood, and leverages parallel computing for efficient gradient estimation. |
Yang Liu; Chuan Zhou; Yuhan Lin; Shuai Zhang; Yang Gao; Zhao Li; Shirui Pan; |
| 111 | Trace: Structural Riemannian Bridge Matching for Transferable Source Localization in Information Propagation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In light of the issues above, we propose to study transferable source localization from a fresh geometric perspective, and present a novel approach (Trace) on the Riemannian manifold. |
Li Sun; Suyang Zhou; Bowen Fang; Hechuan Zhang; Junda Ye; Yutong Ye; Philip S. Yu; |
| 112 | Counterfactual Strategies for Markov Decision Processes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We encode such counterfactual strategies as solutions to non-linear optimization problems, and further extend our encoding to synthesize diverse counterfactual strategies. We evaluate our approach on four real-world datasets and demonstrate its practical viability in sophisticated sequential decision-making tasks. |
Paul Kobialka; Lina Gerlach; Francesco Leofante; Erika Ábrahám; Silvia Lizeth Tapia Tarifa; Einar Broch Johnsen; |
| 113 | Towards The Terminator Economy: Assessing Job Exposure to AI Through LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Many researchers have been working on estimating if and to what extent jobs and tasks are exposed to the risk of being automatized by AI-related technologies. Our work tackles this issue through a data-driven approach by:(i) developing a reproducible framework that uses cutting-edge open-source large language models to assess the current capabilities of AI and robotics in performing job-related tasks;(ii) formalizing and computing a measure of AI exposure by occupation, the Task Exposure to AI (TEAI) index, and a measure of Task Replacement by AI (TRAI) index, both validated through a human user evaluation and compared with the state-of-the-art. |
Emilio Colombo; Fabio Mercorio; Mario Mezzanzanica; Antonio Serino; |
| 114 | DFMU: Distribution-based Framework for Modeling Aleatoric Uncertainty in Multimodal Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To specifically address them, we propose DFMU, a Distribution-based Framework for Modeling Aleatoric Uncertainty, which incorporates an uncertainty modeling block capable of encoding uncertainty distributions and adaptively adjusting optimization objectives. |
Chen Tang; Tingrui Shen; Xinrong Gong; Chong Zhao; Tong Zhang; |
| 115 | Spatially Resolved Transcriptomics Data Clustering with Tailored Spatial-scale Modulation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing methods typically construct spatial graphs using a static radius based on spatial coordinates, which hinders the accurate identification of spatial domains and complicates the precise partitioning of boundary nodes within clusters. To address this issue, we introduce a novel spatially resolved transcriptomics data clustering network (TSstc). |
Yuang Xiao; Yanran Zhu; Chang Tang; Xiao Zheng; Yuanyuan Liu; Kun Sun; Xinwang Liu; |
| 116 | SCOUT: Semi-supervised Camouflaged Object Detection By Utilizing Text and Adaptive Data Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we introduce a Semi-supervised Camouflaged Object Detection by Utilizing Text and Adaptive Data Selection (SCOUT). |
Weiqi Yan; Lvhai Chen; Shengchuan Zhang; Yan Zhang; Liujuan Cao; |
| 117 | Electron Density-enhanced Molecular Geometry Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose an efficient Electronic Density representation framework to enhance molecular Geometric learning (called EDG), which leverages images rendered from ED to boost molecular geometric representations in MLFF. |
Hongxin Xiang; Jun Xia; Xin Jin; Wenjie Du; Li Zeng; Xiangxiang Zeng; |
| 118 | A Generalized Diffusion Framework with Learnable Propagation Dynamics for Source Localization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing methods only achieve satisfactory performance within a specific propagation model, which restricts their applicability and generalizability across different scenarios. To address this, we propose a Generalized Diffusion Framework for Source Localization (GDFSL), which enhances probabilistic diffusion models to flexibly capture the underlying dynamics of various propagation scenarios. |
Dongpeng Hou; Yuchen Wang; Chao Gao; Xianghua Li; |
| 119 | Combining Code Generating Large Language Models and Self-Play to Iteratively Refine Strategies in Games Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a self-play approach to generating strategies for playing in multi-player games, where strategies are represented as computer code. |
Yoram Bachrach; Edan Toledo; Karen Hambardzumyan; Despoina Magka; Martin Josifoski; Minqi Jiang; Jakob Foerster; Roberta Raileanu; Tatiana Shavrina; Nicola Cancedda; Avraham Ruderman; Katie Millican; Andrei Lupu; Rishi Hazra; |
| 120 | Localizing Before Answering: A Benchmark for Grounded Medical Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To improve visual reasoning, we propose the Localize-before-Answer (LobA) framework, which trains LMMs to localize target regions of interest and self-prompt to emphasize segmented pathological areas, generating grounded and reliable answers. |
Dung Nguyen; Minh Khoi Ho; Huy Ta; Thanh Tam Nguyen; Qi Chen; Kumar Rav; Quy Duong Dang; Satwik Ramchandre; Son Lam Phung; Zhibin Liao; Minh-Son To; Johan Verjans; Phi Le Nguyen; Vu Minh Hieu Phan; |
| 121 | RPMIL: Rethinking Uncertainty-Aware Probabilistic Multiple Instance Learning for Whole Slide Pathology Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we rethink probabilistic modeling in MIL and propose RPMIL, an uncertainty-aware probabilistic MIL method for whole slide pathology diagnosis. |
Zhikang Zhao; Kaitao Chen; Jing Zhao; |
| 122 | Enhancing Counterfactual Estimation: A Focus on Temporal Treatments Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Therefore, highlighting the role of temporal treatments within the model is crucial for accurate counterfactual estimation, which is often overlooked in current methods. To address this, we employ Koopman theory, known for its capability to model complex dynamic systems, and introduce a novel model named the Counterfactual Temporal Dynamics Network via Neural Koopman Operators (CTD-NKO). |
Xin Wang; Shengfei Lyu; Kangyang Luo; Lishan Yang; Huanhuan Chen; Chunyan Miao; |
| 123 | Multimodal Image Matching Based on Cross-Modality Completion Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Deep learning-based matching methods struggle with multimodal images due to the lack of large annotated multimodal datasets. To address these challenges, we propose XCP-Match based on cross-modality completion pre-training. |
Meng Yang; Fan Fan; Jun Huang; Yong Ma; Xiaoguang Mei; Zhanchuan Cai; Jiayi Ma; |
| 124 | T2S: High-resolution Time Series Generation with Text-to-Series Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We train T2S in an interleaved paradigm across multiple lengths, allowing it to generate sequences of arbitrary lengths. |
Yunfeng Ge; Jiawei Li; Yiji Zhao; Haomin Wen; Zhao Li; Meikang Qiu; Hongyan Li; Ming Jin; Shirui Pan; |
| 125 | Instance Relation Learning Network with Label Knowledge Propagation for Few-shot Multi-label Intent Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these methods rely on representation classification and ignore instance relations, leading to error propagation. To solve the above issues, we propose a multi-label joint learning method for few-shot MID in an end-to-end manner, which constructs an instance relation learning network with label knowledge propagation to eliminate error propagation. |
Shiman Zhao; Shangyuan Li; Wei Chen; Tengjiao Wang; Jiahui Yao; Jiabin Zheng; Kam-Fai Wong; |
| 126 | Efficient Dynamic Ensembling for Multiple LLM Experts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose an efficient Dynamic Ensemble Reasoning paradigm, called DER to integrate the strengths of multiple LLM experts conditioned on dynamic inputs. |
Jinwu Hu; Yufeng Wang; Shuhai Zhang; Kai Zhou; Guohao Chen; Yu Hu; Bin Xiao; Mingkui Tan; |
| 127 | A Theoretical Perspective on Why Stochastic Population Update Needs An Archive in Evolutionary Multi-objective Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we theoretically show that using an archive allows a small population and may enhance the search performance of SPU-based MOEAs. |
Shengjie Ren; Zimin Liang; Miqing Li; Chao Qian; |
| 128 | CADP: Towards Better Centralized Learning for Decentralized Execution in MARL Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a novel Centralized Advising and Decentralized Pruning (CADP) framework for MARL, that not only enables an efficacious message exchange among agents during training but also guarantees the independent policies for decentralized execution. |
Yihe Zhou; Shunyu Liu; Yunpeng Qing; Tongya Zheng; Kaixuan Chen; Jie Song; Mingli Song; |
| 129 | OT-DETECTOR: Delving Into Optimal Transport for Zero-shot Out-of-Distribution Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While zero-shot OOD detection, which requires no training on in-distribution (ID) data, has become feasible with the emergence of vision-language models like CLIP, existing methods primarily focus on semantic matching and fail to fully capture distributional discrepancies. To address these limitations, we propose OT-DETECTOR, a novel framework that employs Optimal Transport (OT) to quantify both semantic and distributional discrepancies between test samples and ID labels. |
Yu Liu; Hao Tang; Haiqi Zhang; Jing Qin; Zechao Li; |
| 130 | Graph Random Walk with Feature-Label Space Alignment: A Multi-Label Feature Selection Method Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, linear decomposition struggles to capture complex nonlinear associations and may lead to misalignment between the feature space and the label space. To address these two critical challenges, we propose innovative solutions. |
Wanfu Gao; Jun Gao; Qingqi Han; Hanlin Pan; Kunpeng Liu; |
| 131 | Noise-Resistant Label Reconstruction Feature Selection for Partial Multi-Label Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, a PML feature selection method is proposed considering two important characteristics of dataset: label relationship’s noise-resistance and label connectivity. |
Wanfu Gao; Hanlin Pan; Qingqi Han; Kunpeng Liu; |
| 132 | Two-Stage Feature Generation with Transformer and Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although automated feature generation techniques address these issues to some extent, they often face challenges such as feature redundancy, inefficiency in feature space exploration, and limited adaptability to diverse datasets and tasks. To address these problems, we propose a Two-Stage Feature Generation (TSFG) framework, which integrates a Transformer-based encoder-decoder architecture with Proximal Policy Optimization (PPO). |
Wanfu Gao; Zengyao Man; Zebin He; Yuhao Tang; Jun Gao; Kunpeng Liu; |
| 133 | Dual-Agent Reinforcement Learning for Automated Feature Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Third, there are significant differences between discrete and continuous features in tabular data, requiring different operations for each type. To address these challenges, we propose a novel dual-agent reinforcement learning method for feature generation. |
Wanfu Gao; Zengyao Man; Hanlin Pan; Kunpeng Liu; |
| 134 | Federated Domain Generalization with Decision Insight Matrix Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel approach FedDIM, which leverages the concept of “insight matrix” – a fine-grained representation of the model’s decision-making process derived from element-wise products between feature vectors and classifier weights. |
Tianchi Liao; Binghui Xie; Lele Fu; Sheng Huang; Bowen Deng; Chuan Chen; Zibin Zheng; |
| 135 | DeepShade: Enable Shade Simulation By Text-conditioned Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, current routing systems (e.g., online maps) fail to incorporate shade information due to the difficulty of estimating shades directly from noisy satellite imagery and the limited availability of training data for generative models. In this paper, we address these challenges through two main contributions. First, we build an extensive dataset covering diverse longitude-latitude regions, varying levels of building density, and different urban layouts. |
Longchao Da; Xiangrui Liu; Mithun Shivakoti; Thirulogasankar Pranav Kutralingam; Yezhou Yang; Hua Wei; |
| 136 | GE-Chat: A Graph Enhanced RAG Framework for Evidential Response Generation of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This issue is exacerbated by hallucinated responses, which are frequently presented with convincing but incorrect explanations, leading to trust concerns among users. To address this challenge, we propose GE-Chat, a knowledge Graph-enhanced retrieval-augmented generation framework designed to deliver Evidence-based responses. |
Longchao Da; Parth Mitesh Shah; Kuan-Ru Liou; Jiaxing Zhang; Hua Wei; |
| 137 | Let’s Group: A Plug-and-Play SubGraph Learning Method for Memory-Efficient Spatio-Temporal Graph Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, despite the excellent predictive performance of increasingly complex models, their intricate architectures result in significant memory overhead and computational complexity when handling spatio-temporal data, which limits their practical applications. To address these challenges, we propose a plug-and-play SubGraph Learning (SGL) method to reduce the memory overhead without compromising performance. |
Wenchao Weng; Hanyu Jiang; Mei Wu; Xiao Han; Haidong Gao; Guojiang Shen; Xiangjie Kong; |
| 138 | StarFT: Robust Fine-tuning of Zero-shot Models Via Spuriosity Alignment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, in a different context, fine-tuned models with limited data are also prone to learning features that are spurious to humans, such as background or texture. In this paper, we propose StarFT (Spurious Textual Alignment Regularization), a novel framework for fine-tuning zero-shot models to enhance robustness by preventing them from learning spuriosity. |
Younghyun Kim; Jongheon Jeong; Sangkyung Kwak; Kyungmin Lee; Juho Lee; Jinwoo Shin; |
| 139 | Reinforcement Learning for Hybrid Charging Stations Planning and Operation Considering Fixed and Mobile Chargers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To solve the HCSPO problem, we propose a deep reinforcement learning approach enhanced with heuristic scheduling. |
Yanchen Zhu; Honghui Zou; Chufan Liu; Yuyu Luo; Yuankai Wu; Yuxuan Liang; |
| 140 | METEOR: Melody-aware Texture-controllable Symbolic Music Re-Orchestration Via Transformer VAE Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose METEOR, a model for generating Melody-aware Texture-controllable re-Orchestration with a Transformer-based variational auto-encoder (VAE). |
Dinh-Viet-Toan Le; Yi-Hsuan Yang; |
| 141 | Learn Multi-task Anchor: Joint View Imputation and Label Generation for Incomplete Multi-view Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Besides, most methods neglect the potential of anchors for imputing missing views. To address these limitations, we propose a Joint View Imputation and Label Generation (JVILG) method. |
Xinxin Wang; Yongshan Zhang; Yicong Zhou; |
| 142 | SDDiff: Boosting Radar Perception Via Spatial-Doppler Diffusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we observe an underlying correlation between 3D points and ego velocity, which offers reciprocal benefits for PCE and EVE. |
Shengpeng Wang; Xin Luo; Yulong Xie; Wei Wang; |
| 143 | ChronoFact: Timeline-based Temporal Fact Verification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Current systems struggle with the complexities of evaluating the accuracy of these claims, especially when they include multiple, overlapping, or recurring events. We introduce a novel timeline-based fact verification framework that identify events from both claim and evidence and organize them into their respective chronological timelines. |
Anab Maulana Barik; Wynne Hsu; Mong Li Lee; |
| 144 | Prompt-Free Conditional Diffusion for Multi-object Image Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To mitigate both problems with one stone, we propose a prompt-free conditional diffusion framework for multi-object image augmentation. |
Haoyu Wang; Lei Zhang; Wei Wei; Chen Ding; Yanning Zhang; |
| 145 | Curriculum Hierarchical Knowledge Distillation for Bias-Free Survival Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel approach, PathoKD, based on knowledge distillation. |
Chaozhuo Li; Zhihao Tang; Mingji Zhang; Zhiquan Liu; Litian Zhang; Xi Zhang; |
| 146 | Diffusion Guided Propagation Augmentation for Popularity Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Diffusion Guided Propagation Augmentation(DGPA), a novel framework designed to improve early-stage popularity prediction. |
Chaozhuo Li; Tianqi Yang; Litian Zhang; Xi Zhang; |
| 147 | RotateKV: Accurate and Robust 2-Bit KV Cache Quantization for LLMs Via Outlier-Aware Adaptive Rotations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we explore the potential of rotation technique for 2-bit KV quantization and propose RotateKV, which achieves accurate and robust performance through the following innovations:(i) Outlier-Aware Rotation, which utilizes channel-reordering to adapt the rotations to varying channel-wise outlier distributions without sacrificing the computational efficiency of the fast Walsh-Hadamard transform (FWHT);(ii) Pre-RoPE Grouped-Head Rotation, which mitigates the impact of rotary position embedding (RoPE) on proposed outlier-aware rotation and further smooths outliers across heads;(iii) Attention-Sink-Aware Quantization, which leverages the massive activations to precisely identify and protect attention sinks.RotateKV achieves less than 0.3 perplexity (PPL) degradation with 2-bit quantization on WikiText-2 using LLaMA-2-13B, maintains strong CoT reasoning and long-context capabilities, with less than 1.7% degradation on GSM8K, outperforming existing methods even at lower average bit-widths. |
Zunhai Su; Hanyu Wei; Zhe Chen; Wang Shen; Linge Li; Huangqi Yu; Kehong Yuan; |
| 148 | Neuro-Symbolic Artificial Intelligence: Towards Improving The Reasoning Abilities of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large Language Models (LLMs) have shown promising results across various tasks, yet their reasoning capabilities remain a fundamental challenge. |
Xiao-Wen Yang; Jie-Jing Shao; Lan-Zhe Guo; Bo-Wen Zhang; Zhi Zhou; Lin-Han Jia; Wang-Zhou Dai; Yu-Feng Li; |
| 149 | GETMusic: Generating Music Tracks with A Unified Representation and Diffusion Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a framework known as GETMusic, with “GET” standing for “GEnerate music Tracks.” |
Ang Lv; Xu Tan; Peiling Lu; Wei Ye; Shikun Zhang; Jiang Bian; Rui Yan; |
| 150 | Optical Flow Estimation for Tiny Objects: New Problem, Specialized Benchmark, and Bioinspired Scheme Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Optical flow is pivotal in video-based tasks, yet existing methods mostly focus on medium-/large-size objects, while underperforming when characterizing the motion of tiny objects. To bridge this gap, we introduce the On-off Time-delay with Hassenstein-Reichardt correlator (OTHR), a computationally efficient scheme inspired by the primate visual cortex’s direction selectivity mechanism. |
Xueyao Ji; Gang Wang; Yizheng Wang; |
| 151 | T-T: Table Transformer for Tagging-based Aspect Sentiment Triplet Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, owing to the quadratic relation between the length of the table and the length of the input sentence sequence, using transformers directly faces two challenges: overly long table sequences and unfair local attention interaction. To address these challenges, we propose a novel Table-Transformer (T-T) for the tagging-based ASTE method. |
Kun Peng; Chaodong Tong; Cong Cao; Hao Peng; Qian Li; Guanlin Wu; Lei Jiang; Yanbing Liu; Philip S. Yu; |
| 152 | Fast Guaranteed Tensor Recovery with Adaptive Tensor Nuclear Norm Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these methods struggle with either computational inefficiency or weak theoretical guarantees for large-scale data. To address these issues, we propose a fast guaranteed tensor recovery framework based on a new tensor nuclear norm. |
Jiangjun Peng; Hailin Wang; Xiangyong Cao; Shuang Xu; |
| 153 | Multi-Omics Analysis for Cancer Subtype Inference Via Unrolling Graph Smoothness Priors Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Integrating multi-omics datasets through data-driven analysis offers a comprehensive understanding of the complex biological processes underlying various diseases, particularly cancer.Graph Neural Networks (GNNs) have recently demonstrated remarkable ability to exploit relational structures in biological data, enabling advances in multi-omics integration for cancer subtype classification. Existing approaches often neglect the intricate coupling between heterogeneous omics, limiting their capacity to resolve subtle cancer subtype heterogeneity critical for precision oncology.To address these limitations, we propose a framework named Graph Transformer for Multi-omics Cancer Subtype Classification (GTMancer). |
Jielong Lu; Zhihao Wu; Jiajun Yu; Jiajun Bu; Haishuai Wang; |
| 154 | The Devil Is in Fine-tuning and Long-tailed Problems: A New Benchmark for Scene Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Given that the DSO paradigm might undermine the generalization ability of models, we advocate for a Joint-Dataset Learning (JDL) protocol to alleviate the Fine-tuning Gap. |
Tianjiao Cao; Jiahao Lyu; Weichao Zeng; Weimin Mu; Yu Zhou; |
| 155 | A Timestep-Adaptive Frequency-Enhancement Framework for Diffusion-based Image Super-Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we innovatively investigate their frequency-domain behaviors from a sampling timestep perspective. |
Yueying Li; Hanbin Zhao; Jiaqing Zhou; Guozhi Xu; Tianlei Hu; Gang Chen; Haobo Wang; |
| 156 | Problem-dependent Regret for Lexicographic Multi-Armed Bandits with Adversarial Corruptions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although previous literature has proposed the algorithm for lexicographic MAB, their algorithm suffers from several limitations: (1) it exhibits poor adversarial robustness due to its reliance on stochastic rewards, (2) its regret bound is suboptimal compared to single-objective counterparts, and (3) the regret bound does not adapt to specific problem instances. To address these limitations, we study lexicographic MAB with adversarial corruptions, where an adversary might corrupt the stochastic rewards with a corruption budget of C. First, when the value of C is known, we propose an algorithm achieving a problem-dependent regret bound of O(∑(log T / Δⁱ(a) + C)) for the i-th objective (i ∈ [M]), where Δⁱ(a) is the reward gap for arm a on the i-th objective, and M is the number of objectives. |
Bo Xue; Xi Lin; Yuanyu Wan; Qingfu Zhang; |
| 157 | Time-Frequency Disentanglement Boosted Pre-Training: A Universal Spatio-Temporal Modeling Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose a Universal Spatio-Temporal Correlationship pre-training framework (USTC), for spatio-temporal modeling across different cities and tasks. |
Yudong Zhang; Zhaoyang Sun; Xu Wang; Xuan Yu; Kai Wang; Yang Wang; |
| 158 | LiBOG: Lifelong Learning for Black-Box Optimizer Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we explore a novel paradigm of lifelong learning in MetaBBO and introduce LiBOG, a novel approach designed to learn from sequentially encountered problems and generate high-performance optimizers for Black-Box Optimization (BBO). |
Jiyuan Pei; Yi Mei; Jialin Liu; Mengjie Zhang; |
| 159 | SourceDetMamba: A Graph-aware State Space Model for Source Detection in Sequential Hypergraphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite advances in machine learning-based methods, many fail to capture intrinsic dynamics of rumor propagation. In this work, we present SourceDetMamba: A Graph-aware State Space Model for Source Detection in Sequential Hypergraphs, which harnesses the recent success of the state space model Mamba, known for its superior global modeling capabilities and computational efficiency, to address this challenge. |
Le Cheng; Peican Zhu; Yangming Guo; Chao Gao; Zhen Wang; Keke Tang; |
| 160 | Endowing Interpretability for Neural Cognitive Diagnosis By Efficient Kolmogorov-Arnold Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although neural network-based neural cognitive diagnosis models (CDMs) have exhibited significantly better performance than traditional models, neural cognitive diagnosis is criticized for the poor model interpretability due to the multi-layer perceptron(MLP) employed, even with the monotonicity assumption. Therefore, this paper proposes to empower the interpretability of neural cognitive diagnosis models through efficient Kolmogorov-Arnold networks (KANs), named KAN2CD, where KANs are used to enhance interpretability in two manners. |
Shangshang Yang; Linrui Qin; Xiaoshan Yu; Ziwen Wang; Xueming Yan; Haiping Ma; Ye Tian; |
| 161 | DPMamba: Distillation Prompt Mamba for Multimodal Remote Sensing Image Classification with Missing Modalities Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose a unified Distillation Prompt Mamba (DPMamba) framework for multimodal RSIC with missing modalities. |
Yueguang Yang; Jiahui Qu; Ling Huang; Wenqian Dong; |
| 162 | Finite-Time Analysis of Heterogeneous Federated Temporal Difference Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We devise a heterogeneous federated temporal difference (HFTD) algorithm which iteratively aggregates agents’ local stochastic gradients for TD learning. |
Ye Zhu; Xiaowen Gong; Shiwen Mao; |
| 163 | Reliable Disentanglement Multi-view Learning Against View Adversarial Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This inevitably leads to the adversarial unreliability problem (AUP) in trusted multi-view learning. To overcome this tricky problem, we propose a novel multi-view learning framework, namely Reliable Disentanglement Multi-view Learning (RDML). |
Xuyang Wang; Siyuan Duan; Qizhi Li; Guiduo Duan; Yuan Sun; Dezhong Peng; |
| 164 | Denoised Attention and Question-Augmented Representations for Knowledge Tracing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These limitations often lead to the attention noise problem: the model assigns non-negligible attention weight to some information that is cognitively irrelevant in nature, thereby generating interference signals. To address this problem, we propose a novel KT model, i.e., DenoiseKT. |
Jiwei Deng; Youheng Bai; Mingliang Hou; Teng Guo; Zitao Liu; Weiqi Luo; |
| 165 | Learnable Frequency Decomposition for Image Forgery Detection and Localization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we observe and analyze the frequency characteristic changes caused by image tampering. |
Dong Li; Jiayíng Zhu; Yidi Liu; Xin Lu; Xueyang Fu; Jiawei Liu; Aiping Liu; Zheng-Jun Zha; |
| 166 | Optimizing Personalized Federated Learning Through Adaptive Layer-Wise Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose FLAYER, a novel layer-wise learning method for pFL that optimizes local model personalization performance. |
Weihang Chen; Cheng Yang; Jie Ren; Zhiqiang Li; Zheng Wang; |
| 167 | Hallucination-Aware Prompt Optimization for Text-to-Video Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To boost the effectiveness of Sora-like T2V models, we introduce VidPrompter, an innovative large multi-modal model supporting T2V applications with three key functionalities: (1) generating detailed prompts from raw videos, (2) enhancing prompts from videos grounded with short descriptions, and (3) refining simple user-provided prompts to elevate T2V video quality. |
Jiapeng Wang; Chengyu Wang; Jun Huang; Lianwen Jin; |
| 168 | Dual Encoder Contrastive Learning with Augmented Views for Graph Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce an innovative approach: Dual Encoder Contrastive Learning with Augmented Views for Graph Anomaly Detection, named DECLARE. |
Nannan Wu; Hongdou Dong; Wenjun Wang; Yiming Zhao; |
| 169 | Cross-modal Collaborative Representation Learning for Text-to-Image Person Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose CoRL: a cross-modal Collaborative Representation Learning framework designed to improve TIPR by effectively leveraging the complementarity between modalities. |
Shuanglin Yan; Jun Liu; Neng Dong; Liyan Zhang; Jinhui Tang; |
| 170 | Improving Efficiency of Answer Set Planning with Rough Solutions from Large Language Models for Robotic Task Planning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, it is still challenging to efficiently solve ASP programs that have multiple variables with large domains, which prevents the above application of ASP planning from real-world task planning problems. In this paper, we consider how to reduce the domains of variables without losing possible solutions for ASP planning, while given these rough solutions from LLMs. |
Xinrui Lin; Yangfan Wu; Huanyu Yang; Yuting Huang; Yu Zhang; Jianmin Ji; Yanyong Zhang; |
| 171 | MIRROR: Multi-agent Intra- and Inter-Reflection for Optimized Reasoning in Tool Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose MIRROR, a framework that consists of both intra-reflection, which critically assesses intended actions before execution, and inter-reflection, which further adjusts the trajectory based on observations. |
Zikang Guo; Benfeng Xu; Xiaorui Wang; Zhendong Mao; |
| 172 | ABNet: Mitigating Sample Imbalance in Anomaly Detection Within Dynamic Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, GNN-based approaches often prioritize normal samples, neglecting rare anomalies. To address these issues, we propose the Anomaly Balance Network (ABNet), designed to alleviate sample imbalance and enhance anomaly detection. |
Yifan Hong; Muhammad Asif Ali; Huan Wang; Junyang Chen; Di Wang; |
| 173 | NeSyA: Neurosymbolic Automata Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Focusing on the task of sequence classificationand tagging we show that symbolic automata can be integrated with neural-basedperception, under probabilistic semantics towards an end-to-end differentiable model. |
Nikolaos Manginas; George Paliouras; Luc De Raedt; |
| 174 | MVP-CBM: Multi-layer Visual Preference-enhanced Concept Bottleneck Model for Explainable Medical Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, we empirically discover the phenomenon of concept preference variation, that is, the concepts are preferably associated with the features at different layers than those only at the final layer; yet a blind last-layer-based association neglects such a preference variation and thus weakens the accurate correspondences between features and concepts, impairing model interpretability. To address this issue, we propose a novel Multi-layer Visual Preference-enhanced Concept Bottleneck Model (MVP-CBM), which comprises two key novel modules: (1) intra-layer concept preference modeling, which captures the preferred association of different concepts with features at various visual layers, and (2) multi-layer concept sparse activation fusion, which sparsely aggregates concept activations from multiple layers to enhance performance. |
Chunjiang Wang; Kun Zhang; Yandong Liu; Zhiyang He; Xiaodong Tao; S. Kevin Zhou; |
| 175 | Prototype-guided Knowledge Propagation with Adaptive Learning for Lifelong Person Re-identification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study introduces a novel Prototype-guided Knowledge Propagation (PKP) method, which mitigates discrepancies in similar identity images between old and new tasks by guiding prototype construction through triplet loss constraints. |
Zhijie Lu; Wuxuan Shi; He Li; Mang Ye; |
| 176 | Squeezing Context Into Patches: Towards Memory-Efficient Ultra-High Resolution Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the need for both high accuracy and low memory usage in processing UHR images, we introduce a memory-efficient semantic segmentation approach by squeezing context information into local patches (SCPSeg). |
Wang Liu; Puhong Duan; Xudong Kang; Shutao Li; |
| 177 | Mechanism Design for Large Language Models (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose an auction format that operates on a token-by-token basis, and allows LLM agents to influence content creation through single dimensional bids. |
Paul Dütting; Vahab Mirrokni; Renato Paes Leme; Haifeng Xu; Song Zuo; |
| 178 | Good Advisor for Source Localization: Using Large Language Model to Guide The Source Inference Process Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Meanwhile, the high-dimensional embedding of the textual representation introduces significant amounts of redundant features, which also reduces its efficiency in source localization task to some extent. To solve the above problems, this paper proposes a multi-modal fusion framework for rumor source localization, namely Contrastive Rumor Source Localization via LLM (CRSLL), based on the idea of contrastive learning. |
Dongpeng Hou; Wenfei Wei; Chao Gao; Xianghua Li; Zhen Wang; |
| 179 | Zero-Shot Machine Unlearning with Proxy Adversarial Data Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As such, these methods are inapplicable in a more practical scenario, where only the unlearning samples are available (i.e., zero-shot unlearning). This paper presents a novel framework, ZS-PAG, to fill this gap. |
Huiqiang Chen; Tianqing Zhu; Xin Yu; Wanlei Zhou; |
| 180 | Multi-Sourced Compositional Generalization in Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we explore MSCG in the context of visual question answering (VQA), and propose a retrieval-augmented training framework to enhance the MSCG ability of VQA models by learning unified representations for primitives from different modalities. |
Chuanhao Li; Wenbo Ye; Zhen Li; Yuwei Wu; Yunde Jia; |
| 181 | Universal Backdoor Defense Via Label Consistency in Vertical Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the inherent limitations on the defender’s access to the global model and complete training data in VFL environments fundamentally constrain the effectiveness of these conventional methods. To address these limitations, we propose the Universal Backdoor Defense (UBD) framework. |
Peng Chen; Haolong Xiang; Xin Du; Xiaolong Xu; Xuhao Jiang; Zhihui Lu; Jirui Yang; Qiang Duan; Wanchun Dou; |
| 182 | FreEformer: Frequency Enhanced Transformer for Multivariate Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper presents FreEformer, a simple yet effective model that leverages a Frequency Enhanced Transformer for multivariate time series forecasting. |
Wenzhen Yue; Yong Liu; Xianghua Ying; Bowei Xing; Ruohao Guo; Ji Shi; |
| 183 | SpaceDet: A Large-scale Space-based Image Dataset and RSO Detection for Space Situational Awareness Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce SpaceDet, a large-scale realistic space-based image dataset for SSA. |
Jiaping Xiao; Rangya Zhang; Yuhang Zhang; Lu Bai; Qianlei Jia; Mir Feroskhan; |
| 184 | Learning Real Facial Concepts for Independent Deepfake Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This is primarily due to an overreliance on forgery artifacts and a limited understanding of real faces. To address this challenge, we propose a novel approach RealID to enhance generalization by learning a comprehensive concept of real faces while assessing the probabilities of belonging to the real and fake classes independently. |
Ming-Hui Liu; Harry Cheng; Tianyi Wang; Xin Luo; Xin-Shun Xu; |
| 185 | HiTuner: Hierarchical Semantic Fusion Model Fine-Tuning on Text-Attributed Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, LLMs face challenges in specialized domains due to their limited task-specific knowledge, and fine-tuning them for specific tasks demands significant resources. To cope with the above challenges, we propose HiTuner, a novel framework that leverages fine-tuned Pre-trained Language Models (PLMs) with domain expertise as tuner to enhance the hierarchical LLM contextualized representations for modeling TAGs. |
Zihan Fang; Zhiling Cai; Yuxuan Zheng; Shide Du; Yanchao Tan; Shiping Wang; |
| 186 | Top-I2P: Explore Open-Domain Image-to-Point Cloud Registration Using Topology Relationship Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we address open-domain I2P registration from the topology relationships perspective. |
Pei An; Jiaqi Yang; Muyao Peng; You Yang; Qiong Liu; Jie Ma; Liangliang Nan; |
| 187 | SOTA: Spike-Navigated Optimal TrAnsport Saliency Region Detection in Composite-bias Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Low-quality samples further distort model predictions, leading to saliency bias. To address these challenges, we propose Spike-navigated Optimal TrAnsport Saliency Region Detection (SOTA), a framework that leverages the strengths of spike cameras while mitigating biases in both spatial and temporal dimensions. |
Wenxuan Liu; Yao Deng; Kang Chen; Xian Zhong; Zhaofei Yu; Tiejun Huang; |
| 188 | HeTa: Relation-wise Heterogeneous Graph Foundation Attack Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel relation-wise heterogeneous graph foundation attack model, HeTa. |
Yuling Wang; Zihui Chen; Pengfei Jiao; Xiao Wang; |
| 189 | Graph Prompts: Adapting Video Graph for Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Unfortunately, these methods heavily rely on the question text, thus making it challenging to perceive and reason about video content that is not explicitly mentioned in the question. To address the above challenge, we propose Graph Prompts-based VideoQA (GP-VQA), which adopts a video-based graph structure for enhanced video understanding. |
Yiming Li; Xiaoshan Yang; Bing-Kun Bao; Changsheng Xu; |
| 190 | MSMAR-RL: Multi-Step Masked-Attention Recovery Reinforcement Learning for Safe Maneuver Decision in High-Speed Pursuit-Evasion Game Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing reinforcement learning methods often fail to balance safety and reward under strict safety constraints and diverse environmental conditions. To address these limitations, this paper proposes a novel zero-constraint-violation recovery RL framework tailored for high-speed uav pursuit-evasion combat games. |
Yang Zhao; Wenzhe Zhao; Xuelong Li; |
| 191 | Proven Approximation Guarantees in Multi-Objective Optimization: SPEA2 Beats NSGA-II Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Different from the NSGA-II, it does not employ the crowding distance (essentially the distance to neighboring solutions) to compare pairwise non-dominating solutions but a complex system of σ-distances that builds on the distances to all other solutions. In this work, we give a first mathematical proof showing that this more complex system of distances can be superior. |
Yasser Alghouass; Benjamin Doerr; Martin S. Krejca; Mohammed Lagmah; |
| 192 | The Graph’s Apprentice: Teaching An LLM Low-Level Knowledge for Circuit Quality Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work proposes augmenting LLMs with predictor networks trained to estimate circuit quality directly from HDL code. |
Reza Moravej; Saurabh Bodhe; Zhanguang Zhang; Didier Chételat; Dimitrios Tsaras; Yingxue Zhang; Hui-Ling Zhen; Jianye Hao; Mingxuan Yuan; |
| 193 | Multi-Objective Neural Bandits with Random Scalarization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Using the trade-off capabilities of upper confidence bound (UCB) and Thompson sampling (TS) strategies, we propose two novel algorithms, MONeural-UCB and MONeural-TS. |
Ji Cheng; Bo Xue; Chengyu Lu; Ziqiang Cui; Qingfu Zhang; |
| 194 | Preventing Latent Diffusion Model-Based Image Mimicry Via Angle Shifting and Ensemble Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we reveal that the robustness of the denoising module stems from two key factors: the cancellation effect between adversarial perturbations and estimated noise, and unstable gradients caused by randomly sampled timesteps and Gaussian noise. |
Minghao Li; Rui Wang; Ming Sun; Lihua Jing; |
| 195 | DIIN: Diffusion Iterative Implicit Networks for Arbitrary-scale Super-resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we rethink the forward process of implicit neural functions as a signal diffusion process, we propose a novel Diffusion Iterative Implicit Network (DIIN) for arbitrary-scale SR to promote global signal flow with neighborhood interactions. |
Tao Dai; Song Wang; Hang Guo; Jianping Wang; Zexuan Zhu; |
| 196 | Fine-Grained and Efficient Self-Unlearning with Layered Iteration Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, such single-step approaches lead to coarse-grained changes in decision boundaries among the remaining classes and impose adverse effects on the model utility. To address these limitations, we propose ‘Self-Unlearning with Layered Iteration (SULI),’ a novel unlearning approach that introduces a layered iteration strategy to re-label the forgetting data iteratively and refine the decision boundaries progressively. |
Hongyi Lyu; Xuyun Zhang; Hongsheng Hu; Shuo Wang; Chaoxiang He; Lianyong Qi; |
| 197 | A Comprehensive and Systematic Review for Deep Learning-Based De Novo Peptide Sequencing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we provide the first review of deep learning-based \emph{de novo} peptide sequencing techniques from the perspectives of data types, model architectures, decoding strategies, applications and evaluation metrics. |
Jun Xia; Jingbo Zhou; Shaorong Chen; Tianze Ling; Stan Z. Li; |
| 198 | HyperDet: Source Detection in Hypergraphs Via Interactive Relationship Construction and Feature-rich Attention Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we present a novel approach for Source Detection in Hypergraphs (HyperDet) via Interactive Relationship Construction and Feature-rich Attention Fusion. |
Le Cheng; Peican Zhu; Yangming Guo; Keke Tang; Chao Gao; Zhen Wang; |
| 199 | Gaussian Mixture Model for Graph Domain Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Unsupervised domain adaptation (UDA) has been widely studied with the goal of transferring knowledge from a label-rich source domain to a related but unlabeled target domain. |
Mengzhu Wang; Wenhao Ren; Yu Zhang; Yanlong Fan; Dianxi Shi; Luoxi Jing; Nan Yin; |
| 200 | Coupling Category Alignment for Graph Domain Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the problem, we propose an effective framework named Coupling Category Alignment (CoCA) for GDA, which effectively addresses the category alignment issue with theoretical guarantees. |
Nan Yin; Xiao Teng; Zhiguang Cao; Mengzhu Wang; |
| 201 | ST-TAR: An Efficient Spatio-Temporal Learning Framework for Traffic Accident Risk Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In addition, improving efficiency is an urgent requirement for traffic accident forecasting. To overcome these limitations, we propose an efficient Spatio-Temporal learning framework for Traffic Accident Risk forecasting (ST-TAR). |
Hongyu Wang; Lisi Chen; Shuo Shang; Peng Han; Christian S. Jensen; |
| 202 | Learning Robust Multi-view Representation Using Dual-masked VAEs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose a holistic method, called Dual-masked Variational Autoencoders (DualVAE), which aims at learning robust multi-view representation. |
Jiedong Wang; Kai Guo; Peng Hu; Xi Peng; Hao Wang; |
| 203 | Perspectives in Play: A Multi-Perspective Approach for More Inclusive NLP Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recognizing that labels reflect the diverse backgrounds, life experiences, and values of individuals, this study proposes a new multi-perspective approach using soft labels to encourage the development of the next generation of perspective-aware models—more inclusive and pluralistic. |
Benedetta Muscato; Lucia Passaro; Gizem Gezici; Fosca Giannotti; |
| 204 | RegionMatch: Pixel-Region Collaboration for Semi-Supervised Semantic Segmentation in Remote Sensing Images Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing methods mainly rely on pixel-level information, neglecting the strong region consistency inherent in remote sensing images (RSIs), which limits their effectiveness in handling the complex and diverse backgrounds of RSIs. To address this, we propose RegionMatch, a novel approach that leverages unlabeled data from a fresh object-level perspective, which is more tailored to the nature of semantic segmentation. |
Xiaoqian Zhu; Xiangrong Zhang; Tianyang Zhang; Chaowei Fang; Xu Tang; Licheng Jiao; |
| 205 | Towards Cross-Modality Modeling for Time Series Analytics: A Survey in The LLM Era Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this survey, we provide an up-to-date overview of LLMs-based cross-modality modeling for time series analytics. |
Chenxi Liu; Shaowen Zhou; Qianxiong Xu; Hao Miao; Cheng Long; Ziyue Li; Rui Zhao; |
| 206 | Unleashing The Semantic Adaptability of Controlled Diffusion Model for Image Colorization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recent data-driven image colorization methods have leveraged pre-trained Text-to-Image (T2I) diffusion models as generative prior, while still suffering from unsatisfactory and inaccurate semantic-level color control. To address these issues, we propose a Semantic Adaptation method (SeAda) that enhances the prior while considering the semantic discrepancy between color and grayscale image pairs. |
Xiangcheng Du; Zhao Zhou; Yanlong Wang; Yingbin Zheng; Xingjiao Wu; Peizhu Gong; Cheng Jin; |
| 207 | DcDsDiff: Dual-Conditional and Dual-Stream Diffusion Model for Generative Image Tampering Localization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Consequently, this paper introduces a denoising diffusion probabilistic model-based DcDsDiff, which comprises a Dual-View Conditional Network (DVCN) and a Dual-Stream Denoising Network (DSDN). |
Qixian Hao; Shaozhang Niu; Jiwei Zhang; Kai Wang; |
| 208 | Towards Regularized Mixture of Predictions for Class-Imbalanced Semi-Supervised Facial Expression Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, they are far from real-world applications due to biased pseudo-labels caused by class imbalance. To alleviate this issue, we propose Regularized Mixture of Predictions (ReMoP), a simple yet effective method to generate high-quality pseudo-labels for imbalanced samples. |
Hangyu Li; Yixin Zhang; Jiangchao Yao; Nannan Wang; Bo Han; |
| 209 | EchoGPT: An Interactive Cardiac Function Assessment Model for Echocardiogram Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing studies only analyze echocardiogram video through discriminative models, which have limited question-answering capabilities. Therefore, this study innovatively proposes a large language model with cardiac ultrasound diagnostic capabilities—EchoGPT. |
Bo Xu; Quanhao Zhu; Qingchen Zhang; Mengmeng Wang; Liang Zhao; Hongfei Lin; Jing Ren; Feng Xia; |
| 210 | Efficient Rectification of Neuro-Symbolic Reasoning Inconsistencies By Abductive Reflection (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, for complex learning targets, NeSy systems often generate outputs inconsistent with domain knowledge. Inspired by the human Cognitive Reflection, which promptly detects errors in our intuitive response and revises them by invoking the System 2 reasoning, we propose to improve NeSy systems by introducing Abductive Reflection (ABL-Refl) based on the Abductive Learning (ABL) framework. |
Wen-Chao Hu; Wang-Zhou Dai; Yuan Jiang; Zhi-Hua Zhou; |
| 211 | Pre-defined Keypoints Promote Category-level Articulation Pose Estimation Via Multi-Modal Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Articulations are essential in everyday interactions, yet traditional RGB-based pose estimation methods often struggle with issues such as lighting variations and shadows. To overcome these challenges, we propose a novel Pre-defined keypoint based framework for category-level articulation pose estimation via multi-modal Alignment, coined PAGE. |
Wenbo Xu; Li Zhang; Liu Liu; Yan Zhong; Haonan Jiang; Xue Wang; Rujing Wang; |
| 212 | FairSMOE: Mitigating Multi-Attribute Fairness Problem with Sparse Mixture-of-Experts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we formulate multi‐attribute fairness issue as an MTL problem and employ SMoE to achieve desirable performance across all attributes simultaneously.We first analyze the feasibility and find the potentiality by formalizing multi-attribute fairness problem into a MTL problem and mitigating it by using SMoE. |
Changdi Yang; Zheng Zhan; Ci Zhang; Yifan Gong; Yize Li; Zichong Meng; Jun Liu; Xuan Shen; Hao Tang; Geng Yuan; Pu Zhao; Xue Lin; Yanzhi Wang; |
| 213 | EVICheck: Evidence-Driven Independent Reasoning and Combined Verification Method for Fact-Checking Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Additionally, they rely on simple prompts or few-shot learning for verification, which makes truthfulness judgments less reliable, especially for complex claims. To address these limitations, we propose a novel method to enhance evidence utilization and introduce explicit verification criteria, named EVICheck. |
Lingxiao Wang; Lei Shi; Feifei Kou; Ligu Zhu; Chen Ma; Pengfei Zhang; Mingying Xu; Zeyu Li; |
| 214 | Revisiting Continual Ultra-fine-grained Visual Recognition with Pre-trained Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: By analyzing recent pre-trained model (PTM) based continual learning methods on the proposed benchmark, we propose two simple yet effective PTM-based methods to boost the performance of VC-UFG and HC-UFG, respectively. |
Pengcheng Zhang; Xiaohan Yu; Meiying Gu; Yuchen Wu; Yongsheng Gao; Xiao Bai; |
| 215 | Deep Opinion-Unaware Blind Image Quality Assessment By Learning and Adapting from Multiple Annotators Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The resulting dataset is subsequently employed for training our DUBMA.Due to the inherent discrepancies between synthetic and real-world distortions, a domain shift may occur. To address this, we propose an outlier-robust unsupervised domain adaptation approach leveraging optimal transport. |
Zhihua Wang; Xuelin Liu; Jiebin Yan; Jie Wen; Wei Wang; Chao Huang; |
| 216 | Harnessing Vision Models for Time Series Analysis: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This survey highlights the advantages of vision models over LLMs in time series analysis, offering a comprehensive dual-view taxonomy that answers key research questions like how to encode time series as images and how to model imaged time series. |
Jingchao Ni; Ziming Zhao; ChengAo Shen; Hanghang Tong; Dongjin Song; Wei Cheng; Dongsheng Luo; Haifeng Chen; |
| 217 | An Association-based Fusion Method for Speech Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose an innovative Association-based Fusion Speech Enhancement method (AFSE), a decoupled method. |
Shijie Wang; Qian Guo; Lu Chen; Liang Du; Zikun Jin; Zhian Yuan; Xinyan Liang; |
| 218 | Stabilizing Holistic Semantics in Diffusion Bridge for Image Inpainting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, this paper proposes a novel Global Structure-Guided Diffusion Bridge framework (GSGDiff), which incorporates an additional structure restorer to stabilize the generation of holistic semantics. |
Jinjia Peng; Mengkai Li; Huibing Wang; |
| 219 | VidEvo: Evolving Video Editing Through Exhaustive Temporal Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: A crucial factor leading to the aforementioned issue is the inadequate and implicit tuning of the attention module within existing methods, which is specifically designed to capture temporal information. In light of this, we introduce VidEvo, a novel one-shot video editing method that leverages explicit cues derived from the original video to enhance temporal modeling. |
Sizhe Dang; Huan Liu; Mengmeng Wang; Xin Lai; Guang Dai; Jingdong Wang; |
| 220 | Connecting Giants: Synergistic Knowledge Transfer of Large Multimodal Models for Few-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel framework, Synergistic Knowledge Transfer (SynTrans), which effectively transfers diverse and complementary knowledge from large multimodal models to empower the off-the-shelf few-shot learner. |
Hao Tang; Shengfeng He; Jing Qin; |
| 221 | Towards Safer Pretraining: Analyzing and Filtering Harmful Content in Webscale Datasets for Responsible LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Training LLMs on such unfiltered data risks perpetuating toxic behaviors, spreading misinformation, and amplifying societal biases which can undermine trust in LLM-driven applications and raise ethical concerns about their use. This paper presents a large-scale analysis of inappropriate content across these datasets, offering a comprehensive taxonomy that categorizes harmful webpages into Topical and Toxic based on their intent. |
Sai Krishna Mendu; Harish Yenala; Aditi Gulati; Shanu Kumar; Parag Agrawal; |
| 222 | SandboxSocial: A Sandbox for Social Media Using Multimodal AI Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing simulators lack features needed to capture the complex information-sharing dynamics of platform-based social networks. To bridge this gap, we present SandboxSocial, a new simulator that includes several key innovations, mainly: (1) a virtual social media platform (modelled as Mastodon and mirrored in an actual Mastodon server) that enables a realistic setting in which agents interact; (2) an adapter that uses real-world user data to create more grounded agents and social media content; and (3) multi-modal capabilities that enable our agents to interact using both text and images—just as humans do on social media. |
Maximilian Puelma Touzel; Sneheel Sarangi; Gayatri Krishnakumar; Busra Tugce Gurbuz; Austin Welch; Zachary Yang; Andreea Musulan; Hao Yu; Ethan Kosak-Hine; Tom Gibbs; Camille Thibault; Reihaneh Rabbany; Jean-François Godbout; Dan Zhao; Kellin Pelrine; |
| 223 | Reinforced In-Context Black-Box Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose RIBBO, a method to reinforce-learn a BBO algorithm from offline data in an end-to-end fashion. |
Lei Song; Chen-Xiao Gao; Ke Xue; Chenyang Wu; Dong Li; Jianye Hao; Zongzhang Zhang; Chao Qian; |
| 224 | Omni-Dimensional State Space Model-driven SAM for Pixel-level Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel Omni Dimensional State Space Model-driven SAM (ODS-SAM) for pixel-level anomaly detection. |
Chao Huang; Qianyi Li; Jie Wen; Bob Zhang; |
| 225 | Screening, Rectifying, and Re-Screening: A Unified Framework for Tuning Vision-Language Models with Noisy Labels Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, their fine-tuning under noisy labels remains an open problem due to challenges like self-confirmation bias and the limitations of conventional small-loss criteria. In this paper, we propose a unified framework to address these issues, consisting of three key steps: Screening, Rectifying, and Re-Screening. |
Chaowei Fang; Hangfei Ma; Zhihao Li; De Cheng; Yue Zhang; Guanbin Li; |
| 226 | Sample-Efficient Behavior Cloning Using General Domain Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To enable learning from both general knowledge and specific demonstration trajectories, we use a large language model’s coding capability to instantiate a policy structure based on expert domain knowledge expressed in natural language and tune the parameters in the policy with demonstrations. |
Feiyu Zhu; Jean Oh; Reid Simmons; |
| 227 | MedualTime: A Dual-Adapter Language Model for Medical Time Series-Text Multimodal Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: The recent rapid advancements in language models (LMs) have garnered attention in medical time series-text multimodal learning.However, existing contrastive learning-based and prompt-based LM approaches tend to be biased, often assigning a primary role to time series modality while treating text modality as secondary. We classify these approaches under a temporal-primary paradigm, which may overlook the unique and critical task-relevant information embedded in text modality like clinical reports, thus failing to fully leverage mutual benefits and complementarity of different modalities.To fill this gap, we propose a novel textual-temporal multimodal learning paradigm that enables either modality to serve as the primary while being enhanced by the other, thereby effectively capturing modality-specific information and fostering cross-modal interaction. |
Jiexia Ye; Weiqi Zhang; Ziyue Li; Jia Li; Meng Zhao; Fugee Tsung; |
| 228 | Enhanced Graph Similarity Learning Via Adaptive Multi-scale Feature Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, due to inadequate feature representation, existing methods often struggle to cope with complex graph structures, which in turn limits the feature fusion capability and leads to low accuracy of similarity computation. To address these issues, this paper introduces an Adaptive Multi-scale Feature Fusion(AMFF) framework. |
Cuifang Zou; Guangquan Lu; Wenzhen Zhang; Xuxia Zeng; Shilong Lin; Longqing Du; Shichao Zhang; |
| 229 | Wave-wise Discriminative Tracking By Phase-Amplitude Separation, Augmentation and Mixture Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: By using both phase and amplitude, it captures richer semantics and specific invariances compared to pixel-based methods, and allows for feature fusion across regions for a holistic image representation. Based on this, we propose the Wave-wise Discriminative Transformer Tracker (WDT). |
Huibin Tan; Mingyu Cao; Kun Hu; Xihuai He; Zhe Wang; Hao Li; Long Lan; Mengzhu Wang; |
| 230 | Approximated Behavioral Metric-based State Projection for Federated Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce FedRAG, a FRL framework to learn a computationally practical projection function of states for each client and aggregating the parameters of projection functions at a central server. |
Zengxia Guo; Bohui An; Zhongqi Lu; |
| 231 | SPARC: An AI-Based Speech Processing and Real-Time Correction System Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This project aims to develop a real-time voice correction system that automatically detects and corrects speech errors in near real-time while integrating the adjusted audio into ongoing conversations without disrupting the natural flow. |
TingRay Chung; Pin-Yu Chen; |
| 232 | IterMeme: Expert-Guided Multimodal LLM for Interactive Meme Creation with Layout-Aware Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing methods lack critical components, failing to support intent-driven caption-layout generation and personalized generation, making it difficult to generate high-quality memes. To address this limitation, we propose IterMeme, an end-to-end interactive meme creation framework that utilizes a unified Multimodal Large Language Model (MLLM) to facilitate seamless collaboration among multiple components. |
Yaqi Cai; Shancheng Fang; Yadong Qu; Xiaorui Wang; Meng Shao; Hongtao Xie; |
| 233 | Boosting Zero-shot Stereo Matching Using Large-Scale Mixed Images Sources in The Real World Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel framework, BooSTer, that leverages both vision foundation models and large-scale mixed image sources, including synthetic, real, and single-view images. |
Yuran Wang; Yingping Liang; Ying Fu; |
| 234 | LLM-enhanced Score Function Evolution for Causal Structure Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce L-SFE, a framework designed to automatically discover effective score functions by exploring the "score function space". |
Zidong Wang; Fei Liu; Qi Feng; Qingfu Zhang; Xiaoguang Gao; |
| 235 | Incorporating Visual Experts to Resolve The Information Loss in Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, this work introduces a novel method that incorporates multi-task encoders and existing visual tools into the MLLMs training and inference pipeline, aiming to provide a more comprehensive summarization of visual inputs. |
Xin He; Longhui Wei; Lingxi Xie; Qi Tian; |
| 236 | An Out-Of-Distribution Membership Inference Attack Approach for Cross-Domain Graph Attacks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we categorize the distribution diversity issue in real-world MIA scenarios as an Out-Of-Distribution (OOD) problem, and propose a novel Graph OOD Membership Inference Attack (GOOD-MIA) to achieve cross-domain graph attacks. |
Jinyan Wang; Liu Yang; Yuecen Wei; Jiaxuan Si; Chenhao Guo; Qingyun Sun; Xianxian Li; Xingcheng Fu; |
| 237 | Multi-Scale Temporal Neural Network for Stock Trend Prediction Enhanced By Temporal Hyepredge Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These relationships typically provide contextual insights into market investments influencing stock price fluctuations. To tackle these issues, we propose a Multi-Scale Temporal Neural Network (MSTNN) framework tailored for STP. |
Lingyun Song; Haodong Li; Siyu Chen; Xinbiao Gan; Binze Shi; Jie Ma; Yudai Pan; Xiaoqi Wang; Xuequn Shang; |
| 238 | Metapath and Hypergraph Structure-based Multi-Channel Graph Contrastive Learning for Student Performance Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we using an innovative Multi-Channel Graph Contrastive Learning (MCGCL) framework that integrates various high-order interactions for predicting student performance. |
Lingyun Song; Xiaofan Sun; Xinbiao Gan; Yudai Pan; Xiaolin Han; Jie Ma; Jun Liu; Xuequn Shang; |
| 239 | Coming Out of The Dark: Human Pose Estimation in Low-light Conditions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To alleviate the issue, we construct a Low-Light Images and Poses (LLIP) dataset, which includes only paired low-light images and pose annotations obtained using off-the-shelf motion capture devices. |
Yong Su; Defang Chen; Meng Xing; Changjae Oh; Xuewei Liu; Jieyang Li; |
| 240 | MedDiT: A Knowledge-Controlled Diffusion Transformer Framework for Dynamic Medical Image Generation in Virtual Simulated Patient Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the high cost of recruiting qualified SPs and the lack of diverse medical imaging datasets have presented significant challenges. To address these issues, this paper introduces MedDiT, a novel knowledge-controlled conversational framework that can dynamically generate plausible medical images aligned with simulated patient symptoms, enabling diverse diagnostic skill training. |
Yanzeng Li; Cheng Zeng; Jinchao Zhang; Jie Zhou; Lei Zou; |
| 241 | PanComplex: Leveraging Complex-Valued Neural Networks for Enhanced Pansharpening Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To redefine the pansharpening task, we propose a complex-valued spatial-frequency dual-domain framework, PanComplex. |
Chunhui Luo; Dong Li; Xiaoliang Ma; Xin Lu; Zhiyuan Wang; Jiangtong Tan; Xueyang Fu; |
| 242 | MHANet: Multi-scale Hybrid Attention Network for Auditory Attention Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, most AAD methods solely utilize attention mechanisms sequentially and overlook valuable multi-scale contextual information within EEG signals, limiting their ability to capture long-short range spatiotemporal dependencies simultaneously. To address these issues, this paper proposes a multi-scale hybrid attention network (MHANet) for AAD, which consists of the multi-scale hybrid attention (MHA) module and the spatiotemporal convolution (STC) module. |
Lu Li; Cunhang Fan; Hongyu Zhang; Jingjing Zhang; Xiaoke Yang; Jian Zhou; Zhao Lv; |
| 243 | DisPIM: Distilling PreTrained Image Models for Generalizable Visuo-Motor Control Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce DisPIM, a framework that leverages pretrained image models (PIMs) for visuo-motor control. |
Haitao Wang; Hejun Wu; |
| 244 | Tight Runtime Guarantees From Understanding The Population Dynamics of The GSEMO Multi-Objective Evolutionary Algorithm Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we significantly enhance our understanding of the dynamics of the GSEMO, in particular, for the classic CountingOnesCountingZeros (COCZ) benchmark. |
Benjamin Doerr; Martin S. Krejca; Andre Opris; |
| 245 | EFormer: An Effective Edge-based Transformer for Vehicle Routing Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recent neural heuristics for the Vehicle Routing Problem (VRP) primarily rely on node coordinates as input, which may be less effective in practical scenarios where real cost metrics—such as edge-based distances—are more relevant. To address this limitation, we introduce EFormer, an Edge-based Transformer model that uses edge as the sole input for VRPs. |
Dian Meng; Zhiguang Cao; Yaoxin Wu; Yaqing Hou; Hongwei Ge; Qiang Zhang; |
| 246 | Tree-of-AdEditor: Heuristic Tree Reasoning for Automated Video Advertisement Editing with Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address these, we propose a novel framework, Tree-of-AdEditor (ToAE), which constructs a reasoning tree to mimic human editors, and incorporates domain-specific theories and heuristic fact-checking to identify optimal editing solutions. |
Yuqi Zhang; Bin Guo; Nuo Li; Ying Zhang; Shijie Wang; Zhiwen Yu; Qing Li; |
| 247 | Generative Agents for Multimodal Controversy Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing methods typically provide only classification results, failing to identify what aspects are controversial and why, thereby lacking detailed explanations. To address this limitation, we propose a novel Agent-based Multimodal Controversy Detection architecture, termed AgentMCD. |
Tianjiao Xu; Jinfei Gao; Keyi Kong; Jianhua Yin; Tian Gan; Liqiang Nie; |
| 248 | K-Buffers: A Plug-in Method for Enhancing Neural Fields with Multiple Buffers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we propose a plug-in method named K-Buffers that leverages multiple buffers to improve the rendering performance. |
Haofan Ren; Zunjie Zhu; Xiang Chen; Ming Lu; Rongfeng Lu; Chenggang Yan; |
| 249 | Polynomial-Time Relational Probabilistic Inference in Open Universes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Inspired by human reasoning, we introduce a method of first-order relational probabilistic inference that satisfies both criteria, and can handle hybrid (discrete and continuous) variables. |
Luise Ge; Brendan Juba; Kris Nilsson; |
| 250 | Probabilistic Multimodal Learning with Von Mises-Fisher Distributions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite significant progress, existing methods struggle with challenges such as noisy inputs, noisy correspondence, and the inherent uncertainty of multimodal data, limiting their reliability and robustness. To address these issues, this paper presents a novel Probabilistic Multimodal Learning framework (PML) that models each data point as a von Mises-Fisher (vMF) distribution, effectively capturing intrinsic uncertainty and enabling robust fusion. |
Peng Hu; Yang Qin; Yuanbiao Gou; Yunfan Li; Mouxing Yang; Xi Peng; |
| 251 | An Efficient Core-Guided Solver for Weighted Partial MaxSAT Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents CASHWMaxSAT, an efficient core-guided MaxSAT solver based on two novel ideas. |
Shiwei Pan; Yiyuan Wang; Shaowei Cai; |
| 252 | RobustX: Robust Counterfactual Explanations Made Easy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce RobustX, an open-source Python library implementing a collection of CE generation and evaluation methods, with a focus on the robustness property. |
Junqi Jiang; Luca Marzari; Aaryan Purohit; Francesco Leofante; |
| 253 | SketchAgent: Generating Structured Diagrams from Hand-Drawn Sketches Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To evaluate the effectiveness of our approach, we propose the Sketch2Diagram Benchmark, a comprehensive dataset and evaluation framework encompassing eight diverse diagram categories, such as flowcharts, directed graphs, and model architectures. |
Cheng Tan; Qi Chen; Jingxuan Wei; Gaowei Wu; Zhangyang Gao; Siyuan Li; Bihui Yu; Ruifeng Guo; Stan Z. Li; |
| 254 | CLLMRec: Contrastive Learning with LLMs-based View Augmentation for Sequential Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, limited by the scarcity and suboptimal quality of data, these methods struggle to capture subtle differences in user sequences, which results in diminished recommendation accuracy. To address the above issue, we propose a contrastive learning framework with LLMs-based view augmentation (CLLMRec), which effectively mines differences in behavioral sequences through sample generation. |
Fan Lu; Xiaolong Xu; Haolong Xiang; Lianyong Qi; Xiaokang Zhou; Fei Dai; Wanchun Dou; |
| 255 | Multimodal Fake News Detection: MFND Dataset and Shallow-Deep Multitask Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Under shallow inference, we propose the momentum distillation-based light punishment contrastive learning for fine-grained uniform spatial image and text semantic alignment, and an adaptive cross-modal fusion module to enhance mutual modal features. |
Ye Zhu; Yunan Wang; Zitong Yu; |
| 256 | From End-to-end to Step-by-step: Learning to Abstract Via Abductive Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose Abductive Abstract Reinforcement Learning (A2RL), a novel neuro-symbolic RL framework bridging the two paradigms based on Abductive Learning (ABL), enabling RL agents to learn abstractions directly from raw sensory inputs without predefined symbols.A2RL induces a finite state machine to represent high-level, step-by-step procedures, where each abstract state corresponds to a sub-algebra of the original Markov Decision Process (MDP). |
Zilong Wang; Jiongda Wang; Xiaoyong Chen; Meng Wang; Ming Ma; ZhiPeng Wang; Zhenyu Zhou; Tianming Yang; Wang-Zhou Dai; |
| 257 | AttentionDrag: Exploiting Latent Correlation Knowledge in Pre-trained Diffusion Models for Image Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a novel one-step point-based image editing method, named \textbf{AttentionDrag}, which leverages the inherent latent knowledge and feature correlations within pre-trained diffusion models for image editing tasks. |
Biao Yang; Muqi Huang; Yuhui Zhang; Yun Xiong; Kun Zhou; Xi Chen; Shiyang Zhou; Huishuai Bao; Chuan Li; Feng Shi; Hualei Liu; |
| 258 | LEKA: LLM-Enhanced Knowledge Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The more complex task is teaching models about which knowledge can be analogized and transferred. Therefore, we design a knowledge augmentation method, LEKA, for knowledge transfer that actively searches for suitable knowledge sources that can enrich the target domain’s knowledge. |
Xinhao Zhang; Jinghan Zhang; Fengran Mo; Dongjie Wang; Yanjie Fu; Kunpeng Liu; |
| 259 | Dynamic and Adaptive Feature Generation with LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These shortcomings frequently hinder and limit the deployment of ML models across varied scenarios. Our research introduces a novel approach adopting large language models (LLMs) and feature-generating prompts to address these challenges. |
Xinhao Zhang; Jinghan Zhang; Banafsheh Rekabdar; Yuanchun Zhou; Pengfei Wang; Kunpeng Liu; |
| 260 | Antibody Design and Optimization with Multi-scale Equivariant Graph Diffusion Models for Accurate Complex Antigen Binding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite recent advancements, these methods often fail to accurately capture molecular interactions and maintain structural integrity. To address these challenges, we propose AbMEGD, an end-to-end framework integrating Multi-scale Equivariant Graph Diffusion for antibody sequence and structure co-design. |
Jiameng Chen; Xiantao Cai; Jia Wu; Wenbin Hu; |
| 261 | Binary Event-Driven Spiking Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we integrate binarization techniques into Transformer-based SNNs and propose the Binary Event-Driven Spiking Transformer, i.e. BESTformer. |
Honglin Cao; Zijian Zhou; Wenjie Wei; Yu Liang; Ammar Belatreche; Dehao Zhang; Malu Zhang; Yang Yang; Haizhou Li; |
| 262 | Heterogeneous Temporal Hypergraph Neural Network Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Graph representation learning (GRL) has emerged as an effective technique for modeling graph-structured data. When modeling heterogeneity and dynamics in real-world complex … |
Huan Liu; Pengfei Jiao; Mengzhou Gao; Chaochao Chen; Di Jin; |
| 263 | Fast Second-Order Online Kernel Learning Through Incremental Matrix Sketching and Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, the singular value decomposition required to obtain explicit feature mapping is computationally expensive due to the complete decomposition process. To address these issues, we propose FORKS, a fast incremental matrix sketching and decomposition approach tailored for second-order OKL. |
Dongxie Wen; Xiao Zhang; Zhewei Wei; Chenping Hou; Shuai Li; Weinan Zhang; |
| 264 | Uncertainty-aware Predict-Then-Optimize Framework for Equitable Post-Disaster Power Restoration Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This disparity makes the current restoration solution inequitable, leaving these communities vulnerable to extended power outages. To address this, we aim to propose an equity-aware power restoration strategy that balances both restoration efficiency and equity across communities. |
Lin Jiang; Dahai Yu; Rongchao Xu; Tian Tang; Guang Wang; |
| 265 | HCRide: Harmonizing Passenger Fairness and Driver Preference for Human-Centered Ride-Hailing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Hence, in this work, we aim to design a human-centered ride-hailing system by considering both passenger fairness and driver preference without compromising the overall system efficiency. |
Lin Jiang; Yu Yang; Guang Wang; |
| 266 | MGCA-Net: Multi-Graph Contextual Attention Network for Two-View Correspondence Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing methods have limitations in local geometric modeling and cross-stage information optimization, which make it difficult to accurately capture the geometric constraints of matched pairs and thus reduce the robustness of the model. To address these challenges, we propose a Multi-Graph Contextual Attention Network (MGCA-Net), which consists of a Contextual Geometric Attention (CGA) module and a Cross-Stage Multi-Graph Consensus (CSMGC) module. |
Shuyuan Lin; Mengtin Lo; Haosheng Chen; Yanjie Liang; Qiangqiang Wu; |
| 267 | TsCA: On The Semantic Consistency Alignment Via Conditional Transport for Compositional Zero-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Concretely, we utilize three distinct yet semantically homologous sets, i.e., patches, primitives, and compositions, to construct pairwise CT costs to minimize their semantic discrepancies. |
Miaoge Li; Jingcai Guo; Richard Yi Da Xu; Dongsheng Wang; Xiaofeng Cao; Zhijie Rao; Song Guo; |
| 268 | Robustness to Spurious Correlations Via Dynamic Knowledge Transfer Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Spurious correlations pose a significant challenge to the robustness of statistical models, often resulting in unsatisfactory performance when distributional shifts occur between training and testing data. To address this, we propose to transfer knowledge across spuriously correlated categories within the deep feature space. |
Xiaoling Zhou; Wei Ye; Zhemg Lee; Shikun Zhang; |
| 269 | Meta Label Correction with Generalization Regularizer Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, previous label correction methods for dealing with noisy labels often need expensive computation cost to achieve effectiveness and ignore the generalization ability of the model. To address these issues, in this paper, we propose a new meta-based self-correction method to achieve accurate filtering of noisy labels and to enhance the generalization ability of the label correction model. |
Tao Tong; Yujie Mo; Yucheng Xie; Songyue Cai; Xiaoshuang Shi; Xiaofeng Zhu; |
| 270 | Learning Accurate and Interpretable Decision Trees (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we develop a data-driven approach to design decision tree learning algorithms given repeated access to data from the same domain. |
Maria-Florina Balcan; Dravyansh Sharma; |
| 271 | ESBN: Estimation Shift of Batch Normalization for Source-free Universal Domain Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we propose a novel method, ESBN, which addresses the challenge of domain shift by adjusting the placement of normalization layers and replacing BN with Batch-free Normalization (BFN). |
Jiao Li; Houcheng Su; Bingli Wang; Yuandong Min; Mengzhu Wang; Nan Yin; Shanshan Wang; Jingcai Guo; |
| 272 | Can Large Models Teach Student Models to Solve Mathematical Problems Like Human Beings? A Reasoning Distillation Method Via Multi-LoRA Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, human learning also requires System 2 thinking, where knowledge is first acquired and then reinforced through practice. Inspired by such two distinct modes of thinking, we propose a novel method based on the multi-LoRA Interaction for mathematical reasoning Distillation (LoRID). |
Xinhe Li; Jiajun Liu; Peng Wang; |
| 273 | Intoner: For Chinese Poetry Intoning Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing text-to-speech models lack the ability to generate melodious audio, while singing-voice-synthesis models rely on predetermined musical scores, which are all unsuitable for intoning synthesis. Hence, we introduce Chinese Poetry Intoning Synthesis (PIS) as a novel task to reproduce intoning audio and preserve this age-old cultural art. |
Heda Zuo; Liyao Sun; Zeyu Lai; Weitao You; Pei Chen; Lingyun Sun; |
| 274 | TESTN: A Triad-Enhanced Spatio-Temporal Network for Multi-Temporal POI Relationship Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While existing studies have made substantial efforts to model relationships with custom-designed graph neural networks, they face the challenge of leveraging POI contextual information characterized by spatial dependencies and temporal dynamics, as well as capturing the heterogeneity of multi-type relationships. To address these challenges, we propose a Triad-Enhanced Spatio-Temporal Network (TESTN), which conceptualizes triads as interactions between relationships for capturing potential interplay. |
Hongyu Wang; Lisi Chen; Shuo Shang; |
| 275 | Object-Level Backdoor Attacks in RGB-T Semantic Segmentation with Cross-Modality Trigger Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We overcome the critical limitation of current segmentation backdoor attacks that indiscriminately compromise all objects of a victim class, failing to provide fine-grained control for selectively targeting specific objects as required by adversaries. To address this, we introduce a novel Object-level Backdoor Attack pipeline, termed OBA. |
Xianghao Jiao; Di Wang; Jiawei Liang; Jianjie Huang; Wei Wang; Xiaochun Cao; |
| 276 | Odyssey : Empowering Minecraft Agents with Open-World Skills Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce Odyssey, a new framework that empowers Large Language Model (LLM)-based agents with open-world skills to explore the vast Minecraft world. |
Shunyu Liu; Yaoru Li; Kongcheng Zhang; Zhenyu Cui; Wenkai Fang; Yuxuan Zheng; Tongya Zheng; Mingli Song; |
| 277 | Hand By Hand: LLM Driving EMS Assistant for Operational Skill Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, we introduced an "Align-Analyze-Adjust" strategy and developed FlightAxis, a tool that integrates LLM with Electrical Muscle Stimulation (EMS) for flight skill acquisition, a representative operational skill domain. |
Wei Xiang; Ziyue Lei; Haoyuan Che; Fangyuan Ye; Xueting Wu; Lingyun Sun; |
| 278 | ID-RemovalNet: Identity Removal Network for EEG Privacy Protection with Enhancing Decoding Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these methods also damage the performance of decoding task. In order to solve these problems, this paper proposes an identity removal network (ID-RemovalNet) to achieve EEG privacy protection while improving the classification accuracy of decoding task. |
Huabin Wang; Jie Ruan; Cunhang Fan; Yingfan Cheng; Zhao Lv; |
| 279 | LivePoem: Improving The Learning Experience of Classical Chinese Poetry with AI-Generated Musical Storyboards Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper aims to improve the experience of classical Chinese poetry learning by introducing LivePoem—a system that generates musical storyboards (storyboards with background music) as audiovisual aids to support poetry comprehension. |
Qihao Liang; Xichu Ma; Torin Hopkins; Ye Wang; |
| 280 | Federated Deconfounding and Debiasing Learning for Out-of-Distribution Generalization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, they lack a comprehensive analysis of the inference paths, and the interference from confounding factors limits their performance. To address these limitations, we propose the Federated Deconfounding and Debiasing Learning (FedDDL) method. |
Zhuang Qi; Sijin Zhou; Lei Meng; Han Hu; Han Yu; Xiangxu Meng; |
| 281 | Adaptive Deep Learning from Crowds Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, we propose a probabilistic model to capture the informativeness of possible instances for each worker. |
Hang Yang; Zhiwu Li; Witold Pedrycz; |
| 282 | Leveraging Personalized PageRank and Higher-Order Topological Structures for Heterophily Mitigation in Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This leads to suboptimal performance, particularly under noise from conflicting class information across nodes. To address these challenges, we propose HPGNN, a novel model integrating Higher-order Personalized PageRank with Graph Neural Networks. |
Yumeng Wang; Zengyi Wo; Wenjun Wang; Xingcheng Fu; Minglai Shao; |
| 283 | Map2Traj: Street Map Piloted Zero-shot Trajectory Generation Method for Wireless Network Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces Map2Traj, a novel zero-shot trajectory generation method that leverages the diffusion model to capture the intrinsic relationship between street maps and user mobility. |
Zhenyu Tao; Wei Xu; Xiaohu You; |
| 284 | DECASTE: Unveiling Caste Stereotypes in Large Language Models Through Multi-Dimensional Bias Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: A critical and underexplored issue is the reinforcement of caste-based biases, particularly towards India’s marginalized caste groups such as Dalits and Shudras. In this paper, we address this gap by proposing DECASTE, a novel, multi-dimensional framework designed to detect and assess both implicit and explicit caste biases in LLMs. |
Prashanth Vijayaraghavan; Soroush Vosoughi; Lamogha Chiazor; Raya Horesh; Rogerio Abreu de Paula; Ehsan Degan; Vandana Mukherjee; |
| 285 | Maximin Share Guarantees for Few Agents with Subadditive Valuations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the problem of fairly allocating a set of indivisible items among a set of agents. |
George Christodoulou; Vasilis Christoforidis; Symeon Mastrakoulis; Alkmini Sgouritsa; |
| 286 | Exploiting Text Semantics for Few and Zero Shot Node Classification on Text-attributed Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose Text Semantics Augmentation (TSA) to improve accuracy by introducing more text semantic supervision signals. |
Yuxiang Wang; Xiao Yan; Shiyu Jin; Quanqing Xu; Chuang Hu; Yuanyuan Zhu; Bo Du; Jia Wu; Jiawei Jiang; |
| 287 | Latte: Transfering LLMs’ Latent-level Knowledge for Few-shot Tabular Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite promising results, existing approaches either rely on test-time knowledge extraction, which introduces undesirable latency, or text-level knowledge, which leads to unreliable feature engineering. To overcome these limitations, we propose Latte, a training-time knowledge extraction framework that transfers the latent prior knowledge within LLMs to optimize a more generalized downstream model. |
Ruxue Shi; Hengrui Gu; Hangting Ye; Yiwei Dai; Xu Shen; Xin Wang; |
| 288 | GBGC: Efficient and Adaptive Graph Coarsening Via Granular-ball Computing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, inspired by the application of granular-ball computing in multi-granularity, we propose a new multi-granularity, efficient, and adaptive coarsening method via granular-ball (GBGC), which significantly improves the coarsening results and efficiency. |
Shuyin Xia; Guan Wang; Gaojie Xu; Sen Zhao; Guoyin Wang; |
| 289 | Granular-Ball-Induced Multiple Kernel K-Means Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we leverage granular-ball computing to improve the multi-kernel clustering framework.The core of granular-ball computing is to adaptively fit data distribution by balls from coarse to acceptable levels. |
Shuyin Xia; Yifan Wang; Lifeng Shen; Guoyin Wang; |
| 290 | SepALM: Audio Language Models Are Error Correctors for Robust Speech Separation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While contemporary speech separation technologies adeptly process lengthy mixed audio waveforms, they are frequently challenged by the intricacies of real-world environments, including noisy and reverberant settings, which can result in artifacts or distortions in the separated speech. To overcome these limitations, we introduce SepALM, a pioneering approach that employs audio language models (ALMs) to rectify and re-synthesize speech within the text domain following preliminary separation. |
Zhaoxi Mu; Xinyu Yang; Gang Wang; |
| 291 | A³-Net: Calibration-Free Multi-View 3D Hand Reconstruction for Enhanced Musical Instrument Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we focus on calibration-free multi-view 3D hand reconstruction in unconstrained scenarios. |
Geng Chen; Xufeng Jian; Yuchen Chen; Pengfei Ren; Jingyu Wang; Haifeng Sun; Qi Qi; Jing Wang; Jianxin Liao; |
| 292 | Constructive Conflict-Driven Multi-Agent Reinforcement Learning for Strategic Diversity Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing methods predominantly focus on designing policies based on individual agent characteristics, often neglecting the interplay and mutual influence among agents during policy formation. To address this gap, we propose Competitive Diversity through Constructive Conflict (CoDiCon), a novel approach that incorporates competitive incentives into cooperative scenarios to encourage policy exchange and foster strategic diversity among agents. |
Yuxiang Mai; Qiyue Yin; Wancheng Ni; Pei Xu; Kaiqi Huang; |
| 293 | Flexible Generalized Low-Rank Regularizer for Tensor RPCA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we design a novel tensor low-rank regularization framework coined FGTNN (Flexible Generalized Tensor Nuclear Norm). |
Zhiyang Gong; Jie Yu; Yutao Hu; Yulong Wang; |
| 294 | Understanding Visual Detail Hallucinations of Large Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we investigate the capability of advanced large vision-language models (LVLMs) to recognize and interpret small objects in visual data. |
Xiaoxi Sun; Jianxin Liang; Yueqian Wang; Huishuai Zhang; Dongyan Zhao; |
| 295 | OS-GCL: A One-Shot Learner in Graph Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Consequently, the one-shot learning nature of GCL leads to the issue of the limited self-supervised signal. To further address the above issue, we propose a One-Shot Learner in Graph Contrastive Learning (OS-GCL). |
Cheng Ji; Chenrui He; Qian Li; Qingyun Sun; Xingcheng Fu; Jianxin Li; |
| 296 | Theoretical Insights Into Fine-Tuning Attention Mechanism: Generalization and Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we explore two remarkable phenomena related to the attention mechanism during the fine-tuning of LLMs (where Wq, Wk, and Wv denote the weights of the query, key, and value layers, respectively). |
Xinhao Yao; Hongjin Qian; Xiaolin Hu; Gengze Xu; Wei Liu; Jian Luan; Bin Wang; Yong Liu; |
| 297 | Leveraging MLLM Embeddings and Attribute Smoothing for Compositional Zero-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Being aware of these, we propose a novel framework named multimodal large language model (MLLM) embeddings and attribute smoothing guided disentanglement for CZSL. |
Xudong Yan; Songhe Feng; Yang Zhang; Jian Yang; Yueguan Lin; Haojun Fei; |
| 298 | TextMEF: Text-guided Prompt Learning for Multi-exposure Image Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite significant advancements, current MEF approaches still struggle to handle extremely over- or under-exposed conditions, resulting in unsatisfactory visual effects such as hallucinated details and distorted color tones. With this regard, we propose TextMEF, a prompt-driven fusion method enhanced by prompt learning, for multi-exposure image fusion. |
Jinyuan Liu; Qianjun Huang; Guanyao Wu; Di Wang; Zhiying Jiang; Long Ma; Risheng Liu; Xin Fan; |
| 299 | LLM-based Collaborative Agents with Pedagogy-guided Interaction Modeling for Timely Instructive Feedback Generation in Task-oriented Group Discussions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose an LLM-based collaborative agent that innovatively leverages pedagogical strategies to sense discussion stages, detect learning issues, identify the timing of intervention, and generate instructive feedback. |
Qihao Yang; Yu Yang; Sixu An; Tianyong Hao; Guandong Xu; |
| 300 | Categorical Attention: Fine-grained Language-guided Noise Filtering Network for Occluded Person Re-Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce the Fine-grained Language-guided Noise Filtering Network (FLaN-Net) for occluded ReID. |
Minghui Chen; Dayan Wu; Chenxu Yang; Qinghang Su; Zheng Lin; |
| 301 | Veracity: An Open-Source AI Fact-Checking System Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This demo paper introduces Veracity, an open-source AI system designed to empower individuals to combat misinformation through transparent and accessible fact-checking. |
Taylor Lynn Curtis; Maximilian Puelma Touzel; William Garneau; Manon Gruaz; Mike Pinder; Li Wei Wang; Sukanya Krishna; Luda Cohen; Jean-François Godbout; Reihaneh Rabbany; Kellin Pelrine; |
| 302 | Understanding Matters: Semantic-Structural Determined Visual Relocalization for Large Scenes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the above-mentioned issues, we propose the Semantic-Structural Determined Visual Relocalization method for SCR, which leverages semantic-structural partition learning and partition-determined pose refinement to better understand the semantic and structural information on large scenes. |
Jingyi Nie; Liangliang Cai; Qichuan Geng; Zhong Zhou; |
| 303 | MultiDreamer3D: Multi-concept 3D Customization with Concept-Aware Diffusion Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While single-concept customization has been studied in 3D, multi-concept customization remains largely unexplored. To address this, we propose MultiDreamer3D that can generate coherent multi-concept 3D content in a divide-and-conquer manner. |
Wooseok Song; Seunggyu Chang; Jaejun Yoo; |
| 304 | Empowering Multimodal Road Traffic Profiling with Vision Language Models and Frequency Spectrum Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the joint modeling and multimodal fusion of the textual and visual modalities have been rarely studied in road traffic profiling, which largely hinders the accurate prediction or classification of traffic conditions. To address this issue, we propose a novel multimodal learning and fusion framework for road traffic profiling, named TraffiCFUS. |
Haolong Xiang; Xiaolong Xu; Guangdong Wang; Xuyun Zhang; Xiaoyong Li; Qi Zhang; Amin Beheshti; Wei Fan; |
| 305 | Human Motion Capture from Loose and Sparse Inertial Sensors with Garment-aware Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, this assumption often does not hold in real-world scenarios. In this paper, we present a new task of full-body human pose estimation using sparse, loosely attached IMU sensors.To solve this task, we simulate IMU recordings from an existing garment-aware human motion dataset.We developed transformer-based diffusion models to synthesize loose IMU data and estimate human poses based on this challenging loose IMU data.In addition, we show that incorporating garment-related parameters while training the model on simulated loose data effectively maintains expressiveness and enhances the ability to capture variations introduced by looser or tighter garments. |
Andela Ilic; Jiaxi Jiang; Paul Streli; Xintong Liu; Christian Holz; |
| 306 | GCNT: Graph-Based Transformer Policies for Morphology-Agnostic Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose GCNT, a morphology-agnostic policy network based on improved Graph Convolutional Network (GCN) and Transformer. |
Yingbo Luo; Meibao Yao; Xueming Xiao; |
| 307 | High-Confident Local Structure Guided Consensus Graph Learning For Incomplete Multi-view Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Furthermore, instances with weak discriminative features usually degrading the precision of consistent representation or graph across all views. To address these problems, in this paper, we propose a simple but efficient method, called high-confident local structure guided consensus graph learning for incomplete multi-view clustering (HLSCG_IMC). |
Shuping Zhao; Lunke Fei; Qi Lai; Jie Wen; Jinrong Cui; Tingting Chai; |
| 308 | Learn to Think: Bootstrapping LLM Logic Through Graph Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although existing methods have extended the reasoning capabilities of LLMs through structured paradigms, these approaches often rely on task-specific prompts and predefined reasoning processes, which constrain their flexibility and generalizability. To address these limitations, we propose a novel framework that leverages graph learning to enable more flexible and adaptive reasoning capabilities for LLMs. |
Hang Gao; Chenhao Zhang; Tie Wang; Junsuo Zhao; Fengge Wu; Changwen Zheng; Huaping Liu; |
| 309 | Beyond Feature Mapping GAP: Integrating Real HDRTV Priors for Superior SDRTV-to-HDRTV Conversion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Inspired by generative approaches, we propose a novel method for SDRTV to HDRTV conversion guided by real HDRTV priors. |
Gang He; Kepeng Xu; Li Xu; Wenxin Yu; Xianyun Wu; |
| 310 | FissionVAE: Federated Non-IID Image Generation with Latent Space and Decoder Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we address the challenges of non-IID (independently and identically distributed) data environments featuring multiple groups of images of different types. |
Chen Hu; Hanchi Ren; Jingjing Deng; Xianghua Xie; Xiaoke Ma; |
| 311 | App2Exa: Accelerating Exact KNN Search Via Dynamic Cache-Guided Approximation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This creates a significant gap in accelerating exact kNN search for low-to-medium dimensional data with dynamic query distributions. To fill this gap, we propose App2Exa, a cache-guided framework that integrates approximate and exact kNN search. |
Ke Li; Leong Hou U; Shuo Shang; |
| 312 | Phenotypic Profile-Informed Generation of Drug-Like Molecules Via Dual-Channel Variational Autoencoders Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, previous methods predominantly rely on expression profiles to guide molecule generation, but overlook the perturbative effect of the molecules on cellular contexts. To overcome this limitation, we propose SmilesGEN, a novel generative model based on variational autoencoder (VAE) architecture to generate molecules with potential therapeutic effects. |
Hui Liu; Shiye Tian; Xuejun Liu; |
| 313 | Towards Debiased Generalized Category Discovery Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, we delve into the reason behind this problem: the GCD classifier can be overconfident and biased towards the new class. With this insight, we propose Debiased GCD (DeGCD), a simple but effective approach that mitigates the bias caused by the overconfidence from new categories by a debiased head. |
Pengcheng Guo; Yonghong Song; Boyu Wang; |
| 314 | Universal Graph Self-Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: On the other hand, the difficulty in selecting TRUE positive and negative samples for GCLs limits their universality to both homophilic and heterophilic graphs. To address these drawbacks, this paper introduces a novel GCL framework called GRAph learning via Self-contraSt (GRASS). |
Liang Yang; Yukun Cai; Hui Ning; Jiaming Zhuo; Di Jin; Ziyi Ma; Yuanfang Guo; Chuan Wang; Zhen Wang; |
| 315 | Transferable Relativistic Predictor: Mitigating Cross-Task Cold-Start Issue in NAS Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a transferable relativistic predictor (TRP). |
Nan Li; Bing Xue; Lianbo Ma; Mengjie Zhang; |
| 316 | INFP: INdustrial Video Anomaly Detection Via Frequency Prioritization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Directly applying conventional methods to industrial scenarios can result in an inability to focus on products moving along fixed trajectories, ineffective utilization of their equidistant periodicity, and greater susceptibility to lighting variations. To address these issues, we propose FreqNet, an encoder-decoder framework that learns frequency-domain features from videos to capture periodic and dynamic characteristics, enhancing the model’s robustness. |
Qianzi Yu; Kai Zhu; Yang Cao; Yu Kang; |
| 317 | SRA-MCTS: Self-driven Reasoning Augmentation with Monte Carlo Tree Search for Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, they encounter significant challenges when addressing complex problems that require reasoning and question decomposition. To tackle this, we propose a self-driven reasoning augmentation process, SRA-MCTS, which incorporates Monte Carlo Tree Search (MCTS) for reasoning data generation. |
Bin Xu; Yiguan Lin; Yinghao Li; Yang Gao; |
| 318 | Can Retelling Have Adequate Information for Reasoning? An Enhancement Method for Imperfect Video Understanding with Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Given the proven assessment of the strong reasoning capabilities of LLMs, this paper proposes ERSR, a novel Entity and Relationship based Self-Enhanced Reasoning method for imperfect video understanding. |
Mingxin Li; Wenhao Wang; Hongru Ji; Xianghua Li; Chao Gao; |
| 319 | DERI: Cross-Modal ECG Representation Learning with Deep ECG-Report Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although existing self-supervised learning (SSL) methods have achieved great performance in learning representation for ECG-based cardiac conditions classification, the clinical semantics can not be effectively captured. To overcome this limitation, we proposed to learn cross-modal ECG representations that contain more clinical semantics via a novel framework with \textbf{D}eep \textbf{E}CG-\textbf{R}eport \textbf{I}nteraction (\textbf{DERI}). |
Jian Chen; Xiaoru Dong; Wei Wang; Shaorui Zhou; Lequan Yu; Xiping Hu; |
| 320 | MSCI: Addressing CLIP’s Inherent Limitations for Compositional Zero-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Existing studies basically rely on the cross-modal alignment capabilities of CLIP but tend to overlook its limitations in capturing fine-grained local features, which arise from its architectural and training paradigm. To address this issue, we propose a Multi-Stage Cross-modal Interaction (MSCI) model that effectively explores and utilizes intermediate-layer information from CLIP’s visual encoder. |
Yue Wang; Shuai Xu; Xuelin Zhu; Yicong Li; |
| 321 | State Revisit and Re-explore: Bridging Sim-to-Real Gaps in Offline-and-Online Reinforcement Learning with An Imperfect Simulator Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, due to the unrestricted exploration in the imperfect simulator, the hybrid offline-and-online RL methods inevitably suffer from low sample efficiency and insufficient state-action space coverage during training. To solve this problem, we propose a State Revisit and Re-exploration (SR2) hybrid offline-and-online RL framework. |
Xingyu Chen; Jiayi Xie; Zhijian Xu; Ruixun Liu; Shuai Yang; Zeyang Liu; Lipeng Wan; Xuguang Lan; |
| 322 | Most General Explanations of Tree Ensembles Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we show how to find a most general abductive explanation for an AI decision. |
Yacine Izza; Akexey Ignatiev; Sasha Rubin; Joao Marques-Silva; Peter J. Stuckey; |
| 323 | Imagination-Limited Q-Learning for Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To balance exploitation and restriction, we propose an Imagination-Limited Q-learning (ILQ) method, which aims to maintain the optimism that OOD actions deserve within appropriate limits. |
Wenhui Liu; Zhijian Wu; Jingchao Wang; Dingjiang Huang; Shuigeng Zhou; |
| 324 | Improving Generalization in Meta-Learning Via Meta-Gradient Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This work proposes a data-independent Meta-Gradient Augmentation (MGAug) method from the perspective of gradient regularization. |
Ren Wang; Haoliang Sun; Yuxiu Lin; Xinxin Zhang; Yilong Yin; |
| 325 | On The Learning with Augmented Class Via Forests Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we focus on learning with augmented class via forests, where an augmented class may appear in testing data yet not in training data. |
Fan Xu; Wuyang Chen; Wei Gao; |
| 326 | CoderAgent: Simulating Student Behavior for Personalized Programming Learning with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address this challenge, many approaches attempt to simulate learner practice data, yet they often overlook the fine-grained, iterative nature of programming learning, resulting in a lack of interpretability and granularity. To fill this gap, we propose a LLM-based agent, CoderAgent, to simulate students’ programming processes in a fine-grained manner without relying on real data. |
Yi Zhan; Qi Liu; Weibo Gao; Zheng Zhang; Tianfu Wang; Shuanghong Shen; Junyu Lu; Zhenya Huang; |
| 327 | Disentangled and Personalized Representation Learning for Next Point-of-Interest Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This hinders the effective utilization of context information, and diverse user preferences are also neglected. To tackle these limitations, we propose Disentangled and Personalized Representation Learning (DPRL) as a novel method for next POI recommendation. |
Xuan Rao; Shuo Shang; Lisi Chen; Renhe Jiang; Peng Han; |
| 328 | TP-Eval: Tap Multimodal LLMs’ Potential in Evaluation By Customizing Prompts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Recently, multimodal large language models (MLLMs) have received much attention for their impressive capabilities. The evaluation of MLLMs is becoming critical to analyzing … |
Yuxuan Xie; Tianhua Li; Wenqi Shao; Kaipeng Zhang; |
| 329 | All Roads Lead to Rome: Exploring Edge Distribution Shifts for Heterophilic Graph Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces H₂OGNN, a novel framework that reframes edge attribute inference as an out-of-distribution (OOD) detection problem. |
Yi Wang; Changqin Huang; Ming Li; Tingyi Cai; Zhonglong Zheng; Xiaodi Huang; |
| 330 | Dividing Conflicting Items Fairly Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We significantly extend this result by establishing that a maximal EF1 allocation exists for any graph when the two agents have monotone valuations. To compute such an allocation, we present a polynomial-time algorithm for additive valuations, as well as a pseudo-polynomial time algorithm for monotone valuations. |
Ayumi Igarashi; Pasin Manurangsi; Hirotaka Yoneda; |
| 331 | The Core of Approval-Based Committee Elections with Few Seats Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We prove that core committees always exist when k ≤ 8, for any number of candidates m and any number of voters n, by showing that the Proportional Approval Voting (PAV) rule, proposed by Thiele in 1895, always satisfies the core when k ≤ 7 and always selects at least one committee in the core when k = 8. |
Dominik Peters; |
| 332 | Optimized View and Geometry Distillation from Multi-view Diffuser Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this study, we consider the radiance field optimized during geometry extraction as a more rigid consistency prior, compared to volume and ray aggregation used in previous works. |
Youjia Zhang; Zikai Song; Junqing Yu; Yawei Luo; Wei Yang; |
| 333 | Parameterized Approximation Algorithm for Doubly Constrained Fair Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we study the Fixed-Parameter Tractable (FPT) approximation algorithms for doubly constrained fair clustering under the k-median objective, referred to Df-k-Med. |
Xiaoliang Wu; Qilong Feng; Junyu Huang; Jianxin Wang; |
| 334 | BinMetric: A Comprehensive Binary Code Analysis Benchmark for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In summary, BinMetric makes a significant step forward in measuring binary analysis capabilities of LLMs, establishing a new benchmark leaderboard, and our study offers valuable insights for advancing LLMs in software security. |
Xiuwei Shang; Guoqiang Chen; Shaoyin Cheng; Benlong Wu; Li Hu; Gangyang Li; Weiming Zhang; Nenghai Yu; |
| 335 | M^2LLM: Multi-view Molecular Representation Learning with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recent advancements in large language models (LLMs) demonstrate remarkable reasoning abilities and prior knowledge across scientific domains, leading us to hypothesize that LLMs can generate rich molecular representations when guided to reason in multiple perspectives. To address these gaps, we propose M^2LLM, a multi-view framework that integrates three perspectives: the molecular structure view, the molecular task view, and the molecular rules view. |
Jiaxin Ju; Yizhen Zheng; Huan Yee Koh; Can Wang; Shirui Pan; |
| 336 | LRGR: Self-Supervised Incomplete Multi-View Clustering Via Local Refinement and Global Realignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Incomplete Multi-View Clustering (IMVC) aims to explore comprehensive representations from multiple views with missing samples. Recent studies have revealed that IMVC methods … |
Yanwanyu Xi; Xiao Zheng; Chang Tang; Xingchen Hu; Yuanyuan Liu; Jun-Jie Huang; Xinwang Liu; |
| 337 | M4Bench: A Benchmark of Multi-domain Multi-granularity Multi-image Understanding for Multi-modal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce M4Bench to enhance the capability of aligning and distinguishing multi-images with multi-domain multi-granularity comparison. |
Xiaojun Ye; Guanbao Liang; Chun Wang; Liangcheng Li; Pengfei Ke; Rui Wang; Bingxin Jia; Gang Huang; Qiao Sun; Sheng Zhou; |
| 338 | Multi-view Clustering Via Multi-granularity Ensemble Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Meanwhile, representation and graph fusion-based approaches face challenges such as explicit view alignment and manual weight tuning, making them less effective for heterogeneous views with varying data distributions. To address these limitations, we propose a novel multi-view clustering framework via Multi-granularity Ensemble (MGE), fully using the multi-granularity information across diverse views for accurate and consistent clustering. |
Jie Yang; Wei Chen; Feng Liu; Peng Zhou; Zhongli Wang; Xinyan Liang; Bingbing Jiang; |
| 339 | On The Discrimination and Consistency for Exemplar-Free Class Incremental Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In EF-CIL, task-id prediction is more challenging due to the lack of inter-task interaction (e.g., replays of exemplars). To address this issue, we conduct a theoretical analysis of the importance and feasibility of preserving a discriminative and consistent feature space, upon which we propose a novel method termed DCNet. |
Tianqi Wang; Jingcai Guo; Depeng Li; Zhi Chen; |
| 340 | 2D Gaussian Splatting for Outdoor Scene Decomposition and Relighting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose outdoor scene decomposition and relighting with 2D Gaussian splatting (OSDR-GS), a novel inverse rendering strategy under outdoor changing and unknown lighting conditions. |
Wei Feng; Kangrui Ye; Qi Zhang; Qian Zhang; Nan Li; |
| 341 | Game Theory Meets Large Language Models: A Systematic Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Finally, we identify key challenges and future research directions, assessing their feasibility based on the current state of the field. By bridging theoretical rigor with emerging AI capabilities, this survey aims to foster interdisciplinary collaboration and drive progress in this evolving research area. |
Haoran Sun; Yusen Wu; Yukun Cheng; Xu Chu; |
| 342 | DGL: Dynamic Global-Local Information Aggregation for Scalable VRP Generalization with Self-Improvement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While neural network-based VRP solvers have shown impressive results on test instances similar to training data, their performance often degrades when faced with varying scales and unseen distributions, limiting their practical applicability. To overcome these limitations, we introduce DGL (Dynamic Global-Local Information Aggregation), a novel model that combines global and local information to effectively solve VRPs. |
Yubin Xiao; Yuesong Wu; Rui Cao; Di Wang; Zhiguang Cao; Xuan Wu; Peng Zhao; Yuanshu Li; You Zhou; Yuan Jiang; |
| 343 | Diff-LMM: Diffusion Teacher-Guided Spatio-Temporal Perception for Video Large Multimodal Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This leads to hallucinated results when interpreting fine-grained objects or scenes. To address these limitations, we propose a novel framework that integrates diffusion models into multimodal video models. |
Jisheng Dang; Ligen Chen; Jingze Wu; Ronghao Lin; Bimei Wang; Yun Wang; Liting Wang; Nannan Zhu; Teng Wang; |
| 344 | FedHAN: A Cache-Based Semi-Asynchronous Federated Learning Framework Defending Against Poisoning Attacks in Heterogeneous Clients Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Current recovery methods, which are based on historical update records, are limited in environments with device heterogeneity and asynchronous communication. To address these problems, we introduce FedHAN, a reliable federated learning algorithm designed for asynchronous communication and device heterogeneity. |
Xiaoding Wang; Bin Ye; Li Xu; Lizhao Wu; Sun-Yuan Hsieh; Jie Wu; Limei Lin; |
| 345 | Multimodal Retina Image Analysis Survey: Datasets, Tasks and Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Changes in the retina often indicate both ophthalmic and systemic diseases, aiding in diagnosis and early intervention.While deep learning algorithms have advanced retina image analysis, a comprehensive review of related datasets, tasks, and benchmarking is still lacking. In this survey, we systematically categorize existing retina image datasets based on their available data modalities, and review the tasks these datasets support in multimodal retina image analysis. |
Hongwei Sheng; Heming Du; Xin Shen; Sen Wang; Xin Yu; |
| 346 | Revealing Concept Shift in Spatio-Temporal Graphs Via State Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The reason is that some environment variables in dynamic graphs exert varying effects on evolution patterns, but these variables are not effectively captured by the models, leading to the intractable concept shift issue. To tackle this issue, we propose a State-driven environment inference framework (Samen) to achieve a dynamic graph learning framework equipped with concept generalization ability. |
Kuo Yang; Yunhe Guo; Qihe Huang; Zhengyang Zhou; Yang Wang; |
| 347 | Optimizing Parameters of Quantum Circuits with Sparsity-Inducing Coordinate Descent Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: With the quest for a potential speedup, there is a need to run larger quantum circuits, which in turn results in the arduous task of parameter optimization. In this paper, we propose a generic method, called Rotolasso, that utilizes sparsity-inducing coordinate descent (CD) to optimize parameters of a PQC for balancing its accuracy and the number of parameterized gates. |
Rudy Raymond; Zichang He; |
| 348 | A Dynamic Stiefel Graph Neural Network for Efficient Spatio-Temporal Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing graph neural networks struggle to balance effectiveness and efficiency in modeling dynamic spatio-temporal relations. To address this problem, we propose the Dynamic Spatio-Temporal Stiefel Graph Neural Network (DST-SGNN) to efficiently process STTS. |
Jiankai Zheng; Liang Xie; |
| 349 | FedCCH: Automatic Personalized Graph Federated Learning for Inter-Client and Intra-Client Heterogeneity Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel automatic personalized graph federated learning (PGFL) scheme named FedCCH to capture both inter-client and intra-client heterogeneity. |
Pengfei Jiao; Zian Zhou; Meiting Xue; Huijun Tang; Zhidong Zhao; HuaMing Wu; |
| 350 | FreqMoE: Dynamic Frequency Enhancement for Neural PDE Solvers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the sparsity of high-frequency signals limits computational efficiency for high-dimensional inputs, and fixed-pattern truncation often causes high-frequency signal loss, reducing performance in scenarios such as high-resolution inputs or long-term predictions. To address these challenges, we propose FreqMoE, an efficient and progressive training framework that exploits the dependency of high-frequency signals on low-frequency components. |
Tianyu Chen; Haoyi Zhou; Ying Li; Hao Wang; Zhenzhe Zhang; Tianchen Zhu; Shanghang Zhang; Jianxin Li; |
| 351 | Domain Prompt Learning with Quaternion Networks (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Domain Prompt Learning with Quaternion Networks (DPLQ), which leverages domain-specific foundation models and quaternion-based prompt tuning to effectively transfer recognition capabilities. |
Qinglong Cao; Zhengqin Xu; Yuntian Chen; Chao Ma; Xiaokang Yang; |
| 352 | R2DQG: A Quality Meets Diversity Framework for Question Generation Over Knowledge Bases Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing models often focus on maximizing surface-level similarity to ground-truth questions, neglecting the need for diverse syntactic forms and leading to semantic drift during generation. To overcome these challenges, we propose Refine-Reinforced Diverse Question Generation (R2DQG), a two-phase framework leveraging a generation-then-refinement paradigm. |
Yimeng Ren; Yanhua Yu; Lizi Liao; Yuhu Shang; Kangkang Lu; Mingliang Yan; |
| 353 | Causal Learning Meet Covariates: Empowering Lightweight and Effective Nationwide Air Quality Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We use information theory to illustrate the superiority of the proposed model. |
Jiaming Ma; Zhiqing Cui; Binwu Wang; Pengkun Wang; Zhengyang Zhou; Zhe Zhao; Yang Wang; |
| 354 | Towards Comprehensive and Prerequisite-Free Explainer for Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, two major limitations severely degrade the performance and hinder the generalizability of existing XGNN methods: they (a) fail to capture the complete decision logic of GNNs across diverse distributions in the entire dataset’s sample space, and (b) impose strict prerequisites on edge properties and GNN internal accessibility. To address these limitations, we propose OPEN, a novel cOmprehensive and Prerequisite-free Explainer for GNNs. |
Han Zhang; Yan Wang; Guanfeng Liu; Pengfei Ding; Huaxiong Wang; Kwok-Yan Lam; |
| 355 | Progressive Prefix-Memory Tuning for Complex Logical Query Answering on Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing PLM-based KGCQA methods usually overlook the harm of disordered syntax or fragmented contexts within a serialized query, posing the problem of “impossible language” to limit PLMs in grasping the logical semantics. To address this problem, we propose a Progressive Prefix-Memory Tuning (PPMT) framework for KGCQA tasks, which effectively rectifies erroneous segments in serialized queries to assist PLMs in query answering. |
Xingrui Zhuo; Shirui Pan; Jiapu Wang; Gongqing Wu; Zan Zhang; Rui Li; Zizhong Wei; Xindong Wu; |
| 356 | PCToolkit: A Unified Plug-and-Play Prompt Compression Toolkit of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To support rapid implementation and standardization, we present the Prompt Compression Toolkit (PCToolkit), a unified plug-and-play framework for LLM prompt compression. |
Zheng Zhang; Jinyi Li; Yihuai Lan; Xiang Wang; Hao Wang; |
| 357 | Enhancing User-Oriented Proactivity in Open-Domain Dialogues with Critic Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This lack of user-oriented proactivity can lead users to feel unappreciated, reducing their satisfaction and willingness to continue the conversation in human-computer interactions. To address this issue, we propose a User-oriented Proactive Chatbot (UPC) to enhance the user-oriented proactivity. |
Yufeng Wang; Jinwu Hu; Ziteng Huang; Kunyang Lin; Zitian Zhang; Peihao Chen; Yu Hu; Qianyue Wang; Zhuliang Yu; Bin Sun; Xiaofen Xing; Qingfang Zheng; Mingkui Tan; |
| 358 | FAST: A Lightweight Mechanism Unleashing Arbitrary Client Participation in Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To alleviate the impact, we propose a lightweight solution, Federated Average with Snapshot (FAST), that supports almost ACP for FL and can seamlessly integrate with other classic FL algorithms. |
Zhe Li; Seyedsina Nabavirazavi; Bicheng Ying; Sitharama Iyengar; Haibo Yang; |
| 359 | Multimodal Inverse Attention Network with Intrinsic Discriminant Feature Exploitation for Fake News Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While existing approaches have contributed to understanding cross-modal consistency, they often fail to leverage modal-specific representations and explicit discrepant features. To address these limitations, we propose a Multimodal Inverse Attention Network (MIAN), a novel framework that explores intrinsic discriminative features based on news content to advance fake news detection. |
Tianlin Zhang; En Yu; Yi Shao; Jiande Sun; |
| 360 | View-Association-Guided Dynamic Multi-View Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, most existing methods fail to fully leverage the complex relationships between views, often treating them independently or using static fusion strategies. In this paper, we propose a View-Association-Guided Dynamic Multi-View Classification method (AssoDMVC) to address these limitations. |
Xinyan Liang; Li Lv; Qian Guo; Bingbing Jiang; Feijiang Li; Liang Du; Lu Chen; |
| 361 | HLMTrans: A Sim-to-Real Transfer Framework for Spatial Crowdsourcing with Human-Guided Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a Sim-to-Real Transfer with Human-guided Language Models framework called HLMTrans, which comprises three core modules: RLMs decision for task assignment, sim-to-real transfer with Large Language Models (LLMs), and preference learning from human feedback. |
Qingshun Wu; Yafei Li; Lulu Li; Yuanyuan Jin; Shuo He; Mingliang Xu; |
| 362 | ReplayCAD: Generative Diffusion Replay for Continual Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing CAD methods store image distributions or patch features to mitigate catastrophic forgetting, but they fail to preserve pixel-level detailed features for accurate segmentation. To overcome this limitation, we propose ReplayCAD, a novel diffusion-driven generative replay framework that replay high-quality historical data, thus effectively preserving pixel-level detailed features. |
Lei Hu; Zhiyong Gan; Ling Deng; Jinglin Liang; Lingyu Liang; Shuangping Huang; Tianshui Chen; |
| 363 | Hybrid Local Causal Discovery Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Due to the inherent incompleteness of local information, popular methods from global causal discovery often face new challenges in local causal discovery tasks, such as 1) erroneous symmetry constraint tests and the resulting cascading errors in constraint-based methods, and 2) confusion within score-based approaches caused by local spurious equivalence classes. To address the above issues, we propose a Hybrid Local Causal Discovery algorithm, called HLCD. |
Zhaolong Ling; Honghui Peng; Yiwen Zhang; Debo Cheng; Xingyu Wu; Peng Zhou; Kui Yu; |
| 364 | Fair Incomplete Multi-View Clustering Via Distribution Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To tackle the problem, this work presents a novel Fair Incomplete Multi-View Clustering (FIMVC) method. |
Qianqian Wang; Haiming Xu; Meiling Liu; Wei Feng; Xiangdong Zhang; |
| 365 | Accurate Sublayer Pruning for Large Language Models By Exploiting Latency and Tunability Information Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: How can we accelerate large language models (LLMs) without sacrificing accuracy? The slow inference speed of LLMs hinders us to benefit from their remarkable performance in … |
Seungcheol Park; Sojin Lee; Jongjin Kim; Jinsik Lee; Hyunjik Jo; U Kang; |
| 366 | GRAPE: Heterogeneous Graph Representation Learning for Genetic Perturbation with Coding and Non-Coding Biotype Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we leverage pre-trained large language model and DNA sequence model to extract features from gene descriptions and DNA sequence data, respectively, which serve as the initialization for gene representations. |
Changxi Chi; Jun Xia; Jingbo Zhou; Jiabei Cheng; Chang Yu; Stan Z. Li; |
| 367 | Interaction-Data-guided Conditional Instrumental Variables for Debiasing Recommender Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: It is often challenging to identify a valid instrumental variable (IV), although the IV methods have been regarded as effective tools of addressing the confounding bias introduced by latent variables. To deal with this issue, an Interaction-Data-guided Conditional IV (IDCIV) debiasing method is proposed for Recommender Systems, called IDCIV-RS. |
Zhirong Huang; Debo Cheng; Lin Liu; Jiuyong Li; Guangquan Lu; Shichao Zhang; |
| 368 | DM-POSA: Enhancing Open-World Test-Time Adaptation with Dual-Mode Matching and Prompt-Based Open Set Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we emphasize that accurate identification of the open-set samples is rather challenging in TTA. |
Shiji Zhao; Shao-Yuan Li; Chuanxing Geng; Sheng-Jun Huang; Songcan Chen; |
| 369 | The First Theoretical Approximation Guarantees for The Non-Dominated Sorting Genetic Algorithm III (NSGA-III) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work conducts a first theoretical analysis studying how well the NSGA-III approximates the Pareto front when the population size N is less than the Pareto front size. |
Renzhong Deng; Weijie Zheng; Benjamin Doerr; |
| 370 | Learning Neural Jump Stochastic Differential Equations with Latent Graph for Multivariate Temporal Point Processes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these models have yet to thoroughly consider the underlying relationships among different event types to enhance their modeling capacity. Therefore, this paper introduces a method that uses neural SDEs with a jump process guided by the latent graph. |
Yuchen Wang; Dongpeng Hou; Chao Gao; Xianghua Li; |
| 371 | Brain-Inspired Stepwise Patch Merging for Vision Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Drawing inspiration from the brain’s ability to integrate global and local information for comprehensive visual understanding, we propose Stepwise Patch Merging (SPM), which enhances the subsequent attention mechanism’s ability to ‘see’ better. |
Yonghao Yu; Dongcheng Zhao; Guobin Shen; Yiting Dong; Yi Zeng; |
| 372 | A Survey on The Feedback Mechanism of LLM-based AI Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address this challenge, tremendous efforts have been dedicated to designing diverse feedback mechanisms for LLM-based AI agents. To provide a comprehensive overview of this rapidly evolving field, this paper presents a systematic review of these studies, offering a holistic perspective on the feedback mechanisms in LLM-based AI agents. |
Zhipeng Liu; Xuefeng Bai; Kehai Chen; Xinyang Chen; Xiucheng Li; Yang Xiang; Jin Liu; Hong-Dong Li; Yaowei Wang; Liqiang Nie; Min Zhang; |
| 373 | METOR: A Unified Framework for Mutual Enhancement of Objects and Relationships in Open-vocabulary Video Visual Relationship Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose Mutual EnhancemenT of Objects and Relationships (METOR), a query-based unified framework to jointly model and mutually enhance object detection and relationship classification in open-vocabulary scenarios. |
Yongqi Wang; Xinxiao Wu; Shuo Yang; |
| 374 | FedAPA: Server-side Gradient-Based Adaptive Personalized Aggregation for Federated Learning on Heterogeneous Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose FedAPA, a novel PFL method featuring a server-side, gradient-based adaptive aggregation strategy to generate personalized models, by updating aggregation weights based on gradients of client-parameter changes with respect to the aggregation weights in a centralized manner. |
Yuxia Sun; Aoxiang Sun; Siyi Pan; Zhixiao Fu; Jingcai Guo; |
| 375 | Graph Embedded Contrastive Learning for Multi-View Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, traditional contrastive losses ignore the neighbor relationship in multi-view scenarios and easily lead to false associations in sample pairs. To address these issues, we propose Graph Embedded Contrastive Learning for Multi-View Clustering. |
Hongqing He; Jie Xu; Guoqiu Wen; Yazhou Ren; Na Zhao; Xiaofeng Zhu; |
| 376 | A Survey on Multi-View Knowledge Graph: Generation, Fusion, Applications and Future Directions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: By unifying fragmented methodologies and formalizing MVKG design principles, this survey serves as a roadmap for advancing KG versatility in complex AI-driven scenarios. |
Zihan Yang; Xiaohui Tao; Taotao Cai; Yifu Tang; Haoran Xie; Lin Li; Jianxin Li; Qing Li; |
| 377 | Learning Advanced Self-Attention for Linear Transformers in The Singular Value Domain Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Consequently, existing self-attention mechanisms are designed in a rather simplified manner. Therefore, we propose a novel method, called Attentive Graph Filter (AGF), interpreting the self-attention as learning the graph filter in the singular value domain from the perspective of graph signal processing for directed graphs with the linear complexity w.r.t. the input length. |
Hyowon Wi; Jeongwhan Choi; Noseong Park; |
| 378 | Poisoning-based Backdoor Attacks for Arbitrary Target Label with Positive Triggers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To exploit the abundant information contained in the input-to-label mapping, our scheme utilizes the network trained from the clean dataset as a trigger generator to produce poisons that significantly raise the success rate of backdoor attacks versus conventional approaches. Specifically, we introduce a new categorization of triggers inspired by adversarial techniques and propose a multi-label and multi-payload Poisoning-based backdoor attack with Positive Triggers (PPT), which strategically manipulates inputs to align them closer to the target label in the feature space of benign classifiers. |
Binxiao Huang; Ngai Wong; |
| 379 | Hybrid Mesh-Gaussian Representation for Efficient Indoor Scene Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, regions with complex textures require numerous Gaussians to capture significant color variations accurately, leading to inefficiencies in rendering speed. To address this challenge, we introduce a hybrid representation for indoor scenes that combines 3DGS with textured meshes. |
Binxiao Huang; Zhihao Li; Shiyong Liu; Xiao Tang; Jiajun Tang; Jiaqi Lin; Yuxin Cheng; Zhenyu Chen; Xiaofei Wu; Ngai Wong; |
| 380 | Exploring Transferable Homogenous Groups for Compositional Zero-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Comparatively, humans are adept at analogizing and reasoning in a hierarchical clustering manner, intuitively grouping categories with similar properties to form cohesive concepts. Motivated by this, we propose Homogeneous Group Representation Learning (HGRL), a new perspective formulates state (object) representation learning as multiple homogeneous sub-group representation learning. |
Zhijie Rao; Jingcai Guo; Miaoge Li; Yang Chen; Mengzhu Wang; |
| 381 | Taking STEPS Forward: Enhancing Online Peer-Counseling with Schema Therapy Via Socratic Questioning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present STEPS, an AI-powered assistive dialog tool for peer-counseling. |
Beng Heng Ang; Sujatha Das Gollapalli; See-Kiong Ng; |
| 382 | Guiding LLM-based Smart Contract Generation with Finite State Machine Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although Large Language Models (LLMs) show great potential in programming tasks, they still face challenges in smart contract generation w.r.t. effectiveness and security. To solve these problems, we propose FSM-SCG, a smart contract generation framework based on finite state machine (FSM) and LLMs, which significantly improves the quality of the generated code by abstracting user requirements to generate FSM, guiding LLMs to generate smart contracts, and iteratively optimizing the code with the feedback of compilation and security checks. |
Hao Luo; Yuhao Lin; Xiao Yan; Xintong Hu; Yuxiang Wang; Qiming Zeng; Hao Wang; Jiawei Jiang; |
| 383 | A Survey on Temporal Interaction Graph Representation Learning: Progress, Challenges, and Opportunities Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we begin by introducing the foundational concepts of TIGs and emphasizing the critical role of temporal dependencies. |
Pengfei Jiao; Hongjiang Chen; Xuan Guo; Zhidong Zhao; Dongxiao He; Di Jin; |
| 384 | Mitigating Message Imbalance in Fraud Detection with Dual-View Graph Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Based on statistical validation, we propose a novel dual-view graph representation learning method to mitigate Message imbalance in Fraud Detection (MimbFD). |
Yudan Song; Yuecen Wei; Yuhang Lu; Qingyun Sun; Minglai Shao; Li-e Wang; Chunming Hu; Xianxian Li; Xingcheng Fu; |
| 385 | DeepFeatIoT: Unifying Deep Learned, Randomized, and LLM Features for Enhanced IoT Time Series Sensor Data Classification in Smart Industries Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, challenges such as the loss or ambiguity of sensor metadata, heterogeneity in data sources, varying sampling frequencies, inconsistent units of measurement, and irregular timestamps make raw IoT time series data difficult to interpret, undermining the effectiveness of smart systems. To address these challenges, we propose a novel deep learning model, DeepFeatIoT, which integrates learned local and global features with non-learned randomized convolutional kernel-based features and features from large language models (LLMs). |
Muhammad Sakib Khan Inan; Kewen Liao; |
| 386 | Variational Multi-Modal Hypergraph Attention Network for Multi-Modal Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These scenarios amplify the difficulty of distinguishing relationships and hinder accurate extraction. To address these limitations, we propose the variational multi-modal hypergraph attention network (VM-HAN), a novel and robust framework for MMRE. |
Qian Li; Cheng Ji; Shu Guo; Kun Peng; Qianren Mao; Shangguang Wang; |
| 387 | Multimodal Knowledge Retrieval-Augmented Iterative Alignment for Satellite Commonsense Conversation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing large language models suffer from prevalent hallucinations and poor comprehensibility on multimodal satellite data due to their high professional content threshold and partial information opacity. To address these issues, we propose a multimodal satellite knowledge retrieval-augmented iterative alignment framework (Sat-RIA) for satellite commonsense conversation. |
Qian Li; Xuchen Li; Zongyu Chang; Yuzheng Zhang; Cheng Ji; Shangguang Wang; |
| 388 | MolHFCNet: Enhancing Molecular Graph Representations with Hierarchical Feature Combining and Hybrid Pretraining Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces MolHFCNet, a graph neural network designed to enhance molecular representation learning. |
Duy-Long Nguyen; Duc-Luong Ho-Viet; Anh-Thu Ngo-Tran; Quang H. Nguyen; Binh P. Nguyen; |
| 389 | KgMBQA: Quality Knowledge Graph-driven Multimodal Blind Image Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing unimodal quality indicators have limited representational ability when facing complex contents and distortion types, and the predicted scores also fail to provide explanatory reasons, which further affects the credibility of their prediction results. To address these challenges, we propose a multimodal quality indicator with explanatory text descriptions, called kgMBQA. |
Wuyuan Xie; Tingcheng Bian; Miaohui Wang; |
| 390 | Sanitizing Backdoored Graph Neural Networks: A Multidimensional Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To tackle these threats, we empirically analyze triggers from a multidimensional aspect, and our analysis shows that there are clear distinctions between trigger nodes and normal ones in terms of node feature values, node embeddings, and class prediction probabilities. Based on these findings, we propose a Multidimensional Anomaly Detection framework (MAD) that can effectively minimize the impact of triggers by pruning away anomalous nodes and edges. |
Rong Zhao; Jilian Zhang; Yu Wang; Yinyan Zhang; Jian Weng; |
| 391 | FedCM: Client Clustering and Migration in Federated Learning Via Gradient Path Similarity and Update Direction Deviation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Statistical heterogeneity causes the direction of local model updates to deviate from the global training objective, while data distribution drift leads to a mismatch between local models and their cluster models. To address these challenges, this paper proposes an adaptive clustered federated learning framework, Fed-CM. |
Peng Wang; Shoupeng Lu; Hao Yin; Banglie Yang; Tianli Zhu; Cheng Dai; |
| 392 | GATES: Cost-aware Dynamic Workflow Scheduling Via Graph Attention Networks and Evolution Strategy Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Considering the above-mentioned issues, this study proposes a novel DRL method combining Graph Attention Networks-based policy network and Evolution Strategy, referred to as GATES. |
Ya Shen; Gang Chen; Hui Ma; Mengjie Zhang; |
| 393 | ADC-GS: Anchor-Driven Deformable and Compressed Gaussian Splatting for Dynamic Scene Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Existing 4D Gaussian Splatting methods rely on per-Gaussian deformation from a canonical space to target frames, which overlooks redundancy among adjacent Gaussian primitives and result in suboptimal performance. To address this limitation, we propose Anchor-Driven Deformable and Compressed Gaussian Splatting (ADC-GS), a compact and efficient representation for dynamic scene reconstruction. |
He Huang; Qi Yang; Mufan Liu; Yiling Xu; Zhu Li; |
| 394 | Graph Neural Networks for Databases: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, despite notable advances, There is a lack of a comprehensive review and understanding of how GNNs could improve DB systems. Therefore, this survey aims to bridge this gap by providing a structured and in-depth overview of GNNs for DB systems. |
Ziming Li; Youhuan Li; Yuyu Luo; Guoliang Li; Chuxu Zhang; |
| 395 | Model Rake: A Defense Against Stealing Attacks in Split Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The surrogate models can also be used to reconstruct private training data of the clients (i.e., data stealing). To defend against these stealing attacks, we propose Model Rake (i.e., Rake), which runs two bottom models on each client and differentiates their output spaces to make the two models distinct. |
Qinbo Zhang; Xiao Yan; Yanfeng Zhao; Fangcheng Fu; Quanqing Xu; Yukai Ding; Xiaokai Zhou; Chuang Hu; Jiawei Jiang; |
| 396 | Towards Anytime Retrieval: A Benchmark for Anytime Person Re-Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing ReID tasks and datasets cannot meet this requirement, as they are constrained by available time and only provide training and evaluation for specific scenarios. Therefore, we investigate a new task called Anytime Person Re-identification (AT-ReID), which aims to achieve effective retrieval in multiple scenarios based on variations in time. |
Xulin Li; Yan Lu; Bin Liu; Jiaze Li; Qinhong Yang; Tao Gong; Qi Chu; Mang Ye; Nenghai Yu; |
| 397 | Wrapped Partial Label Dimensionality Reduction Via Dependence Maximization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the decoupling of dimensionality reduction from partial label disambiguation can lead to severe performance degradation. In this paper, we present a novel approach called Wrapped Partial Label Dimensionality Reduction (WPLDR) to address this challenge. |
Xiang-Ru Yu; Deng-Bao Wang; Min-Ling Zhang; |
| 398 | LP-Based Weighted Model Integration Over Non-Linear Real Arithmetic Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present a novel method for approximate WMI, which provides more effective support for the wide class of semi-algebraic functions that includes rational and radical functions, with literals defined over non-linear real arithmetic. |
S. Akshay; Supratik Chakraborty; Soroush Farokhnia; Amir Goharshady; Harshit Jitendra Motwani; Đorđe Žikelić; |
| 399 | CD^2: Constrained Dataset Distillation for Few-Shot Class-Incremental Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing methods usually employ an external memory to store previous knowledge and treat it with incremental classes equally, which cannot properly preserve previous essential knowledge. To solve this problem and inspired by recent distillation works on knowledge transfer, we propose a framework termed Constrained Dataset Distillation (CD^2) to facilitate FSCIL, which includes a dataset distillation module (DDM) and a distillation constraint module (DCM). |
Kexin Bao; Daichi Zhang; Hansong Zhang; Yong Li; Yutao Yue; Shiming Ge; |
| 400 | Large-Scale Trade-Off Curve Computation for Incentive Allocation with Cardinality and Matroid Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We consider a large-scale incentive allocation problem where the entire trade-off curve between budget and profit has to be maintained approximately at all time. |
Yu Cong; Chao Xu; Yi Zhou; |
| 401 | Fusion of Granular-Ball Visual Spatial Representations for Enhanced Facial Expression Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Current methods primarily focus on extracting visual representations while overlooking other valuable information. To address this limitation, we propose a novel method called Component Separation and Granular-ball Space Bootstrap Fusion (CS-GBSBF), which leverages granular balls to transform visual images to spatial graphs, thereby enlarging the spatial information embedded in images. |
Shuaiyu Liu; Qiyao Shen; Yunxi Wang; Yazhou Ren; Guoyin Wang; |
| 402 | Injecting Imbalance Sensitivity for Multi-Task Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In line with this perspective, we enhance the existing baseline method by injecting imbalance-sensitivity through the imposition of constraints on the projected norms. |
Zhipeng Zhou; Liu Liu; Peilin Zhao; Wei Gong; |
| 403 | MEGAD: A Memory-Efficient Framework for Large-Scale Attributed Graph Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a Memory-Efficient framework for large-scale attributed Graph Anomaly Detection (MEGAD). |
Yifan Zhang; Haolong Xiang; Xiaolong Xu; Zishun Rui; Xiaoyong Li; Lianyong Qi; Fei Dai; |
| 404 | Enhancing Chemical Reaction and Retrosynthesis Prediction with Large Language Model and Dual-task Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, directly applying LLMs to these tasks faces two major challenges: (i) lacking a large-scale chemical synthesis-related instruction dataset; (ii) ignoring the close correlation between reaction and retrosynthesis prediction for the existing fine-tuning strategies. To address these challenges, we propose ChemDual, a novel LLM framework for accurate chemical synthesis. |
Xuan Lin; Qingrui Liu; Hongxin Xiang; Daojian Zeng; Xiangxiang Zeng; |
| 405 | Graph OOD Detection Via Plug-and-Play Energy-based Evaluation and Propagation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: On the other hand, existing out-of-distribution (OOD) detection methods often rely on the softmax confidence score, which makes the OOD data suffer from overconfident posterior distributions. To address the above issues, we propose an Energy Propagation-based Graph Neural Network (EPGNN), which improves the OOD generalization ability by endowing GNN with the capacity to detect the OOD nodes in the graph. |
Yunxia Zhang; Mingchen Sun; Yutong Zhang; Funing Yang; Ying Wang; |
| 406 | General Incomplete Time Series Analysis Via Patch Dropping Without Imputation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose INTER, a novel end-to-end framework for incomplete multivariate time series analysis, which bypasses imputation by leveraging pre-trained language models to learn the distribution of incomplete time series data. |
Yangyang Wu; Yi Yuan; Mengying Zhu; Xiaoye Miao; Meng Xi; |
| 407 | Learn from Global Rather Than Local: Consistent Context-Aware Representation Learning for Multi-View Graph Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite the impressive achievements of existing methods, they are limited by a common deficiency, namely, the curse of local manifold while failing to perceive the global manifold structure. In light of this drawback, we propose a Consistent Context-Aware Representation Learning (CCARL) method for MVGC, aiming to learn node representations from global space rather than just local topology. |
Lele Fu; Bowen Deng; Sheng Huang; Tianchi Liao; Chuanfu Zhang; Chuan Chen; |
| 408 | FedSaaS: Class-Consistency Federated Semantic Segmentation Via Global Prototype Supervision and Local Adversarial Harmonization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This oversight results in ambiguities between class representation. To overcome this challenge, we propose a novel federated segmentation framework that strikes class consistency, termed FedSaaS. |
Xiaoyang Yu; Xiaoming Wu; Xin Wang; Dongrun Li; Ming Yang; Peng Cheng; |
| 409 | Conditional Independent Test in The Presence of Measurement Error with Causal Structure Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: By leveraging high-order cumulants, we derive rank constraints on the cumulant matrix and establish their role in effectively assessing conditional independence, even in the presence of measurement errors. Based on these theoretical results, we leverage the rank constraints of the cumulant matrix as a tool for conditional independence testing and incorporate it into the PC algorithm, resulting in the PC-ME algorithm — a method designed to learn causal structures from observed data while accounting for measurement errors. |
Hongbin Zhang; Kezhou Chen; Nankai Lin; Aimin Yang; Zhifeng Hao; Zhengming Chen; |
| 410 | RenderBender: A Survey on Adversarial Attacks Using Differentiable Rendering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This survey contributes a framework that unifies diverse goals and tasks, facilitating easy comparison of existing work, identifying research gaps, and highlighting future directions—ranging from expanding attack goals and tasks to account for new modalities, state-of-the-art models, tools, and pipelines, to underscoring the importance of studying real-world threats in complex scenes. |
Matthew Hull; Haoran Wang; Matthew Lau; Alec Helbling; Mansi Phute; Chao Zhang; Zsolt Kira; Willian Lunardi; Martin Andreoni; Wenke Lee; Duen Horng Chau; |
| 411 | Semantic-Space-Intervened Diffusive Alignment for Visual Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, they typically face difficulties in finding such a projection due to the two modalities in both the distribution of class-wise samples and the range of their feature values. To address this issue, this paper proposes a novel Semantic-Space-Intervened Diffusive Alignment method, termed SeDA, models a semantic space as a bridge in the visual-to-textual projection, considering both types of features share the same class-level information in classification. |
Zixuan Li; Lei Meng; Guoqing Chao; Wei Wu; Yimeng Yang; Xiaoshuo Yan; Zhuang Qi; Xiangxu Meng; |
| 412 | Generative Multi-Agent Collaboration in Embodied AI: A Systematic Review Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a taxonomy that categorizes EMAS by system architectures and embodiment modalities, emphasizing how collaboration spans both physical and virtual contexts. |
Di Wu; Xian Wei; Guang Chen; Hao Shen; Bo Jin; |
| 413 | Multi-Task Curriculum Graph Contrastive Learning with Clustering Entropy Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite the strides, most graph contrastive learning models face challenges: 1) graph augmentation is used to improve learning diversity, but commonly used random augmentation methods may destroy inherent semantics and cause noise; 2) the fixed positive and negative sample selection strategy ignores the difficulty distribution of samples when deal with complex real data, thereby impeding the model’s capability to capture fine-grained patterns and trapping the model in sub-optimal for clustering. To reduce these problems, we propose the Clustering-guided Curriculum Graph contrastive Learning (CurGL) framework. |
Chusheng Zeng; Bocheng Wang; Jinghui Yuan; Mulin Chen; Xuelong Li; |
| 414 | Credit Assignment and Fine-Tuning Enhanced Reinforcement Learning for Collaborative Spatial Crowdsourcing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although mainstream reinforcement learning (RL) methods have proven effective in task allocation, they face two key obstacles: delayed reward feedback and non-stationary data distributions, both hindering optimal allocation and collaborative efficiency. To address these limitations, we propose CAFE (credit assignment and fine-tuning enhanced), a novel multi-agent RL framework for spatial crowdsourcing. |
Wei Chen; Yafei Li; Baolong Mei; Guanglei Zhu; Jiaqi Wu; Mingliang Xu; |
| 415 | Optimal Metric Distortion for Matching on The Line Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose an algorithm that is provided only with ordinal information regarding the agents’ preferences (each agent’s ranking of the items from most- to least-preferred) and returns a matching aiming to minimize the social cost with respect to the agents’ true (cardinal) costs. |
Aris Filos-Ratsikas; Vasilis Gkatzelis; Mohamad Latifian; Emma Rewinski; Alexandros A. Voudouris; |
| 416 | Mamba-Based Graph Convolutional Networks: Tackling Over-smoothing with Selective State Space Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce MbaGCN, a novel graph convolutional architecture that draws inspiration from the Mamba paradigm—originally designed for sequence modeling. |
Xin He; Yili Wang; Wenqi Fan; Xu Shen; Xin Juan; Rui Miao; Xin Wang; |
| 417 | Template3D-AD: Point Cloud Template Matching Method Based on Center Points for 3D Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Considering that the appearance of anomalies is related to the change of surface shape, this paper proposes a curvature-based local feature representation method, which increases the feature difference between abnormal surfaces and normal surfaces. |
Yi Liu; Changsheng Zhang; Yufei Yang; |
| 418 | Modality-Guided Dynamic Graph Fusion and Temporal Diffusion for Self-Supervised RGB-T Tracking Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, the omission of the object region by erroneous pseudo-label or the introduction of background noise affects the efficiency of modality fusion, while pseudo-label noise triggered by similar object noise can further affect the tracking performance. In this paper, we propose GDSTrack, a novel approach that introduces dynamic graph fusion and temporal diffusion to address the above challenges in self-supervised RGB-T tracking. |
Shenglan Li; Rui Yao; Yong Zhou; Hancheng Zhu; Kunyang Sun; Bing Liu; Zhiwen Shao; Jiaqi Zhao; |
| 419 | RobustHAR: Multi-scale Spatial-temporal Masked Self-supervised Pre-training for Robust Human Activity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Besides, different human activities often span across various spatial-temporal scales, which results in activity recognizer failing to capture intricate spatial-temporal semantic information. To address these issues, we propose RobustHAR, a new HAR model with multi-scale spatial-temporal masked self-supervised pre-training designed to improve model performance on the data missing context. |
Xiao Liu; Guan Yuan; Yanmei Zhang; Shang Liu; Qiuyan Yan; |
| 420 | Grounding Creativity in Physics: A Brief Survey of Physical Priors in AIGC Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: By presenting an in-depth analysis of physics-grounded AIGC, this survey aims to bridge the gap between generative models and physical realism, providing insights that inspire future research in physically consistent content generation. |
Siwei Meng; Yawei Luo; Ping Liu; |
| 421 | Human-Imperceptible, Machine-Recognizable Images Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: A major conflict is exposed relating to software engineers between better developing AI systems and distancing from the sensitive training data. To reconcile this conflict, the paper proposes an efficient privacy-preserving learning paradigm, where images are encrypted to become “human-imperceptible, machine-recognizable” via one of the two encryption strategies: (1) random shuffling equally-sized patches and (2) mixing-up sub-patches. |
Fusheng Hao; Fengxiang He; Yikai Wang; Fuxiang Wu; Jing Zhang; Dacheng Tao; Jun Cheng; |
| 422 | AlphaGAT: A Two-Stage Learning Approach for Adaptive Portfolio Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose AlphaGAT, a novel two-stage learning approach for portfolio selection, designed to adapt to different market scenarios. |
Shicheng Li; Jinshan Zhang ; Feng Wang; |
| 423 | Template-based Uncertainty Multimodal Fusion Network for RGBT Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the quality of different modalities changes dynamically in complex scenes, and effectively perceiving modal quality for multimodal fusion remains a significant challenge. To address this challenge, we propose to employ the reliability of initial template to explore the uncertainty across different modalities, and design a novel template-based uncertainty computation framework for robust multimodal fusion in RGBT tracking.In particular, we introduce an Uncertainty-aware Multimodal Fusion Module (UMFM), which constructs the uncertainty of each modality by leveraging the correlation between the template and search region in the Subjective Logic framework, aiming to achieve robust multimodal fusion. |
Zhaodong Ding; Chenglong Li; Shengqing Miao; Jin Tang; |
| 424 | CoDiCast: Conditional Diffusion Model for Global Weather Forecasting with Uncertainty Quantification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: On the other hand, most machine learning-based weather prediction (MLWP) approaches offer efficiency and accuracy but remain deterministic, lacking the ability to capture forecast uncertainty. To tackle these challenges, we propose a conditional diffusion model, CoDiCast, to generate global weather prediction, integrating accuracy and uncertainty quantification at a modest computational cost. |
Jimeng Shi; Bowen Jin; Jiawei Han; Sundararaman Gopalakrishnan; Giri Narasimhan; |
| 425 | Bi-DiffCD: Bidirectional Diffusion Guided Collaborative Change Detection for Arbitrary-Modal Remote Sensing Images Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose a bidirectional diffusion guided collaborative change detection model (Bi-DiffCD) for arbitrary-modal images, which eliminates the modal discrepancy between arbitrary-modal images through the bidirectional diffusion and makes full use of the multilevel complementary advantage features to improve the detection accuracy. |
Jingyu Zhao; Jiahui Qu; Wenqian Dong; |
| 426 | Rethinking Removal Attack and Fingerprinting Defense for Model Intellectual Property Protection: A Frequency Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, current model ownership resolution (MOR) methods predominantly address general removal attacks that involve weight modifications, with limited research considering alternative attack perspectives. In this work, we propose a frequency-based model ownership removal attack, grounded in a key observation: modifying a model’s high-frequency coefficients does not significantly impact its performance but does alter its weights and decision boundary. |
Cheng Zhang; Yang Xu; Tingqiao Huang; Zixing Zhang; |
| 427 | Towards Automatic Sampling of User Behaviors for Sequential Recommender Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Hence, in this paper, we propose a novel automatic sampling framework for sequential recommendation, named AutoSAM, to non-uniformly treat historical behaviors. |
Hao Zhang; Mingyue Cheng; Zhiding Liu; Junzhe Jiang; |
| 428 | Aggregation Mechanism Based Graph Heterogeneous Networks Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes an aggregation mechanism enhanced GNN distillation framework (AMEND). |
Xiaobin Hong; Mingkai Lin; Xiangkai Ma; Wenzhong Li; Sanglu Lu; |
| 429 | TrajCogn: Leveraging LLMs for Cognizing Movement Patterns and Travel Purposes from Trajectories Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: Spatio-temporal trajectories are crucial for data mining tasks, requiring versatile learning methods that can accurately extract movement patterns and travel purposes. While large … |
Zeyu Zhou; Yan Lin; Haomin Wen; Shengnan Guo; Jilin Hu; Youfang Lin; Huaiyu Wan; |
| 430 | Not All Layers of LLMs Are Necessary During Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose a simple yet effective algorithm named AdaInfer to adaptively terminate the inference process for an input instance. |
Siqi Fan; Xin Jiang; Xiang Li; Xuying Meng; Peng Han; Shuo Shang; Aixin Sun; Yequan Wang; |
| 431 | GraphProt: Certified Black-Box Shielding Against Backdoored Graph Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address those limitations, we propose GraphProt, a certified black-box defense method to suppress backdoor attacks on GNN-based graph classifiers. |
Xiao Yang; Yuni Lai; Kai Zhou; Gaolei Li; Jianhua Li; Hang Zhang; |
| 432 | Advancing Embodied Agent Security: From Safety Benchmarks to Input Moderation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing research predominantly concentrates on the security of general large language models, lacking specialized methodologies for establishing safety benchmarks and input moderation tailored to embodied agents. To bridge this gap, this paper introduces a novel input moderation framework, meticulously designed to safeguard embodied agents. |
Ning Wang; Zihan Yan; Weiyang Li; Chuan Ma; He Chen; Tao Xiang; |
| 433 | Exploring The Frontiers of Animation Video Generation in The Sora Era: Method, Dataset and Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present a comprehensive system, AniSora, designed for animation video generation, which includes a data processing pipeline, a controllable generation model, and an evaluation benchmark. |
Yudong Jiang; Baohan Xu; Siqian Yang; Mingyu Ying; Jing Liu; Chao Xu; Siqi Wang; Yidi Wu; Bingwen Zhu; Yue Zhang; Jinlong Hou; Huyang Sun; |
| 434 | Robust Graph Contrastive Learning for Incomplete Multi-view Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose a new IMVC framework, namely robust graph contrastive learning (RGCL). |
Deyin Zhuang; Jian Dai; Xingfeng Li; Xi Wu; Yuan Sun; Zhenwen Ren; |
| 435 | Fair Submodular Maximization Over A Knapsack Constraint Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We consider fairness in submodular maximization subject to a knapsack constraint, a fundamental problem with various applications in economics, machine learning, and data mining. |
Lijun Li; Chenyang Xu; Liuyi Yang; Ruilong Zhang; |
| 436 | Spotlighting Partially Visible Cinematic Language for Video-to-Audio Generation Via Self-distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As a result, their performance deteriorates in scenarios where Foley targets are only partially visible. To address this challenge, we propose a simple self-distillation approach to extend V2A models to cinematic language scenarios. |
Feizhen Huang; Yu Wu; Yutian Lin; Bo Du; |
| 437 | Towards Micro-Action Recognition with Limited Annotations: An Asynchronous Pseudo Labeling and Training Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This issue primarily arises from the common practice of directly using the predictions of classifier as pseudo-labels to train the model. To solve this issue, we propose a novel framework, called Asynchronous Pseudo Labeling and Training (APLT), which explicitly separates the pseudo-labeling process from model training. |
Yan Zhang; Lechao Cheng; Yaxiong Wang; Zhun Zhong; Meng Wang; |
| 438 | Dual Robust Unbiased Multi-View Clustering for Incomplete and Unpaired Information Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For PSP, due to varying degrees of missing data, incomplete spatial structures can cause clustering centers-shifted problem, resulting in the model learning incorrect correspondences and biased spatial structures.To tackle them, we propose a novel method called Dual Robust Unbiased Multi-View Clustering for Incomplete and Unpaired Information (DRUMVC). |
Liang Zhao; Ziyue Wang; Chuanye He; Qingchen Zhang; Bo Xu; |
| 439 | The Role of Video Generation in Enhancing Data-Limited Action Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we address the data-limited action understanding problem by bridging data scarcity. |
Wei Li; Dezhao Luo; Dongbao Yang; Zhenhang Li; Weiping Wang; Yu Zhou; |
| 440 | Optimal Distributed Training With Co-Adaptive Data Parallelism in Heterogeneous Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, most previous distributed training frameworks like DDP and DeepSpeed are primarily designed for co-located clusters under homogeneous computing and communication conditions, and hence cannot account for geo-distributed clusters with both computing and communication heterogeneity. To address this challenge, we develop a new data parallel based distributed training framework called Co-Adaptive Data Parallelism (C-ADP). |
Lifang Chen; Zhichao Chen; Liqi Yan; Yanyu Cheng; Fangli Guan; Pan Li; |
| 441 | Exploiting Self-Refining Normal Graph Structures for Robust Defense Against Unsupervised Adversarial Attacks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We observe that representations learned from attacked graphs are often ineffective for refinement due to perturbations that cause the endpoints of perturbed edges to become more similar, complicating the defender’s ability to distinguish them. To address this challenge, we propose a robust unsupervised graph learning framework that utilizes cleaner graphs to learn effective representations. |
Bingdao Feng; Di Jin; Xiaobao Wang; Dongxiao He; Jingyi Cao; Zhen Wang; |
| 442 | ST-USleepNet: A Spatial-Temporal Coupling Prominence Network for Multi-Channel Sleep Staging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: (2) Capturing the spatial-temporal coupling patterns essential for accurate sleep staging. To address these challenges, we propose a novel framework named ST-USleepNet, comprising a spatial-temporal graph construction module (ST) and a U-shaped sleep network (USleepNet). |
Jingying Ma; Qika Lin; Ziyu Jia; Mengling Feng; |
| 443 | FADE: Towards Fairness-aware Data Generation for Domain Generalization Via Classifier-Guided Score-based Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although disentanglement has been used to tackle FairDG, it is limited by its strong assumptions. To overcome these limitations, we propose Fairness-aware Classifier-Guided Score-based Diffusion Models (FADE) as a novel approach to effectively address the FairDG issue. |
Yujie Lin; Dong Li; Minglai Shao; Guihong Wan; Chen Zhao; |
| 444 | Volumetric Axial Disentanglement Enabling Advancing in Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces the volumetric axial disentanglement to address the disparities in spatial information along different axial dimensions. |
Xingru Huang; Jian Huang; Yihao Guo; Tianyun Zhang; Zhao Huang; Yaqi Wang; Ruipu Tang; Guangliang Cheng; Shaowei Jiang; Zhiwen Zheng; Jin Liu; Renjie Ruan; Xiaoshuai Zhang; |
| 445 | BTPG: A Platform and Benchmark for Behavior Tree Planning in Everyday Service Robots Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose Behavior Tree Planning Gym (BTPG), the first platform and benchmark for BT planning in everyday service robots. |
Xinglin Chen; Yishuai Cai; Minglong Li; Yunxin Mao; Zhou Yang; Wenjing Yang; Weixia Xu; Ji Wang; |
| 446 | FBQuant: FeedBack Quantization for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recent efforts to incorporate sub-branches have shown promise for mitigating quantization errors, but these methods either lack robust optimization strategies or rely on suboptimal objectives. To address these gaps, we propose FeedBack Quantization (FBQuant), a novel approach inspired by negative feedback mechanisms in automatic control.FBQuant inherently ensures that the reconstructed weights remain bounded by the quantization process, thereby reducing the risk of overfitting.To further offset the additional latency introduced by sub-branches, we develop an efficient CUDA kernel that decreases 60% of extra inference time. |
Yijiang Liu; Hengyu Fang; Liulu He; Rongyu Zhang; Yichuan Bai; Yuan Du; Li Du; |
| 447 | SCNNs: Spike-based Coupling Neural Networks for Understanding Structural-Functional Relationships in The Human Brain Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, extant SC-FC coupling analysis methods primarily center on disclosing the statistical association between the topological patterns of structural connectivity (SC) and functional connectivity (FC), while often neglecting the neurobiological mechanisms by which the brain typically transmits and processes information in the form of spikes. To address this, we propose a biologically inspired deep-learning model called spike-based coupling neural networks (SCNNs). |
Shaolong Wei; Shu Jiang; Mingliang Wang; Liang Sun; Haonan Rao; Weiping Ding; Jiashuang Huang; |
| 448 | PALA: Class-imbalanced Graph Domain Adaptation Via Prototype-anchored Learning and Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, most existing methods assume balanced labels in the source graph, which often fails in practice and leads to biased knowledge transfer. To address this, in this paper, we propose a prototype-anchored learning and alignment framework for class-imbalanced graph domain adaptation. |
Xin Ma; Yifan Wang; Siyu Yi; Wei Ju; Bei Wu; Ziyue Qiao; Chenwei Tang; Jiancheng Lv; |
| 449 | Accelerating Diffusion-based Super-Resolution with Dynamic Time-Spatial Sampling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we analyze the frequency- and spatial-domain properties of diffusion-based SR methods, revealing key insights into the temporal and spatial dependencies of high-frequency signal recovery. |
Rui Qin; Qijie Wang; Ming Sun; Haowei Zhu; Chao Zhou; Bin Wang; |
| 450 | DANCE: Resource-Efficient Neural Architecture Search with Data-Aware and Continuous Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose DANCE (Dynamic Architectures with Neural Continuous Evolution), which reformulates architecture search as a continuous evolution problem through learning distributions over architectural components. |
Maolin Wang; Tianshuo Wei; Sheng Zhang; Ruocheng Guo; Wangyu Wang; Shanshan Ye; Lixin Zou; Xuetao Wei; Xiangyu Zhao; |
| 451 | QA-MDT: Quality-aware Masked Diffusion Transformer for Enhanced Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Text-to-music (TTM) generation, which converts textual descriptions into audio, opens up innovative avenues for multimedia creation.Achieving high quality and diversity in this process demands extensive, high-quality data, which are often scarce in available datasets. Most open-source datasets frequently suffer from issues like low-quality waveforms and low text-audio consistency, hindering the advancement of music generation models.To address these challenges, we propose a novel quality-aware training paradigm for generating high-quality, high-musicality music from large-scale, quality-imbalanced datasets. |
Chang Li; Ruoyu Wang; Lijuan Liu; Jun Du; Yixuan Sun; Zilu Guo; Zhengrong Zhang; Yuan Jiang; Jianqing Gao; Feng Ma; |
| 452 | From General Relation Patterns to Task-Specific Decision-Making in Continual Multi-Agent Coordination Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we delve into the core of Co-MARL, namely Relation Patterns, which refer to agents’ general understanding of interactions. |
Chang Yao; Youfang Lin; Shoucheng Song; Hao Wu; Yuqing Ma; Sheng Han; Kai Lv; |
| 453 | Sentiment-enhanced Multi-hop Connected Graph Attention Network for Multimodal Aspect-Based Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While current research has broadly focused on syntax relation-driven semantic comprehension, the impact of the importance of different syntactic relations on semantic understanding has not been adequately investigated. To address this issue, we propose a Sentiment-enhanced Multi-hop Connected Graph Attention Network (MCG), aiming to enhance the discriminative capability of model for sentiments and to delve into the syntactic relationships within the text. |
Linlin Zhu; Heli Sun; Xiaoyong Huang; Qi Zhang; Ruichen Cao; Liang He; |
| 454 | ScSiameseClu: A Siamese Clustering Framework for Interpreting Single-cell RNA Sequencing Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In response, we propose scSiameseClu, a novel Siamese Clustering framework for interpreting single-cell RNA-seq data, comprising of 3 key steps: (1) Dual Augmentation Module, which applies biologically informed perturbations to the gene expression matrix and cell graph relationships to enhance representation robustness; (2) Siamese Fusion Module, which combines cross-correlation refinement and adaptive information fusion to capture complex cellular relationships while mitigating over-smoothing; and (3) Optimal Transport Clustering, which utilizes Sinkhorn distance to efficiently align cluster assignments with predefined proportions while maintaining balance. |
Ping Xu; Zhiyuan Ning; Pengjiang Li; Wenhao Liu; Pengyang Wang; Jiaxu Cui; Yuanchun Zhou; Pengfei Wang; |
| 455 | RePST: Language Model Empowered Spatio-Temporal Forecasting Via Semantic-Oriented Reprogramming Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we aim to harness the reasoning and generalization abilities of Pre-trained Language Models (PLMs) for more effective spatio-temporal forecasting, particularly in data-scarce scenarios. |
Hao Wang; Jindong Han; Wei Fan; Leilei Sun; Hao Liu; |
| 456 | Going Beyond Consistency: Target-oriented Multi-view Graph Neural Network Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Furthermore, these approaches often lack rigorous theoretical analysis that bridges training data to test data. To address these issues, we propose Target-oriented Graph Neural Network (TGNN), a novel framework that goes beyond traditional consistency by prioritizing task-relevant information, ensuring alignment with the target. |
Sujia Huang; Lele Fu; Shuman Zhuang; Yide Qiu; Bo Huang; Zhen Cui; Tong Zhang; |
| 457 | Contrastive Cross-Course Knowledge Tracing Via Concept Graph Guided Knowledge Transfer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose TransKT, a contrastive cross-course knowledge tracing method that leverages concept graph guided knowledge transfer to model the relationships between learning behaviors across different courses, thereby enhancing knowledge state estimation. |
Wenkang Han; Wang Lin; Liya Hu; Zhenlong Dai; Yiyun Zhou; Mengze Li; Zemin Liu; Chang Yao; Jingyuan Chen; |
| 458 | OMS: One More Step Noise Searching to Enhance Membership Inference Attacks for Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We analyze current MIA methods through the lens of the noise search framework and reveal that they rely on the first residual as the discriminative metric to differentiate members and non-members. Inspired by this observation, we introduce OMS, which augments existing MIA methods by iterating One More fixed-point Step to include a further residual, i.e., the second residual. |
Xiaomeng Fu; Xi Wang; Qiao Li; Jin Liu; Jiao Dai; Jizhong Han; Xingyu Gao; |
| 459 | A Centrality-based Graph Learning Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While node-level learning focuses on individual nodes and their local structures, graph-level learning encounters challenges in capturing the global properties of graphs. In this paper, we conduct a theoretical and experimental analysis of existing graph-level learning frameworks and find that these frameworks typically adopt a single-view perspective based solely on node degree, which limits their ability to capture comprehensive graph characteristics.To address these issues, we propose a multi-view approach that leverages different types of centrality measures to capture diverse aspects of graph structure. |
Jiajun Yu; Zhihao Wu; Jielong Lu; Tianyue Wang; Haishuai Wang; |
| 460 | G3PT: Unleash The Power of Autoregressive Modeling in 3D Generation Via Cross-Scale Querying Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Instead of imposing an artificial order on 3D data, in this paper, we introduce G3PT, a scalable, coarse-to-fine 3D native generative model with cross-scale vector quantization and cross-scale autoregressive modeling. |
Jinzhi Zhang; Feng Xiong; Guangyu Wang; Mu Xu; |
| 461 | Consensus-Guided Incomplete Multi-view Clustering Via Cross-view Affinities Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, dual-view interaction neglects the collaboration effects of multiple views, making it challenging to capture the holistic characteristics across views. In response to these issues, we propose a novel Consensus-Guided Incomplete Multi-view Clustering via Cross-view Affinities Learning (CAL). |
Qian Liu; Huibing Wang; Jinjia Peng; Yawei Chen; Mingze Yao; Xianping Fu; Yang Wang; |
| 462 | CAN-ST: Clustering Adaptive Normalization for Spatio-temporal OOD Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing methods, nonetheless, often address individual time series in isolation, neglecting correlations across series, which limits their capacity to handle complex spatio-temporal dynamics and results in suboptimal solutions. To overcome these challenges, we propose Clustering Adaptive Normalization (CAN-ST), a general and model-agnostic method that mitigates non-stationarity by capturing both localized distributional changes and shared patterns across nodes via adaptive clustering and a parameter register. |
Min Yang; Yang An; Jinliang Deng; Xiaoyu Li; Bin Xu; Ji Zhong; Xiankai Lu; Yongshun Gong; |
| 463 | Detection and Geographic Localization of Natural Objects in The Wild: A Case Study on Palms Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We develop PRISM (Processing, Inference, Segmentation, and Mapping), a flexible pipeline for detecting and localizing palms in dense tropical forests using large orthomosaic images. |
Kangning Cui; Rongkun Zhu; Manqi Wang; Wei Tang; Gregory D. Larsen; Victor P. Pauca; Sarra Alqahtani; Fan Yang; David Segurado; David A. Lutz; Jean-Michel Morel; Miles R. Silman; |
| 464 | Low-Light Video Enhancement Via Spatial-Temporal Consistent Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present an innovative video decomposition strategy that incorporates view-independent and view-dependent components to enhance the performance of LLVE. |
Xiaogang Xu; Kun Zhou; Tao Hu; Jiafei Wu; Ruixing Wang; Hao Peng; Bei Yu; |
| 465 | SMILE: A Scale-aware Multiple Instance Learning Method for Multicenter STAS Lung Cancer Histopathology Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the bias, sparse and heterogeneous nature of STAS, we propose an scale-aware multiple instance learning(SMILE) method for STAS diagnosis of lung cancer. |
Liangrui Pan; Xiaoyu Li; Yutao Dou; Qiya Song; Jiadi Luo; Qingchun Liang; Shaoliang Peng; |
| 466 | Optimal Policy Adaptation Under Covariate Shift Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose principled approaches for learning the optimal policy in the target domain by leveraging two datasets: one with full information from the source domain and the other from the target domain with only covariates. |
Xueqing Liu; Qinwei Yang; Zhaoqing Tian; Ruocheng Guo; Peng Wu; |
| 467 | ECG2TOK: ECG Pre-Training with Self-Distillation Semantic Tokenizers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the continuity and significant variability of ECG signals pose a challenge in generating semantically discrete labels. To address this issue, we propose an ECG pretraining framework with a self-distillation semantic tokenizer (ECG2TOK), which maps continuous ECG signals into discrete labels for self-supervised training. |
Xiaoyan Yuan; Wei Wang; Han Liu; Jian Chen; Xiping Hu; |
| 468 | A Fast-Adaptive Cognitive Diagnosis Framework for Computerized Adaptive Testing Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, this paper proposes a Fast Adaptive Cognitive Diagnosis (FACD) framework, which incorporates dynamic collaborative and personalized diagnosis modules. |
Yuanhao Liu; Yiya You; Shuo Liu; Hong Qian; Ying Qian; Aimin Zhou; |
| 469 | Disentangling Multi-view Representations Via Curriculum Learning with Learnable Prior Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose to disentangle view-consistency and view-specificity and learn them gradually. |
Kai Guo; Jiedong Wang; Xi Peng; Peng Hu; Hao Wang; |
| 470 | AdvGrasp: Adversarial Attacks on Robotic Grasping from A Physical Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Unlike studies that focus solely on neural network predictions while overlooking the physical principles of grasping, this paper introduces AdvGrasp, a framework for adversarial attacks on robotic grasping from a physical perspective. |
Xiaofei Wang; Mingliang Han; Tianyu Hao; Cegang Li; Yunbo Zhao; Keke Tang; |
| 471 | Explanatory Capabilities of Large Language Models in Prescriptive Process Monitoring (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we explore the use of Large Language Models (LLMs) to generate explanations for PrPM recommendations. |
Kateryna Kubrak; Lana Botchorishvili; Fredrik Milani; Alexander Nolte; Marlon Dumas; |
| 472 | Scan-and-Print: Patch-level Data Summarization and Augmentation for Content-aware Layout Generation in Poster Design Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To perceive the background images, existing work demanded a high parameter count that far exceeds the size of available training data, which has impeded the model’s real-time performance and generalization ability. To address these challenges, we proposed a patch-level data summarization and augmentation approach, vividly named Scan-and-Print. |
HsiaoYuan Hsu; Yuxin Peng; |
| 473 | FedBG: Proactively Mitigating Bias in Cross-Domain Graph Federated Learning Using Background Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Therefore, adjusting the bias before it occurs will hopefully address the learning difficulties caused by the skew. In view of this, we employ background graph data, which works as reference information for local training, to proactively correct bias before it occurs. |
Sheng Huang; Lele Fu; Tianchi Liao; Bowen Deng; Chuanfu Zhang; Chuan Chen; |
| 474 | Semantic-Guided Diffusion Model for Single-Step Image Super-Resolution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Specifically, we introduce the SAM-Noise Module, which refines Gaussian noise using segmentation masks to preserve spatial and semantic features. |
Zihang Liu; Zhenyu Zhang; Hao Tang; |
| 475 | Distribution-Aware Online Learning for Urban Spatiotemporal Forecasting on Streaming Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing works often overlook these dynamic shifts, limiting their ability to adapt to evolving trends effectively. To address this challenge, we propose DOL, a novel Distribution-aware Online Learning framework designed to handle the unique shifts in urban ST streams. |
Chengxin Wang; Gary Tan; Swagato Barman Roy; Beng Chin Ooi; |
| 476 | Learning Causally Disentangled Representations for Fair Personality Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose an Interventional Personality Detection Network (IPDN) to learn implicit confounders in user-generated posts and exploit the true causal effect to train the detection model. |
Yangfu Zhu; Meiling Li; Yuting Wei; Di Liu; Yuqing Li ; Bin Wu; |
| 477 | Hallucination Reduction in Video-Language Models Via Hierarchical Multimodal Consistency Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This limitation results in a biased understanding of the semantics between visual concepts, leading to hallucinations. To address this challenge, we propose a Multi-level Multimodal Alignment (MMA) framework that leverages a text encoder and semantic discriminative loss to achieve multi-level alignment. |
Jisheng Dang; Shengjun Deng; Haochen Chang; Teng Wang; Bimei Wang; Shude Wang; Nannan Zhu; Guo Niu; Jingwen Zhao; Jizhao Liu; |
| 478 | Federated Multi-view Graph Clustering with Incomplete Attribute Imputation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To mitigate the view incompleteness issue and simultaneously maintain privacy and efffciency, we propose a novel Federated Multiview Graph Clustering with Incomplete Attribute Imputation (FMVC-IAI). |
Wei Feng; Zeyu Bi; Qianqian Wang; Bo Dong; |
| 479 | Tensorial Multi-view Clustering with Deep Anchor Graph Projection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing tensorized MVC methods generally overlook deep structures within each view and rely on post-processing to derive clustering results, leading to potential information loss and degraded performance. To address these issues, we develop Tensorial Multi-view Clustering with Deep Anchor Graph Projection (TMVC-DAGP), which performs deep projection on the anchor graph, thus improving model scalability. |
Wei Feng; Dongyuvan Wei; Qianqian Wang; Bo Dong; |
| 480 | A Simple Yet Effective Hypergraph Clustering Network Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: 2) High computational demands hinder their real-world application. To address the above issues, we propose a simple yet effective Hypergraph Clustering Network framework (HCN). |
Qianqian Wang; Bowen Zhao; Zhengming Ding; Xiangdong Zhang; Quanxue Gao; |
| 481 | Efficient Multi-view Clustering Via Reinforcement Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Contrastive multi-view clustering has demonstrated remarkable potential in complex data analysis, yet existing approaches face two critical challenges: difficulty in constructing high-quality positive and negative pairs and high computational overhead due to static optimization strategies. To address these challenges, we propose an innovative efficient Multi-View Clustering framework with Reinforcement Contrastive Learning (EMVCRCL). |
Qianqian Wang; Haiming Xu; Zihao Zhang; Zhiqiang Tao; Quanxue Gao; |
| 482 | Self-calibration Enhanced Whole Slide Pathology Image Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing methods fail to simultaneously extract global structural and local detail features for comprehensive pathology image analysis efficiently. To address these limitations, we propose a self-calibration enhanced framework for whole slide pathology image analysis, comprising three components: a global branch, a focus predictor, and a detailed branch. |
Haoming Luo; Xiaotian Yu; Shengxuming Zhang; Jiabin Xia; Jian Yang; Yuning Sun; Xiuming Zhang; Jing Zhang; Zunlei Feng; |
| 483 | PerfSeer: An Efficient and Accurate Deep Learning Models Performance Predictor Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To address this, we represent a model as a graph that includes the topology, along with node, edge, and global features, all of which are crucial for effectively capturing the performance of the model. Based on this representation, we propose PerfSeer, a novel predictor that uses a Graph Neural Network (GNN)-based performance prediction model, SeerNet. |
Xinlong Zhao; Jiande Sun; Jia Zhang; Tong Liu; Ke Liu; |
| 484 | Egocentric Object-Interaction Anticipation with Retentive and Predictive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These limitations stem from a lack of focus on retention, i.e., retaining long-term object-centric interactions, and prediction, i.e., future-centric encoding and future uncertainty modeling. We introduce EgoAnticipator, a novel Retentive and Predictive Learning framework that addresses these challenges. |
Guo Chen; Yifei Huang; Yin-dong Zheng; Yicheng Liu; Jiahao Wang; Tong Lu; |
| 485 | On The Power of Optimism in Constrained Online Convex Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Optimistic-COCO, an adaptive gradient-based algorithm that incorporates optimistic design with the Lyapunov optimization technique. |
Haobo Zhang; Hengquan Guo; Xin Liu; |
| 486 | Differentiable Prompt Learning for Vision Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: How to automate the continuous prompt design is an underexplored area, and a fundamental question arises, is manually designed deep prompt strategy optimal? To answer this question, we propose a method dubbed differentiable prompt learning (DPL). |
Zhenhan Huang; Tejaswini Pedapati; Pin-Yu Chen; Jianxi Gao; |
| 487 | Expanding The Category of Classifiers with LLM Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Motivated by the widespread success of large language models (LLMs), we introduce an LLM-driven framework for class-incremental learning that removes the need for human intervention, termed Classifier Expansion with Multi-vIew LLM knowledge (CEMIL). |
Derui Lyu; Xiangyu Wang; Taiyu Ban; Lyuzhou Chen; Xiren Zhou; Huanhuan Chen; |
| 488 | SyncGaussian: Stable 3D Gaussian-Based Talking Head Generation with Enhanced Lip Sync Via Discriminative Speech Features Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although methods based on 3D Gaussian Splatting (3DGS) offer a promising solution via point-based deformation, they suffer from inconsistent head dynamics and mismatched mouth movements due to unstable Gaussian initialization and incomplete speech features. To overcome these limitations, we introduce SyncGaussian, a 3DGS-based framework that ensures stable head poses, enhanced lip sync, and realistic appearances with real-time rendering. |
Ke Liu; Jiwei Wei; Shiyuan He; Zeyu Ma; Chaoning Zhang; Ning Xie; Yang Yang; |
| 489 | Logic Distillation: Learning from Code Function By Function for Decision-making Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To tackle the identified challenges, we propose a novel framework called Logic Distillation (LD). |
Dong Chen; Shilin Zhang; Fei Gao; Yueting Zhuang; Siliang Tang; Qidong Liu; Mingliang Xu; |
| 490 | Enhancing Sampling Protocol for Point Cloud Classification Against Corruptions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As a result, these protocols are highly vulnerable to noise, posing significant safety risks in critical applications like autonomous driving. To address these issues, we propose an enhanced point cloud sampling protocol, PointSP, designed to improve robustness against point cloud corruptions. |
Chongshou Li; Pin Tang; Tianrui Li; Yuheng Liu; Xinke Li; |
| 491 | CompLex: Music Theory Lexicon Constructed By Autonomous Agents for Automatic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a novel automatic music lexicon construction model that generates a lexicon, named CompLex, comprising 37,432 items derived from just 9 manually input category keywords and 5 sentence prompt templates. |
Zhejing Hu; Yan Liu; Gong Chen; Bruce X. B. Yu; |
| 492 | KnowMDD: Knowledge-guided Cross Contrastive Learning for Major Depressive Disorder Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose KnowMDD, a novel knowledge-guided cross contrastive learning framework for MDD diagnosis. |
Anchen Lin; Weikun Wang; Haijun Han; Fanwei Zhu; Qi Ma; Zengwei Zheng; Binbin Zhou; |
| 493 | Higher-order Logical Knowledge Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose a higher-order logical knowledge representation learning method, named LORE, which leverages network motifs, the patterns/subgraphs that naturally capture the structural information in graphs, to extract higher-order features and ultimately, learn effective representations of knowledge graphs. |
Suixue Wang; Weiliang Huo; Shilin Zhang; Qingchen Zhang; |
| 494 | POMP: Pathology-omics Multimodal Pre-training Framework for Cancer Survival Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose POMP, a pathology-omics multimodal pre-training framework jointly learned with three training tasks for integrating pathological images and omics data for cancer survival prediction. |
Suixue Wang; Shilin Zhang; Huiyuan Lai; Weiliang Huo; Qingchen Zhang; |
| 495 | Explainable Graph Neural Networks Via Structural Externalities Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a novel explainability framework, GraphEXT, which leverages cooperative game theory and the concept of social externalities. |
Lijun Wu; Dong Hao; Zhiyi Fan; |
| 496 | Empowering Vision Transformers with Multi-Scale Causal Intervention for Long-Tailed Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper investigates the influence of existing causal models on CNNs and ViT variants, highlighting that ViT’s global feature representation makes it hard for causal methods to model associations between fine-grained features and predictions, which leads to difficulties in classifying tail classes with similar visual appearance. To address these issues, this paper proposes TSCNet, a two-stage causal modeling method to discover fine-grained causal associations through multi-scale causal interventions. |
Xiaoshuo Yan; Zhaochuan Li; Lei Meng; Zhuang Qi; Wei Wu; Zixuan Li; Xiangxu Meng; |
| 497 | Unlocking The Potential of Lightweight Quantized Models for Deepfake Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a low-bit quantization framework for lightweight and efficient deepfake detection. |
Renshuai Tao; Ziheng Qin; Yifu Ding; Chuangchuang Tan; Jiakai Wang; Wei Wang; |
| 498 | Enhanced Unsupervised Discriminant Dimensionality Reduction for Nonlinear Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, the subspace dimension learned by LDA must be smaller than cluster number, which limits its practical applications. To address these issues, we propose a novel unsupervised LDA method that combines centerless K-means and LDA. |
Qianqian Wang; Mengping Jiang; Wei Feng; Zhengming Ding; |
| 499 | High-Fidelity Road Network Generation with Latent Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel approach, called GraphWalker, to generate high-fidelity road network graphs from raw trajectories in an end-to-end manner. |
Jinming Wang; Hongkai Wen; Geyong Min; Man Luo; |
| 500 | Balancing Invariant and Specific Knowledge for Domain Generalization with Online Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing approaches often overlook domain-specific knowledge and rely on an offline distillation strategy, limiting the effectiveness of knowledge transfer. To address these limitations, we propose Balanced Online knowLedge Distillation (BOLD). |
Di Zhao; Jingfeng Zhang; Hongsheng Hu; Philippe Fournier-Viger; Gillian Dobbie; Yun Sing Koh; |
This table only includes 500 papers selected by our daily digest algorithm. To continue with the full list (~1,300 papers), please visit Paper Digest: IJCAI-2025 (Full List).