Paper Digest: AAAI 2023 Highlights
The AAAI Conference on Artificial Intelligence (AAAI) is one of the top artificial intelligence conferences in the world. In 2023, it was held in Washington DC.
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
Based in New York, Paper Digest is dedicated to helping people generate contents & reason over unstructured data. Different from black-box approaches, we build deep models on semantics, which allows results to be produced with explainations. Such models power this website, and are behind our services including “search engine”, “summarization”, “question answering”, and “literature review”.
If you do not want to miss interesting academic papers, you are welcome to sign up our daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Paper Digest: AAAI 2023 Highlights
Paper | Author(s) | |
---|---|---|
1 | Back to The Future: Toward A Hybrid Architecture for Ad Hoc Teamwork Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our architecture builds on the principles of step-wise refinement and ecological rationality to enable an ad hoc agent to perform non-monotonic logical reasoning with prior commonsense domain knowledge and models learned rapidly from limited examples to predict the behavior of other agents. |
Hasra Dodampegama; Mohan Sridharan; |
2 | Reducing ANN-SNN Conversion Error Through Residual Membrane Potential Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we make a detailed analysis of unevenness error and divide it into four categories. |
Zecheng Hao; Tong Bu; Jianhao Ding; Tiejun Huang; Zhaofei Yu; |
3 | Hierarchical ConViT with Attention-Based Relational Reasoner for Visual Analogical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To tackle the challenges of visual perception and logic reasoning on RPMs, we propose a Hierarchical ConViT with Attention-based Relational Reasoner (HCV-ARR). |
Wentao He; Jialu Zhang; Jianfeng Ren; Ruibin Bai; Xudong Jiang; |
4 | Deep Spiking Neural Networks with High Representation Similarity Model Visual Pathways of Macaque and Mouse Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we model the visual cortex with deep SNNs for the first time, and also with a wide range of state-of-the-art deep CNNs and ViTs for comparison. |
Liwei Huang; Zhengyu Ma; Liutao Yu; Huihui Zhou; Yonghong Tian; |
5 | A Semi-parametric Model for Decision Making in High-Dimensional Sensory Discrimination Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Approaches to characterizing high-dimensional sensory spaces either require strong parametric assumptions about these additional contextual dimensions, or fail to leverage known properties of classical psychometric curves. We overcome both limitations by introducing a semi-parametric model of sensory discrimination that applies traditional psychophysical models along a stimulus intensity dimension, but puts Gaussian process (GP) priors on the parameters of these models with respect to the remaining dimensions. |
Stephen Keeley; Benjamin Letham; Craig Sanders; Chase Tymms; Michael Shvartsman; |
6 | A Machine with Short-Term, Episodic, and Semantic Memory Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by the cognitive science theory of the explicit human memory systems, we have modeled an agent with short-term, episodic, and semantic memory systems, each of which is modeled with a knowledge graph. |
Taewoon Kim; Michael Cochez; Vincent Francois-Lavet; Mark Neerincx; Piek Vossen; |
7 | Persuasion Strategies in Advertisements Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by persuasion literature in social psychology and marketing, we introduce an extensive vocabulary of persuasion strategies and build the first ad image corpus annotated with persuasion strategies. |
Yaman Kumar; Rajat Jha; Arunim Gupta; Milan Aggarwal; Aditya Garg; Tushar Malyan; Ayush Bhardwaj; Rajiv Ratn Shah; Balaji Krishnamurthy; Changyou Chen; |
8 | Intensity-Aware Loss for Dynamic Facial Expression Recognition in The Wild Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, if the expressions with different intensities are treated equally, the features learned by the networks will have large intra-class and small inter-class differences, which are harmful to DFER. To tackle this problem, we propose the global convolution-attention block (GCA) to rescale the channels of the feature maps. |
Hanting Li; Hongjing Niu; Zhaoqing Zhu; Feng Zhao; |
9 | AVCAffe: A Large Scale Audio-Visual Dataset of Cognitive Load and Affect for Remote Work Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce AVCAffe, the first Audio-Visual dataset consisting of Cognitive load and Affect attributes. |
Pritam Sarkar; Aaron Posen; Ali Etemad; |
10 | ESL-SNNs: An Evolutionary Structure Learning Strategy for Spiking Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our work presents a brand-new approach for sparse training of SNNs from scratch with biologically plausible evolutionary mechanisms, closing the gap in the expressibility between sparse training and dense training. |
Jiangrong Shen; Qi Xu; Jian K. Liu; Yueming Wang; Gang Pan; Huajin Tang; |
11 | Zero-Shot Linear Combinations of Grounded Social Interactions with Linear Social MDPs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: How an agent responds socially, should depend on what it thinks the other agent is doing at that point in time. To encode this notion, we take linear combinations of social interactions as defined in Social MDPs, and compute the weights on those combinations on the fly depending on the estimated goals of other agents. |
Ravi Tejwani; Yen-Ling Kuo; Tianmin Shu; Bennett Stankovits; Dan Gutfreund; Joshua B. Tenenbaum; Boris Katz; Andrei Barbu; |
12 | Complex Dynamic Neurons Improved Spiking Transformer Network for Efficient Automatic Speech Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here we introduce four types of neuronal dynamics to post-process the sequential patterns generated from the spiking transformer to get the complex dynamic neuron improved spiking transformer neural network (DyTr-SNN). |
Qingyu Wang; Tielin Zhang; Minglun Han; Yi Wang; Duzhen Zhang; Bo Xu; |
13 | Self-Supervised Graph Learning for Long-Tailed Cognitive Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To relieve the situation, we proposed a Self-supervised Cognitive Diagnosis (SCD) framework which leverages the self-supervised manner to assist the graph-based cognitive diagnosis, then the performance on those students with sparse data can be improved. |
Shanshan Wang; Zhen Zeng; Xun Yang; Xingyi Zhang; |
14 | CMNet: Contrastive Magnification Network for Micro-Expression Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we provide a reliable scheme to extract intensity clues while considering their variation on the time scale. |
Mengting Wei; Xingxun Jiang; Wenming Zheng; Yuan Zong; Cheng Lu; Jiateng Liu; |
15 | Disentangling Reafferent Effects By Doing Nothing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Toward the development of more general agents, we develop a framework that enables agents to disentangle self-caused and externally-caused sensory effects. |
Benedict Wilkins; Kostas Stathis; |
16 | Learning Temporal-Ordered Representation for Spike Streams Based on Discrete Wavelet Transforms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to mine temporal-robust features of spikes in time-frequency space with wavelet transforms. |
Jiyuan Zhang; Shanshan Jia; Zhaofei Yu; Tiejun Huang; |
17 | ScatterFormer: Locally-Invariant Scattering Transformer for Patient-Independent Multispectral Detection of Epileptiform Discharges Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Scattering Transformer (ScatterFormer), an invariant scattering transform-based hierarchical Transformer that specifically pays attention to subtle features. |
Ruizhe Zheng; Jun Li; Yi Wang; Tian Luo; Yuguo Yu; |
18 | Progress and Limitations of Deep Networks to Recognize Objects in Unusual Poses Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We create a synthetic dataset of images of objects in unusual orientations, and evaluate the robustness of a collection of 38 recent and competitive deep networks for image classification. We show that classifying these images is still a challenge for all networks tested, with an average accuracy drop of 29.5% compared to when the objects are presented upright. |
Amro Abbas; Stéphane Deny; |
19 | Denoising After Entropy-Based Debiasing A Robust Training Method for Dataset Bias with Noisy Labels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we find that earlier approaches that used the provided labels to quantify difficulty could be affected by the small proportion of noisy labels. |
Sumyeong Ahn; Se-Young Yun; |
20 | Rethinking Interpretation: Input-Agnostic Saliency Mapping of Deep Visual Classifiers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current attempts to use `general’ input features for model interpretation assume access to a dataset containing those features, which biases the interpretation. Addressing the gap, we introduce a new perspective of input-agnostic saliency mapping that computationally estimates the high-level features attributed by the model to its outputs. |
Naveed Akhtar; Mohammad Amir Asim Khan Jalwana; |
21 | Deep Digging Into The Generalization of Self-Supervised Monocular Depth Estimation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate the backbone networks (e.g., CNNs, Transformers, and CNN-Transformer hybrid models) toward the generalization of monocular depth estimation. |
Jinwoo Bae; Sungho Moon; Sunghoon Im; |
22 | Self-Contrastive Learning: Single-Viewed Supervised Contrastive Framework Using Sub-network Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To exploit the strength of multi-views while avoiding the high computation cost, we introduce a multi-exit architecture that outputs multiple features of a single image in a single-viewed framework. |
Sangmin Bae; Sungnyun Kim; Jongwoo Ko; Gihun Lee; Seungjong Noh; Se-Young Yun; |
23 | Layout Representation Learning with Spatial and Structural Hierarchies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a novel hierarchical modeling method for layout representation learning, the core of design documents (e.g., user interface, poster, template). |
Yue Bai; Dipu Manandhar; Zhaowen Wang; John Collomosse; Yun Fu; |
24 | Cross-Modal Label Contrastive Learning for Unsupervised Audio-Visual Event Localization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose cross-modal label contrastive learning to exploit multi-modal information among unlabeled audio and visual streams as self-supervision signals. |
Peijun Bao; Wenhan Yang; Boon Poh Ng; Meng Hwa Er; Alex C. Kot; |
25 | Multi-Level Compositional Reasoning for Interactive Instruction Following Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The tasks given to the agents are often composite thus are challenging as completing them require to reason about multiple subtasks, e.g., bring a cup of coffee. To address the challenge, we propose to divide and conquer it by breaking the task into multiple subgoals and attend to them individually for better navigation and interaction. |
Suvaansh Bhambri; Byeonghwi Kim; Jonghyun Choi; |
26 | Self-Supervised Image Local Forgery Detection By JPEG Compression Trace Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we firstly analyzed the JPEG compression traces which are mainly caused by different JPEG compression chains, and designed a trace extractor to learn such traces. Then, we utilized the trace extractor as the backbone and trained self-supervised to strengthen the discrimination ability of learned traces. |
Xiuli Bi; Wuqing Yan; Bo Liu; Bin Xiao; Weisheng Li; Xinbo Gao; |
27 | VASR: Visual Analogies of Situation Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a novel task, Visual Analogies of Situation Recognition, adapting the classical word-analogy task into the visual domain. |
Yonatan Bitton; Ron Yosef; Eliyahu Strugo; Dafna Shahaf; Roy Schwartz; Gabriel Stanovsky; |
28 | Parametric Surface Constrained Upsampler Network for Point Cloud Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these approaches are prone to produce outlier points due to the lack of explicit surface-level constraints. To solve this problem, we introduce a novel surface regularizer into the upsampler network by forcing the neural network to learn the underlying parametric surface represented by bicubic functions and rotation functions, where the new generated points are then constrained on the underlying surface. |
Pingping Cai; Zhenyao Wu; Xinyi Wu; Song Wang; |
29 | Explicit Invariant Feature Induced Cross-Domain Crowd Counting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an innovative explicit Invariant Feature induced Cross-domain Knowledge Transformation framework to address the inconsistent domain-invariant features of different domains. |
Yiqing Cai; Lianggangxu Chen; Haoyue Guan; Shaohui Lin; Changhong Lu; Changbo Wang; Gaoqi He; |
30 | Painterly Image Harmonization in Dual Domains Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel painterly harmonization network consisting of a dual-domain generator and a dual-domain discriminator, which harmonizes the composite image in both spatial domain and frequency domain. |
Junyan Cao; Yan Hong; Li Niu; |
31 | MMTN: Multi-Modal Memory Transformer Network for Image-Report Consistent Medical Report Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they do not fully explore the relationships between multi-modal medical data, and generate inaccurate and inconsistent reports. To address these issues, this paper proposes a Multi-modal Memory Transformer Network (MMTN) to cope with multi-modal medical data for generating image-report consistent medical reports. |
Yiming Cao; Lizhen Cui; Lei Zhang; Fuqiang Yu; Zhen Li; Yonghui Xu; |
32 | KT-Net: Knowledge Transfer for Unpaired 3D Shape Completion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unpaired 3D object completion aims to predict a complete 3D shape from an incomplete input without knowing the correspondence between the complete and incomplete shapes. In this paper, we propose the novel KTNet to solve this task from the new perspective of knowledge transfer. |
Zhen Cao; Wenxiao Zhang; Xin Wen; Zhen Dong; Yu-Shen Liu; Xiongwu Xiao; Bisheng Yang; |
33 | Deconstructed Generation-Based Zero-Shot Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, current literature has overlooked the fundamental principles of these methods and has made limited progress in a complex manner. In this paper, we aim to deconstruct the generator-classifier framework and provide guidance for its improvement and extension. |
Dubing Chen; Yuming Shen; Haofeng Zhang; Philip H.S. Torr; |
34 | Tracking and Reconstructing Hand Object Interactions from Point Cloud Sequences in The Wild Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we tackle the challenging task of jointly tracking hand object poses and reconstructing their shapes from depth point cloud sequences in the wild, given the initial poses at frame 0. |
Jiayi Chen; Mi Yan; Jiazhao Zhang; Yinzhen Xu; Xiaolong Li; Yijia Weng; Li Yi; Shuran Song; He Wang; |
35 | Amodal Instance Segmentation Via Prior-Guided Expansion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a prior-guided expansion framework, which builds on a two-stage segmentation model (i.e., Mask R-CNN) and performs box-level (resp., pixel-level) expansion for amodal box (resp., mask) prediction, by retrieving regression (resp., flow) transformations from a memory bank of expansion prior. |
Junjie Chen; Li Niu; Jianfu Zhang; Jianlou Si; Chen Qian; Liqing Zhang; |
36 | SwinRDM: Integrate SwinRNN with Diffusion Model Towards High-Resolution and High-Quality Weather Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose to leverage a two-step strategy to achieve high-resolution predictions at 0.25-degree considering the trade-off between computation memory and forecasting accuracy. |
Lei Chen; Fei Du; Yuan Hu; Zhibin Wang; Fan Wang; |
37 | Take Your Model Further: A General Post-refinement Network for Light Field Disparity Estimation Via BadPix Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel idea called Bad Pixel (BadPix) correction for method modeling, then implement a general post-refinement network for LF disparity estimation: Bad-pixel Correction Network (BpCNet). |
Rongshan Chen; Hao Sheng; Da Yang; Sizhe Wang; Zhenglong Cui; Ruixuan Cong; |
38 | Improving Dynamic HDR Imaging with Fusion Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a transformer model for HDR imaging. |
Rufeng Chen; Bolun Zheng; Hua Zhang; Quan Chen; Chenggang Yan; Gregory Slabaugh; Shanxin Yuan; |
39 | Self-Supervised Joint Dynamic Scene Reconstruction and Optical Flow Estimation for Spiking Camera Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a self-supervised joint learning framework for optical flow estimation and reconstruction of spiking camera. |
Shiyan Chen; Zhaofei Yu; Tiejun Huang; |
40 | Bidirectional Optical Flow NeRF: High Accuracy and High Quality Under Fewer Views Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, due to the lack of spatial consistency of the single-depth image and the poor performance of depth estimation with fewer views, the existing methods still have challenges in addressing this problem. So this paper proposes Bidirectional Optical Flow NeRF(BOF-NeRF), which addresses this problem by mining optical flow information between 2D images. |
Shuo Chen; Binbin Yan; Xinzhu Sang; Duo Chen; Peng Wang; Xiao Guo; Chongli Zhong; Huaming Wan; |
41 | Scalable Spatial Memory for Scene Rendering and Navigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce Scene Memory Network (SMN) to achieve online spatial memory construction and expansion for view rendering in novel scenes. |
Wen-Cheng Chen; Chu-Song Chen; Wei-Chen Chiu; Min-Chun Hu; |
42 | Hybrid CNN-Transformer Feature Fusion for Single Image Deraining Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, rich local-global information representations are increasingly indispensable for better satisfying rain removal. In this paper, we propose a lightweight Hybrid CNN-Transformer Feature Fusion Network (dubbed as HCT-FFN) in a stage-by-stage progressive manner, which can harmonize these two architectures to help image restoration by leveraging their individual learning strengths. |
Xiang Chen; Jinshan Pan; Jiyang Lu; Zhentao Fan; Hao Li; |
43 | MGFN: Magnitude-Contrastive Glance-and-Focus Network for Weakly-Supervised Video Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In addition, we empirically found that existing approaches that use feature magnitudes to represent the degree of anomalies typically ignore the effects of scene variations, and hence result in sub-optimal performance due to the inconsistency of feature magnitudes across scenes. To address this issue, we propose the Feature Amplification Mechanism and a Magnitude Contrastive Loss to enhance the discriminativeness of feature magnitudes for detecting anomalies. |
Yingxian Chen; Zhengzhe Liu; Baoheng Zhang; Wilton Fok; Xiaojuan Qi; Yik-Chung Wu; |
44 | Tagging Before Alignment: Integrating Multi-Modal Tags for Video-Text Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we integrate multi-modal information in an explicit manner by tagging, and use the tags as the anchors for better video-text alignment. |
Yizhen Chen; Jie Wang; Lijian Lin; Zhongang Qi; Jin Ma; Ying Shan; |
45 | DUET: Cross-Modal Semantic Grounding for Contrastive Zero-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a transformer-based end-to-end ZSL method named DUET, which integrates latent semantic knowledge from the pre-trained language models (PLMs) via a self-supervised multi-modal learning paradigm. |
Zhuo Chen; Yufeng Huang; Jiaoyan Chen; Yuxia Geng; Wen Zhang; Yin Fang; Jeff Z. Pan; Huajun Chen; |
46 | Imperceptible Adversarial Attack Via Invertible Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a novel Adversarial Attack via Invertible Neural Networks (AdvINN) method to produce robust and imperceptible adversarial examples. |
Zihan Chen; Ziyue Wang; Jun-Jie Huang; Wentao Zhao; Xiao Liu; Dejian Guan; |
47 | Cross-Modality Person Re-identification with Memory-Based Contrastive Embedding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an aggregated memory-based cross-modality deep metric learning framework, which benefits from the increasing number of learned modality-aware and modality-agnostic centroid proxies for cluster contrast and mutual information learning. |
De Cheng; Xiaolong Wang; Nannan Wang; Zhen Wang; Xiaoyu Wang; Xinbo Gao; |
48 | User-Controllable Arbitrary Style Transfer Via Entropy Regularization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel solution ensuring both efficiency and diversity for generating multiple user-controllable AST results by systematically modulating AST behavior at run-time. |
Jiaxin Cheng; Yue Wu; Ayush Jaiswal; Xu Zhang; Pradeep Natarajan; Prem Natarajan; |
49 | Neural Architecture Search for Wide Spectrum Adversarial Robustness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we aim to find Neural Architectures that have improved robustness on a wide range of adversarial noise strengths through Neural Architecture Search. |
Zhi Cheng; Yanxi Li; Minjing Dong; Xiu Su; Shan You; Chang Xu; |
50 | Adversarial Alignment for Source Free Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While most existing SFOD methods generate pseudo labels via a source-pretrained model to guide training, these pseudo labels usually contain high noises due to heavy domain discrepancy. In order to obtain better pseudo supervisions, we divide the target domain into source-similar and source-dissimilar parts and align them in the feature space by adversarial learning.Specifically, we design a detection variance-based criterion to divide the target domain. |
Qiaosong Chu; Shuyan Li; Guangyi Chen; Kai Li; Xiu Li; |
51 | Weakly Supervised 3D Multi-Person Pose Estimation for Large-Scale Scenes Based on Monocular Camera and Single LiDAR Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Since LiDAR can capture accurate depth information in long-range scenes, it can benefit both the global localization of individuals and the 3D pose estimation by providing rich geometry features. Motivated by this, we propose a monocular camera and single LiDAR-based method for 3D multi-person pose estimation in large-scale scenes, which is easy to deploy and insensitive to light. |
Peishan Cong; Yiteng Xu; Yiming Ren; Juze Zhang; Lan Xu; Jingya Wang; Jingyi Yu; Yuexin Ma; |
52 | OctFormer: Efficient Octree-Based Transformer for Point Cloud Compression with Local Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, previous methods that depend on 3D convolution or frequent multi-head self-attention operations bring huge computations. To address this problem, we propose an octree-based Transformer compression method called OctFormer, which does not rely on the occupancy information of sibling nodes. |
Mingyue Cui; Junhua Long; Mingjian Feng; Boyang Li; Huang Kai; |
53 | Dual-Domain Attention for Image Deblurring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, to bridge the gaps between degraded/sharp image pairs in the spatial and frequency domains simultaneously, we develop the dual-domain attention mechanism for image deblurring. |
Yuning Cui; Yi Tao; Wenqi Ren; Alois Knoll; |
54 | Multi-Resolution Monocular Depth Map Fusion By Self-Supervised Gradient-Based Composition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find that increasing input resolution is helpful to preserve more local details while the estimation at low resolution is more accurate globally. Therefore, we propose a novel depth map fusion module to combine the advantages of estimations with multi-resolution inputs. |
Yaqiao Dai; Renjiao Yi; Chenyang Zhu; Hongjun He; Kai Xu; |
55 | Improving Crowded Object Detection Via Copy-Paste Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we first underline two main effects of the crowdedness issue: 1) IoU-confidence correlation disturbances (ICD) and 2) confused de-duplication (CDD). |
Jiangfan Deng; Dewen Fan; Xiaosong Qiu; Feng Zhou; |
56 | Defending Backdoor Attacks on Vision Transformer Via Patch Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To the best of our knowledge, this paper presents the first defensive strategy that utilizes a unique characteristic of ViTs against backdoor attacks. |
Khoa D. Doan; Yingjie Lao; Peng Yang; Ping Li; |
57 | Head-Free Lightweight Semantic Segmentation with Linear Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a head-free lightweight architecture specifically for semantic segmentation, named Adaptive Frequency Transformer (AFFormer). |
Bo Dong; Pichao Wang; Fan Wang; |
58 | Hierarchical Contrast for Unsupervised Skeleton-Based Action Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper targets unsupervised skeleton-based action representation learning and proposes a new Hierarchical Contrast (HiCo) framework. |
Jianfeng Dong; Shengkai Sun; Zhonglin Liu; Shujie Chen; Baolong Liu; Xun Wang; |
59 | Exploring Tuning Characteristics of Ventral Stream’s Neurons for Few-Shot Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we computationally model two groups of neurons found in ventral stream which are respectively sensitive to shape cues and color cues. |
Lintao Dong; Wei Zhai; Zheng-Jun Zha; |
60 | Incremental-DETR: Incremental Few-Shot Object Detection Via Self-Supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose the Incremental-DETR that does incremental few-shot object detection via fine-tuning and self-supervised learning on the DETR object detector. |
Na Dong; Yongqiang Zhang; Mingli Ding; Gim Hee Lee; |
61 | PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper explores a better prediction target for BERT pre-training of vision transformers. |
Xiaoyi Dong; Jianmin Bao; Ting Zhang; Dongdong Chen; Weiming Zhang; Lu Yuan; Dong Chen; Fang Wen; Nenghai Yu; Baining Guo; |
62 | Domain-General Crowd Counting in Unseen Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we instead target to train a model based on a single source domain which can generalize well on any unseen domain. |
Zhipeng Du; Jiankang Deng; Miaojing Shi; |
63 | Few-Shot Defect Image Generation Via Defect-Aware Feature Manipulation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose the first defect image generation method in the challenging few-shot cases. |
Yuxuan Duan; Yan Hong; Li Niu; Liqing Zhang; |
64 | Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present Frido, a Feature Pyramid Diffusion model performing a multi-scale coarse-to-fine denoising process for image synthesis. |
Wan-Cyuan Fan; Yen-Chun Chen; DongDong Chen; Yu Cheng; Lu Yuan; Yu-Chiang Frank Wang; |
65 | Target-Free Text-Guided Image Manipulation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We tackle the problem of target-free text-guided image manipulation, which requires one to modify the input reference image based on the given text instruction, while no ground truth target image is observed during training. To address this challenging task, we propose a Cyclic-Manipulation GAN (cManiGAN) in this paper, which is able to realize where and how to edit the image regions of interest. |
Wan-Cyuan Fan; Cheng-Fu Yang; Chiao-An Yang; Yu-Chiang Frank Wang; |
66 | One Is All: Bridging The Gap Between Neural Radiance Fields Architectures with Progressive Volume Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Progressive Volume Distillation (PVD), a systematic distillation method that allows any-to-any conversions between different architectures, including MLP, sparse or low-rank tensors, hashtables and their compositions. |
Shuangkang Fang; Weixin Xu; Heng Wang; Yi Yang; Yufeng Wang; Shuchang Zhou; |
67 | Weakly-Supervised Semantic Segmentation for Histopathology Images Based on Dataset Synthesis and Feature Consistency Constraint Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most of these methods are based on class activation map, which suffers from inaccurate segmentation boundaries. To address this problem, we propose a novel weakly-supervised tissue segmentation framework named PistoSeg, which is implemented under a fully-supervised manner by transferring tissue category labels to pixel-level masks. |
Zijie Fang; Yang Chen; Yifeng Wang; Zhi Wang; Xiangyang Ji; Yongbing Zhang; |
68 | Uncertainty-Aware Image Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an uncertainty-aware image captioning framework, which parallelly and iteratively operates insertion of discontinuous candidate words between existing words from easy to difficult until converged. |
Zhengcong Fei; Mingyuan Fan; Li Zhu; Junshi Huang; Xiaoming Wei; Xiaolin Wei; |
69 | Unsupervised Domain Adaptation for Medical Image Segmentation By Selective Entropy Constraints and Adaptive Semantic Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a new unsupervised domain adaptation framework for cross-modality medical image segmentation. |
Wei Feng; Lie Ju; Lin Wang; Kaimin Song; Xin Zhao; Zongyuan Ge; |
70 | SEFormer: Structure Embedding Transformer for 3D Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a Structure-Embedding transFormer (SEFormer), which can not only preserve the local structure as a traditional Transformer but also have the ability to encode the local structure. |
Xiaoyu Feng; Heming Du; Hehe Fan; Yueqi Duan; Yongpan Liu; |
71 | Exploit Domain-Robust Optical Flow in Domain Adaptive Video Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we try to find a domain-robust clue to construct more reliable supervision signals. |
Yuan Gao; Zilei Wang; Jiafan Zhuang; Yixin Zhang; Junjie Li; |
72 | Scene-Level Sketch-Based Image Retrieval with Minimal Pairwise Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, a more general scene-level SBIR task is explored, where sketches and images can both contain multiple object instances. |
Ce Ge; Jingyu Wang; Qi Qi; Haifeng Sun; Tong Xu; Jianxin Liao; |
73 | Causal Intervention for Human Trajectory Prediction with Cross Attention Mechanism Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on causal intervention rather than conventional likelihood, we propose a Social Environment ADjustment (SEAD) method, to remove the confounding effect of the social environment. |
Chunjiang Ge; Shiji Song; Gao Huang; |
74 | Point-Teaching: Weakly Semi-supervised Object Detection with Point Annotations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present Point-Teaching, a weakly- and semi-supervised object detection framework to fully utilize the point annotations. |
Yongtao Ge; Qiang Zhou; Xinlong Wang; Chunhua Shen; Zhibin Wang; Hao Li; |
75 | Progressive Multi-View Human Mesh Recovery with Self-Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Tackling both challenges, we propose a novel simulation-based training pipeline for multi-view human mesh recovery, which (a) relies on intermediate 2D representations which are more robust to synthetic-to-real domain gap; (b) leverages learnable calibration and triangulation to adapt to more diversified camera setups; and (c) progressively aggregates multi-view information in a canonical 3D space to remove ambiguities in 2D representations. |
Xuan Gong; Liangchen Song; Meng Zheng; Benjamin Planche; Terrence Chen; Junsong Yuan; David Doermann; Ziyan Wu; |
76 | Incremental Image De-raining Via Associative Memory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we argue the importance of sample diversity in the episodes on the iterative optimization, and propose a novel memory management method, Associative Memory, to achieve incremental image de-raining. |
Yi Gu; Chao Wang; Jie Li; |
77 | Flexible 3D Lane Detection By Hierarchical Shape Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, an end-to-end flexible and hierarchical lane detector is proposed to precisely predict 3D lane lines from point clouds. |
Zhihao Guan; Ruixin Liu; Zejian Yuan; Ao Liu; Kun Tang; Tong Zhou; Erlong Li; Chao Zheng; Shuqi Mei; |
78 | Underwater Ranker: Learn Which Is Better and How to Be Better Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a ranking-based underwater image quality assessment (UIQA) method, abbreviated as URanker. |
Chunle Guo; Ruiqi Wu; Xin Jin; Linghao Han; Weidong Zhang; Zhi Chai; Chongyi Li; |
79 | ShadowFormer: Global Context Helps Shadow Removal Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on that, we propose a Shadow-Interaction Module (SIM) with Shadow-Interaction Attention (SIA) in the bottleneck stage to effectively model the context correlation between shadow and non-shadow regions. |
Lanqing Guo; Siyu Huang; Ding Liu; Hao Cheng; Bihan Wen; |
80 | RAFaRe: Learning Robust and Accurate Non-parametric 3D Face Reconstruction from Pseudo 2D&3D Pairs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a robust and accurate non-parametric method for single-view 3D face reconstruction (SVFR). |
Longwei Guo; Hao Zhu; Yuanxun Lu; Menghua Wu; Xun Cao; |
81 | RankDNN: Learning to Rank for Few-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces a new few-shot learning pipeline that casts relevance ranking for image retrieval as binary ranking relation classification. |
Qianyu Guo; Gong Haotong; Xujun Wei; Yanwei Fu; Yizhou Yu; Wenqiang Zhang; Weifeng Ge; |
82 | Social Relation Reasoning Based on Triangular Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we formulate the paradigm of the higher-order constraints in social relations into triangular relational closed-loop structures, i.e., triangular constraints, and further introduce the triangular reasoning graph attention network (TRGAT). |
Yunfei Guo; Fei Yin; Wei Feng; Xudong Yan; Tao Xue; Shuqi Mei; Cheng-Lin Liu; |
83 | CALIP: Zero-Shot Enhancement of CLIP with Parameter-Free Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a free-lunch enhancement method, CALIP, to boost CLIP’s zero-shot performance via a parameter-free attention module. |
Ziyu Guo; Renrui Zhang; Longtian Qiu; Xianzheng Ma; Xupeng Miao; Xuming He; Bin Cui; |
84 | Few-Shot Object Detection Via Variational Feature Aggregation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As few-shot object detectors are often trained with abundant base samples and fine-tuned on few-shot novel examples, the learned models are usually biased to base classes and sensitive to the variance of novel examples. To address this issue, we propose a meta-learning framework with two novel feature aggregation schemes. |
Jiaming Han; Yuqiang Ren; Jian Ding; Ke Yan; Gui-Song Xia; |
85 | Generating Transferable 3D Adversarial Point Cloud Via Random Perturbation Factorization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we revisit the transferability of adversarial 3D point clouds. |
Bangyan He; Jian Liu; Yiming Li; Siyuan Liang; Jingzhi Li; Xiaojun Jia; Xiaochun Cao; |
86 | Target-Aware Tracking with Long-Term Context Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Most deep trackers still follow the guidance of the siamese paradigms and use a template that contains only the target without any contextual information, which makes it difficult for the tracker to cope with large appearance changes, rapid target movement, and attraction from similar objects. To alleviate the above problem, we propose a long-term context attention (LCA) module that can perform extensive information fusion on the target and its context from long-term frames, and calculate the target correlation while enhancing target features. |
Kaijie He; Canlong Zhang; Sheng Xie; Zhixin Li; Zhiwen Wang; |
87 | Weakly-Supervised Camouflaged Object Detection with Scribble Annotations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose the first weakly-supervised COD method, using scribble annotations as supervision. |
Ruozhen He; Qihua Dong; Jiaying Lin; Rynson W.H. Lau; |
88 | Efficient Mirror Detection Via Multi-Level Heterogeneous Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present HetNet (Multi-level Heterogeneous Network), a highly efficient mirror detection network. |
Ruozhen He; Jiaying Lin; Rynson W.H. Lau; |
89 | TransVCL: Attention-Enhanced Video Copy Localization Network with Flexible Supervision Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose TransVCL: an attention-enhanced video copy localization network, which is optimized directly from initial frame-level features and trained end-to-end with three main components: a customized Transformer for feature enhancement, a correlation and softmax layer for similarity matrix generation, and a temporal alignment module for copied segments localization. |
Sifeng He; Yue He; Minlong Lu; Chen Jiang; Xudong Yang; Feng Qian; Xiaobo Zhang; Lei Yang; Jiandong Zhang; |
90 | Open-Vocabulary Multi-Label Classification Via Multi-Modal Knowledge Transfer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by the success of OV-based methods, we propose a novel open-vocabulary framework, named multi-modal knowledge transfer (MKT), for multi-label classification. |
Sunan He; Taian Guo; Tao Dai; Ruizhi Qiao; Xiujun Shu; Bo Ren; Shu-Tao Xia; |
91 | Parameter-Efficient Model Adaptation for Vision Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we aim to study parameter-efficient model adaptation strategies for vision transformers on the image classification task. |
Xuehai He; Chunyuan Li; Pengchuan Zhang; Jianwei Yang; Xin Eric Wang; |
92 | DarkFeat: Noise-Robust Feature Detector and Descriptor for Extremely Low-Light RAW Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose DarkFeat, a deep learning model which directly detects and describes local features from extreme low-light RAW images in an end-to-end manner. |
Yuze He; Yubin Hu; Wang Zhao; Jisheng Li; Yong-Jin Liu; Yuxing Han; Jiangtao Wen; |
93 | GAM: Gradient Attention Module of Optimization for Point Clouds Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper argues that fine-grained geometric information (FGGI) plays an important role in the aggregation of local features. Based on this, we propose a gradient-based local attention module to address the above problem, which is called Gradient Attention Module (GAM). |
Haotian Hu; Fanyi Wang; Zhiwang Zhang; Yaonong Wang; Laifeng Hu; Yanhao Zhang; |
94 | Self-Supervised Learning for Multilevel Skeleton-Based Forgery Detection Via Temporal-Causal Consistency of Actions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose an approach to self-supervised learning of the temporal causality behind human action, which can effectively check TII in skeletal sequences. |
Liang Hu; Dora D. Liu; Qi Zhang; Usman Naseem; Zhong Yuan Lai; |
95 | Self-Emphasizing Network for Continuous Sign Language Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a temporal self-emphasizing module to adaptively emphasize those discriminative frames and suppress redundant ones. |
Lianyu Hu; Liqing Gao; Zekang Liu; Wei Feng; |
96 | Store and Fetch Immediately: Everything Is All You Need for Space-Time Video Super-resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To alleviate the limitation, this paper proposes an immediate storeand-fetch network to promote long-range correlation learning, where the stored information from the past and future can be refetched to help the representation of the current frame. |
Mengshun Hu; Kui Jiang; Zhixiang Nie; Jiahuan Zhou; Zheng Wang; |
97 | PointCA: Evaluating The Robustness of 3D Point Cloud Completion Models Against Adversarial Examples Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to evaluate the robustness of the completion models, we propose PointCA, the first adversarial attack against 3D point cloud completion models. |
Shengshan Hu; Junwei Zhang; Wei Liu; Junhui Hou; Minghui Li; Leo Yu Zhang; Hai Jin; Lichao Sun; |
98 | High-Resolution Iterative Feedback Network for Camouflaged Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Spotting camouflaged objects that are visually assimilated into the background is tricky for both object detection algorithms and humans who are usually confused or cheated by the perfectly intrinsic similarities between the foreground objects and the background surroundings. To tackle this challenge, we aim to extract the high-resolution texture details to avoid the detail degradation that causes blurred vision in edges and boundaries. |
Xiaobin Hu; Shuo Wang; Xuebin Qin; Hang Dai; Wenqi Ren; Donghao Luo; Ying Tai; Ling Shao; |
99 | Leveraging Sub-class Discimination for Compositional Zero-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a simple yet effective approach with leveraging sub-class discrimination. |
Xiaoming Hu; Zilei Wang; |
100 | GPTR: Gestalt-Perception Transformer for Diagram Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a gestalt-perception transformer model for diagram object detection, which is based on an encoder-decoder architecture. |
Xin Hu; Lingling Zhang; Jun Liu; Jinfu Fan; Yang You; Yaqiang Wu; |
101 | Resolving Task Confusion in Dynamic Expansion Architectures for Class Incremental Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We conduct extensive experiments on CIFAR100 and Ima- geNet100 datasets. |
Bingchen Huang; Zhineng Chen; Peng Zhou; Jiayin Chen; Zuxuan Wu; |
102 | ClassFormer: Exploring Class-Aware Dependency with Transformer for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite their strong capability of modeling long-range dependencies, the current methods still give rise to two main concerns in a class-level perspective: (1) intra-class problem: the existing methods lacked in extracting class-specific correspondences of different pixels, which may lead to poor object coverage and/or boundary prediction; (2) inter-class problem: the existing methods failed to model explicit category-dependencies among various objects, which may result in inaccurate localization. In light of these two issues, we propose a novel transformer, called ClassFormer, powered by two appealing transformers, i.e., intra-class dynamic transformer and inter-class interactive transformer, to address the challenge of fully exploration on compactness and discrepancy. |
Huimin Huang; Shiao Xie; Lanfen Lin; Ruofeng Tong; Yen-Wei Chen; Hong Wang; Yuexiang Li; Yawen Huang; Yefeng Zheng; |
103 | NLIP: Noise-Robust Language-Image Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, to automatically mitigate the impact of noise by solely mining over existing data, we propose a principled Noise-robust Language-Image Pre-training framework (NLIP) to stabilize pre-training via two schemes: noise-harmonization and noise-completion. |
Runhui Huang; Yanxin Long; Jianhua Han; Hang Xu; Xiwen Liang; Chunjing Xu; Xiaodan Liang; |
104 | Symmetry-Aware Transformer-Based Mirror Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing works mainly focus on integrating the semantic features and structural features to mine specific relations between mirror and non-mirror regions, or introducing mirror properties like depth or chirality to help analyze the existence of mirrors. In this work, we observe that a real object typically forms a loose symmetry relationship with its corresponding reflection in the mirror, which is beneficial in distinguishing mirrors from real objects. |
Tianyu Huang; Bowen Dong; Jiaying Lin; Xiaohui Liu; Rynson W.H. Lau; Wangmeng Zuo; |
105 | AudioEar: Single-View Ear Reconstruction for Personalized Spatial Audio Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: One of the key problems of current spatial audio rendering methods is the lack of personalization based on different anatomies of individuals, which is essential to produce accurate sound source positions. In this work, we address this problem from an interdisciplinary perspective. |
Xiaoyang Huang; Yanjun Wang; Yang Liu; Bingbing Ni; Wenjun Zhang; Jinxian Liu; Teng Li; |
106 | Boosting Point Clouds Rendering Via Radiance Mapping Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on boosting the image quality of point clouds rendering with a compact model design. |
Xiaoyang Huang; Yi Zhang; Bingbing Ni; Teng Li; Kai Chen; Wenjun Zhang; |
107 | FreeEnricher: Enriching Face Landmarks Without Additional Cost Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Though dense facial landmark is highly demanded in various scenarios, e.g., cosmetic medicine and facial beautification, most works only consider sparse face alignment. To address this problem, we present a framework that can enrich landmark density by existing sparse landmark datasets, e.g., 300W with 68 points and WFLW with 98 points. |
Yangyu Huang; Xi Chen; Jongyoo Kim; Hao Yang; Chong Li; Jiaolong Yang; Dong Chen; |
108 | PATRON: Perspective-Aware Multitask Model for Referring Expression Grounding Using Embodied Multimodal Cues Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To make it exacerbate, these models are often trained on datasets collected in non-embodied settings without nonverbal gestures and curated from an exocentric perspective. To address these issues, in this paper, we present a perspective-aware multitask learning model, called PATRON, for relation and object grounding tasks in embodied settings by utilizing verbal utterances and nonverbal cues. |
Md Mofijul Islam; Alexi Gladstone; Tariq Iqbal; |
109 | Unifying Vision-Language Representation Space with Single-Tower Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore the hypothesis that an image and caption can be regarded as two different views of the underlying mutual information, and train a model to learn a unified vision-language representation space that encodes both modalities at once in a modality-agnostic manner. |
Jiho Jang; Chaerin Kong; DongHyeon Jeon; Seonhoon Kim; Nojun Kwak; |
110 | Delving Deep Into Pixel Alignment Feature for Accurate Multi-View Human Mesh Recovery Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present Pixel-aligned Feedback Fusion (PaFF) for accurate yet efficient human mesh recovery from multi-view images. |
Kai Jia; Hongwen Zhang; Liang An; Yebin Liu; |
111 | Semi-attention Partition for Occluded Person Re-identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a Semi-Attention Partition (SAP) method to learn well-aligned part features for occluded person re-identification (re-ID). |
Mengxi Jia; Yifan Sun; Yunpeng Zhai; Xinhua Cheng; Yi Yang; Ying Li; |
112 | Fast Online Hashing with Multi-Label Projection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel Fast Online Hashing (FOH) method which only updates the binary codes of a small part of the database. |
Wenzhe Jia; Yuan Cao; Junwei Liu; Jie Gui; |
113 | Fourier-Net: Fast Image Registration with Band-Limited Deformation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For high-resolution volumetric image data, this process is however resource-intensive and time-consuming. To tackle this problem, we propose the Fourier-Net, replacing the expansive path in a U-Net style network with a parameter-free model-driven decoder. |
Xi Jia; Joseph Bartlett; Wei Chen; Siyang Song; Tianyang Zhang; Xinxing Cheng; Wenqi Lu; Zhaowen Qiu; Jinming Duan; |
114 | Semi-supervised Deep Large-Baseline Homography Estimation with Progressive Equivalence Constraint Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Homography estimation is erroneous in the case of large-baseline due to the low image overlay and limited receptive field. To address it, we propose a progressive estimation strategy by converting large-baseline homography into multiple intermediate ones, cumulatively multiplying these intermediate items can reconstruct the initial homography. |
Hai Jiang; Haipeng Li; Yuhang Lu; Songchen Han; Shuaicheng Liu; |
115 | Multi-Modality Deep Network for Extreme Learned Image Compression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Image-based single-modality compression learning approaches have demonstrated exceptionally powerful encoding and decoding capabilities in the past few years , but suffer from blur and severe semantics loss at extremely low bitrates. To address this issue, we propose a multimodal machine learning method for text-guided image compression, in which the semantic information of text is used as prior information to guide image compression for better compression performance. |
Xuhao Jiang; Weimin Tan; Tian Tan; Bo Yan; Liquan Shen; |
116 | PolarFormer: Multi-Camera 3D Object Detection with Polar Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Hence, in this paper we advocate the exploitation of the Polar coordinate system and propose a new Polar Transformer (PolarFormer) for more accurate 3D object detection in the bird’s-eye-view (BEV) taking as input only multi-camera 2D images. |
Yanqin Jiang; Li Zhang; Zhenwei Miao; Xiatian Zhu; Jin Gao; Weiming Hu; Yu-Gang Jiang; |
117 | 3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we make the first attempt to achieve generic text-guided cross-category 3D object generation via a new 3D-TOGO model, which integrates a text-to-views generation module and a views-to-3D generation module. |
Zutao Jiang; Guansong Lu; Xiaodan Liang; Jihua Zhu; Wei Zhang; Xiaojun Chang; Hang Xu; |
118 | FacT: Factor-Tuning for Lightweight Adaptation on Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we aim to further promote the efficiency of PETL to meet the extreme storage constraint in real-world applications. |
Shibo Jie; Zhi-Hong Deng; |
119 | Estimating Reflectance Layer from A Single Image: Integrating Reflectance Guidance and Shadow/Specular Aware Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: It becomes more challenging when the input image contains shadows or specular highlights, which often render an inaccurate estimate of the reflectance layer. Therefore, we propose a two-stage learning method, including reflectance guidance and a Shadow/Specular-Aware (S-Aware) network to tackle the problem. |
Yeying Jin; Ruoteng Li; Wenhan Yang; Robby T. Tan; |
120 | Weakly-Guided Self-Supervised Pretraining for Temporal Activity Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel weakly-guided self-supervised pretraining method for detection. |
Kumara Kahatapitiya; Zhou Ren; Haoxiang Li; Zhenyu Wu; Michael S. Ryoo; Gang Hua; |
121 | Correlation Loss: Enforcing Correlation Between Classification and Localization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by these works, we focus on the correlation between classification and localization and make two main contributions: (i) We provide an analysis about the effects of correlation between classification and localization tasks in object detectors. We identify why correlation affects the performance of various NMS-based and NMS-free detectors, and we devise measures to evaluate the effect of correlation and use them to analyze common detectors. (ii) Motivated by our observations, e.g., that NMS-free detectors can also benefit from correlation, we propose Correlation Loss, a novel plug-in loss function that improves the performance of various object detectors by directly optimizing correlation coefficients: E.g., Correlation Loss on Sparse R-CNN, an NMS-free method, yields 1.6 AP gain on COCO and 1.8 AP gain on Cityscapes dataset. |
Fehmi Kahraman; Kemal Oksuz; Sinan Kalkan; Emre Akbas; |
122 | GuidedMixup: An Efficient Mixup Strategy Guided By Saliency Maps Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these methods incur a significant computational burden to optimize the mixup mask. From this motivation, we propose a novel saliency-aware mixup method, GuidedMixup, which aims to retain the salient regions in mixup images with low computational overhead. |
Minsoo Kang; Suhyun Kim; |
123 | 3D Human Pose Lifting with Grid Convolution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In sharp contrast to them, this paper presents Grid Convolution (GridConv), mimicking the wisdom of regular convolution operations in image space. |
Yangyuxuan Kang; Yuyang Liu; Anbang Yao; Shandong Wang; Enhua Wu; |
124 | Bidirectional Domain Mixup for Domain Adaptive Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper systematically studies the impact of mixup under the domain adaptive semantic segmentation task and presents a simple yet effective mixup strategy called Bidirectional Domain Mixup (BDM). |
Daehan Kim; Minseok Seo; Kwanyong Park; Inkyu Shin; Sanghyun Woo; In So Kweon; Dong-Geol Choi; |
125 | Frequency Selective Augmentation for Video Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose frequency augmentation (FreqAug), a spatio-temporal data augmentation method in the frequency domain for video representation learning. |
Jinhyung Kim; Taeoh Kim; Minho Shim; Dongyoon Han; Dongyoon Wee; Junmo Kim; |
126 | Pose-Guided 3D Human Generation in Indoor Scene Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address the problem of scene-aware 3D human avatar generation based on human-scene interactions. |
Minseok Kim; Changwoo Kang; Jeongin Park; Kyungdon Joo; |
127 | Semantic-Aware Superpixel for Weakly Supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore a self-supervised vision transformer to mitigate the heavy efforts on generation of pixel-level annotations. |
Sangtae Kim; Daeyoung Park; Byonghyo Shim; |
128 | Multispectral Invisible Coating: Laminated Visible-Thermal Physical Attack Against Multispectral Object Detectors Using Transparent Low-E Films Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate the vulnerability of multispectral detectors against physical attacks by proposing a new physical method: Multispectral Invisible Coating. |
Taeheon Kim; Youngjoon Yu; Yong Man Ro; |
129 | CRAFT: Camera-Radar 3D Object Detection with Spatio-Contextual Fusion Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we propose a novel proposal-level early fusion approach that effectively exploits both spatial and contextual properties of camera and radar for 3D object detection. |
Youngseok Kim; Sanmin Kim; Jun Won Choi; Dongsuk Kum; |
130 | Simple and Effective Synthesis of Indoor 3D Scenes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our aim is to generate high-resolution images and videos from novel viewpoints, including viewpoints that extrapolate far beyond the input images while maintaining 3D consistency. |
Jing Yu Koh; Harsh Agrawal; Dhruv Batra; Richard Tucker; Austin Waters; Honglak Lee; Yinfei Yang; Jason Baldridge; Peter Anderson; |
131 | MGTANet: Encoding Sequential LiDAR Points Using Long Short-Term Motion-Guided Temporal Attention for 3D Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel 3D object detection architecture, which can encode LiDAR point cloud sequences acquired by multiple successive scans. |
Junho Koh; Junhyung Lee; Youngwoo Lee; Jaekyum Kim; Jun Won Choi; |
132 | InstanceFormer: An Online Video Instance Segmentation Framework Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a single-stage transformer-based efficient online VIS framework named InstanceFormer, which is especially suitable for long and challenging videos. |
Rajat Koner; Tanveer Hannan; Suprosanna Shit; Sahand Sharifzadeh; Matthias Schubert; Thomas Seidl; Volker Tresp; |
133 | Pixel-Wise Warping for Deep Image Stitching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, instead of relying on the homography-based warp, we propose a novel deep image stitching framework exploiting the pixel-wise warp field to handle the large-parallax problem. |
Hyeokjun Kweon; Hyeonseong Kim; Yoonsu Kang; Youngho Yoon; WooSeong Jeong; Kuk-Jin Yoon; |
134 | Learning to Learn Better for Video Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Besides, how to reasonably fuse the target features in the two different branches rather than simply adding them together to avoid the adverse effect of one dominant branch has not been investigated. In this paper, we propose a novel framework that emphasizes Learning to Learn Better (LLB) target features for SVOS, termed LLB, where we design the discriminative label generation module (DLGM) and the adaptive fusion module to address these issues. |
Meng Lan; Jing Zhang; Lefei Zhang; Dacheng Tao; |
135 | Curriculum Multi-Negative Augmentation for Debiased Video Grounding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent studies have found that current VG models are prone to over-rely the groundtruth moment annotation distribution biases in the training set. To discourage the standard VG model’s behavior of exploiting such temporal annotation biases and improve the model generalization ability, we propose multiple negative augmentations in a hierarchical way, including cross-video augmentations from clip-/video-level, and self-shuffled augmentations with masks. |
Xiaohan Lan; Yitian Yuan; Hong Chen; Xin Wang; Zequn Jie; Lin Ma; Zhi Wang; Wenwu Zhu; |
136 | Weakly Supervised 3D Segmentation Via Receptive-Driven Pseudo Label Consistency and Structural Consistency Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a weakly supervised point cloud semantic segmentation framework via receptive-driven pseudo label consistency and structural consistency to mine potential knowledge. |
Yuxiang Lan; Yachao Zhang; Yanyun Qu; Cong Wang; Chengyang Li; Jia Cai; Yuan Xie; Zongze Wu; |
137 | MultiAct: Long-Term 3D Human Motion Generation from Multiple Action Labels Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present MultiAct, the first framework to generate long-term 3D human motion from multiple action labels. |
Taeryung Lee; Gyeongsik Moon; Kyoung Mu Lee; |
138 | Not All Neighbors Matter: Point Distribution-Aware Pruning for 3D Point Cloud Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a new weight pruning technique for 3D point cloud based on spatial point distribution. |
Yejin Lee; Donghyun Lee; JungUk Hong; Jae W. Lee; Hongil Yoon; |
139 | Symbolic Replay: Scene Graph As Prompt for Continual Learning on VQA Task Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Thus, we propose a real-data-free replay-based method tailored for CL on VQA, named Scene Graph as Prompt for Symbolic Replay. |
Stan Weixian Lei; Difei Gao; Jay Zhangjie Wu; Yuxuan Wang; Wei Liu; Mengmi Zhang; Mike Zheng Shou; |
140 | Linking People Across Text and Images Based on Social Relation Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We observe that humans are adept at exploring social relations to assist identifying people. Therefore, we propose a Social Relation Reasoning (SRR) model to address the aforementioned issues. |
Yang Lei; Peizhi Zhao; Pijian Li; Yi Cai; Qingbao Huang; |
141 | ReGANIE: Rectifying GAN Inversion Errors for Accurate Real Image Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing encoder-based or optimization-based StyleGAN inversion methods attempt to mitigate the trade-off but see limited performance. To fundamentally resolve this problem, we propose a novel two-phase framework by designating two separate networks to tackle editing and reconstruction respectively, instead of balancing the two. |
Bingchuan Li; Tianxiang Ma; Peng Zhang; Miao Hua; Wei Liu; Qian He; Zili Yi; |
142 | SWBNet: A Stable White Balance Network for SRGB Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The white balance methods for sRGB images (sRGB-WB) aim to directly remove their color temperature shifts. Despite achieving promising white balance (WB) performance, the existing methods suffer from WB instability, i.e., their results are inconsistent for images with different color temperatures. We propose a stable white balance network (SWBNet) to alleviate this problem. |
Chunxiao Li; Xuejing Kang; Zhifeng Zhang; Anlong Ming; |
143 | Frequency Domain Disentanglement for Arbitrary Neural Style Transfer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Therefore, these methods always suffer from low-quality results because of the sub-optimal disentanglement. To address such a challenge, this paper proposes the frequency mixer (FreMixer) module that disentangles and re-entangles the frequency spectrum of content and style components in the frequency domain. |
Dongyang Li; Hao Luo; Pichao Wang; Zhibin Wang; Shang Liu; Fan Wang; |
144 | Pose-Oriented Transformer with Uncertainty-Guided Refinement for 2D-to-3D Human Pose Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing transformer-based methods treat body joints as equally important inputs and ignore the prior knowledge of human skeleton topology in the self-attention mechanism. To tackle this issue, in this paper, we propose a Pose-Oriented Transformer (POT) with uncertainty guided refinement for 3D HPE. |
Han Li; Bowen Shi; Wenrui Dai; Hongwei Zheng; Botao Wang; Yu Sun; Min Guo; Chenglin Li; Junni Zou; Hongkai Xiong; |
145 | CEE-Net: Complementary End-to-End Network for 3D Human Pose Generation and Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose the CEE-Net, a Complementary End-to-End Network for 3D human pose generation and estimation. |
Haolun Li; Chi-Man Pun; |
146 | Real-World Deep Local Motion Deblurring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on ReLoBlur, we propose a Local Blur-Aware Gated network (LBAG) and several local blur-aware techniques to bridge the gap between global and local deblurring: 1) a blur detection approach based on background subtraction to localize blurred regions; 2) a gate mechanism to guide our network to focus on blurred regions; and 3) a blur-aware patch cropping strategy to address data imbalance problem. |
Haoying Li; Ziran Zhang; Tingting Jiang; Peng Luo; Huajun Feng; Zhihai Xu; |
147 | Disentangle and Remerge: Interventional Knowledge Distillation for Few-Shot Object Detection from A Conditional Causal Perspective Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Following the theoretical guidance, we propose a backdoor adjustment-based knowledge distillation method for the few-shot object detection task, namely Disentangle and Remerge (D&R), to perform conditional causal intervention toward the corresponding Structural Causal Model. |
Jiangmeng Li; Yanan Zhang; Wenwen Qiang; Lingyu Si; Chengbo Jiao; Xiaohui Hu; Changwen Zheng; Fuchun Sun; |
148 | Learning Motion-Robust Remote Photoplethysmography Through Arbitrary Resolution Videos Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Different from the previous rPPG models designed for a constant distance between camera and participants, in this paper, we propose two plug-and-play blocks (i.e., physiological signal feature extraction block (PFE) and temporal face alignment block (TFA)) to alleviate the degradation of changing distance and head motion. |
Jianwei Li; Zitong Yu; Jingang Shi; |
149 | FSR: A General Frequency-Oriented Framework to Accelerate Image Super-resolution Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods usually require substantial computations by operating in spatial domain. To address this issue, we propose a general frequency-oriented framework (FSR) to accelerate SR networks by considering data characteristics in frequency domain. |
Jinmin Li; Tao Dai; Mingyan Zhu; Bin Chen; Zhi Wang; Shu-Tao Xia; |
150 | Learning Polysemantic Spoof Trace: A Multi-Modal Disentanglement Network for Face Anti-spoofing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recently, the spoof trace disentanglement framework has shown great potential for coping with both seen and unseen spoof scenarios, but the performance is largely restricted by the single-modal input. This paper focuses on this issue and presents a multi-modal disentanglement model which targetedly learns polysemantic spoof traces for more accurate and robust generic attack detection. |
Kaicheng Li; Hongyu Yang; Binghui Chen; Pengyu Li; Biao Wang; Di Huang; |
151 | Stroke Extraction of Chinese Character Based on Deep Structure Deformable Image Registration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a deep learning-based character stroke extraction method that takes semantic features and prior information of strokes into consideration. |
Meng Li; Yahan Yu; Yi Yang; Guanghao Ren; Jian Wang; |
152 | Spatial-Spectral Transformer for Hyperspectral Image Denoising Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Unfortunately, though witnessing the development of deep learning in HSI denoising area, existing convolution-based methods face the trade-off between computational efficiency and capability to model non-local characteristics of HSI. In this paper, we propose a Spatial-Spectral Transformer (SST) to alleviate this problem. |
Miaoyu Li; Ying Fu; Yulun Zhang; |
153 | Learning Semantic Alignment with Global Modality Reconstruction for Video-Language Pre-training Towards Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To alleviate the problem, we propose a video-language pre-training framework, termed videolanguage pre-training For lEarning sEmantic aLignments (FEEL), to learn semantic alignments at the sequence level. |
Mingchao Li; Xiaoming Shi; Haitao Leng; Wei Zhou; Hai-Tao Zheng; Kuncai Zhang; |
154 | Layout-Aware Dreamer for Embodied Visual Referring Expression Grounding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the problem of Embodied Referring Expression Grounding, where an agent needs to navigate in a previously unseen environment and localize a remote object described by a concise high-level natural language instruction. |
Mingxiao Li; Zehao Wang; Tinne Tuytelaars; Marie-Francine Moens; |
155 | NeAF: Learning Neural Angle Fields for Point Normal Estimation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these methods are not generalized well to unseen scenarios and are sensitive to parameter settings. To resolve these issues, we propose an implicit function to learn an angle field around the normal of each point in the spherical coordinate system, which is dubbed as Neural Angle Fields (NeAF). |
Shujuan Li; Junsheng Zhou; Baorui Ma; Yu-Shen Liu; Zhizhong Han; |
156 | CLIP-ReID: Exploiting Vision-Language Model for Image Re-identification Without Concrete Text Labels Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper first finds out that simply fine-tuning the visual model initialized by the image encoder in CLIP, has already obtained competitive performances in various ReID tasks. Then we propose a two-stage strategy to facilitate a better visual representation. |
Siyuan Li; Li Sun; Qingli Li; |
157 | DC-Former: Diverse and Compact Transformer for Person Re-identification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a Diverse and Compact Transformer (DC-Former) that can achieve a similar effect by splitting embedding space into multiple diverse and compact subspaces. |
Wen Li; Cheng Zou; Meng Wang; Furong Xu; Jianan Zhao; Ruobing Zheng; Yuan Cheng; Wei Chu; |
158 | Panoramic Video Salient Object Detection with Ambisonic Audio Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aim to tackle the video salient object detection problem for panoramic videos, with their corresponding ambisonic audios. |
Xiang Li; Haoyuan Cao; Shijie Zhao; Junlin Li; Li Zhang; Bhiksha Raj; |
159 | LWSIS: LiDAR-Guided Weakly Supervised Instance Segmentation for Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a more artful framework, LiDAR-guided Weakly Supervised Instance Segmentation (LWSIS), which leverages the off-the-shelf 3D data, i.e., Point Cloud, together with the 3D boxes, as natural weak supervisions for training the 2D image instance segmentation models. |
Xiang Li; Junbo Yin; Botian Shi; Yikang Li; Ruigang Yang; Jianbing Shen; |
160 | Adaptive Texture Filtering for Single-Domain Generalized Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a novel adaptive texture filtering mechanism to suppress the influence of texture without using augmentation, thus eliminating the interference of domain-specific features. |
Xinhui Li; Mingjia Li; Yaxing Wang; Chuan-Xian Ren; Xiaojie Guo; |
161 | MEID: Mixture-of-Experts with Internal Distillation for Long-Tailed Video Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To highlight the multi-label challenge in long-tailed video recognition, we create two additional benchmarks based on Charades and CharadesEgo videos with the multi-label property, called CharadesLT and CharadesEgoLT. |
Xinjie Li; Huijuan Xu; |
162 | Gradient Corner Pooling for Keypoint-Based Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a method named Gradient Corner Pooling. |
Xuyang Li; Xuemei Xie; Mingxuan Yu; Jiakai Luo; Chengwei Rao; Guangming Shi; |
163 | Towards Real-Time Segmentation on The Edge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose to combine the self attention block with lightweight convolutions to form new building blocks, and employ latency constraints to search an efficient sub-network. |
Yanyu Li; Changdi Yang; Pu Zhao; Geng Yuan; Wei Niu; Jiexiong Guan; Hao Tang; Minghai Qin; Qing Jin; Bin Ren; Xue Lin; Yanzhi Wang; |
164 | BEVDepth: Acquisition of Reliable Depth for Multi-View 3D Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this research, we propose a new 3D object detector with a trustworthy depth estimation, dubbed BEVDepth, for camera-based Bird’s-Eye-View~(BEV) 3D object detection. |
Yinhao Li; Zheng Ge; Guanyi Yu; Jinrong Yang; Zengran Wang; Yukang Shi; Jianjian Sun; Zeming Li; |
165 | BEVStereo: Enhancing Depth Estimation in Multi-View 3D Object Detection with Temporal Stereo Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose an effective method for creating temporal stereo by dynamically determining the center and range of the temporal stereo. |
Yinhao Li; Han Bao; Zheng Ge; Jinrong Yang; Jianjian Sun; Zeming Li; |
166 | Learning Single Image Defocus Deblurring with Misaligned Training Pairs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a joint deblurring and reblurring learning (JDRL) framework for single image defocus deblurring with misaligned training pairs. |
Yu Li; Dongwei Ren; Xinya Shu; Wangmeng Zuo; |
167 | Curriculum Temperature for Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a simple curriculum-based technique, termed Curriculum Temperature for Knowledge Distillation (CTKD), which controls the task difficulty level during the student’s learning career through a dynamic and learnable temperature. |
Zheng Li; Xiang Li; Lingfeng Yang; Borui Zhao; Renjie Song; Lei Luo; Jun Li; Jian Yang; |
168 | Actionness Inconsistency-Guided Contrastive Learning for Weakly-Supervised Temporal Action Localization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel Actionness Inconsistency-guided Contrastive Learning (AICL) method which utilizes the consistent segments to boost the representation learning of the inconsistent segments. |
Zhilin Li; Zilei Wang; Qinying Liu; |
169 | READ: Large-Scale Neural Scene Rendering for Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, a large-scale neural rendering method is proposed to synthesize the autonomous driving scene~(READ), which makes it possible to generate large-scale driving scenes in real time on a PC through a variety of sampling schemes. |
Zhuopeng Li; Lu Li; Jianke Zhu; |
170 | CDTA: A Cross-Domain Transfer-Based Attack with Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we design a Cross-Domain Transfer-Based Attack (CDTA), which works in the cross-domain scenario. |
Zihan Li; Weibin Wu; Yuxin Su; Zibin Zheng; Michael R. Lyu; |
171 | HybridCap: Inertia-Aid Monocular Capture of Challenging Human Motions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a light-weight, hybrid mocap technique called HybridCap that augments the camera with only 4 Inertial Measurement Units (IMUs) in a novel learning-and-optimization framework. |
Han Liang; Yannan He; Chengfeng Zhao; Mutian Li; Jingya Wang; Jingyi Yu; Lan Xu; |
172 | Global Dilated Attention and Target Focusing Network for Robust Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, it brings two challenges: First, its global receptive field has less attention on local structure and inter-channel associations, which limits the semantics to distinguish objects and backgrounds; Second, its feature fusion with linear process cannot avoid the interference of non-target semantic objects. To solve the above issues, this paper proposes a robust tracking method named GdaTFT by defining the Global Dilated Attention (GDA) and Target Focusing Network (TFN). |
Yun Liang; Qiaoqiao Li; Fumian Long; |
173 | Only A Few Classes Confusing: Pixel-Wise Candidate Labels Disambiguation for Foggy Scene Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The true semantics of most pixels have a high likelihood of appearing in the few top classes according to confidence ranking. In this paper, we replace the one-hot pseudo label with a candidate label set (CLS) that consists of only a few ambiguous classes and exploit its effects on self-training-based unsupervised domain adaptation. |
Liang Liao; Wenyi Chen; Zhen Zhang; Jing Xiao; Yan Yang; Chia-Wen Lin; Shin’ichi Satoh; |
174 | Actional Atomic-Concept Learning for Demystifying Vision-Language Navigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Actional Atomic-Concept Learning (AACL), which maps visual observations to actional atomic concepts for facilitating the alignment. |
Bingqian Lin; Yi Zhu; Xiaodan Liang; Liang Lin; Jianzhuang Liu; |
175 | Probability Guided Loss for Long-Tailed Multi-Label Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Long-tailed multi-label image classification is one subtask and remains challenging and poorly researched. In this paper, we provide a fresh perspective from probability to tackle this problem. |
Dekun Lin; |
176 | Self-Supervised Image Denoising Using Implicit Deep Denoiser Prior Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We devise a new regularization for denoising with self-supervised learning. |
Huangxing Lin; Yihong Zhuang; Xinghao Ding; Delu Zeng; Yue Huang; Xiaotong Tu; John Paisley; |
177 | Accelerating The Training of Video Super-resolution Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show that it is possible to gradually train video models from small to large spatial/temporal sizes, \ie, in an easy-to-hard manner. |
Lijian Lin; Xintao Wang; Zhongang Qi; Ying Shan; |
178 | SelectAugment: Hierarchical Deterministic Sample Selection for Data Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an effective approach, dubbed SelectAugment, to select samples for augmentation in a deterministic and online manner based on the sample contents and the network training status. |
Shiqi Lin; Zhizheng Zhang; Xin Li; Zhibo Chen; |
179 | AdaCM: Adaptive ColorMLP for Real-Time Universal Photo-Realistic Style Transfer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose the Adaptive ColorMLP (AdaCM), an effective and efficient framework for universal photo-realistic style transfer. |
Tianwei Lin; Honglin Lin; Fu Li; Dongliang He; Wenhao Wu; Meiling Wang; Xin Li; Yong Liu; |
180 | SEPT: Towards Scalable and Efficient Visual Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Buttressed by the hypothesis, we propose the first yet novel framework for Scalable and Efficient visual Pre-Training (SEPT) by introducing a retrieval pipeline for data selection. |
Yiqi Lin; Huabin Zheng; Huaping Zhong; Jinjing Zhu; Weijia Li; Conghui He; Lin Wang; |
181 | Cross-Modality Earth Mover’s Distance for Visible Thermal Person Re-identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose the Cross-Modality Earth Mover’s Distance (CM-EMD) that can alleviate the impact of the intra-identity variations during modality alignment. |
Yongguo Ling; Zhun Zhong; Zhiming Luo; Fengxiang Yang; Donglin Cao; Yaojin Lin; Shaozi Li; Nicu Sebe; |
182 | Hypotheses Tree Building for One-Shot Temporal Sentence Localization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we target another more practical and challenging setting: one-shot temporal sentence localization (one-shot TSL), which learns to retrieve the query information among the entire video with only one annotated frame. |
Daizong Liu; Xiang Fang; Pan Zhou; Xing Di; Weining Lu; Yu Cheng; |
183 | The Devil Is in The Frequency: Geminated Gestalt Autoencoder for Self-Supervised Visual Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Different from them, we shift the perspective to the Fourier domain which naturally has global perspective and present a new Masked Image Modeling (MIM), termed Geminated Gestalt Autoencoder (Ge^2-AE) for visual pre-training. |
Hao Liu; Xinghua Jiang; Xin Li; Antai Guo; Yiqing Hu; Deqiang Jiang; Bo Ren; |
184 | M3AE: Multimodal Representation Learning for Brain Tumor Segmentation with Missing Modalities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel two-stage framework for brain tumor segmentation with missing modalities. |
Hong Liu; Dong Wei; Donghuan Lu; Jinghan Sun; Liansheng Wang; Yefeng Zheng; |
185 | From Coarse to Fine: Hierarchical Pixel Integration for Lightweight Image Super-resolution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we aim to design a new attention block whose insights are from the interpretation of Local Attribution Map (LAM) for SR networks. |
Jie Liu; Chao Chen; Jie Tang; Gangshan Wu; |
186 | Fast Fluid Simulation Via Dynamic Multi-Scale Gridding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent works on learning-based frameworks for Lagrangian (i.e., particle-based) fluid simulation, though bypassing iterative pressure projection via efficient convolution operators, are still time-consuming due to excessive amount of particles. To address this challenge, we propose a dynamic multi-scale gridding method to reduce the magnitude of elements that have to be processed, by observing repeated particle motion patterns within certain consistent regions. |
Jinxian Liu; Ye Chen; Bingbing Ni; Wei Ren; Zhenbo Yu; Xiaoyang Huang; |
187 | TransLO: A Window-Based Masked Point Transformer Framework for Large-Scale LiDAR Odometry Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we rethink the structure of point transformer. |
Jiuming Liu; Guangming Wang; Chaokang Jiang; Zhe Liu; Hesheng Wang; |
188 | Low-Light Video Enhancement with Synthetic Event Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, inspired by the low latency and high dynamic range of events, we use synthetic events from multiple frames to guide the enhancement and restoration of low-light videos. |
Lin Liu; Junfeng An; Jianzhuang Liu; Shanxin Yuan; Xiangyu Chen; Wengang Zhou; Houqiang Li; Yan Feng Wang; Qi Tian; |
189 | Novel Motion Patterns Matter for Practical Skeleton-Based Action Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As it is laborious to collect sufficient training samples to enumerate various types of novel motion patterns, this paper presents a practical skeleton-based action recognition task where the training set contains common motion patterns of action samples and the test set contains action samples that suffer from novel motion patterns. |
Mengyuan Liu; Fanyang Meng; Chen Chen; Songtao Wu; |
190 | EMEF: Ensemble Multi-Exposure Image Fusion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we study the MEF problem from a new perspective. |
Renshuai Liu; Chengyang Li; Haitao Cao; Yinglin Zheng; Ming Zeng; Xuan Cheng; |
191 | Reducing Domain Gap in Frequency and Spatial Domain for Cross-Modality Domain Adaptation on Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a simple yet effective UDA method based on frequency and spatial domain transfer under multi-teacher distillation framework. |
Shaolei Liu; Siqi Yin; Linhao Qu; Manning Wang; |
192 | DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we study the problem of visual grounding by considering both phrase extraction and grounding (PEG). |
Shilong Liu; Shijia Huang; Feng Li; Hao Zhang; Yaoyuan Liang; Hang Su; Jun Zhu; Lei Zhang; |
193 | Progressive Neighborhood Aggregation for Semantic Segmentation Refinement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate a progressive neighborhood aggregation (PNA) framework to refine the semantic segmentation prediction, resulting in an end-to-end solution that can perform the coarse prediction and refinement in a unified network. |
Ting Liu; Yunchao Wei; Yanning Zhang; |
194 | CoordFill: Efficient High-Resolution Image Inpainting Via Parameterized Coordinate Querying Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Image inpainting aims to fill the missing hole of the input. It is hard to solve this task efficiently when facing high-resolution images due to two reasons: (1) Large reception field needs to be handled for high-resolution image inpainting. (2) The general encoder and decoder network synthesizes many background pixels synchronously due to the form of the image matrix. In this paper, we try to break the above limitations for the first time thanks to the recent development of continuous implicit representation. |
Weihuang Liu; Xiaodong Cun; Chi-Man Pun; Menghan Xia; Yong Zhang; Jue Wang; |
195 | CCQ: Cross-Class Query Network for Partially Labeled Organ Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing algorithms of multi-organ segmentation on partially-labeled datasets neglect the semantic relations and anatomical priors between different categories of organs, which is crucial for partially-labeled multi-organ segmentation. In this paper, we tackle the limitations above by proposing the Cross-Class Query Network (CCQ). |
Xuyang Liu; Bingbing Wen; Sibei Yang; |
196 | Counterfactual Dynamics Forecasting – A New Setting of Quantitative Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on it, we propose a method to infer counterfactual dynamics considering the factual dynamics as demonstration. |
Yanzhu Liu; Ying Sun; Joo-Hwee Lim; |
197 | Self-Decoupling and Ensemble Distillation for Efficient Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, most of the Self-KD algorithms are specific to classification tasks based on soft-labels, and not suitable for semantic segmentation. To alleviate these contradictions, we revisit the label and feature distillation problem in segmentation, and propose Self-Decoupling and Ensemble Distillation for Efficient Segmentation (SDES). |
Yuang Liu; Wei Zhang; Jun Wang; |
198 | Token Mixing: Parameter-Efficient Transfer Learning from Image-Language to Video-Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, based on our analysis of the core ideas of different temporal modeling components in existing approaches, we propose a token mixing strategy to enable cross-frame interactions, which enables transferring from the pre-trained image-language model to video-language tasks through selecting and mixing a key set and a value set from the input video samples. |
Yuqi Liu; Luhui Xu; Pengfei Xiong; Qin Jin; |
199 | StereoDistill: Pick The Cream from LiDAR for Distilling Stereo-Based 3D Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a cross-modal distillation method named StereoDistill to narrow the gap between the stereo and LiDAR-based approaches via distilling the stereo detectors from the superior LiDAR model at the response level, which is usually overlooked in 3D object detection distillation. |
Zhe Liu; Xiaoqing Ye; Xiao Tan; Errui Ding; Xiang Bai; |
200 | Good Helper Is Around You: Attention-Driven Masked Image Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Attention-driven Masking and Throwing Strategy (AMT), which could solve both problems above. |
Zhengqi Liu; Jie Gui; Hao Luo; |
201 | RADIANT: Radar-Image Association Network for 3D Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: On the other hand, leveraging radar depths is hampered by difficulties in precisely associating radar returns with 3D estimates from monocular methods, effectively erasing its benefits. This paper proposes a fusion network that addresses this radar-camera association challenge. |
Yunfei Long; Abhinav Kumar; Daniel Morris; Xiaoming Liu; Marcos Castro; Punarjay Chakravarty; |
202 | CRIN: Rotation-Invariant Point Cloud Analysis and Rotation Estimation Via Centrifugal Reference Frame Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose the CRIN, namely Centrifugal Rotation-Invariant Network. |
Yujing Lou; Zelin Ye; Yang You; Nianjuan Jiang; Jiangbo Lu; Weiming Wang; Lizhuang Ma; Cewu Lu; |
203 | See Your Emotion from Gait Using Unlabeled Skeleton Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Cross-coordinate contrastive learning framework utilizing Ambiguity samples for self-supervised Gait-based Emotion representation (CAGE). |
Haifeng Lu; Xiping Hu; Bin Hu; |
204 | Learning Progressive Modality-Shared Transformers for Effective Visible-Infrared Person Re-identification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel deep learning framework named Progressive Modality-shared Transformer (PMT) for effective VI-ReID. |
Hu Lu; Xuezhang Zou; Pingping Zhang; |
205 | Breaking Immutable: Information-Coupled Prototype Elaboration for Few-Shot Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an Information-Coupled Prototype Elaboration (ICPE) method to generate specific and representative prototypes for each query image. |
Xiaonan Lu; Wenhui Diao; Yongqiang Mao; Junxi Li; Peijin Wang; Xian Sun; Kun Fu; |
206 | ParaFormer: Parallel Attention Transformer for Efficient Feature Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing lightweight networks optimized for Euclidean data cannot address classical feature matching tasks, since sparse keypoint based descriptors are expected to be matched. This paper tackles this problem and proposes two concepts: 1) a novel parallel attention model entitled ParaFormer and 2) a graph based U-Net architecture with attentional pooling. |
Xiaoyong Lu; Yaping Yan; Bin Kang; Songlin Du; |
207 | Robust One-Shot Segmentation of Brain Tissues Via Image-Aligned Style Transformation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel image-aligned style transformation to reinforce the dual-model iterative learning for robust one-shot segmentation of brain tissues. |
Jinxin Lv; Xiaoyu Zeng; Sheng Wang; Ran Duan; Zhiwei Wang; Qiang Li; |
208 | HRDoc: Dataset and Baseline Method Toward Hierarchical Reconstruction of Document Structures Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces hierarchical reconstruction of document structures as a novel task suitable for NLP and CV fields. |
Jiefeng Ma; Jun Du; Pengfei Hu; Zhenrong Zhang; Jianshu Zhang; Huihui Zhu; Cong Liu; |
209 | Semantic 3D-Aware Portrait Synthesis and Manipulation Based on Compositional Neural Radiance Field Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a Compositional Neural Radiance Field (CNeRF) for semantic 3D-aware portrait synthesis and manipulation. |
Tianxiang Ma; Bingchuan Li; Qian He; Jing Dong; Tieniu Tan; |
210 | CFFT-GAN: Cross-Domain Feature Fusion Transformer for Exemplar-Based Image Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a more general learning approach by considering two domain features as a whole and learning both inter-domain correspondence and intra-domain potential information interactions. |
Tianxiang Ma; Bingchuan Li; Wei Liu; Miao Hua; Jing Dong; Tieniu Tan; |
211 | StyleTalk: One-Shot Talking Head Generation with Controllable Speaking Styles Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Although existing one-shot talking head methods have made significant progress in lip sync, natural facial expressions, and stable head motions, they still cannot generate diverse speaking styles in the final talking head videos. To tackle this problem, we propose a one-shot style-controllable talking face generation framework. |
Yifeng Ma; Suzhen Wang; Zhipeng Hu; Changjie Fan; Tangjie Lv; Yu Ding; Zhidong Deng; Xin Yu; |
212 | Intriguing Findings of Frequency Selection for Image Deblurring Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper reveals an intriguing phenomenon that simply applying ReLU operation on the frequency domain of a blur image followed by inverse Fourier transform, i.e., frequency selection, provides faithful information about the blur pattern (e.g., the blur direction and blur level, implicitly shows the kernel pattern). |
Xintian Mao; Yiming Liu; Fengze Liu; Qingli Li; Wei Shen; Yan Wang; |
213 | DocEdit: Language-Guided Document Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a new task of language-guided localized document editing, where the user provides a document and an open vocabulary editing request, and the intelligent system produces a command that can be used to automate edits in real-world document editing software. |
Puneet Mathur; Rajiv Jain; Jiuxiang Gu; Franck Dernoncourt; Dinesh Manocha; Vlad I. Morariu; |
214 | Progressive Few-Shot Adaptation of Generative Model with Align-Free Spatial Correlation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, it can bring visual artifacts if source and target domain images are not nicely aligned. In this paper, we propose a few-shot generative model adaptation method free from such assumption, based on a motivation that generative models are progressively adapting from the source domain to the target domain. |
Jongbo Moon; Hyunjun Kim; Jae-Pil Heo; |
215 | Minority-Oriented Vicinity Expansion with Attentive Aggregation for Video Long-Tailed Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: A dramatic increase in real-world video volume with extremely diverse and emerging topics naturally forms a long-tailed video distribution in terms of their categories, and it spotlights the need for Video Long-Tailed Recognition (VLTR). In this work, we summarize the challenges in VLTR and explore how to overcome them. |
WonJun Moon; Hyun Seok Seong; Jae-Pil Heo; |
216 | Show, Interpret and Tell: Entity-Aware Contextualised Image Captioning in Wikipedia Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose the novel task of captioning Wikipedia images by integrating contextual knowledge. |
Khanh Nguyen; Ali Furkan Biten; Andres Mafla; Lluis Gomez; Dimosthenis Karatzas; |
217 | TaCo: Textual Attribute Recognition Via Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, their performance drop severely in real-world scenarios where unexpected and obvious imaging distortions appear. In this paper, we aim to tackle these problems by proposing TaCo, a contrastive framework for textual attribute recognition tailored toward the most common document scenes. |
Chang Nie; Yiqing Hu; Yanqiu Qu; Hao Liu; Deqiang Jiang; Bo Ren; |
218 | GLT-T: Global-Local Transformer Voting for 3D Single Object Tracking in Point Clouds Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moreover, seed points with different importance are treated equally in the voting process, aggravating this defect. To address these issues, we propose a novel global-local transformer voting scheme to provide more informative cues and guide the model pay more attention on potential seed points, promoting the generation of high-quality 3D proposals. |
Jiahao Nie; Zhiwei He; Yuxiang Yang; Mingyu Gao; Jing Zhang; |
219 | Adapting Object Size Variance and Class Imbalance for Semi-supervised Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Considering different object classes usually have different detection difficulty levels due to scale variance and data distribution imbalance, conventional pseudo-labeling-based methods are arduous to explore the value of unlabeled data sufficiently. To address these issues, we propose an adaptive pseudo labeling strategy, which can assign thresholds to classes with respect to their “hardness”. |
Yuxiang Nie; Chaowei Fang; Lechao Cheng; Liang Lin; Guanbin Li; |
220 | MIMO Is All You Need:A Strong Multi-in-Multi-Out Baseline for Video Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by that, we conduct a comprehensive investigation in this paper to thoroughly exploit how far a simple MIMO architecture can go. |
Shuliang Ning; Mengcheng Lan; Yanran Li; Chaofeng Chen; Qian Chen; Xunlai Chen; Xiaoguang Han; Shuguang Cui; |
221 | Universe Points Representation Learning for Partial Multi-Graph Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the more general partial matching problem with multi-graph cycle consistency guarantees. |
Zhakshylyk Nurlanov; Frank R. Schmidt; Florian Bernard; |
222 | Robust Image Denoising of No-Flash Images Guided By Consistent Flash Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a learning-based technique that robustly fuses the image pairs while considering their inconsistency. |
Geunwoo Oh; Jonghee Back; Jae-Pil Heo; Bochang Moon; |
223 | Coarse2Fine: Local Consistency Aware Re-prediction for Weakly Supervised Object Localization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce the Local Consistency Aware Re-prediction (LCAR) framework, which aims to recover the complete fine object mask from locally inconsistent activation map and hence obtain a tight bounding box. |
Yixuan Pan; Yao Yao; Yichao Cao; Chongjin Chen; Xiaobo Lu; |
224 | Find Beauty in The Rare: Contrastive Composition Feature Clustering for Nontrivial Cropping Box Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel Contrastive Composition Clustering (C2C) to regularize the composition features by contrasting dynamically established similar and dissimilar pairs. |
Zhiyu Pan; Yinpeng Chen; Jiale Zhang; Hao Lu; Zhiguo Cao; Weicai Zhong; |
225 | Domain Decorrelation with Potential Energy Ranking Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Behind this, domain shift is one of the primary factors to be blamed. In order to tackle this problem, we propose using Potential Energy Ranking (PoER) to decouple the object feature and the domain feature in given images, promoting the learning of label-discriminative representations while filtering out the irrelevant correlations between the objects and the background. |
Sen Pei; Jiaxi Sun; Richard Yi Da Xu; Shiming Xiang; Gaofeng Meng; |
226 | PDRF: Progressively Deblurring Radiance Field for Fast Scene Reconstruction from Blurry Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Progressively Deblurring Radiance Field (PDRF), a novel approach to efficiently reconstruct high quality radiance fields from blurry images. |
Cheng Peng; Rama Chellappa; |
227 | Efficient End-to-End Video Question Answering with Pyramidal Multimodal Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a new method for end-to-end Video Question Answering (VideoQA), aside from the current popularity of using large-scale pre-training with huge feature extractors. |
Min Peng; Chongyang Wang; Yu Shi; Xiang-Dong Zhou; |
228 | CL3D: Unsupervised Domain Adaptation for Cross-LiDAR 3D Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Domain adaptation for Cross-LiDAR 3D detection is challenging due to the large gap on the raw data representation with disparate point densities and point arrangements. By exploring domain-invariant 3D geometric characteristics and motion patterns, we present an unsupervised domain adaptation method that overcomes above difficulties. |
Xidong Peng; Xinge Zhu; Yuexin Ma; |
229 | Better and Faster: Adaptive Event Conversion for Event-Based Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on building better and faster event-based object detectors. |
Yansong Peng; Yueyi Zhang; Peilin Xiao; Xiaoyan Sun; Feng Wu; |
230 | CSTAR: Towards Compact and Structured Deep Neural Networks with Adversarial Robustness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the structured sparse models obtained by the existing works suffer severe performance degradation for both benign and robust accuracy, thereby causing a challenging dilemma between robustness and structuredness of compact DNNs. To address this problem, in this paper, we propose CSTAR, an efficient solution that simultaneously impose Compactness, high STructuredness and high Adversarial Robustness on the target DNN models. |
Huy Phan; Miao Yin; Yang Sui; Bo Yuan; Saman Zonouz; |
231 | Exploring Stochastic Autoregressive Image Modeling for Visual Representation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we try to find the reason why autoregressive modeling does not work well on vision tasks. |
Yu Qi; Fan Yang; Yousong Zhu; Yufei Liu; Liwei Wu; Rui Zhao; Wei Li; |
232 | Context-Aware Transformer for 3D Point Cloud Automatic Annotation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a simple yet effective end-to-end Context-Aware Transformer (CAT) as an automated 3D-box labeler to generate precise 3D box annotations from 2D boxes, trained with a small number of human annotations. |
Xiaoyan Qian; Chang Liu; Xiaojuan Qi; Siew-Chong Tan; Edmund Lam; Ngai Wong; |
233 | Data-Efficient Image Quality Assessment with Attention-Panel Decoder Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Blind Image Quality Assessment (BIQA) is a fundamental task in computer vision, which however remains unresolved due to the complex distortion conditions and diversified image contents. To confront this challenge, we in this paper propose a novel BIQA pipeline based on the Transformer architecture, which achieves an efficient quality-aware feature representation with much fewer data. |
Guanyi Qin; Runze Hu; Yutao Liu; Xiawu Zheng; Haotian Liu; Xiu Li; Yan Zhang; |
234 | FoPro: Few-Shot Guided Robust Webly-Supervised Prototypical Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose a Few-shot guided Prototypical (FoPro) representation learning method, which only needs a few labeled examples from reality and can significantly improve the performance in the real-world domain. |
Yulei Qin; Xingyu Chen; Chao Chen; Yunhang Shen; Bo Ren; Yun Gu; Jie Yang; Chunhua Shen; |
235 | Exposing The Self-Supervised Space-Time Correspondence Learning Via Graph Kernels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose the VideoHiGraph, a space-time correspondence framework based on a learnable graph kernel. |
Zheyun Qin; Xiankai Lu; Xiushan Nie; Yilong Yin; Jianbing Shen; |
236 | Exploring Stroke-Level Modifications for Scene Text Editing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we attribute the poor editing performance to two problems: 1) Implicit decoupling structure. |
Yadong Qu; Qingfeng Tan; Hongtao Xie; Jianjun Xu; YuXin Wang; Yongdong Zhang; |
237 | Unsupervised Deep Learning for Phase Retrieval Via Teacher-Student Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by the challenge of collecting ground-truth (GT) images in many domains, this paper proposes a fully-unsupervised learning approach for PR, which trains an end-to-end deep model via a GT-free teacher-student online distillation framework. |
Yuhui Quan; Zhile Chen; Tongyao Pang; Hui Ji; |
238 | A Learnable Radial Basis Positional Embedding for Coordinate-MLPs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel method to enhance the performance of coordinate-MLPs (also referred to as neural fields) by learning instance-specific positional embeddings. |
Sameera Ramasinghe; Simon Lucey; |
239 | Action-Conditioned Generation of Bimanual Object Manipulation Sequences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate our approach on the KIT Motion Capture and KIT RGBD Bimanual Manipulation datasets and show improvements over a simplified approach that treats the entire body as a single entity, and existing whole-body-only methods. |
Haziq Razali; Yiannis Demiris; |
240 | Mean-Shifted Contrastive Loss for Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We take the approach of transferring representations pre-trained on external datasets for anomaly detection. |
Tal Reiss; Yedid Hoshen; |
241 | Two Heads Are Better Than One: Image-Point Cloud Network for Depth-Based 3D Hand Pose Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an Image-Point cloud Network (IPNet) for accurate and robust 3D hand pose estimation. |
Pengfei Ren; Yuchen Chen; Jiachang Hao; Haifeng Sun; Qi Qi; Jingyu Wang; Jianxin Liao; |
242 | MAGIC: Mask-Guided Image Synthesis By Inverting A Quasi-robust Classifier Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We offer a method for one-shot mask-guided image synthesis that allows controlling manipulations of a single image by inverting a quasi-robust classifier equipped with strong regularizers. |
Mozhdeh Rouhsedaghat; Masoud Monajatipoor; C.-C. Jay Kuo; Iacopo Masi; |
243 | Domain Generalised Faster R-CNN Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This is the first paper to address domain generalisation in the context of object detection, with a rigorous mathematical analysis of domain shift, without the covariate shift assumption. |
Karthik Seemakurthy; Charles Fox; Erchan Aptoula; Petra Bosilj; |
244 | MIDMs: Matching Interleaved Diffusion Models for Exemplar-Based Image Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a novel method for exemplar-based image translation, called matching interleaved diffusion models (MIDMs). |
Junyoung Seo; Gyuseong Lee; Seokju Cho; Jiyoung Lee; Seungryong Kim; |
245 | JR2Net: Joint Monocular 3D Face Reconstruction and Reenactment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In particular, we propose a novel cascade framework named JR2Net for Joint Face Reconstruction and Reenactment, which begins with the training of a coarse reconstruction network, followed by a 3D-aware face reenactment network based on the coarse reconstruction results. |
Jiaxiang Shang; Yu Zeng; Xin Qiao; Xin Wang; Runze Zhang; Guangyuan Sun; Vishal Patel; Hongbo Fu; |
246 | HVTSurv: Hierarchical Vision Transformer for Patient-Level Survival Prediction from Whole Slide Image Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a hierarchical vision Transformer framework named HVTSurv, which can encode the local-level relative spatial information, strengthen WSI-level context-aware communication, and establish patient-level hierarchical interaction. |
Zhuchen Shao; Yang Chen; Hao Bian; Jian Zhang; Guojun Liu; Yongbing Zhang; |
247 | Channel Regeneration: Improving Channel Utilization for Compact DNNs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Overparameterized deep neural networks have redundant neurons that do not contribute to the network’s accuracy. In this paper, we introduce a novel channel regeneration technique that reinvigorates these redundant channels by re-initializing its batch normalization scaling factor gamma. |
Ankit Sharma; Hassan Foroosh; |
248 | Adaptive Dynamic Filtering Network for Image Denoising Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recently, dynamic convolution has exhibited powerful capabilities in processing high-frequency information (e.g., edges, corners, textures), but previous works lack sufficient spatial contextual information in filter generation. To alleviate these issues, we propose to employ dynamic convolution to improve the learning of high-frequency and multi-scale features. |
Hao Shen; Zhong-Qiu Zhao; Wandi Zhang; |
249 | Edge Structure Learning Via Low Rank Residuals for Robust Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Therefore, filtering out such structural information could hamper the discriminative details in images, especially in heavy corruptions. In order to address this limitation, this paper proposes a novel method named ESL-LRR, which preserves image edges by finding image projections from low-rank residuals. |
Xiang-Jun Shen; Stanley Ebhohimhen Abhadiomhen; Yang Yang; Zhifeng Liu; Sirui Tian; |
250 | Memory-Oriented Structural Pruning for Efficient Image Restoration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we reveal the overlooked memory redundancy of the IR models and propose a Memory-Oriented Structural Pruning (MOSP) method. |
Xiangsheng Shi; Xuefei Ning; Lidong Guo; Tianchen Zhao; Enshu Liu; Yi Cai; Yuhan Dong; Huazhong Yang; Yu Wang; |
251 | YOLOV: Making Still Image Object Detectors Great at Video Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these detectors are usually computationally expensive due to their two-stage nature. This work proposes a simple yet effective strategy to address the above concerns, which costs marginal overheads with significant gains in accuracy. |
Yuheng Shi; Naiyan Wang; Xiaojie Guo; |
252 | FeedFormer: Revisiting Transformer Decoder for Efficient Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Instead, we aim to directly use the encoder features as the queries. |
Jae-hun Shim; Hyunwoo Yu; Kyeongbo Kong; Suk-Ju Kang; |
253 | Task-Specific Scene Structure Representations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a single general neural network architecture for extracting task-specific structure guidance for scenes. |
Jisu Shin; Seunghyun Shin; Hae-Gon Jeon; |
254 | Diversified and Realistic 3D Augmentation Via Iterative Construction, Random Placement, and HPR Occlusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we develop a diversified and realistic augmentation method that can flexibly construct a whole-body object, freely locate and rotate the object, and apply self-occlusion and external-occlusion accordingly. |
Jungwook Shin; Jaeill Kim; Kyungeun Lee; Hyunghun Cho; Wonjong Rhee; |
255 | SHUNIT: Style Harmonization for Unpaired Image-to-Image Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel solution for unpaired image-to-image (I2I) translation. |
Seokbeom Song; Suhyeon Lee; Hongje Seong; Kyoungwon Min; Euntai Kim; |
256 | Siamese-Discriminant Deep Reinforcement Learning for Solving Jigsaw Puzzles with Large Eroded Gaps Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We formulate the puzzle reassembly as a combinatorial optimization problem and propose a Siamese-Discriminant Deep Reinforcement Learning (SD2RL) to solve it. |
Xingke Song; Jiahuan Jin; Chenglin Yao; Shihe Wang; Jianfeng Ren; Ruibin Bai; |
257 | CLIPVG: Text-Guided Image Manipulation Using Differentiable Vector Graphics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce CLIPVG, a text-guided image manipulation framework using differentiable vector graphics, which is also the first CLIP-based general image manipulation framework that does not require any additional generative models. |
Yiren Song; Xuning Shao; Kang Chen; Weidong Zhang; Zhongliang Jing; Minzhe Li; |
258 | Compact Transformer Tracker with Correlative Masked Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we prove that the vanilla self-attention structure is sufficient for information aggregation, and structural adaption is unnecessary. |
Zikai Song; Run Luo; Junqing Yu; Yi-Ping Phoebe Chen; Wei Yang; |
259 | Text-DIAE: A Self-Supervised Degradation Invariant Autoencoder for Text Recognition and Document Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Text-Degradation Invariant Auto Encoder (Text-DIAE), a self-supervised model designed to tackle two tasks, text recognition (handwritten or scene-text) and document image enhancement. |
Mohamed Ali Souibgui; Sanket Biswas; Andres Mafla; Ali Furkan Biten; Alicia Fornés; Yousri Kessentini; Josep Lladós; Lluis Gomez; Dimosthenis Karatzas; |
260 | PUPS: Point Cloud Unified Panoptic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a simple but effective point cloud unified panoptic segmentation (PUPS) framework, which use a set of point-level classifiers to directly predict semantic and instance groupings in an end-to-end manner. |
Shihao Su; Jianyun Xu; Huanyu Wang; Zhenwei Miao; Xin Zhan; Dayang Hao; Xi Li; |
261 | Efficient Edge-Preserving Multi-View Stereo Network for Depth Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we present an Efficient edge-Preserving multi-view stereo Network (EPNet) for practical depth estimation. |
Wanjuan Su; Wenbing Tao; |
262 | Referring Expression Comprehension Using Language Adaptive Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Concretely, we propose a neat yet efficient framework named Language Adaptive Dynamic Subnets (LADS), which can extract language-adaptive subnets from the REC model conditioned on the referring expressions. |
Wei Su; Peihan Miao; Huanzhang Dou; Yongjian Fu; Xi Li; |
263 | Rethinking Data Augmentation for Single-Source Domain Generalization in Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we rethink the data augmentation strategy for SDG in medical image segmentation. |
Zixian Su; Kai Yao; Xi Yang; Kaizhu Huang; Qiufeng Wang; Jie Sun; |
264 | Hybrid Pixel-Unshuffled Network for Lightweight Image Super-resolution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel Hybrid Pixel-Unshuffled Network (HPUN) by introducing an efficient and effective downsampling module into the SR task. |
Bin Sun; Yulun Zhang; Songyao Jiang; Yun Fu; |
265 | Learning Event-Relevant Factors for Video Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to explicitly learn event-relevant factors to eliminate the interferences from event-irrelevant factors on anomaly predictions. |
Che Sun; Chenrui Shi; Yunde Jia; Yuwei Wu; |
266 | Superpoint Transformer for 3D Scene Instance Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these non-straightforward methods suffer from two drawbacks: 1) Imprecise bounding boxes or unsatisfactory semantic predictions limit the performance of the overall 3D instance segmentation framework. 2) Existing method requires a time-consuming intermediate step of aggregation. To address these issues, this paper proposes a novel end-to-end 3D instance segmentation method based on Superpoint Transformer, named as SPFormer. |
Jiahao Sun; Chunmei Qing; Junpeng Tan; Xiangmin Xu; |
267 | Asynchronous Event Processing with Local-Shift Graph Convolutional Network Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a local-shift graph convolutional network (LSNet), which utilizes a novel local-shift operation equipped with a local spatio-temporal attention component to achieve efficient and adaptive aggregation of neighbor features. |
Linhui Sun; Yifan Zhang; Jian Cheng; Hanqing Lu; |
268 | DENet: Disentangled Embedding Network for Visible Watermark Removal Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, inspired by the two-stage coarse-refinement network, we propose a novel contrastive learning mechanism to disentangle the high-level embedding semantic information of the images and watermarks, driving the respective network branch more oriented. |
Ruizhou Sun; Yukun Su; Qingyao Wu; |
269 | Deep Manifold Attack on Point Clouds Via Parameter Plane Stretching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we formulate a novel manifold attack, which deforms the underlying 2-manifold surfaces via parameter plane stretching to generate adversarial point clouds. |
Keke Tang; Jianpeng Wu; Weilong Peng; Yawen Shi; Peng Song; Zhaoquan Gu; Zhihong Tian; Wenping Wang; |
270 | Fair Generative Models Via Transfer Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Under this setup, a weakly-supervised approach has been proposed, which achieves state-of-the-art quality and fairness in generated samples. In our work, based on this setup, we propose a simple yet effective approach. |
Christopher T.H. Teo; Milad Abdollahzadeh; Ngai-Man Cheung; |
271 | Learning Context-Aware Classifier for Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Different from the mainstream literature where the efficacy of strong backbones and effective decoder heads has been well studied, in this paper, additional contextual hints are instead exploited via learning a context-aware classifier whose content is data-conditioned, decently adapting to different latent distributions. |
Zhuotao Tian; Jiequan Cui; Li Jiang; Xiaojuan Qi; Xin Lai; Yixin Chen; Shu Liu; Jiaya Jia; |
272 | TopicFM: Robust and Interpretable Topic-Assisted Feature Matching Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel image-matching method that applies a topic-modeling strategy to encode high-level contexts in images. |
Khang Truong Giang; Soohwan Song; Sungho Jo; |
273 | Learning Fractals By Gradient Descent Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel approach that learns the parameters underlying a fractal image via gradient descent. |
Cheng-Hao Tu; Hong-You Chen; David Carlyn; Wei-Lun Chao; |
274 | Leveraging Weighted Cross-Graph Attention for Visual and Semantic Enhanced Video Captioning Network Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These semantic features contain significant information that helps to generate highly informative human description-like captions. Therefore, we propose a novel visual and semantic enhanced video captioning network, named as VSVCap, that efficiently utilizes multiple ground truth captions. |
Deepali Verma; Arya Haldar; Tanima Dutta; |
275 | Doodle to Object: Practical Zero-Shot Sketch-Based 3D Shape Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we contribute a new Doodle2Object (D2O) dataset consisting of 8,992 3D shapes and over 7M sketches spanning 50 categories. |
Bingrui Wang; Yuan Zhou; |
276 | Controlling Class Layout for Deep Ordinal Classification Via Constrained Proxies Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, we propose two kinds of strategies: hard layout constraint and soft layout constraint. |
Cong Wang; Zhiwei Jiang; Yafeng Yin; Zifeng Cheng; Shiping Ge; Qing Gu; |
277 | Dual Memory Aggregation Network for Event-Based Object Detection with Learnable Representation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose to learn an event representation optimized for event-based object detection. |
Dongsheng Wang; Xu Jia; Yang Zhang; Xinyu Zhang; Yaoyuan Wang; Ziyang Zhang; Dong Wang; Huchuan Lu; |
278 | Text to Point Cloud Localization with Relation-Enhanced Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To alleviate the two challenges, we propose a unified Relation-Enhanced Transformer (RET) to improve representation discriminability for both point cloud and nature language queries. |
Guangzhi Wang; Hehe Fan; Mohan Kankanhalli; |
279 | UCoL: Unsupervised Learning of Discriminative Facial Representations Via Uncertainty-Aware Contrast Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel uncertainty-aware consistency K-nearest neighbors algorithm to generate predicted positive pairs, which enables efficient discriminative learning from large-scale open-world unlabeled data. |
Hao Wang; Min Li; Yangyang Song; Youjian Zhang; Liying Chi; |
280 | Calibrated Teacher for Sparsely Annotated Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Therefore, the current methods with fixed thresholds have sub-optimal performance, and are difficult to be applied to other detectors. In order to resolve this obstacle, we propose a Calibrated Teacher, of which the confidence estimation of the prediction is well calibrated to match its real precision. |
Haohan Wang; Liang Liu; Boshen Zhang; Jiangning Zhang; Wuhao Zhang; Zhenye Gan; Yabiao Wang; Chengjie Wang; Haoqian Wang; |
281 | Towards Real-Time Panoptic Narrative Grounding By An End-to-End Grounding Network Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a one-stage network for real-time PNG, termed End-to-End Panoptic Narrative Grounding network (EPNG), which directly generates masks for referents. |
Haowei Wang; Jiayi Ji; Yiyi Zhou; Yongjian Wu; Xiaoshuai Sun; |
282 | LeNo: Adversarial Robust Salient Object Detection Networks with Learnable Noise Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Different from ROSA that rely on various pre- and post-processings, this paper proposes a light-weight Learnable Noise (LeNo) to defend adversarial attacks for SOD models. |
He Wang; Lin Wan; He Tang; |
283 | Defending Black-Box Skeleton-Based Human Activity Classifiers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose the first black-box defense method for skeleton-based HAR to our best knowledge. |
He Wang; Yunfeng Diao; Zichang Tan; Guodong Guo; |
284 | Exploring CLIP for Assessing The Look and Feel of Images Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we go beyond the conventional paradigms by exploring the rich visual language prior encapsulated in Contrastive Language-Image Pre-training (CLIP) models for assessing both the quality perception (look) and abstract perception (feel) of images without explicit task-specific training. |
Jianyi Wang; Kelvin C.K. Chan; Chen Change Loy; |
285 | Robust Video Portrait Reenactment Via Personalized Representation Quantization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes the Video Portrait via Non-local Quantization Modeling (VPNQ) framework, which produces pose- and disturbance-robust reenactable video portraits. |
Kaisiyuan Wang; Changcheng Liang; Hang Zhou; Jiaxiang Tang; Qianyi Wu; Dongliang He; Zhibin Hong; Jingtuo Liu; Errui Ding; Ziwei Liu; Jingdong Wang; |
286 | De-biased Teacher: Rethinking IoU Matching for Semi-supervised Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To de-bias the training proposals generated by the pseudo-label-based IoU matching, we propose a general framework — De-biased Teacher, which abandons both the IoU matching and pseudo labeling processes by directly generating favorable training proposals for consistency regularization between the weak/strong augmented image pairs. |
Kuo Wang; Jingyu Zhuang; Guanbin Li; Chaowei Fang; Lechao Cheng; Liang Lin; Fan Zhou; |
287 | Learning to Generate An Unbiased Scene Graph By Using Attribute-Guided Predicate Features Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, a decoupled learning framework is proposed for unbiased scene graph generation by using attribute-guided predicate features to construct a balanced training set. |
Lei Wang; Zejian Yuan; Badong Chen; |
288 | Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a new model architecture with alignment-enriched tuning (dubbed AETNet) upon pre-trained document image models, to adapt downstream tasks with the joint task-specific supervised and alignment-aware contrastive objective. |
Lei Wang; Jiabang He; Xing Xu; Ning Liu; Hui Liu; |
289 | Flora: Dual-Frequency LOss-Compensated ReAl-Time Monocular 3D Video Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a real-time monocular 3D video reconstruction approach named Flora for reconstructing delicate and complete 3D scenes from RGB video sequences in an end-to-end manner. |
Likang Wang; Yue Gong; Qirui Wang; Kaixuan Zhou; Lei Chen; |
290 | Efficient Image Captioning for Edge Devices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose LightCap, a lightweight image captioner for resource-limited devices. |
Ning Wang; Jiangrong Xie; Hang Luo; Qinglin Cheng; Jihao Wu; Mingbo Jia; Linlin Li; |
291 | Controllable Image Captioning Via Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that a unified model is qualified to perform well in diverse domains and freely switch among multiple styles. |
Ning Wang; Jiahao Xie; Jihao Wu; Mingbo Jia; Linlin Li; |
292 | ECO-3D: Equivariant Contrastive Learning for Pre-training on Perturbed 3D Point Cloud Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate contrastive learning on perturbed point clouds and find that the contrasting process may widen the domain gap caused by random perturbations, making the pre-trained network fail to generalize on testing data. |
Ruibin Wang; Xianghua Ying; Bowei Xing; Jinfa Yang; |
293 | Global-Local Characteristic Excited Cross-Modal Attacks from Images to Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an effective cross-modal attack method which considers both the global and local characteristics of video data. |
Ruikui Wang; Yuanfang Guo; Yunhong Wang; |
294 | Fine-Grained Retrieval Prompt Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we develop Fine-grained Retrieval Prompt Tuning (FRPT), which steers a frozen pre-trained model to perform the fine-grained retrieval task from the perspectives of sample prompting and feature adaptation. |
Shijie Wang; Jianlong Chang; Zhihui Wang; Haojie Li; Wanli Ouyang; Qi Tian; |
295 | Ultra-High-Definition Low-Light Image Enhancement: A Benchmark and Transformer-Based Method Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we consider the task of low-light image enhancement (LLIE) and introduce a large-scale database consisting of images at 4K and 8K resolution. |
Tao Wang; Kaihao Zhang; Tianrun Shen; Wenhan Luo; Bjorn Stenger; Tong Lu; |
296 | 3D Assembly Completion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose FiT, a framework for Finishing the incomplete 3D assembly with Transformer. |
Weihao Wang; Rufeng Zhang; Mingyu You; Hongjun Zhou; Bin He; |
297 | A Benchmark and Asymmetrical-Similarity Learning for Practical Image Copy Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Based on existing ICD datasets, this paper constructs a new dataset by additionally adding 100,000 and 24, 252 hard negative pairs into the training and test set, respectively. |
Wenhao Wang; Yifan Sun; Yi Yang; |
298 | Revisiting Unsupervised Local Descriptor Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents HybridDesc, an unsupervised approach that learns powerful local descriptor models with fast convergence speed by combining the rule-based and clustering-based approaches to construct training tuples. |
Wufan Wang; Lei Zhang; Hua Huang; |
299 | Crafting Monocular Cues and Velocity Guidance for Self-Supervised Multi-Frame Depth Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Unfortunately, MVS often suffers from texture-less regions, non-Lambertian surfaces, and moving objects, especially in real-world video sequences without known camera motion and depth supervision. Therefore, we propose MOVEDepth, which exploits the MOnocular cues and VElocity guidance to improve multi-frame Depth learning. |
Xiaofeng Wang; Zheng Zhu; Guan Huang; Xu Chi; Yun Ye; Ziwei Chen; Xingang Wang; |
300 | Learning Continuous Depth Representation Via Geometric Spatial Aggregator Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While arbitrary scale DSR is a more realistic setting in this scenario, previous approaches predominantly suffer from the issue of inefficient real-numbered scale upsampling. To explicitly address this issue, we propose a novel continuous depth representation for DSR. |
Xiaohang Wang; Xuanhong Chen; Bingbing Ni; Zhengyan Tong; Hang Wang; |
301 | SSDA3D: Semi-supervised Domain Adaptation for 3D Object Detection from Point Cloud Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these UDA solutions just yield unsatisfactory 3D detection results when there is a severe domain shift, e.g., from Waymo (64-beam) to nuScenes (32-beam). To address this, we present a novel Semi-Supervised Domain Adaptation method for 3D object detection (SSDA3D), where only a few labeled target data is available, yet can significantly improve the adaptation performance. |
Yan Wang; Junbo Yin; Wei Li; Pascal Frossard; Ruigang Yang; Jianbing Shen; |
302 | High-Resolution GAN Inversion for Degraded Images in Large Diverse Datasets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: A generic method for generating a high-quality image from the degraded one is in demand. In this paper, we present a novel GAN inversion framework that utilizes the powerful generative ability of StyleGAN-XL for this problem. |
Yanbo Wang; Chuming Lin; Donghao Luo; Ying Tai; Zhizhong Zhang; Yuan Xie; |
303 | GAN Prior Based Null-Space Learning for Consistent Super-resolution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While the realness has been dramatically improved with the use of GAN prior, the state-of-the-art methods still suffer inconsistencies in local structures and colors (e.g., tooth and eyes). In this paper, we show that these inconsistencies can be analytically eliminated by learning only the null-space component while fixing the range-space part. |
Yinhuai Wang; Yujie Hu; Jiwen Yu; Jian Zhang; |
304 | Contrastive Masked Autoencoders for Self-Supervised Video Hashing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a simple yet effective one-stage SSVH method called ConMH, which incorporates video semantic information and video similarity relationship understanding in a single stage. |
Yuting Wang; Jinpeng Wang; Bin Chen; Ziyun Zeng; Shu-Tao Xia; |
305 | MicroAST: Towards Super-fast Ultra-Resolution Arbitrary Style Transfer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the recent rapid progress, existing AST methods are either incapable or too slow to run at ultra-resolutions (e.g., 4K) with limited resources, which heavily hinders their further applications. In this paper, we tackle this dilemma by learning a straightforward and lightweight model, dubbed MicroAST. |
Zhizhong Wang; Lei Zhao; Zhiwen Zuo; Ailin Li; Haibo Chen; Wei Xing; Dongming Lu; |
306 | Truncate-Split-Contrast: A Framework for Learning from Mislabeled Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose two new strategies for video analysis with noisy labels: 1) a lightweight channel selection method dubbed as Channel Truncation for feature-based label noise detection. |
Zixiao Wang; Junwu Weng; Chun Yuan; Jue Wang; |
307 | Active Token Mixer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose an innovative token-mixer, dubbed Active Token Mixer (ATM), to actively incorporate contextual information from other tokens in the global scope into the given query token. |
Guoqiang Wei; Zhizheng Zhang; Cuiling Lan; Yan Lu; Zhibo Chen; |
308 | Exploring Non-target Knowledge for Improving Ensemble Universal Adversarial Attacks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we simply adopt a KL loss that only considers the non-target classes for addressing the dominant bias issue. |
Juanjuan Weng; Zhiming Luo; Zhun Zhong; Dazhen Lin; Shaozi Li; |
309 | Towards Good Practices for Missing Modality Robust Action Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper seeks a set of good practices for multi-modal action recognition, with a particular interest in circumstances where some modalities are not available at an inference time. |
Sangmin Woo; Sumin Lee; Yeonju Park; Muhammad Adi Nugroho; Changick Kim; |
310 | Reject Decoding Via Language-Vision Models for Text-to-Image Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we build tiny multi-modal models to evaluate the similarities between the partial paths and the caption at multi scales. |
Fuxiang Wu; Liu Liu; Fusheng Hao; Fengxiang He; Lei Wang; Jun Cheng; |
311 | Transformation-Equivariant 3D Object Detection for Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present TED, an efficient Transformation-Equivariant 3D Detector to overcome the computation cost and speed issues. |
Hai Wu; Chenglu Wen; Wei Li; Xin Li; Ruigang Yang; Cheng Wang; |
312 | Super-efficient Echocardiography Video Segmentation Via Proxy- and Kernel-Based Semi-supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Particularly, the real-time demand in clinical practice makes this task even harder. In this paper, we propose a novel proxy- and kernel-based semi-supervised segmentation network (PKEcho-Net) to comprehensively address these challenges. |
Huisi Wu; Jingyin Lin; Wende Xie; Jing Qin; |
313 | ACL-Net: Semi-supervised Polyp Segmentation Via Affinity Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel semi-supervised polyp segmentation framework using affinity contrastive learning (ACL-Net), which is implemented between student and teacher networks to consistently refine the pseudo-labels for semi-supervised polyp segmentation. |
Huisi Wu; Wende Xie; Jingyin Lin; Xinrong Guo; |
314 | Bi-directional Feature Reconstruction Network for Fine-Grained Few-Shot Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we for the first time introduce a bi-reconstruction mechanism that can simultaneously accommodate for inter-class and intra-class variations. |
Jijie Wu; Dongliang Chang; Aneeshan Sain; Xiaoxu Li; Zhanyu Ma; Jie Cao; Jun Guo; Yi-Zhe Song; |
315 | Preserving Structural Consistency in Arbitrary Artist and Artwork Style Transfer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These methods not only homogenize the artist-style of different artworks of the same artist but also lack generalization for the unseen artists. To solve these challenges, we propose a double-style transferring module (DSTM). |
Jingyu Wu; Lefan Hou; Zejian Li; Jun Liao; Li Liu; Lingyun Sun; |
316 | End-to-End Zero-Shot HOI Detection Via Vision and Language Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The fundamental challenges are to discover potential human-object pairs and identify novel HOI categories. To overcome the above challenges, we propose a novel End-to-end zero-shot HOI Detection (EoID) framework via vision-language knowledge distillation. |
Mingrui Wu; Jiaxin Gu; Yunhang Shen; Mingbao Lin; Chao Chen; Xiaoshuai Sun; |
317 | Revisiting Classifier: Transferring Vision-Language Models for Video Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we focus on transferring knowledge for video classification tasks. |
Wenhao Wu; Zhun Sun; Wanli Ouyang; |
318 | Scene Graph to Image Synthesis Via Knowledge Consensus Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study graph-to-image generation conditioned exclusively on scene graphs, in which we seek to disentangle the veiled semantics between knowledge graphs and images. |
Yang Wu; Pengxu Wei; Liang Lin; |
319 | Synthetic Data Can Also Teach: Synthesizing Effective Data for Unsupervised Visual Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, synthetic data usually has lower quality than real data, and using synthetic data may not improve CL compared with using real data. To tackle this problem, we propose a data generation framework with two methods to improve CL training by joint sample generation and contrastive learning. |
Yawen Wu; Zhepeng Wang; Dewen Zeng; Yiyu Shi; Jingtong Hu; |
320 | Multi-Stream Representation Learning for Pedestrian Trajectory Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the procedure that the hippocampus processes and integrates spatio-temporal information to form memories, we propose a novel multi-stream representation learning module to learn complex spatio-temporal features of pedestrian trajectory. |
Yuxuan Wu; Le Wang; Sanping Zhou; Jinghai Duan; Gang Hua; Wei Tang; |
321 | Pixel Is All You Need: Adversarial Trajectory-Ensemble Active Learning for Salient Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper attempts to answer this unexplored question by proving a hypothesis: there is a point-labeled dataset where saliency models trained on it can achieve equivalent performance when trained on the densely annotated dataset. To prove this conjecture, we proposed a novel yet effective adversarial trajectory-ensemble active learning (ATAL). |
Zhenyu Wu; Lin Wang; Wei Wang; Qing Xia; Chenglizhao Chen; Aimin Hao; Shuo Li; |
322 | Attention-Based Depth Distillation with 3D-Aware Positional Encoding for Monocular 3D Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, inevitable errors from estimated depth priors may lead to misaligned semantic information and 3D localization, hence resulting in feature smearing and suboptimal predictions. To mitigate this issue, we propose ADD, an Attention-based Depth knowledge Distillation framework with 3D-aware positional encoding. |
Zizhang Wu; Yunzhe Wu; Jian Pu; Xianzhi Li; Xiaoquan Wang; |
323 | Skating-Mixer: Long-Term Sport Audio-Visual Modeling with MLPs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Most learning-based methods struggle for two reasons: 1) each move in figure skating changes quickly, hence simply applying traditional frame sampling will lose a lot of valuable information, especially in 3 to 5 minutes lasting videos; 2) prior methods rarely considered the critical audio-visual relationship in their models. Due to these reasons, we introduce a novel architecture, named Skating-Mixer. |
Jingfei Xia; Mingchen Zhuge; Tiantian Geng; Shun Fan; Yuantai Wei; Zhenyu He; Feng Zheng; |
324 | SVFI: Spiking-Based Video Frame Interpolation for High-Speed Motion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Instead of estimating motions by optical flow from RGB frames, we present a new dual-modal pipeline adopting both RGB frames and the corresponding spike stream as inputs (SVFI). |
Lujie Xia; Jing Zhao; Ruiqin Xiong; Tiejun Huang; |
325 | FEditNet: Few-Shot Editing of Latent Semantics in GAN Spaces Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a GAN-based method called FEditNet, aiming to discover latent semantics using very few labeled data without any pretrained predictors or prior knowledge. |
Mengfei Xia; Yezhi Shu; Yuji Wang; Yu-Kun Lai; Qiang Li; Pengfei Wan; Zhongyuan Wang; Yong-Jin Liu; |
326 | Toward Robust Diagnosis: A Contour Attention Preserving Adversarial Defense for COVID-19 Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The existing adversarial training strategies are difficult to generalized into medical imaging field challenged by complex medical texture features. To overcome this challenge, we propose a Contour Attention Preserving (CAP) method based on lung cavity edge extraction. |
Kun Xiang; Xing Zhang; Jinwen She; Jinpeng Liu; Haohan Wang; Shiqi Deng; Shancheng Jiang; |
327 | Boosting Semi-Supervised Semantic Segmentation with Probabilistic Representations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, there exist inaccurate pseudo-labels which map the ambiguous representations of pixels to the wrong classes due to the limited cognitive ability of the model. In this paper, we define pixel-wise representations from a new perspective of probability theory and propose a Probabilistic Representation Contrastive Learning (PRCL) framework that improves representation quality by taking its probability into consideration. |
Haoyu Xie; Changqi Wang; Mingkai Zheng; Minjing Dong; Shan You; Chong Fu; Chang Xu; |
328 | Less Is More Important: An Attention Module Guided By Probability Density Function for Convolutional Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most existing attention modules are heuristic without a sound interpretation, and thus, require empirical engineering to design structure and operators within the modules. To handle the above issue, based on our ‘less is more important’ observation, we propose an Attention Module guided by Probability Density Function (PDF), dubbed PdfAM, which enjoys a rational motivation and requires few empirical structure designs. |
Jingfen Xie; Jian Zhang; |
329 | Mitigating Artifacts in Real-World Video Super-resolution Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Based on the observations, we propose a Hidden State Attention (HSA) module to mitigate artifacts in real-world video super-resolution. |
Liangbin Xie; Xintao Wang; Shuwei Shi; Jinjin Gu; Chao Dong; Ying Shan; |
330 | Just Noticeable Visual Redundancy Forecasting: A Deep Multimodal-Driven Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we investigate the JND modeling from an end-to-end homologous multimodal perspective, namely hmJND-Net. |
Wuyuan Xie; Shukang Wang; Sukun Tian; Lirong Huang; Ye Liu; Miaohui Wang; |
331 | Cross-Modal Contrastive Learning for Domain Adaptation in 3D Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on it, in this paper, we propose a novel cross-modal contrastive learning scheme to further improve the adaptation effects. |
Bowei Xing; Xianghua Ying; Ruibin Wang; Jinfa Yang; Taiyan Chen; |
332 | ROIFormer: Semantic-Aware Region of Interest Transformer for Efficient Self-Supervised Monocular Depth Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we revisit feature fusion between depth and semantic information and propose an efficient local adaptive attention method for geometric aware representation enhancement. |
Daitao Xing; Jinglin Shen; Chiuman Ho; Anthony Tzes; |
333 | LORE: Logical Location Regression Network for Table Structure Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, they either count on additional heuristic rules to recover the table structures, or require a huge amount of training data and time-consuming sequential decoders. In this paper, we propose an alternative paradigm. |
Hangdi Xing; Feiyu Gao; Rujiao Long; Jiajun Bu; Qi Zheng; Liangcheng Li; Cong Yao; Zhi Yu; |
334 | Revisiting The Spatial and Temporal Modeling for Few-Shot Action Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose SloshNet, a new framework that revisits the spatial and temporal modeling for few-shot action recognition in a finer manner. |
Jiazheng Xing; Mengmeng Wang; Yong Liu; Boyu Mu; |
335 | Unsupervised Multi-Exposure Image Fusion Breaking Exposure Limits Via Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes an unsupervised multi-exposure image fusion (MEF) method via contrastive learning, termed as MEF-CL. |
Han Xu; Liang Haochen; Jiayi Ma; |
336 | CasFusionNet: A Cascaded Network for Point Cloud Semantic Scene Completion By Dense Feature Fusion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In our work, we present CasFusionNet, a novel cascaded network for point cloud semantic scene completion by dense feature fusion. |
Jinfeng Xu; Xianzhi Li; Yuan Tang; Qiao Yu; Yixue Hao; Long Hu; Min Chen; |
337 | Learning A Generalized Gaze Estimator from Gaze-Consistent Feature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new domain generalization method based on gaze-consistent features. |
Mingjie Xu; Haofei Wang; Feng Lu; |
338 | Class Overwhelms: Mutual Conditional Blended-Target Domain Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address these, we propose a categorical domain discriminator guided by uncertainty to explicitly model and directly align categorical distributions P(Z|Y). |
Pengcheng Xu; Boyu Wang; Charles Ling; |
339 | Self Correspondence Distillation for End-to-End Weakly-Supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose a simple and novel Self Correspondence Distillation (SCD) method to refine pseudo-labels without introducing external supervision. |
Rongtao Xu; Changwei Wang; Jiaxi Sun; Shibiao Xu; Weiliang Meng; Xiaopeng Zhang; |
340 | Deep Parametric 3D Filters for Joint Video Denoising and Illumination Enhancement in Video Super Resolution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a new parametric representation called the Deep Parametric 3D Filters (DP3DF), which incorporates local spatiotemporal information to enable simultaneous denoising, illumination enhancement, and SR efficiently in a single encoder-and-decoder network. |
Xiaogang Xu; Ruixing Wang; Chi-Wing Fu; Jiaya Jia; |
341 | Inter-image Contrastive Consistency for Multi-Person Pose Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a novel framework, termed Inter-image Contrastive consistency (ICON), to strengthen the keypoint consistency among images for MPPE. |
Xixia Xu; Yingguo Gao; Xingjia Pan; Ke Yan; Xiaoyu Chen; Qi Zou; |
342 | DeMT: Deformable Mixer Transformer for Multi-Task Learning of Dense Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present a novel MTL model by combining both merits of deformable CNN and query-based Transformer for multi-task learning of dense prediction. |
Yangyang Xu; Yibo Yang; Lefei Zhang; |
343 | VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Following the human perception process, where the scene is effectively understood by decomposing it into visual (e.g. human, animal) and non-visual components (e.g. action, relations) under the mutual influence of vision and language, we first propose a visual-linguistic (VL) feature. In the proposed VL feature, the scene is modeled by three modalities including (i) a global visual environment; (ii) local visual main agents; (iii) linguistic scene elements. We then introduce an autoregressive Transformer-in-Transformer (TinT) to simultaneously capture the semantic coherence of intra- and inter-event contents within a video. |
Kashu Yamazaki; Khoa Vo; Quang Sang Truong; Bhiksha Raj; Ngan Le; |
344 | Rethinking Disparity: A Depth Range Free Multi-View Stereo Based on Disparity Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing learning-based multi-view stereo (MVS) methods rely on the depth range to build the 3D cost volume and may fail when the range is too large or unreliable. To address this problem, we propose a disparity-based MVS method based on the epipolar disparity flow (E-flow), called DispMVS, which infers the depth information from the pixel movement between two views. |
Qingsong Yan; Qiang Wang; Kaiyong Zhao; Bo Li; Xiaowen Chu; Fei Deng; |
345 | Video-Text Pre-training with Learned Regions for Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a simple yet effective module for video-text representation learning, namely RegionLearner, which can take into account the structure of objects during pre-training on large-scale video-text pairs. |
Rui Yan; Mike Zheng Shou; Yixiao Ge; Jinpeng Wang; Xudong Lin; Guanyu Cai; Jinhui Tang; |
346 | DesNet: Decomposed Scale-Consistent Network for Unsupervised Depth Completion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we propose the decomposed scale-consistent learning (DSCL) strategy, which disintegrates the absolute depth into relative depth prediction and global scale estimation, contributing to individual learning benefits. |
Zhiqiang Yan; Kun Wang; Xiang Li; Zhenyu Zhang; Jun Li; Jian Yang; |
347 | Self-Supervised Video Representation Learning Via Latent Time Navigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This leads to loss of pertinent information related to temporal relationships, rendering actions such as `enter’ and `leave’ to be indistinguishable. To mitigate this limitation, we propose Latent Time Navigation (LTN), a time parameterized contrastive learning strategy that is streamlined to capture fine-grained motions. |
Di Yang; Yaohui Wang; Quan Kong; Antitza Dantcheva; Lorenzo Garattoni; Gianpiero Francesca; François Brémond; |
348 | One-Shot Replay: Boosting Incremental Object Detection Via Retrospecting One Object Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we design a novel One-Shot Replay (OSR) method for incremental object detection, which is an augmentation-based method. |
Dongbao Yang; Yu Zhou; Xiaopeng Hong; Aoting Zhang; Weiping Wang; |
349 | Video Event Extraction Via Tracking Visual States of Arguments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the definition of events as changes of states, we propose a novel framework to detect video events by tracking the changes in the visual states of all involved arguments, which are expected to provide the most informative evidence for the extraction of video events. |
Guang Yang; Manling Li; Jiajie Zhang; Xudong Lin; Heng Ji; Shih-Fu Chang; |
350 | CoMAE: Single Model Hybrid Pre-training on Small-Scale RGB-D Datasets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a single-model self-supervised hybrid pre-training framework for RGB and depth modalities, termed as CoMAE. |
Jiange Yang; Sheng Guo; Gangshan Wu; Limin Wang; |
351 | Self-Asymmetric Invertible Network for Compression-Aware Image Rescaling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose the Self-Asymmetric Invertible Network (SAIN) for compression-aware image rescaling. |
Jinhai Yang; Mengxi Guo; Shijie Zhao; Junlin Li; Li Zhang; |
352 | Stop-Gradient Softmax Loss for Deep Metric Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this letter, we look into the characteristic of softmax-based approaches and propose a novel learning objective function Stop-Gradient Softmax Loss (SGSL) to solve the convergence problem in softmax-based deep metric learning with L2-normalization. |
Lu Yang; Peng Wang; Yanning Zhang; |
353 | Local Path Integration for Attribution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We devise a method to identify the local input distribution and propose a technique to stochastically integrate the model gradients over the paths defined by the references sampled from that distribution. |
Peiyu Yang; Naveed Akhtar; Zeyi Wen; Ajmal Mian; |
354 | Spatiotemporal Deformation Perception for Fisheye Video Rectification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For different frames of the fisheye video, the existing image correction methods ignore the correlation of sequences, resulting in temporal jitter in the corrected video. To solve this problem, we propose a temporal weighting scheme to get a plausible global optical flow, which mitigates the jitter effect by progressively reducing the weight of frames. |
Shangrong Yang; Chunyu Lin; Kang Liao; Yao Zhao; |
355 | Contrastive Multi-Task Dense Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel multi-task contrastive regularization method based on the consistency to effectively boost the representation learning of the different sub-tasks, which can also be easily generalized to different multi-task dense prediction frameworks, and costs no additional computation in the inference. |
Siwei Yang; Hanrong Ye; Dan Xu; |
356 | AutoStegaFont: Synthesizing Vector Fonts for Hiding Information in Documents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, none of the existing methods can satisfy these requirements well and simultaneously. To satisfy the above requirements, we propose AutoStegaFont, an automatic vector font synthesis scheme for hiding information in documents. |
Xi Yang; Jie Zhang; Han Fang; Chang Liu; Zehua Ma; Weiming Zhang; Nenghai Yu; |
357 | Towards Global Video Scene Segmentation with Context-Aware Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we introduce a novel Context-Aware Transformer (CAT) with a self-supervised learning framework to learn high-quality shot representations, for generating well-bounded scenes. |
Yang Yang; Yurui Huang; Weili Guo; Baohua Xu; Dingyin Xia; |
358 | Low-Light Image Enhancement Network Based on Multi-Scale Feature Complementation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although most current enhancement methods can obtain high-contrast images, they still suffer from noise amplification and color distortion. To address these issues, this paper proposes a low-light image enhancement network based on multi-scale feature complementation (LIEN-MFC), which is a U-shaped encoder-decoder network supervised by multiple images of different scales. |
Yong Yang; Wenzhi Xu; Shuying Huang; Weiguo Wan; |
359 | Semantics-Aware Dynamic Localization and Refinement for Referring Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a simple yet effective alternative for progressively learning discriminative multi-modal features. |
Zhao Yang; Jiaqi Wang; Yansong Tang; Kai Chen; Hengshuang Zhao; Philip H.S. Torr; |
360 | LidarMultiNet: Towards A Unified Multi-Task Network for LiDAR Perception Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: LiDAR-based 3D object detection, semantic segmentation, and panoptic segmentation are usually implemented in specialized networks with distinctive architectures that are difficult to adapt to each other. This paper presents LidarMultiNet, a LiDAR-based multi-task network that unifies these three major LiDAR perception tasks. |
Dongqiangzi Ye; Zixiang Zhou; Weijia Chen; Yufei Xie; Yu Wang; Panqu Wang; Hassan Foroosh; |
361 | DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In addition, the point label form exploited in previous works implies the reading order of humans, which impedes the detection robustness from our observation. To address these challenges, this paper proposes a concise Dynamic Point Text DEtection TRansformer network, termed DPText-DETR. |
Maoyuan Ye; Jing Zhang; Shanshan Zhao; Juhua Liu; Bo Du; Dacheng Tao; |
362 | Learning Second-Order Attentive Context for Efficient Correspondence Pruning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an effective and efficient method for correspondence pruning. |
Xinyi Ye; Weiyue Zhao; Hao Lu; Zhiguo Cao; |
363 | Infusing Definiteness Into Randomness: Rethinking Composition Styles for Deep Image Matting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we first show that naive foreground combination can be problematic and therefore derive an alternative formulation to reasonably combine foregrounds. Our second contribution is an observation that matting performance can benefit from a certain occurrence frequency of combined foregrounds and their associated source foregrounds during training. Inspired by this, we introduce a novel composition style that binds the source and combined foregrounds in a definite triplet. |
Zixuan Ye; Yutong Dai; Chaoyi Hong; Zhiguo Cao; Hao Lu; |
364 | Can We Find Strong Lottery Tickets in Generative Models? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate strong lottery tickets in generative models, the subnetworks that achieve good generative performance without any weight update. |
Sangyeop Yeo; Yoojin Jang; Jy-yong Sohn; Dongyoon Han; Jaejun Yoo; |
365 | Class-Independent Regularization for Learning with Noisy Labels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a class-independent regularization (CIR) method that can effectively alleviate the negative impact of noisy labels in DNN training. |
Rumeng Yi; Dayan Guan; Yaping Huang; Shijian Lu; |
366 | Unbiased Heterogeneous Scene Graph Generation with Relation-Aware Message Passing Neural Network Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an unbiased heterogeneous scene graph generation (HetSGG) framework that captures relation-aware context using message passing neural networks. |
Kanghoon Yoon; Kibum Kim; Jinyoung Moon; Chanyoung Park; |
367 | Lifelong Person Re-identification Via Knowledge Refreshing and Consolidation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: More specifically, a knowledge refreshing scheme is incorporated with the knowledge rehearsal mechanism to enable bi-directional knowledge transfer by introducing a dynamic memory model and an adaptive working model. |
Chunlin Yu; Ye Shi; Zimo Liu; Shenghua Gao; Jingya Wang; |
368 | Generalizing Multiple Object Tracking to Unseen Domains By Introducing Natural Language Representation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To bridge this gap, we first draw the observation that the high-level information contained in natural language is domain invariant to different tracking domains. Based on this observation, we propose to introduce natural language representation into visual MOT models for boosting the domain generalization ability. |
En Yu; Songtao Liu; Zhuoling Li; Jinrong Yang; Zeming Li; Shoudong Han; Wenbing Tao; |
369 | Rethinking Rotation Invariance with Point Cloud Registration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we review rotation invariance (RI) in terms of point cloud registration (PCR) and propose an effective framework for rotation invariance learning via three sequential stages, namely rotation-invariant shape encoding, aligned feature integration, and deep feature registration. |
Jianhui Yu; Chaoyi Zhang; Weidong Cai; |
370 | Frame-Level Label Refinement for Skeleton-Based Weakly-Supervised Action Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by advances in handling the noisy label problem, we introduce a label cleaning strategy of the frame-level pseudo labels to guide the learning process. |
Qing Yu; Kent Fujiwara; |
371 | Recurrent Structure Attention Guidance for Depth Super-resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we develop a recurrent structure attention guided (RSAG) framework, consisting of two important parts. |
Jiayi Yuan; Haobo Jiang; Xiang Li; Jianjun Qian; Jun Li; Jian Yang; |
372 | Structure Flow-Guided Network for Real Depth Super-resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel structure flow-guided DSR framework, where a cross-modality flow map is learned to guide the RGB-structure information transferring for precise depth upsampling. |
Jiayi Yuan; Haobo Jiang; Xiang Li; Jianjun Qian; Jun Li; Jian Yang; |
373 | Pseudo Label-Guided Model Inversion Attack Via Conditional Generative Adversarial Network Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Besides, the widely used cross-entropy loss in these attacks suffers from gradient vanishing. To address these problems, we propose Pseudo Label-Guided MI (PLG-MI) attack via conditional GAN (cGAN). |
Xiaojian Yuan; Kejiang Chen; Jie Zhang; Weiming Zhang; Nenghai Yu; Yang Zhang; |
374 | Cyclically Disentangled Feature Translation for Face Anti-spoofing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we tackle cross-scenario face anti-spoofing by proposing a novel domain adaptation method called cyclically disentangled feature translation network (CDFTN). |
Haixiao Yue; Keyao Wang; Guosheng Zhang; Haocheng Feng; Junyu Han; Errui Ding; Jingdong Wang; |
375 | FlowFace: Semantic Flow-Guided Shape-Aware Face Swapping Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a semantic flow-guided two-stage framework for shape-aware face swapping, namely FlowFace. |
Hao Zeng; Wei Zhang; Changjie Fan; Tangjie Lv; Suzhen Wang; Zhimeng Zhang; Bowen Ma; Lincheng Li; Yu Ding; Xin Yu; |
376 | Multi-Modal Knowledge Hypergraph for Diverse Image Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we pioneer a degree-free hypergraph solution that models many-to-many relations to address the challenge of heterogeneous sources and heterogeneous modalities. |
Yawen Zeng; Qin Jin; Tengfei Bao; Wenfeng Li; |
377 | Learnable Blur Kernel for Single-Image Defocus Deblurring in The Wild Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, the deblurred image generated by the defocus deblurring network lacks high-frequency details, which is unsatisfactory in human perception. To overcome this issue, we propose a novel defocus deblurring method that uses the guidance of the defocus map to implement image deblurring. |
Jucai Zhai; Pengcheng Zeng; Chihao Ma; Jie Chen; Yong Zhao; |
378 | Darwinian Model Upgrades: Model Evolving with Selective Compatibility Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Darwinian Model Upgrades (DMU), which disentangle the inheritance and variation in the model evolving with selective backward compatibility and forward adaptation, respectively. |
Binjie Zhang; Shupeng Su; Yixiao Ge; Xuyuan Xu; Yexin Wang; Chun Yuan; Mike Zheng Shou; Ying Shan; |
379 | Mx2M: Masked Cross-Modality Modeling in Domain Adaptation for 3D Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The results are not ideal when the domain gap is large. To solve the problem of lacking supervision, we introduce masked modeling into this task and propose a method Mx2M, which utilizes masked cross-modality modeling to reduce the large domain gap. |
Boxiang Zhang; Zunran Wang; Yonggen Ling; Yuanyuan Guan; Shenghao Zhang; Wenhui Li; |
380 | Few-Shot 3D Point Cloud Semantic Segmentation Via Stratified Class-Specific Attention Based Transformer Network Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we further address these problems by developing a new multi-layer transformer network for few-shot point cloud semantic segmentation. |
Canyu Zhang; Zhenyao Wu; Xinyi Wu; Ziyu Zhao; Song Wang; |
381 | PaRot: Patch-Wise Rotation-Invariant Network Via Feature Disentanglement and Pose Restoration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a novel Patch-wise Rotation-invariant network (PaRot), which achieves rotation invariance via feature disentanglement and produces consistent predictions for samples with arbitrary rotations. |
Dingxin Zhang; Jianhui Yu; Chaoyi Zhang; Weidong Cai; |
382 | Hierarchical Consistent Contrastive Learning for Skeleton-Based Action Recognition with Growing Augmentations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the potential of adopting strong augmentations and propose a general hierarchical consistent contrastive learning framework (HiCLR) for skeleton-based action recognition. |
Jiahang Zhang; Lilang Lin; Jiaying Liu; |
383 | ImageNet Pre-training Also Transfers Non-robustness Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: ImageNet pre-training has enabled state-of-the-art results on many tasks. In spite of its recognized contribution to generalization, we observed in this study that ImageNet pre-training also transfers adversarial non-robustness from pre-trained model into fine-tuned model in the downstream classification tasks. |
Jiaming Zhang; Jitao Sang; Qi Yi; Yunfan Yang; Huiwen Dong; Jian Yu; |
384 | Language-Assisted 3D Feature Learning for Semantic Scene Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To guide 3D feature learning toward important geometric attributes and scene context, we explore the help of textual scene descriptions. |
Junbo Zhang; Guofan Fan; Guanghan Wang; Zhengyuan Su; Kaisheng Ma; Li Yi; |
385 | IKOL: Inverse Kinematics Optimization Layer for 3D Human Pose and Shape Estimation Via Gauss-Newton Differentiation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents an inverse kinematic optimization layer (IKOL) for 3D human pose and shape estimation that leverages the strength of both optimization- and regression-based methods within an end-to-end framework. |
Juze Zhang; Ye Shi; Yuexin Ma; Lan Xu; Jingyi Yu; Jingya Wang; |
386 | Mind The Gap: Polishing Pseudo Labels for Accurate Semi-supervised Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, due to the limited generalization capacity of the teacher detector caused by the scarce annotations, the produced pseudo labels often deviate from ground truth, especially those with relatively low classification confidences, thus limiting the generalization performance of SSOD. To mitigate this problem, we propose a dual pseudo-label polishing framework for SSOD. |
Lei Zhang; Yuxuan Sun; Wei Wei; |
387 | ConvMatch: Rethinking Network Design for Two-View Correspondence Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, from a novel perspective, we design a correspondence learning network called ConvMatch that for the first time can leverage convolutional neural network (CNN) as the backbone to capture better context, thus avoiding the complex design of extra blocks. |
Shihua Zhang; Jiayi Ma; |
388 | Cross-View Geo-Localization Via Learning Disentangled Geometric Layout Correspondence Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose GeoDTR which explicitly disentangles geometric information from raw features and learns the spatial correlations among visual features from aerial and ground pairs with a novel geometric layout extractor module. |
Xiaohan Zhang; Xingyu Li; Waqas Sultani; Yi Zhou; Safwan Wshah; |
389 | Video Compression Artifact Reduction By Fusing Motion Compensation and Global Context in A Swin-CNN Based Parallel Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The key idea of this paper is to fuse the motion compensation and global context together to gain more compensation information to improve the quality of compressed videos. |
Xinjian Zhang; Su Yang; Wuyang Luo; Longwen Gao; Weishan Zhang; |
390 | MRCN: A Novel Modality Restitution and Compensation Network for Visible-Infrared Person Re-identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel Modality Restitution and Compensation Network (MRCN) to narrow the gap between the two modalities. |
Yukang Zhang; Yan Yan; Jie Li; Hanzi Wang; |
391 | A Simple Baseline for Multi-Camera 3D Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present SimMOD, a Simple baseline for Multi-camera Object Detection, to solve the problem. |
Yunpeng Zhang; Wenzhao Zheng; Zheng Zhu; Guan Huang; Jiwen Lu; Jie Zhou; |
392 | Positional Label for Self-Supervised Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: General effectiveness has been proven in ViT. In our work we propose to train ViT to recognize the positional label of patches of the input image, this apparently simple task actually yields a meaningful self-supervisory task. |
Zhemin Zhang; Xun Gong; |
393 | Cross-Category Highlight Detection Via Feature Decomposition and Modality Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Under this framework, we propose a novel module, named Multi-task Feature Decomposition Branch which jointly conducts label prediction, cyclic feature reconstruction, and adversarial feature reconstruction to decompose the video features into two independent components: highlight-related component and category-related component. |
Zhenduo Zhang; |
394 | TrEP: Transformer-Based Evidential Prediction for Pedestrian Intention with Uncertainty Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop a transformer module towards the temporal correlations among the input features within pedestrian video sequences and a deep evidential learning model to capture the AI uncertainty under scene complexities. |
Zhengming Zhang; Renran Tian; Zhengming Ding; |
395 | DINet: Deformation Inpainting Network for Realistic Face Visually Dubbing on High Resolution Video Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Previous works fail to generate high-fidelity dubbing results. To address the above problem, this paper proposes a Deformation Inpainting Network (DINet) for high-resolution face visually dubbing. |
Zhimeng Zhang; Zhipeng Hu; Wenjin Deng; Changjie Fan; Tangjie Lv; Yu Ding; |
396 | ShiftDDPMs: Exploring Conditional Diffusion Models By Shifting Diffusion Trajectories Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel and flexible conditional diffusion model by introducing conditions into the forward process. |
Zijian Zhang; Zhou Zhao; Jun Yu; Qi Tian; |
397 | Combating Unknown Bias with Effective Bias-Conflicting Scoring and Gradient Alignment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, for challenge A, we propose an effective bias-conflicting scoring method to boost the identification accuracy with two practical strategies — peer-picking and epoch-ensemble. |
Bowen Zhao; Chen Chen; Qian-Wei Wang; Anfeng He; Shu-Tao Xia; |
398 | RLogist: Fast Observation Strategy on Whole-Slide Images with Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we develop RLogist, a benchmarking deep reinforcement learning (DRL) method for fast observation strategy on WSIs. |
Boxuan Zhao; Jun Zhang; Deheng Ye; Jian Cao; Xiao Han; Qiang Fu; Wei Yang; |
399 | Learning to Super-resolve Dynamic Scenes for Neuromorphic Spike Camera Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, as a trade-off for high temporal resolution, its spatial resolution is limited, resulting in inferior reconstruction details. To address this issue, this paper develops a network (SpikeSR-Net) to super-resolve a high-resolution image sequence from the low-resolution binary spike streams. |
Jing Zhao; Ruiqin Xiong; Jian Zhang; Rui Zhao; Hangfan Liu; Tiejun Huang; |
400 | TinyNeRF: Towards 100 X Compression of Voxel Radiance Fields Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these methods accelerate the original NeRF at the expense of extra storage demand, which hinders their applications in many scenarios. To solve this limitation, we present TinyNeRF, a three-stage pipeline: frequency domain transformation, pruning and quantization that work together to reduce the storage demand of the voxel grids with little to no effects on their speed and synthesis quality. |
Tianli Zhao; Jiayuan Chen; Cong Leng; Jian Cheng; |
401 | BEST: BERT Pre-training for Sign Language Recognition with Coupling Tokenization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we are dedicated to leveraging the BERT pre-training success and modeling the domain-specific statistics to fertilize the sign language recognition~(SLR) model. |
Weichao Zhao; Hezhen Hu; Wengang Zhou; Jiaxin Shi; Houqiang Li; |
402 | MulGT: Multi-Task Graph-Transformer with Task-Aware Knowledge Injection and Domain Knowledge-Driven Pooling for Whole Slide Image Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we present a novel multi-task framework (i.e., MulGT) for WSI analysis by the specially designed Graph-Transformer equipped with Task-aware Knowledge Injection and Domain Knowledge-driven Graph Pooling modules. |
Weiqin Zhao; Shujun Wang; Maximus Yeung; Tianye Niu; Lequan Yu; |
403 | Grouped Knowledge Distillation for Deep Face Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We experimentally found that (1) Primary-KD and Binary-KD are indispensable for KD, and (2) Secondary-KD is the culprit restricting KD at the bottleneck. Therefore, we propose a Grouped Knowledge Distillation (GKD) that retains the Primary-KD and Binary-KD but omits Secondary-KD in the ultimate KD loss calculation. |
Weisong Zhao; Xiangyu Zhu; Kaiwen Guo; Xiao-Yu Zhang; Zhen Lei; |
404 | Style-Content Metric Learning for Multidomain Remote Sensing Object Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a style-content metric learning framework to address the generalizable remote sensing object recognition issue. |
Wenda Zhao; Ruikai Yang; Yu Liu; You He; |
405 | Occupancy Planes for Single-View RGB-D Human Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For more accurate results we propose the occupancy planes (OPlanes) representation, which enables to formulate single-view RGB-D human reconstruction as occupancy prediction on planes which slice through the camera’s view frustum. |
Xiaoming Zhao; Yuan-Ting Hu; Zhongzheng Ren; Alexander G. Schwing; |
406 | Deep Equilibrium Models for Snapshot Compressive Imaging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose deep equilibrium models (DEQ) for video SCI, fusing data-driven regularization and stable convergence in a theoretically sound manner. |
Yaping Zhao; Siming Zheng; Xin Yuan; |
407 | Unsupervised Deep Video Denoising with Untrained Network Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Collecting noise-free videos can be costly and challenging in many applications. Therefore, this paper aims to develop an unsupervised deep learning method for video denoising that only uses a single test noisy video for training. |
Huan Zheng; Tongyao Pang; Hui Ji; |
408 | Attack Can Benefit: An Adversarial Approach to Recognizing Facial Expressions Under Noisy Annotations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the aforementioned issues, in this paper, we propose a novel and flexible method to spot noisy labels by leveraging adversarial attack, termed as Geometry Aware Adversarial Vulnerability Estimation (GAAVE). |
Jiawen Zheng; Bo Li; Shengchuan Zhang; Shuang Wu; Liujuan Cao; Shouhong Ding; |
409 | Phrase-Level Temporal Relationship Mining for Temporal Sentence Localization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the problem of video temporal sentence localization, which aims to localize a target moment from videos according to a given language query. |
Minghang Zheng; Sizhe Li; Qingchao Chen; Yuxin Peng; Yang Liu; |
410 | Learning Semantic Degradation-Aware Guidance for Recognition-Driven Unsupervised Low-Light Image Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose to learn a Semantic Degradation-Aware Guidance (SDAG) that perceives the low-light degradation effect on semantic levels in a self-supervised manner, which is further utilized to guide the ULLIE methods. |
Naishan Zheng; Jie Huang; Man Zhou; Zizheng Yang; Qi Zhu; Feng Zhao; |
411 | Memory-Aided Contrastive Consensus Learning for Co-salient Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To learn better group consensus, we propose the Group Consensus Aggregation Module (GCAM) to abstract the common features of each image group; meanwhile, to make the consensus representation more discriminative, we introduce the Memory-based Contrastive Module (MCM), which saves and updates the consensus of images from different groups in a queue of memories. |
Peng Zheng; Jie Qin; Shuo Wang; Tian-Zhu Xiang; Huan Xiong; |
412 | MaskBooster: End-to-End Self-Training for Sparsely Supervised Instance Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose MaskBooster for sparsely supervised instance segmentation (SpSIS) with comprehensive usage of pseudo masks. |
Shida Zheng; Chenshu Chen; Xi Yang; Wenming Tan; |
413 | RSPT: Reconstruct Surroundings and Predict Trajectory for Generalizable Active Object Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a framework called RSPT to form a structure-aware motion representation by Reconstructing Surroundings and Predicting the target Trajectory. |
Fangwei Zhong; Xiao Bi; Yudi Zhang; Wei Zhang; Yizhou Wang; |
414 | STOA-VLP: Spatial-Temporal Modeling of Object and Action for Video-Language Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose STOA-VLP, a pre-training framework that jointly models object and action information across spatial and temporal dimensions. |
Weihong Zhong; Mao Zheng; Duyu Tang; Xuan Luo; Heng Gong; Xiaocheng Feng; Bing Qin; |
415 | Refined Semantic Enhancement Towards Frequency Diffusion for Video Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel Refined Semantic enhancement method towards Frequency Diffusion (RSFD), a captioning model that constantly perceives the linguistic representation of the infrequent tokens. |
Xian Zhong; Zipeng Li; Shuqin Chen; Kui Jiang; Chen Chen; Mang Ye; |
416 | Aesthetically Relevant Image Captioning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we study image AQA and IAC together and present a new IAC method termed Aesthetically Relevant Image Captioning (ARIC). |
Zhipeng Zhong; Fei Zhou; Guoping Qiu; |
417 | Polarization-Aware Low-Light Image Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Stokes-domain enhancement pipeline along with a dual-branch neural network to handle the problem in a polarization-aware manner. |
Chu Zhou; Minggui Teng; Youwei Lyu; Si Li; Chao Xu; Boxin Shi; |
418 | Progressive Bayesian Inference for Scribble-Supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The scribble-supervised semantic segmentation is an important yet challenging task in the field of computer vision. To deal with the pixel-wise sparse annotation problem, we propose a Progressive Bayesian Inference (PBI) framework to boost the performance of the scribble-supervised semantic segmentation, which can effectively infer the semantic distribution of these unlabeled pixels to guide the optimization of the segmentation network. |
Chuanwei Zhou; Chunyan Xu; Zhen Cui; |
419 | Exploratory Inference Learning for Scribble Supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel exploratory inference learning (EIL) framework, which facilitates efficient probing on unlabeled pixels and promotes selecting confident candidates for boosting the evolved segmentation. |
Chuanwei Zhou; Zhen Cui; Chunyan Xu; Cao Han; Jian Yang; |
420 | Dual Memory Units with Uncertainty Regulation for Weakly Supervised Video Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We observe that such a scheme is sub-optimal, i.e., for better distinguishing anomaly one needs to understand what is a normal state, and may yield a higher false alarm rate. To address this issue, we propose an Uncertainty Regulated Dual Memory Units (UR-DMU) model to learn both the representations of normal data and discriminative features of abnormal data. |
Hang Zhou; Junqing Yu; Wei Yang; |
421 | Unsupervised Hierarchical Domain Adaptation for Adverse Weather Optical Flow Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose the first unsupervised framework for adverse weather optical flow via hierarchical motion-boundary adaptation. |
Hanyu Zhou; Yi Chang; Gang Chen; Luxin Yan; |
422 | PASS: Patch Automatic Skip Scheme for Efficient Real-Time Video Perception on Edge Devices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a general and task-independent Patch Automatic Skip Scheme (PASS), a novel end-to-end learning pipeline to support diverse video perception settings by decoupling acceleration and tasks. |
Qihua Zhou; Song Guo; Jun Pan; Jiacheng Liang; Zhenda Xu; Jingren Zhou; |
423 | Robust Feature Rectification of Pretrained Vision Models for Object Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a RObust FEature Rectification module (ROFER) to improve the performance of pretrained models against degradations. |
Shengchao Zhou; Gaofeng Meng; Zhaoxiang Zhang; Richard Yi Da Xu; Shiming Xiang; |
424 | Video Object of Interest Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a new computer vision task named video object of interest segmentation (VOIS). |
Siyuan Zhou; Chunru Zhan; Biao Wang; Tiezheng Ge; Yuning Jiang; Li Niu; |
425 | Tree-Structured Trajectory Encoding for Vision-and-Language Navigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that the sequential encoding may largely lose this kind of fine-grained structure in the trajectory, which could hamper the later state estimation and decision making. In order to solve this problem, this work proposes a novel tree-structured trajectory encoding strategy. |
Xinzhe Zhou; Yadong Mu; |
426 | Self-Supervised Action Representation Learning from Partial Spatio-Temporal Skeleton Sequences Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a Partial Spatio-Temporal Learning (PSTL) framework to exploit the local relationship from a partial skeleton sequences built by a unique spatio-temporal masking strategy. |
Yujie Zhou; Haodong Duan; Anyi Rao; Bing Su; Jiaqi Wang; |
427 | Debiased Fine-Tuning for Vision-Language Models By Prompt Regularization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a new paradigm for fine-tuning large-scale vision-language pre-trained models on downstream task, dubbed Prompt Regularization (ProReg). |
Beier Zhu; Yulei Niu; Saeil Lee; Minhoe Hur; Hanwang Zhang; |
428 | Improving Scene Text Image Super-resolution Via Dual Prior Modulation Network Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our work addresses these gaps and proposes a plug-and-play module dubbed Dual Prior Modulation Network (DPMN), which leverages dual image-level priors to bring performance gain over existing approaches. |
Shipeng Zhu; Zuoyan Zhao; Pengfei Fang; Hui Xue; |
429 | SRoUDA: Meta Self-Training for Robust Unsupervised Domain Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a new meta self-training pipeline, named SRoUDA, for improving adversarial robustness of UDA models. |
Wanqing Zhu; Jia-Li Yin; Bo-Hao Chen; Ximeng Liu; |
430 | Gradient-Based Graph Attention for Scene Text Image Super-resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel gradient-based graph attention method to embed patch-wise text layout contexts into image feature representations for high-resolution text image reconstruction in an implicit and elegant manner. |
Xiangyuan Zhu; Kehua Guo; Hui Fang; Rui Ding; Zheng Wu; Gerald Schaefer; |
431 | RGBD1K: A Large-Scale Dataset and Benchmark for RGB-D Object Tracking Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the dataset deficiency issue, a new RGB-D dataset named RGBD1K is released in this paper. |
Xue-Feng Zhu; Tianyang Xu; Zhangyong Tang; Zucheng Wu; Haodong Liu; Xiao Yang; Xiao-Jun Wu; Josef Kittler; |
432 | Learn More for Food Recognition Via Progressive Self-Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Instead of locating multiple regions, we propose a Progressive Self-Distillation (PSD) method, which progressively enhances the ability of network to mine more details for food recognition. |
Yaohui Zhu; Linhu Liu; Jiang Tian; |
433 | Generative Image Inpainting with Segmentation Confusion Adversarial Training and Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new adversarial training framework for image inpainting with segmentation confusion adversarial training (SCAT) and contrastive learning. |
Zhiwen Zuo; Lei Zhao; Ailin Li; Zhizhong Wang; Zhanjie Zhang; Jiafu Chen; Wei Xing; Dongming Lu; |
434 | Improved Algorithms for Maximum Satisfiability and Its Special Cases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the (n,3)-MAXSAT problem, we design a O*(1.1749^n) algorithm improving on the previous record running time of O*(1.191^n). |
Kirill Brilliantov; Vasily Alferov; Ivan Bliznets; |
435 | Lifting (D)QBF Preprocessing and Solving Techniques to (D)SSAT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To date, no decision procedure has been implemented for solving DSSAT formulas. This work provides the first such tool by converting DSSAT into SSAT with dependency elimination, similar to converting dependency quantified Boolean formula (DQBF) to quantified Boolean formula (QBF). |
Che Cheng; Jie-Hong R. Jiang; |
436 | NuWLS: Improving Local Search for (Weighted) Partial MaxSAT By New Weighting Techniques Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we identify two issues of existing clause weighting techniques for (W)PMS, and propose two ideas correspondingly. |
Yi Chu; Shaowei Cai; Chuan Luo; |
437 | Separate But Equal: Equality in Belief Propagation for Single Cycle Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We prove that on a single cycle graph, belief equality can be avoided only when the algorithm converges to the optimal solution. |
Erel Cohen; Omer Lev; Roie Zivan; |
438 | Complexity of Reasoning with Cardinality Minimality Conditions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider the CardMinSat problem, which asks, given a formula φ and an atom x, whether x is true in some cardinality-minimal model of φ. |
Nadia Creignou; Frédéric Olive; Johannes Schmidt; |
439 | DASH: A Distributed and Parallelizable Algorithm for Size-Constrained Submodular Maximization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the SMCC problem in a distributed setting and propose the first MR algorithms with sublinear adaptive complexity. |
Tonmoy Dey; Yixin Chen; Alan Kuhnle; |
440 | SharpSSAT: A Witness-Generating Stochastic Boolean Satisfiability Solver Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we develop a new witness-generating SSAT solver, SharpSSAT, which integrates techniques, including component caching, clause learning, and pure literal detection. |
Yu-Wei Fan; Jie-Hong R. Jiang; |
441 | Submodular Maximization Under The Intersection of Matroid and Knapsack Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we consider the problem of submodular maximization under the intersection of two commonly used constraints, i.e., k-matroid constraint and m-knapsack constraint, and propose a new algorithm SPROUT by incorporating partial enumeration into the simultaneous greedy framework. |
Yu-Ran Gu; Chao Bian; Chao Qian; |
442 | A Framework to Design Approximation Algorithms for Finding Diverse Solutions in Combinatorial Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As a main result, we propose a framework to design approximation algorithms for finding diverse solutions, which yields several outcomes including constant-factor approximation algorithms for finding diverse matchings in graphs and diverse common bases in two matroids and PTASes for finding diverse minimum cuts and interval schedulings. |
Tesshu Hanaka; Masashi Kiyomi; Yasuaki Kobayashi; Yusuke Kobayashi; Kazuhiro Kurita; Yota Otachi; |
443 | An Improved Approximation Algorithm for Wage Determination and Online Task Allocation in Crowd-Sourcing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We tackle an optimization problem for wage determination and online task allocation in crowd-sourcing and propose a fast 1-1/(k+3)^(1/2)-approximation algorithm, where k is the minimum of tasks’ budgets (numbers of possible assignments). |
Yuya Hikima; Yasunori Akagi; Hideaki Kim; Taichi Asami; |
444 | Predict+Optimize for Packing and Covering LPs with Unknown Parameters in Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the notion of a correction function, and an additional penalty term in the loss function, modelling practical scenarios where an estimated optimal solution can be modified into a feasible solution after the true parameters are revealed, but at an additional cost. |
Xinyi Hu; Jasper C.H. Lee; Jimmy H.M. Lee; |
445 | Solving Explainability Queries with Quantification: The Case of Feature Relevancy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In contrast with earlier work, that studied FRP for specific classifiers, this paper proposes a novel algorithm for the \fprob quantification problem which is applicable to any ML classifier that meets minor requirements. |
Xuanxiang Huang; Yacine Izza; Joao Marques-Silva; |
446 | Second-Order Quantified Boolean Logic Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the second-order quantified Boolean logic with the following main results: First, we present a procedure of quantifier elimination converting SOQBFs to QBFs and a game interpretation of SOQBF semantics. Second, we devise a sound and complete refutation-proof system for SOQBF. Third, we develop an algorithm for countermodel extraction from a refutation proof. |
Jie-Hong R. Jiang; |
447 | Learning Markov Random Fields for Combinatorial Structures Via Sampling Through Lovász Local Lemma Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop NEural Lovasz Sampler (NELSON), which embeds the sampler through Lovasz Local Lemma (LLL) as a fully differentiable neural network layer. |
Nan Jiang; Yi Gu; Yexiang Xue; |
448 | Fast Converging Anytime Model Counting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper designs a new anytime approach called PartialKC for approximate model counting. |
Yong Lai; Kuldeep S. Meel; Roland H.C. Yap; |
449 | Finding Good Partial Assignments During Restart-Based Branch and Bound Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an approach to find good partial assignments to jumpstart search at each restart for general COPs, which are identified by comparing different best solutions found in different restart runs. |
Hongbo Li; Jimmy H.M. Lee; |
450 | Hybrid Learning with New Value Function for The Maximum Common Induced Subgraph Problem Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new value function and a hybrid selection strategy used in reinforcement learning to define a new vertex selection method, and propose a new BnB algorithm, called McSplitDAL, for MCIS. |
Yanli Liu; Jiming Zhao; Chu-Min Li; Hua Jiang; Kun He; |
451 | Self-Supervised Primal-Dual Learning for Constrained Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper takes a different route and proposes the idea of Primal-Dual Learning (PDL), a self-supervised training method that does not require a set of pre-solved instances or an optimization solver for training and inference. |
Seonho Park; Pascal Van Hentenryck; |
452 | Reinforcement Learning for Branch-and-Bound Optimisation Using Retrospective Trajectories Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose retro branching; a simple yet effective approach to RL for branching. |
Christopher W. F. Parsonson; Alexandre Laterre; Thomas D. Barrett; |
453 | Constraint Optimization Over Semirings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The present work investigates the complexity of constraint optimization problems over semirings. |
A. Pavan; Kuldeep S. Meel; N. V. Vinodchandran; Arnab Bhattacharyya; |
454 | Generalized Confidence Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unfortunately, it is restricted to a conjunction of binary inequalities In this paper, we generalize the Confidence constraint to any constraint and propose an implementation based on Multi-valued Decision Diagrams (MDDs). |
Guillaume Perez; Steve Malalel; Gael Glorian; Victor Jung; Alexandre Papadopoulos; Marie Pelleau; Wijnand Suijlen; Jean-Charles Régin; Arnaud Lallouet; |
455 | Circuit Minimization with QBF-Based Exact Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a rewriting method for Boolean circuits that minimizes small subcircuits with exact synthesis. |
Franz-Xaver Reichl; Friedrich Slivovsky; Stefan Szeider; |
456 | Probabilistic Generalization of Backdoor Trees with Application to SAT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the fact that in a ρ-backdoor-based decomposition a portion of hard subproblems remain, in practice the narrowing of the search space often allows solving the problem faster with such a backdoor than without it. In this paper, we significantly improve on the concept of ρ-backdoors by extending this concept to backdoor trees: we introduce ρ-backdoor trees, show the interconnections between SBS, ρ-backdoors, and the corresponding backdoor trees, and establish some new theoretical properties of backdoor trees. |
Alexander Semenov; Daniil Chivilikhin; Stepan Kochemazov; Ibragim Dzhiblavi; |
457 | The Expressive Power of Ad-Hoc Constraints for Modelling CSPs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we ask a more fundamental question which bears on modelling constraints in a CSP as ad-hoc constraints, how the choice of constraints and operations affect tractability. |
Ruiwei Wang; Roland H.C. Yap; |
458 | Graphs, Constraints, and Search for The Abstraction and Reasoning Corpus Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose Abstract Reasoning with Graph Abstractions (ARGA), a new object-centric framework that first represents images using graphs and then performs a search for a correct program in a DSL that is based on the abstracted graph space. |
Yudong Xu; Elias B. Khalil; Scott Sanner; |
459 | Eliminating The Impossible, Whatever Remains Must Be True: On Extracting and Applying Background Knowledge in The Context of Formal Explanations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show how one can apply background knowledge to give more succinct “why” formal explanations, that are presumably easier to interpret by humans, and give more accurate “why not” explanations. |
Jinqiang Yu; Alexey Ignatiev; Peter J. Stuckey; Nina Narodytska; Joao Marques-Silva; |
460 | Farsighted Probabilistic Sampling: A General Strategy for Boosting Local Search MaxSAT Solvers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we observe that most local search (W)PMS solvers usually flip a single variable per iteration. |
Jiongzhi Zheng; Kun He; Jianrong Zhou; |
461 | LANCER: A Lifetime-Aware News Recommender System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By further developing the characteristics of the lifetime of news, then we present a novel approach for news recommendation, namely, Lifetime-Aware News reCommEndeR System (LANCER) that carefully exploits the lifetime of news during training and recommendation. |
Hong-Kyun Bae; Jeewon Ahn; Dongwon Lee; Sang-Wook Kim; |
462 | Win-Win: A Privacy-Preserving Federated Framework for Dual-Target Cross-Domain Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A small amount of recent CDR works have investigated privacy protection, while they still suffer from satisfying practical requirements (e.g., limited privacy-preserving ability) and preventing the potential risk of negative transfer. To address the above challenging problems, we propose a novel and unified privacy-preserving federated framework for dual-target CDR, namely P2FCDR. |
Gaode Chen; Xinghua Zhang; Yijun Su; Yantong Lai; Ji Xiang; Junbo Zhang; Yu Zheng; |
463 | Enhanced Multi-Relationships Integration Graph Convolutional Network for Inferring Substitutable and Complementary Items Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The substitutable items are interchangeable and might be compared with each other before purchasing, while the complementary items are used in conjunction and are usually bought together with the query item. In this paper, we focus on two issues of inferring the substitutable and complementary items: 1) how to model their mutual influence to improve the performance of downstream tasks, 2) how to further discriminate them by considering the strength of relationship for different item pairs. |
Huajie Chen; Jiyuan He; Weisheng Xu; Tao Feng; Ming Liu; Tianyu Song; Runfeng Yao; Yuanyuan Qiao; |
464 | PaTeCon: A Pattern-Based Temporal Constraint Mining Method for Conflict Detection on Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We start from the common pattern of temporal facts and constraints and propose a pattern-based temporal constraint mining method, PaTeCon. |
Jianhao Chen; Junyang Ren; Wentao Ding; Yuzhong Qu; |
465 | End-to-End Entity Linking with Hierarchical Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose to model the EL task as a hierarchical decision-making process and design a hierarchical reinforcement learning algorithm to solve the problem. |
Lihan Chen; Tinghui Zhu; Jingping Liu; Jiaqing Liang; Yanghua Xiao; |
466 | Entity-Agnostic Representation Learning for Parameter-Efficient Knowledge Graph Embedding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose an entity-agnostic representation learning method for handling the problem of inefficient parameter storage costs brought by embedding knowledge graphs. |
Mingyang Chen; Wen Zhang; Zhen Yao; Yushan Zhu; Yang Gao; Jeff Z. Pan; Huajun Chen; |
467 | Dual Low-Rank Graph Autoencoder for Semantic and Topological Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, limited works on GAE were devoted to leveraging both semantic and topological graphs, and they only indirectly extracted the relationships between graphs via weights shared by features. To better capture the connections between nodes from these two types of graphs, this paper proposes a graph neural network dubbed Dual Low-Rank Graph AutoEncoder (DLR-GAE), which takes both semantic and topological homophily into consideration. |
Zhaoliang Chen; Zhihao Wu; Shiping Wang; Wenzhong Guo; |
468 | Dynamic Multi-Behavior Sequence Modeling for Next Item Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we first address the characteristics of multi-behavior sequences that should be considered in SRSs, and then propose novel methods for Dynamic Multi-behavior Sequence modeling named DyMuS, which is a light version, and DyMuS+, which is an improved version, considering the characteristics. |
Junsu Cho; Dongmin Hyun; Dong won Lim; Hyeon jae Cheon; Hyoung-iel Park; Hwanjo Yu; |
469 | Learning Representations of Bi-level Knowledge Graphs for Reasoning Beyond Link Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we define a higher-level triplet to represent a relationship between triplets, e.g., where PrerequisiteFor is a higher-level relation. |
Chanyoung Chung; Joyce Jiyoung Whang; |
470 | Lifelong Embedding Learning and Transfer for Growing Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Consequently, new facts and previously unseen entities and relations continually emerge, necessitating an embedding model that can quickly learn and transfer new knowledge through growth. Motivated by this, we delve into an expanding field of KG embedding in this paper, i.e., lifelong KG embedding. |
Yuanning Cui; Yuxin Wang; Zequn Sun; Wenqiang Liu; Yiqiao Jiang; Kexin Han; Wei Hu; |
471 | Uniform Sequence Better: Time Interval Aware Data Augmentation for Sequential Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In fact, we conducted an empirical study to validate this observation, and found that a sequence with uniformly distributed time interval (denoted as uniform sequence) is more beneficial for performance improvement than that with greatly varying time interval. Therefore, we propose to augment sequence data from the perspective of time interval, which is not studied in the literature. |
Yizhou Dang; Enneng Yang; Guibing Guo; Linying Jiang; Xingwei Wang; Xiaoxiao Xu; Qinghui Sun; Hong Liu; |
472 | Rule Induction in Knowledge Graphs Using Linear Programming Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a simple linear programming (LP) based method to learn compact and interpretable sets of rules encoding the facts in a knowledge graph (KG) and use these rules to solve the KG completion problem. |
Sanjeeb Dash; Joao Goncalves; |
473 | Spatio-Temporal Neural Structural Causal Models for Bike Flow Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In addition, due to the disturbance of incomplete observations in the data, random contextual conditions lead to spurious correlations between data and features, making the prediction of the model ineffective in special scenarios. To overcome this issue, we propose a Spatio-temporal Neural Structure Causal Model(STNSCM) from the perspective of causality. |
Pan Deng; Yu Zhao; Junting Liu; Xiaofeng Jia; Mulan Wang; |
474 | DAMix: Exploiting Deep Autoregressive Model Zoo for Improving Lossless Compression Generalization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Compared with traditional compression methods, deep learning methods have intrinsic flaws for OoD generalization. In this work, we make the attempt to tackle this challenge via exploiting a zoo of Deep Autoregressive models (DAMix). |
Qishi Dong; Fengwei Zhou; Ning Kang; Chuanlong Xie; Shifeng Zhang; Jiawei Li; Heng Peng; Zhenguo Li; |
475 | Soft Target-Enhanced Matching Framework for Deep Entity Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Then, we propose a novel Soft Target-EnhAnced Matching (Steam) framework, which exploits the automatically generated soft targets as label-wise regularizers to constrain the model training. Specifically, Steam regards the EM model trained in previous iteration as a virtual teacher and takes its softened output as the extra regularizer to train the EM model in the current iteration. |
Wenzhou Dou; Derong Shen; Xiangmin Zhou; Tiezheng Nie; Yue Kou; Hang Cui; Ge Yu; |
476 | DropMessage: Unifying Random Dropping for Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel random dropping method called DropMessage, which performs dropping operations directly on the propagated messages during the message-passing process. |
Taoran Fang; Zhiqing Xiao; Chunping Wang; Jiarong Xu; Xuan Yang; Yang Yang; |
477 | Contrastive Pre-training with Adversarial Perturbations for Check-In Sequence Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, in this paper we propose a contrastive pre-training model with adversarial perturbations for check-in sequence representation learning (CACSR). |
Letian Gong; Youfang Lin; Shengnan Guo; Yan Lin; Tianyi Wang; Erwen Zheng; Zeyu Zhou; Huaiyu Wan; |
478 | MA-GCL: Model Augmentation Tricks for Graph Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moreover, previous GCL methods employ two view encoders with exactly the same neural architecture and tied parameters, which further harms the diversity of augmented views. To address this limitation, we propose a novel paradigm named model augmented GCL (MA-GCL), which will focus on manipulating the architectures of view encoders instead of perturbing graph inputs. |
Xumeng Gong; Cheng Yang; Chuan Shi; |
479 | Generic and Dynamic Graph Representation Learning for Crowd Flow Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Different from the existing research, this paper aims to provide a generic and dynamic representation learning method for crowd flow modeling. |
Liangzhe Han; Ruixing Zhang; Leilei Sun; Bowen Du; Yanjie Fu; Tongyu Zhu; |
480 | Conditional Diffusion Based on Discrete Graph Structures for Molecular Graph Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, accurately modeling distribution and rapidly generating novel molecular graphs remain crucial and challenging goals. To accomplish these goals, we propose a novel Conditional Diffusion model based on discrete Graph Structures (CDGS) for molecular graph generation. |
Han Huang; Leilei Sun; Bowen Du; Weifeng Lv; |
481 | SAH: Shifting-Aware Asymmetric Hashing for Reverse K Maximum Inner Product Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose the first subquadratic-time algorithm, i.e., Shifting-aware Asymmetric Hashing (SAH), to tackle the RkMIPS problem. |
Qiang Huang; Yanhao Wang; Anthony K. H. Tung; |
482 | Learned Distributed Image Compression with Multi-Scale Patch Matching in Feature Domain Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the patch matching at the image domain is not robust to the variance of scale, shape, and illumination caused by the different viewing angles, and can not make full use of the rich texture information of the side information image. To resolve this issue, we propose Multi-Scale Feature Domain Patch Matching (MSFDPM) to fully utilizes side information at the decoder of the distributed image compression model. |
Yujun Huang; Bin Chen; Shiyu Qin; Jiawei Li; Yaowei Wang; Tao Dai; Shu-Tao Xia; |
483 | Constrained Market Share Maximization By Signal-Guided Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel two-stage optimization method to address the challenges. |
Bo Hui; Yuchen Fang; Tian Xia; Sarp Aykent; Wei-Shinn Ku; |
484 | T2-GNN: Graph Neural Networks for Graphs with Incomplete Features and Structure Via Teacher-Student Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, in this paper we propose a general GNN framework based on teacher-student distillation to improve the performance of GNNs on incomplete graphs, namely T2-GNN. |
Cuiying Huo; Di Jin; Yawen Li; Dongxiao He; Yu-Bin Yang; Lingfei Wu; |
485 | Detecting Sources of Healthcare Associated Infections Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Prior techniques for showing submodularity, such as the "live graph" technique are not applicable for the load sharing model and our key technical contribution is to use a more sophisticated "coupling" technique to show the submodularity result. We propose algorithms for our two problem formulations by extending existing algorithmic results from submodular optimization and combining these with an expectation propagation heuristic for the load sharing model that leads to orders-of-magnitude speedup. |
Hankyu Jang; Andrew Fu; Jiaming Cui; Methun Kamruzzaman; B. Aditya Prakash; Anil Vullikanti; Bijaya Adhikari; Sriram V. Pemmaraju; |
486 | Spatio-Temporal Self-Supervised Learning for Traffic Flow Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While previous work has made great efforts to model spatio-temporal correlations, existing methods still suffer from two key limitations: i) Most models collectively predict all regions’ flows without accounting for spatial heterogeneity, i.e., different regions may have skewed traffic flow distributions. ii) These models fail to capture the temporal heterogeneity induced by time-varying traffic patterns, as they typically model temporal correlations with a shared parameterized space for all time periods. To tackle these challenges, we propose a novel Spatio-Temporal Self-Supervised Learning (ST-SSL) traffic prediction framework which enhances the traffic pattern representations to be reflective of both spatial and temporal heterogeneity, with auxiliary self-supervised learning paradigms. |
Jiahao Ji; Jingyuan Wang; Chao Huang; Junjie Wu; Boren Xu; Zhenhe Wu; Junbo Zhang; Yu Zheng; |
487 | PDFormer: Propagation Delay-Aware Dynamic Long-Range Transformer for Traffic Flow Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose a novel Propagation Delay-aware dynamic long-range transFormer, namely PDFormer, for accurate traffic flow prediction. |
Jiawei Jiang; Chengkai Han; Wayne Xin Zhao; Jingyuan Wang; |
488 | Continuous Trajectory Generation Based on Two-Stage GAN Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although many previous works have studied the problem of trajectory generation, the continuity of the generated trajectories has been neglected, which makes these methods useless for practical urban simulation scenarios. To solve this problem, we propose a novel two-stage generative adversarial framework to generate the continuous trajectory on the road network, namely TS-TrajGen, which efficiently integrates prior domain knowledge of human mobility with model-free learning paradigm. |
Wenjun Jiang; Wayne Xin Zhao; Jingyuan Wang; Jiawei Jiang; |
489 | Let Graph Be The Go Board: Gradient-Free Node Injection Attack for Graph Neural Networks Via Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, we model the node injection attack as a Markov decision process and propose Gradient-free Graph Advantage Actor Critic, namely G2A2C, a reinforcement learning framework in the fashion of advantage actor critic. |
Mingxuan Ju; Yujie Fan; Chuxu Zhang; Yanfang Ye; |
490 | GLCC: A General Framework for Graph-Level Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a general graph-level clustering framework named Graph-Level Contrastive Clustering (GLCC) given multiple graphs. |
Wei Ju; Yiyang Gu; Binqi Chen; Gongbo Sun; Yifang Qin; Xingyuming Liu; Xiao Luo; Ming Zhang; |
491 | Parameterized Algorithms for Colored Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Such smaller parameters are obtained by considering the difference between k or r and some lower bound on these values. We give both algorithms and lower bounds for Colored Clustering with such parameterizations. |
Leon Kellerhals; Tomohiro Koana; Pascal Kunz; Rolf Niedermeier; |
492 | Towards Reliable Item Sampling for Recommendation Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, despite existing efforts, there is still a lack of rigorous theoretical understanding of the proposed metric estimators, and the basic item sampling also suffers from the “blind spot” issue, i.e., estimation accuracy to recover the top-K metrics when K is small can still be rather substantial. In this paper, we provide an in-depth investigation into these problems and make two innovative contributions. First, we propose a new item-sampling estimator that explicitly optimizes the error with respect to the ground truth, and theoretically highlights its subtle difference against prior work. |
Dong Li; Ruoming Jin; Zhenming Liu; Bin Ren; Jing Gao; Zhi Liu; |
493 | Multiple Robust Learning for Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a multiple robust (MR) estimator that can take the advantage of multiple candidate imputation and propensity models to achieve unbiasedness. |
Haoxuan Li; Quanyu Dai; Yuru Li; Yan Lyu; Zhenhua Dong; Xiao-Hua Zhou; Peng Wu; |
494 | Anomaly Segmentation for High-Resolution Remote Sensing Images Based on Pixel Descriptors Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, it is a challenging task due to the complex distribution and the irregular shapes of objects, and the lack of abnormal samples. To tackle these problems, an anomaly segmentation model based on pixel descriptors (ASD) is proposed for anomaly segmentation in HSR imagery. |
Jingtao Li; Xinyu Wang; Hengwei Zhao; Shaoyu Wang; Yanfei Zhong; |
495 | Adaptive Low-Precision Training for Embeddings in Click-Through Rate Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To train and deploy the CTR models efficiently and economically, it is necessary to compress their embedding tables. |
Shiwei Li; Huifeng Guo; Lu Hou; Wei Zhang; Xing Tang; Ruiming Tang; Rui Zhang; Ruixuan Li; |
496 | Signed Laplacian Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel signed graph representation learning framework, called Signed Laplacian Graph Neural Network (SLGNN), which combines the advantages of both. |
Yu Li; Meng Qu; Jian Tang; Yi Chang; |
497 | PPGenCDR: A Stable and Robust Framework for Privacy-Preserving Cross-Domain Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing work on cross-domain recommendation (CDR) reaches advanced and satisfying recommendation performance, but mostly neglects preserving privacy. To fill this gap, we propose a privacy-preserving generative cross-domain recommendation (PPGenCDR) framework for PPCDR. |
Xinting Liao; Weiming Liu; Xiaolin Zheng; Binhui Yao; Chaochao Chen; |
498 | COLA: Improving Conversational Recommender Systems By Collaborative Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Besides, they still need support in efficiently capturing user preferences since the information reflected in a single conversation is limited. Inspired by collaborative filtering, we propose a collaborative augmentation (COLA) method to simultaneously improve both item representation learning and user preference modeling to address these issues. |
Dongding Lin; Jian Wang; Wenjie Li; |
499 | Scalable and Effective Conductance-Based Graph Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on our framework, we propose two novel algorithms PCon_core and PCon_de with linear time and space complexity, which can efficiently and effectively identify clusters from massive graphs with more than a few billion edges. |
Longlong Lin; Ronghua Li; Tao Jia; |
500 | Multi-Domain Generalized Graph Meta Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the multi-domain generalized graph meta learning problem, which is challenging due to non-Euclidean data, inequivalent feature spaces, and heterogeneous distributions. |
Mingkai Lin; Wenzhong Li; Ding Li; Yizhou Chen; Guohao Li; Sanglu Lu; |
501 | IterDE: An Iterative Knowledge Distillation Framework for Knowledge Graph Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose IterDE, a novel knowledge distillation framework for KGEs. |
Jiajun Liu; Peng Wang; Ziyu Shang; Chenxiao Wu; |
502 | Learning By Applying: A General Framework for Mathematical Reasoning Via Enhancing Explicit Knowledge Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a general Learning by Applying (LeAp) framework to enhance existing models (backbones) in a principled way by explicit knowledge learning. |
Jiayu Liu; Zhenya Huang; ChengXiang Zhai; Qi Liu; |
503 | Low-Resource Personal Attribute Prediction from Conversations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel framework PEARL to predict personal attributes from conversations by leveraging the abundant personal attribute knowledge from utterances under a low-resource setting in which no labeled utterances or external data are utilized. |
Yinan Liu; Hu Chen; Wei Shen; Jiaoyan Chen; |
504 | Beyond Smoothing: Unsupervised Graph Representation Learning with Edge Heterophily Discriminating Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As a result, current methods are hard to generalize to heterophilic graphs where dissimilar nodes are widely connected, and also vulnerable to adversarial attacks. To address this issue, we propose a novel unsupervised Graph Representation learning method with Edge hEterophily discriminaTing (GREET) which learns representations by discriminating and leveraging homophilic edges and heterophilic edges. |
Yixin Liu; Yizhen Zheng; Daokun Zhang; Vincent CS Lee; Shirui Pan; |
505 | On Generalized Degree Fairness in Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the bias in the context of node classification, we propose a novel GNN framework called Generalized Degree Fairness-centric Graph Neural Network (DegFairGNN). |
Zemin Liu; Trung-Kien Nguyen; Yuan Fang; |
506 | Time Series Contrastive Learning with Information-Aware Augmentations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address the problem by encouraging both high fidelity and variety based on information theory. |
Dongsheng Luo; Wei Cheng; Yingheng Wang; Dongkuan Xu; Jingchao Ni; Wenchao Yu; Xuchao Zhang; Yanchi Liu; Yuncong Chen; Haifeng Chen; Xiang Zhang; |
507 | NQE: N-ary Query Embedding for Complex Query Answering Over Hyper-Relational Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moreover, previous CQA methods can only make predictions for a few given types of queries and cannot be flexibly extended to more complex logical queries, which significantly limits their applications. To overcome these challenges, in this work, we propose a novel N-ary Query Embedding (NQE) model for CQA over hyper-relational knowledge graphs (HKGs), which include massive n-ary facts. |
Haoran Luo; Haihong E; Yuhao Yang; Gengxian Zhou; Yikai Guo; Tianyu Yao; Zichen Tang; Xueyuan Lin; Kaiyang Wan; |
508 | FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Instead, this paper presents a simple two-stream feature interaction model, namely FinalMLP, which employs only MLPs in both streams yet achieves surprisingly strong performance. |
Kelong Mao; Jieming Zhu; Liangcai Su; Guohao Cai; Yuru Li; Zhenhua Dong; |
509 | GMDNet: A Graph-Based Mixture Density Network for Estimating Packages’ Multimodal Travel Time Distribution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a Graph-based Mixture Density Network, named GMDNet, which takes the benefits of both graph neural network and mixture density network for estimating MTTD conditioned on graph-structure data (i.e., the logistics network). |
Xiaowei Mao; Huaiyu Wan; Haomin Wen; Fan Wu; Jianbin Zheng; Yuting Qiang; Shengnan Guo; Lixia Wu; Haoyuan Hu; Youfang Lin; |
510 | Logic and Commonsense-Guided Temporal Knowledge Graph Completion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Besides, the previous TKG completion (TKGC) approaches cannot represent both the timeliness and the causality properties of events, simultaneously. To address these challenges, we propose a Logic and Commonsense-Guided Embedding model (LCGE) to jointly learn the time-sensitive representation involving timeliness and causality of events, together with the time-independent representation of events from the perspective of commonsense. |
Guanglin Niu; Bo Li; |
511 | Graph Structure Learning on User Mobility Data for Social Relationship Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present Social Relationship Inference Network (SRINet), a novel Graph Neural Network (GNN) framework, to improve inference performance by learning to remove noisy data. |
Guangming Qin; Lexue Song; Yanwei Yu; Chao Huang; Wenzhe Jia; Yuan Cao; Junyu Dong; |
512 | Online Random Feature Forests for Learning in Varying Feature Spaces Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new online learning algorithm tailored for data streams described by varying feature spaces (VFS), wherein new features constantly emerge and old features may stop to be observed over various time spans. |
Christian Schreckenberger; Yi He; Stefan Lüdtke; Christian Bartelt; Heiner Stuckenschmidt; |
513 | Scaling Law for Recommendation Models: Towards General-Purpose User Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we explore the possibility of general-purpose user representation learning by training a universal user encoder at large scales. |
Kyuyong Shin; Hanock Kwak; Su Young Kim; Max Nihlén Ramström; Jisu Jeong; Jung-Woo Ha; Kyung-Min Kim; |
514 | Cross-Domain Adaptative Learning for Online Advertisement Customer Lifetime Value Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, predicting LTV in real-world applications is not an easy task since the user consumption data is usually insufficient within a specific domain. To tackle this problem, we propose a novel cross-domain adaptative framework (CDAF) to leverage consumption data from different domains. |
Hongzu Su; Zhekai Du; Jingjing Li; Lei Zhu; Ke Lu; |
515 | Self-Supervised Interest Transfer Network Via Prototypical Contrastive Learning for Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a cross-domain recommendation method: Self-supervised Interest Transfer Network (SITN), which can effectively transfer invariant knowledge between domains via prototypical contrastive learning. |
Guoqiang Sun; Yibin Shen; Sijin Zhou; Xiang Chen; Hongyan Liu; Chunming Wu; Chenyi Lei; Xianhui Wei; Fei Fang; |
516 | Opinion Optimization in Directed Social Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study a problem of opinion optimization based on the popular Friedkin-Johnsen (FJ) model for opinion dynamics in an unweighted directed social network with n nodes and m edges. |
Haoxin Sun; Zhongzhi Zhang; |
517 | Self-Supervised Continual Graph Learning in Adaptive Riemannian Spaces Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel self-supervised Riemannian Graph Continual Learner (RieGrace). |
Li Sun; Junda Ye; Hao Peng; Feiyang Wang; Philip S. Yu; |
518 | Self-Organization Preserved Graph Structure Learning with Principle of Relevant Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We proposed PRI-GSL, a Graph Structure Learning framework guided by the Principle of Relevant Information, providing a simple and unified framework for identifying the self-organization and revealing the hidden structure. |
Qingyun Sun; Jianxin Li; Beining Yang; Xingcheng Fu; Hao Peng; Philip S. Yu; |
519 | Efficient Embeddings of Logical Variables for Query Answering Over Incomplete Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This approach, however, can be computationally expensive during inference, and cannot deal with queries involving negation. In this paper, we propose a novel approach that addresses all of these limitations. |
Dingmin Wang; Yeyuan Chen; Bernardo Cuenca Grau; |
520 | Human-Instructed Deep Hierarchical Generative Learning for Automated Urban Planning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We rethink the urban planning generative task from a unique functionality perspective, where we summarize planning requirements into different functionality projections for better urban plan generation. To this end, we develop a three-stage generation process from a target area to zones to grids. |
Dongjie Wang; Lingfei Wu; Denghui Zhang; Jingbo Zhou; Leilei Sun; Yanjie Fu; |
521 | Easy Begun Is Half Done: Spatial-Temporal Graph Modeling with ST-Curriculum Dropout Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose ST-Curriculum Dropout, a novel and easy-to-implement strategy for spatial-temporal graph modeling. |
Hongjun Wang; Jiyuan Chen; Tong Pan; Zipei Fan; Xuan Song; Renhe Jiang; Lingyu Zhang; Yi Xie; Zhongyi Wang; Boyuan Zhang; |
522 | Cross-Domain Graph Anomaly Detection Via Anomaly-Aware Contrastive Alignment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we introduce a novel domain adaptation approach, namely Anomaly-aware Contrastive alignmenT (ACT), for GAD. |
Qizhou Wang; Guansong Pang; Mahsa Salehi; Wray Buntine; Christopher Leckie; |
523 | WSiP: Wave Superposition Inspired Pooling for Dynamic Interactions-Aware Trajectory Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a wave superposition inspired social pooling (Wave-pooling for short) method for dynamically aggregating the high-order interactions from both local and global neighbor vehicles. |
Renzhi Wang; Senzhang Wang; Hao Yan; Xiang Wang; |
524 | Beyond Graph Convolutional Network: An Interpretable Regularizer-Centered Optimization Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, by revisiting the original GCN, we induce an interpretable regularizer-centerd optimization framework, in which by building appropriate regularizers we can interpret most GCNs, such as APPNP, JKNet, DAGNN, and GNN-LF/HF. |
Shiping Wang; Zhihao Wu; Yuhong Chen; Yong Chen; |
525 | Augmenting Affective Dependency Graph Via Iterative Incongruity Graph Learning for Sarcasm Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Errors produced during the graph construction step cannot be remedied and may accrue to the following stages, resulting in poor performance. To surmount the above limitations, we explore a novel Iterative Augmenting Affective Graph and Dependency Graph (IAAD) framework to jointly and iteratively learn the incongruity graph structure. |
Xiaobao Wang; Yiqi Dong; Di Jin; Yawen Li; Longbiao Wang; Jianwu Dang; |
526 | Structure Aware Incremental Learning with Personalized Imitation Weights for Recommender Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel training strategy that adaptively learns personalized imitation weights for each user to balance the contribution from the recent data and the amount of knowledge to be distilled from previous time periods. |
Yuening Wang; Yingxue Zhang; Antonios Valkanas; Ruiming Tang; Chen Ma; Jianye Hao; Mark Coates; |
527 | Online Semi-supervised Learning with Mix-Typed Streaming Features Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our key idea to solve the new problem is to leverage copula model to align the data instances with different feature spaces so as to make their distance measurable. |
Di Wu; Shengda Zhuo; Yu Wang; Zhong Chen; Yi He; |
528 | Few-Shot Composition Learning for Image Retrieval with Prompt Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the problem of composition learning for image retrieval, for which we learn to retrieve target images with search queries in the form of a composition of a reference image and a modification text that describes desired modifications of the image. |
Junda Wu; Rui Wang; Handong Zhao; Ruiyi Zhang; Chaochao Lu; Shuai Li; Ricardo Henao; |
529 | ConTextual Masked Auto-Encoder for Dense Passage Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes CoT-MAE (ConTextual Masked Auto-Encoder), a simple yet effective generative pre-training method for dense passage retrieval. |
Xing Wu; Guangyuan Ma; Meng Lin; Zijia Lin; Zhongyuan Wang; Songlin Hu; |
530 | Jointly Imputing Multi-View Data with Optimal Transport Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a generative imputation model named Git with optimal transport theory to jointly impute the missing features/values, conditional on all observed values from the multi-view data. |
Yangyang Wu; Xiaoye Miao; Xinyu Huang; Jianwei Yin; |
531 | Knowledge Graph Embedding By Normalizing Flows Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a unified perspective of embedding and introduce uncertainty into KGE from the view of group theory. |
Changyi Xiao; Xiangnan He; Yixin Cao; |
532 | Temporal Knowledge Graph Reasoning with Historical Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose a new event forecasting model called Contrastive Event Network (CENET), based on a novel training framework of historical contrastive learning. |
Yi Xu; Junjie Ou; Hui Xu; Luoyi Fu; |
533 | SCI: A Spectrum Concentrated Implicit Neural Compression for Biomedical Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Further, we propose a Spectrum Concentrated Implicit neural compression (SCI) which adaptively partitions the complex biomedical data into blocks matching INR’s concentrated spectrum envelop, and design a funnel shaped neural network capable of representing each block with a small number of parameters. Based on this design, we conduct compression via optimization under given budget and allocate the available parameters with high representation accuracy. |
Runzhao Yang; Tingxiong Xiao; Yuxiao Cheng; Qianni Cao; Jinyuan Qu; Jinli Suo; Qionghai Dai; |
534 | Unsupervised Legal Evidence Retrieval Via Contrastive Learning with Approximate Aggregated Positive Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To build a practical Legal AI application and free the judges from the manually searching work, we introduce the task of Legal Evidence Retrieval, which aims at automatically retrieving the precise fact-related verbal evidence within a single case. |
Feng Yao; Jingyuan Zhang; Yating Zhang; Xiaozhong Liu; Changlong Sun; Yun Liu; Weixing Shen; |
535 | One-for-All: Proposal Masked Cross-Class Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: One of the most challenges for anomaly detection (AD) is how to learn one unified and generalizable model to adapt to multi-class especially cross-class settings: the model is trained with normal samples from seen classes with the objective to detect anomalies from both seen and unseen classes. In this work, we propose a novel Proposal Masked Anomaly Detection (PMAD) approach for such challenging multi- and cross-class anomaly detection. |
Xincheng Yao; Chongyang Zhang; Ruoqi Li; Jun Sun; Zhenyu Liu; |
536 | Analogical Inference Enhanced Knowledge Graph Embedding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, knowledge graphs often contain incomplete triples that are difficult to inductively infer by KGEs. To address this challenge, we resort to analogical inference and propose a novel and general self-supervised framework AnKGE to enhance KGE models with analogical inference capability. |
Zhen Yao; Wen Zhang; Mingyang Chen; Yufeng Huang; Yi Yang; Huajun Chen; |
537 | A Noise-Tolerant Differentiable Learning Approach for Single Occurrence Regular Expression with Interleaving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Most of the previous studies only learn restricted SOIREs and are not robust on noisy data. To tackle these issues, we propose a noise-tolerant differentiable learning approach SOIREDL for SOIRE. |
Rongzhen Ye; Tianqu Zhuang; Hai Wan; Jianfeng Du; Weilin Luo; Pingjia Liang; |
538 | Learning from The Wisdom of Crowds: Exploiting Similar Sessions for Session Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel Similar Session-enhanced Ranking (SSR) model to improve the session search performance using historical sessions with similar intents. |
Yuhang Ye; Zhonghua Li; Zhicheng Dou; Yutao Zhu; Changwang Zhang; Shangquan Wu; Zhao Cao; |
539 | Next POI Recommendation with Dynamic Graph and Explicit Dependency Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose the Sequence-based Neighbour search and Prediction Model (SNPM) for next POI recommendation. |
Feiyu Yin; Yong Liu; Zhiqi Shen; Lisi Chen; Shuo Shang; Peng Han; |
540 | Predicting Temporal Sets with Simplified Fully Connected Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a succinct architecture that is solely built on the Simplified Fully Connected Networks (SFCNs) for temporal sets prediction to bring both effectiveness and efficiency together. |
Le Yu; Zihang Liu; Tongyu Zhu; Leilei Sun; Bowen Du; Weifeng Lv; |
541 | Learning to Count Isomorphisms with Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, expecting a fixed representation of the input graph to match diversely structured query graphs is unrealistic. In this paper, we propose a novel GNN called Count-GNN for subgraph isomorphism counting, to deal with the above challenges. |
Xingtong Yu; Zemin Liu; Yuan Fang; Xinming Zhang; |
542 | Untargeted Attack Against Federated Recommendation Systems Via Poisonous Item Embeddings and The Defense Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing untargeted attack methods are either inapplicable or ineffective against FedRec systems. In this paper, we delve into the untargeted attack and its defense for FedRec systems. |
Yang Yu; Qi Liu; Likang Wu; Runlong Yu; Sanshi Lei Yu; Zaixi Zhang; |
543 | Practical Cross-System Shilling Attacks with Limited Access to Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we analyze the properties a practical shilling attack method should have and propose a new concept of Cross-system Attack. |
Meifang Zeng; Ke Li; Bingchuan Jiang; Liujuan Cao; Hui Li; |
544 | Query-Aware Quantization for Maximum Inner Product Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a quantization method based on the distribution of queries combined with sampled softmax. |
Jin Zhang; Defu Lian; Haodi Zhang; Baoyun Wang; Enhong Chen; |
545 | TOT:Topology-Aware Optimal Transport for Multimodal Hate Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The leveraged cross-modal attention mechanisms also suffer from the distributional modality gap and lack logical interpretability. To address these semantic gap issues, we propose TOT: a topology-aware optimal transport framework to decipher the implicit harm in memes scenario, which formulates the cross-modal aligning problem as solutions for optimal transportation plans. |
Linhao Zhang; Li Jin; Xian Sun; Guangluan Xu; Zequn Zhang; Xiaoyu Li; Nayu Liu; Qing Liu; Shiyao Yan; |
546 | Cross-Domain Few-Shot Graph Classification with A Reinforced Task Coordinator Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The combat with the domain shift issue is hindered due to the coarse utilization of source domains and the ignorance of accessible prompts. To address these challenges, in this paper, we design a novel Cross-domain Task Coordinator to leverage a small set of labeled target domain data as prompt tasks, then model the association and discover the relevance between meta-tasks from the source domain and the prompt tasks. |
Qiannan Zhang; Shichao Pei; Qiang Yang; Chuxu Zhang; Nitesh V. Chawla; Xiangliang Zhang; |
547 | AutoSTL: Automated Spatio-Temporal Multi-Task Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To cope with the problems above, we propose an Automated Spatio-Temporal multi-task Learning (AutoSTL) method to handle multiple spatio-temporal tasks jointly. |
Zijian Zhang; Xiangyu Zhao; Hao Miao; Chunxu Zhang; Hongwei Zhao; Junbo Zhang; |
548 | Fair Representation Learning for Recommendation: A Mutual Information Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, in this paper, we re-define recommendation fairness with a novel two-fold mutual information objective. |
Chen Zhao; Le Wu; Pengyang Shao; Kun Zhang; Richang Hong; Meng Wang; |
549 | Deep Graph Structural Infomax Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present an effective model called Deep Graph Structural Infomax (DGSI) to learn node representation. |
Wenting Zhao; Gongping Xu; Zhen Cui; Siqiang Luo; Cheng Long; Tong Zhang; |
550 | Causal Conditional Hidden Markov Model for Multimodal Traffic Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we analyze the physical concepts affecting the generation of multimode traffic flow from the perspective of the observation generation principle and propose a Causal Conditional Hidden Markov Model (CCHMM) to predict multimodal traffic flow. |
Yu Zhao; Pan Deng; Junting Liu; Xiaofeng Jia; Mulan Wang; |
551 | ADMoE: Anomaly Detection with Mixture-of-Experts from Noisy Labels Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a method to leverage weak/noisy labels (e.g., risk scores generated by machine rules for detecting malware) that are cheaper to obtain for anomaly detection. |
Yue Zhao; Guoqing Zheng; Subhabrata Mukherjee; Robert McCann; Ahmed Awadallah; |
552 | A Provable Framework of Learning Graph Embeddings Via Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose GELSUMM, a well-formulated graph embedding learning framework based on graph sum-marization, in which we show the theoretical ground of learn-ing from summary graphs and the restoration with the three well-known graph embedding approaches in a closed form.Through extensive experiments on real-world datasets, we demonstrate that our methods can learn graph embeddings with matching or better performance on downstream tasks.This work provides theoretical analysis for learning node em-beddings via summarization and helps explain and under-stand the mechanism of the existing works. |
Houquan Zhou; Shenghua Liu; Danai Koutra; Huawei Shen; Xueqi Cheng; |
553 | GraphSR: A Data Augmentation Algorithm for Imbalanced Node Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To resolve the problem, in this paper we seek to automatically augment the minority classes from the massive unlabelled nodes of the graph. |
Mengting Zhou; Zhiguo Gong; |
554 | Detecting Multivariate Time Series Anomalies with Zero Known Label Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose MTGFlow, an unsupervised anomaly detection approach forMultivariate Time series anomaly detection via dynamic Graph and entityaware normalizing Flow, leaning only on a widely accepted hypothesis that abnormal instances exhibit sparse densities than the normal. |
Qihang Zhou; Jiming Chen; Haoyu Liu; Shibo He; Wenchao Meng; |
555 | GRLSTM: Trajectory Similarity Computation with Graph-Based Residual LSTM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods have been designed primarily for trajectories in Euclidean space, which overlooks the fact that real-world trajectories are often generated on road networks. This paper addresses this gap by proposing a novel framework, called GRLSTM (Graph-based Residual LSTM). |
Silin Zhou; Jing Li; Hao Wang; Shuo Shang; Peng Han; |
556 | Heterogeneous Region Embedding with Prompt Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel framework, HREP (Heterogeneous Region Embedding with Prompt learning), which addresses both intra-region and inter-region correlations through two key modules: Heterogeneous Region Embedding (HRE) and prompt learning for different downstream tasks. |
Silin Zhou; Dan He; Lisi Chen; Shuo Shang; Peng Han; |
557 | Show Me The Way! Bilevel Search for Synthesizing Programmatic Strategies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we introduce a bilevel search algorithm that searches concurrently in the space of programs and in a space of state features. |
David S. Aleixo; Levi H.S. Lelis; |
558 | Anytime User Engagement Prediction in Information Cascades for Arbitrary Observation Periods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on split population multi-variate survival processes, we develop a discriminative approach that, unlike prior works, leads to a single model for predicting whether individual users of an information network will engage a given cascade for arbitrary forecast horizons and observation periods. |
Akshay Aravamudan; Xi Zhang; Georgios C. Anagnostopoulos; |
559 | Principled Data-Driven Decision Support for Cyber-Forensic Investigations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, this data-driven approach, called DISCLOSE, is based on a heuristic that utilizes only a subset of the available information and does not approximate optimal decisions. To improve upon this heuristic, we introduce a principled approach for data-driven decision support for cyber-forensic investigations. |
Soodeh Atefi; Sakshyam Panda; Emmanouil Panaousis; Aron Laszka; |
560 | BETA-CD: A Bayesian Meta-Learned Cognitive Diagnosis Framework for Personalized Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a general Bayesian mETA-learned Cognitive Diagnosis framework (BETA-CD), which addresses the two challenges by prior knowledge exploitation and model uncertainty quantification, respectively. |
Haoyang Bi; Enhong Chen; Weidong He; Han Wu; Weihao Zhao; Shijin Wang; Jinze Wu; |
561 | Set-to-Sequence Ranking-Based Concept-Aware Learning Path Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on developing path recommendation systems that aim to generating and recommending an entire learning path to the given user in each session. |
Xianyu Chen; Jian Shen; Wei Xia; Jiarui Jin; Yakun Song; Weinan Zhang; Weiwen Liu; Menghui Zhu; Ruiming Tang; Kai Dong; Dingyin Xia; Yong Yu; |
562 | Unsupervised Deep Embedded Fusion Representation of Single-Cell Transcriptomics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we propose a Single-Cell Deep Embedding Fusion Representation (scDEFR) model, which develop a deep embedded fusion representation to learn fused heterogeneous latent embedding that contains both the transcriptome gene-level information and the cell topology information. |
Yue Cheng; Yanchi Su; Zhuohan Yu; Yanchun Liang; Ka-Chun Wong; Xiangtao Li; |
563 | Constrained Submodular Optimization for Vaccine Design Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In particular, the genetic variability of the human immune system makes it difficult to design peptide vaccines that provide widespread immunity in vaccinated populations. We introduce a framework for evaluating and designing peptide vaccines that uses probabilistic machine learning models, and demonstrate its ability to produce designs for a SARS-CoV-2 vaccine that outperform previous designs. |
Zheng Dai; David K. Gifford; |
564 | Flow-Based Robust Watermarking with Invertible Noise Layer for Black-Box Distortions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, one potential drawback of such a framework is that the encoder and the decoder may not be well coupled, resulting in the fact that the encoder may embed some redundant features into the host image thus influencing the invisibility and robustness of the whole algorithm. To address this limitation, this paper proposes a flow-based robust watermarking framework. |
Han Fang; Yupeng Qiu; Kejiang Chen; Jiyi Zhang; Weiming Zhang; Ee-Chien Chang; |
565 | Identifying and Eliminating Majority Illusion in Social Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: From a system engineering point of view, this motivates the search for algorithms to detect and, where possible, correct this undesirable phenomenon. In this paper we initiate the computational study of majority illusion in social networks, providing NP-hardness and parametrised complexity results for its occurrence and elimination. |
Umberto Grandi; Lawqueen Kanesh; Grzegorz Lisowski; Ramanujan Sridharan; Paolo Turrini; |
566 | A Domain-Knowledge-Inspired Music Embedding Space and A Novel Attention Mechanism for Symbolic Music Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose the Fundamental Music Embedding (FME) for symbolic music based on a bias-adjusted sinusoidal encoding within which both the absolute and the relative attributes can be embedded and the fundamental musical properties (e.g., translational invariance) are explicitly preserved. |
Zixun Guo; Jaeyong Kang; Dorien Herremans; |
567 | MSDC: Exploiting Multi-State Power Consumption in Non-intrusive Load Monitoring Based on A Dual-CNN Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Leveraging recent progress on deep learning techniques, we design a new neural NILM model {\em Multi-State Dual CNN} (MSDC). |
Jialing He; Jiamou Liu; Zijian Zhang; Yang Chen; Yiwei Liu; Bakh Khoussainov; Liehuang Zhu; |
568 | Integrating Reward Maximization and Population Estimation: Sequential Decision-Making for Internal Revenue Service Audit Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a new setting, optimize-and-estimate structured bandits. |
Peter Henderson; Ben Chugg; Brandon Anderson; Kristen Altenburger; Alex Turk; John Guyton; Jacob Goldin; Daniel E. Ho; |
569 | MGTCF: Multi-Generator Tropical Cyclone Forecasting with Heterogeneous Meteorological Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most existing methods lack a generic framework for adapting heterogeneous meteorological data and do not focus on the importance of the environment. Therefore, we propose a Multi-Generator Tropical Cyclone Forecasting model (MGTCF), a generic, extensible, multi-modal TC prediction model with the key modules of Generator Chooser Network (GC-Net) and Environment Net (Env-Net). |
Cheng Huang; Cong Bai; Sixian Chan; Jinglin Zhang; YuQuan Wu; |
570 | MDM: Molecular Diffusion Model for 3D Molecule Generation Related Papers Related Patents Rel |