Paper Digest: AAAI 2022 Highlights
The AAAI Conference on Artificial Intelligence (AAAI) is one of the top artificial intelligence conferences in the world. In 2022, it is to be held in Washington DC.
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
Based in New York, Paper Digest is dedicated to helping people generate contents & reason over unstructured data. Different from black-box approaches, we build deep models on semantics, which allows results to be produced with explainations. Such models power this website, and are behind our services including “search engine”, “summarization”, “question answering”, and “literature review”.
If you do not want to miss interesting academic papers, you are welcome to sign up our daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Paper Digest: AAAI 2022 Highlights
Paper | Author(s) | |
---|---|---|
1 | Pinpointing Fine-Grained Relationships Between Hateful Tweets and Replies Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Recent studies in the hate and counter hate domain have provided the grounds for investigating how to detect this pervasive content in social media. |
Abdullah Albanyan; Eduardo Blanco; |
2 | Cross-Modal Coherence for Text-to-Image Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we train a Cross-Modal Coherence Model for text-to-image retrieval task. |
Malihe Alikhani; Fangda Han; Hareesh Ravi; Mubbasir Kapadia; Vladimir Pavlovic; Matthew Stone; |
3 | Enhanced Story Comprehension for Large Language Models Through Dynamic Document-Based Knowledge Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In order to mitigate the document length limitations that come with finite context windows, we introduce a novel architecture that augments story processing with an external dynamic knowledge graph. |
Berkeley R Andrus; Yeganeh Nasiri; Shilong Cui; Benjamin Cullen; Nancy Fulda; |
4 | Diagnostics-Guided Explanation Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Other diagnostic properties are Data Consistency, which measures how similar explanations are for similar input instances, and Confidence Indication, which shows whether the explanation reflects the confidence of the model. In this work, we show how to directly optimise for these diagnostic properties when training a model to generate sentence-level explanations, which markedly improves explanation quality, agreement with human rationales, and downstream task performance on three complex reasoning tasks. |
Pepa Atanasova; Jakob Grue Simonsen; Christina Lioma; Isabelle Augenstein; |
5 | Mitigating Reporting Bias in Semi-supervised Temporal Commonsense Inference with Probabilistic Soft Logic Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel neural-logic based Soft Logic Enhanced Event Temporal Reasoning (SLEER) model for acquiring unbiased TCS knowledge, in which the complementary relationship among dimensions are explicitly represented as logic rules and modeled by t-norm fuzzy logics. |
Bibo Cai; Xiao Ding; Bowen Chen; Li Du; Ting Liu; |
6 | Adversarial Training for Improving Model Robustness? Look at Both Prediction and Interpretation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a novel feature-level adversarial training method named FLAT. |
Hanjie Chen; Yangfeng Ji; |
7 | Unsupervised Editing for Counterfactual Stories Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose EDUCAT, an editing-based unsupervised approach for counterfactual story rewriting. |
Jiangjie Chen; Chun Gan; Sijie Cheng; Hao Zhou; Yanghua Xiao; Lei Li; |
8 | LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose LOREN, an approach for interpretable fact verification. |
Jiangjie Chen; Qiaoben Bao; Changzhi Sun; Xinbo Zhang; Jiaze Chen; Hao Zhou; Yanghua Xiao; Lei Li; |
9 | ContrastNet: A Contrastive Learning Framework for Few-Shot Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a contrastive learning framework named ContrastNet to tackle both discriminative representation and overfitting problems in few-shot text classification. |
Junfan Chen; Richong Zhang; Yongyi Mao; Jie Xu; |
10 | From Good to Best: Two-Stage Training for Cross-Lingual Machine Reading Comprehension Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, without sufficient training data, it is not powerful enough to capture the nuances between the accurate answer and those approximate ones. Based on this observation, we develop a two-stage approach to enhance the model performance. |
Nuo Chen; Linjun Shou; Ming Gong; Jian Pei; |
11 | Probing Linguistic Information for Logical Inference in Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a methodology for probing knowledge for inference that logical systems require but often lack in pre-trained language model representations. |
Zeming Chen; Qiyue Gao; |
12 | On The Transferability of Pre-trained Language Models: A Study from Artificial Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we study what specific traits in the pre-training data, other than the semantics, make a pre-trained LM superior to their counterparts trained from scratch on downstream tasks. |
Cheng-Han Chiang; Hung-yi Lee; |
13 | C2L: Causally Contrastive Learning for Robust Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We thus aim to leverage contrastive learning and counterfactual augmentation for robustness. |
Seungtaek Choi; Myeongho Jeong; Hojae Han; Seung-won Hwang; |
14 | Novelty Controlled Paraphrase Generation with Retrieval Augmented Conditional Prompt Tuning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we concentrate on two contributions to the task: (1) we propose Retrieval Augmented Prompt Tuning (RAPT) as a parameter-efficient method to adapt large pre-trained language models for paraphrase generation; (2) we propose Novelty Conditioned RAPT (NC-RAPT) as a simple model-agnostic method of using specialized prompt tokens for controlled paraphrase generation with varying levels of lexical novelty. |
Jishnu Ray Chowdhury; Yong Zhuang; Shuyi Wang; |
15 | Flexible Instance-Specific Rationalization of NLP Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, previous research has shown that there is no clear best scoring method across various text classification tasks while practitioners typically have to make several other ad-hoc choices regarding the length and the type of the rationale (e.g. short or long, contiguous or not). Inspired by this, we propose a simple yet effective and flexible method that allows selecting optimally for each data instance: (1) a feature scoring method; (2) the length; and (3) the type of the rationale. |
George Chrysostomou; Nikolaos Aletras; |
16 | InfoLM: A New Metric to Evaluate Summarization & Data2Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce InfoLM a family of untrained metrics that can be viewed as a string-based metric that addresses the aforementioned flaws thanks to a pre-trained masked language model. |
Pierre Jean A. Colombo; Chloé Clavel; Pablo Piantanida; |
17 | Nice Perfume. How Long Did You Marinate in It? Multimodal Sarcasm Explanation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel problem — Multimodal Sarcasm Explanation (MuSE) — given a multimodal sarcastic post containing an image and a caption, we aim to generate a natural language explanation to reveal the intended sarcasm. |
Poorav Desai; Tanmoy Chakraborty; Md Shad Akhtar; |
18 | Zero-Shot Commonsense Question Answering with Cloze Translation and Consistency Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we instead focus on better utilizing the implicit knowledge stored in pre-trained language models. |
Zi-Yi Dou; Nanyun Peng; |
19 | Synthetic Disinformation Attacks on Automated Fact Verification Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Furthermore, the development of modern NLP tools that can produce coherent, fabricated content would allow malicious actors to systematically generate adversarial disinformation for fact-checkers. In this work, we explore the sensitivity of automated fact-checkers to synthetic adversarial evidence in two simulated settings: ADVERSARIAL ADDITION, where we fabricate documents and add them to the evidence repository available to the fact-checking system, and ADVERSARIAL MODIFICATION, where existing evidence source documents in the repository are automatically altered. |
Yibing Du; Antoine Bosselut; Christopher D. Manning; |
20 | Regularizing End-to-End Speech Translation with Triangular Decomposition Agreement Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, this process only involves two-tuple data at each stage, and this loose coupling fails to fully exploit the association between triplet data. In this paper, we attempt to model the joint probability of transcription and translation based on the speech input to directly leverage such triplet data. |
Yichao Du; Zhirui Zhang; Weizhi Wang; Boxing Chen; Jun Xie; Tong Xu; |
21 | Play The Shannon Game with Language Models: A Human-Free Approach to Summary Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The goal of a summary is to concisely state the most important information in a document. With this principle in mind, we introduce new reference-free summary evaluation metrics that use a pretrained language model to estimate the information content shared between a document and its summary. |
Nicholas Egan; Oleg Vasilyev; John Bohannon; |
22 | Fortunately, Discourse Markers Can Enhance Language Models for Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we propose to leverage sentiment-carrying discourse markers to generate large-scale weakly-labeled data, which in turn can be used to adapt language models for sentiment analysis. |
Liat Ein-Dor; Ilya Shnayderman; Artem Spector; Lena Dankin; Ranit Aharonov; Noam Slonim; |
23 | Retrieve, Caption, Generate: Visual Grounding for Enhancing Commonsense in Text Generation Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We investigate the use of multimodal information contained in images as an effective method for enhancing the commonsense of Transformer models for text generation. |
Steven Y. Feng; Kevin Lu; Zhuofu Tao; Malihe Alikhani; Teruko Mitamura; Eduard Hovy; Varun Gangal; |
24 | Language Model Priming for Cross-Lingual Event Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a novel, language-agnostic approach to "priming" language models for the task of event extraction, providing particularly effective performance in low-resource and zero-shot cross-lingual settings. |
Steven Fincke; Shantanu Agarwal; Scott Miller; Elizabeth Boschee; |
25 | Language Modelling Via Learning to Rank Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider language modelling (LM) as a multi-label structured prediction task by re-framing training from solely predicting a single ground-truth word to ranking a set of words which could continue a given context. |
Arvid Frydenlund; Gagandeep Singh; Frank Rudzicz; |
26 | NAREOR: The Narrative Reordering Problem Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Reordering a narrative can impact the temporal, causal, event-based, and other inferences readers draw from it, which in turn can have strong effects both on its interpretation and interestingness. In this paper, we propose and investigate the task of Narrative Reordering (NAREOR) which involves rewriting a given story in a different narrative order while preserving its plot. |
Varun Gangal; Steven Y. Feng; Malihe Alikhani; Teruko Mitamura; Eduard Hovy; |
27 | UNISON: Unpaired Cross-Lingual Image Captioning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we present a novel unpaired cross-lingual method to generate image captions without relying on any caption corpus in the source or the target language. |
Jiahui Gao; Yi Zhou; Philip L. H. Yu; Shafiq Joty; Jiuxiang Gu; |
28 | AutoBERT-Zero: Evolving BERT Backbone from Scratch Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we make the first attempt to automatically discover novel pre-trained language model (PLM) backbone on a flexible search space containing the most fundamental operations from scratch. |
Jiahui Gao; Hang Xu; Han Shi; Xiaozhe Ren; Philip L. H. Yu; Xiaodan Liang; Xin Jiang; Zhenguo Li; |
29 | ISEEQ: Information Seeking Question Generation Using Dynamic Meta-Information Retrieval and Knowledge Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A key open sub-problem in CIS that remains unaddressed in the literature is generating Information Seeking Questions (ISQs) based on a short initial query from the end-user. To address this open problem, we propose Information SEEking Question generator (ISEEQ), a novel approach for generating ISQs from just a short user query, given a large text corpus relevant to the user query. |
Manas Gaur; Kalpa Gunaratna; Vijay Srinivasan; Hongxia Jin; |
30 | Explainable Metaphor Identification Inspired By Conceptual Metaphor Theory Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose the first explainable metaphor identification model, inspired by Conceptual Metaphor Theory. |
Mengshi Ge; Rui Mao; Erik Cambria; |
31 | Confidence Calibration for Intent Detection Via Hyperspherical Space and Rebalanced Accuracy-Uncertainty Loss Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Unfortunately, mainstream neural networks are poorly calibrated, with a large gap between accuracy and confidence. To handle this problem defined as confidence calibration, we propose a model using the hyperspherical space and rebalanced accuracy-uncertainty loss. |
Yantao Gong; Cao Liu; Fan Yang; Xunliang Cai; Guanglu Wan; Jiansong Chen; Weipeng Zhang; Houfeng Wang; |
32 | SSAST: Self-Supervised Audio Spectrogram Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper focuses on audio and speech classification, and aims to reduce the need for large amounts of labeled data for the AST by leveraging self-supervised learning using unlabeled data. |
Yuan Gong; Cheng-I Lai; Yu-An Chung; James Glass; |
33 | Block-Skim: Efficient Question Answering for Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, different from other tasks such as sequence classification, answering the raised question does not necessarily need all the tokens in the context paragraph. Following this motivation, we propose Block-skim, which learns to skim unnecessary context in higher hidden layers to improve and accelerate the Transformer performance. |
Yue Guan; Zhengyi Li; Zhouhan Lin; Yuhao Zhu; Jingwen Leng; Minyi Guo; |
34 | Deep Clustering of Text Representations for Supervision-Free Probing of Syntax Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We consider two notions of syntax: Part of Speech Induction (POSI) and Constituency Labelling (CoLab) in this work. |
Vikram Gupta; Haoyue Shi; Kevin Gimpel; Mrinmaya Sachan; |
35 | Few-Shot Cross-Lingual Stance Detection with Sentiment-Based Pre-training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present the most comprehensive study of cross-lingual stance detection to date: we experiment with 15 diverse datasets in 12 languages from 6 language families, and with 6 low-resource evaluation settings each. |
Momchil Hardalov; Arnav Arora; Preslav Nakov; Isabelle Augenstein; |
36 | Attention Biasing and Context Augmentation for Zero-Shot Control of Encoder-Decoder Transformers for Natural Language Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose novel approaches for controlling encoder-decoder transformer-based NLG models in zero shot. |
Devamanyu Hazarika; Mahdi Namazifar; Dilek Hakkani-Tür; |
37 | GALAXY: A Generative Pre-trained Model for Task-Oriented Dialog with Semi-supervised Learning and Explicit Policy Injection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose GALAXY, a novel pre-trained dialog model that explicitly learns dialog policy from limited labeled dialogs and large-scale unlabeled dialog corpora via semi-supervised learning. |
Wanwei He; Yinpei Dai; Yinhe Zheng; Yuchuan Wu; Zheng Cao; Dermot Liu; Peng Jiang; Min Yang; Fei Huang; Luo Si; Jian Sun; Yongbin Li; |
38 | Protecting Intellectual Property of Language Generation APIs with Lexical Watermark Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work targets at protecting IP of NLG APIs by identifying the attackers who have utilized watermarked responses from the victim NLG APIs. |
Xuanli He; Qiongkai Xu; Lingjuan Lyu; Fangzhao Wu; Chenguang Wang; |
39 | BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we propose a pre-trained language model, named BROS (BERT Relying On Spatiality), that encodes relative positions of texts in 2D space and learns from unlabeled documents with area-masking strategy. |
Teakgyu Hong; DongHyun Kim; Mingi Ji; Wonseok Hwang; Daehyun Nam; Sungrae Park; |
40 | Non-autoregressive Translation with Layer-Wise Prediction and Deep Supervision Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose DSLP, a highly efficient and high-performance model for machine translation. |
Chenyang Huang; Hao Zhou; Osmar R. Zaïane; Lili Mou; Lei Li; |
41 | Word Level Robustness Enhancement: Fight Perturbation with Perturbation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we design a robustness enhancement method to defend against word substitution perturbation, whose basic idea is to fight perturbation with perturbation. |
Pei Huang; Yuting Yang; Fuqi Jia; Minghao Liu; Feifei Ma; Jian Zhang; |
42 | Predicting Above-Sentence Discourse Structure Using Distant Supervision from Topic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To overcome the data sparsity issue, distantly supervised approaches from tasks like sentiment analysis and summarization have been recently proposed. |
Patrick Huber; Linzi Xing; Giuseppe Carenini; |
43 | Call for Customized Conversation: Customized Conversation Grounding Persona and Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, existing conversational agents and datasets do not consider such comprehensive information, and thus they have a limitation in generating the utterances where the knowledge and persona are fused properly. To address this issue, we introduce a call For Customized conversation (FoCus) dataset where the customized answers are built with the user’s persona and Wikipedia knowledge. |
Yoonna Jang; Jungwoo Lim; Yuna Hur; Dongsuk Oh; Suhyune Son; Yeonsoo Lee; Donghoon Shin; Seungryong Kim; Heuiseok Lim; |
44 | Towards Building ASR Systems for The Next Billion Users Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we make multiple contributions towards building ASR systems for low resource languages from the Indian subcontinent. |
Tahir Javed; Sumanth Doddapaneni; Abhigyan Raman; Kaushal Santosh Bhogale; Gowtham Ramesh; Anoop Kunchukuttan; Pratyush Kumar; Mitesh M. Khapra; |
45 | Span-Based Semantic Role Labeling with Argument Pruning and Second-Order Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a framework consisting of two networks: a predicate-agnostic argument pruning network that reduces the number of candidate arguments to O(n), and a semantic role labeling network with an optional second-order decoder that is unfolded from an approximate inference algorithm. |
Zixia Jia; Zhaohui Yan; Haoyi Wu; Kewei Tu; |
46 | Incorporating Constituent Syntax for Coreference Resolution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a simple yet effective graph-based method to incorporate constituent syntactic structures. |
Fan Jiang; Trevor Cohn; |
47 | XLM-K: Improving Cross-Lingual Language Model Pre-training with Multilingual Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose XLM-K, a cross-lingual language model incorporating multilingual knowledge in pre-training. |
Xiaoze Jiang; Yaobo Liang; Weizhu Chen; Nan Duan; |
48 | Hierarchical Context Tagging for Utterance Rewriting Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This can occur in languages like English that introduce tokens such as prepositions into the rewrite for grammaticality. We propose a hierarchical context tagger (HCT) that mitigates this issue by predicting slotted rules (e.g., "besides _") whose slots are later filled with context spans. |
Lisa Jin; Linfeng Song; Lifeng Jin; Dong Yu; Daniel Gildea; |
49 | Search and Learn: Improving Semantic Coverage for Data-to-Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In other words, important input slots tend to be missing in the generated text. To this end, we propose a search-and-learning approach that leverages pretrained language models but inserts the missing slots to improve the semantic coverage. |
Shailza Jolly; Zi Xuan Zhang; Andreas Dengel; Lili Mou; |
50 | Braid: Weaving Symbolic and Neural Knowledge Into Coherent Logical Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we describe the reasoning algorithms used in Braid, and their implementation in a distributed task-based framework that builds proof/explanation graphs for an input query. |
Aditya Kalyanpur; Tom Breloff; David A Ferrucci; |
51 | Self-Supervised Audio-and-Text Pre-training with Extremely Low-Resource Parallel Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we investigate whether it is possible to pre-train an audio-text multimodal model with extremely low-resource parallel data and extra non-parallel unimodal data. |
Yu Kang; Tianqiao Liu; Hang Li; Yang Hao; Wenbiao Ding; |
52 | Bridging The Gap: Using Deep Acoustic Representations to Learn Grounded Language from Percepts and Raw Speech Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we demonstrate the feasibility of performing grounded language acquisition on paired visual percepts and raw speech inputs. |
Gaoussou Youssouf Kebe; Luke E. Richards; Edward Raff; Francis Ferraro; Cynthia Matuszek; |
53 | ALP: Data Augmentation Using Lexicalized PCFGs for Few-Shot Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As a result, they fail to generate samples with plausible and diverse sentence structures. Motivated by this, we present the data Augmentation using Lexicalized Probabilistic context-free grammars (ALP) that generates augmented samples with diverse syntactic structures with plausible grammar. |
Hazel H. Kim; Daecheol Woo; Seong Joon Oh; Jeong-Won Cha; Yo-Sub Han; |
54 | CAISE: Conversational Agent for Image Search and Editing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Thus, we propose a dataset of an automated Conversational Agent for Image Search and Editing (CAISE). |
Hyounghun Kim; Doo Soon Kim; Seunghyun Yoon; Franck Dernoncourt; Trung Bui; Mohit Bansal; |
55 | Dual Task Framework for Improving Persona-Grounded Dialogue Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper introduces a simple yet effective data-centric approach for the task of improving persona-conditioned dialogue agents. |
Minju Kim; Beong-woo Kwak; Youngwook Kim; Hong-in Lee; Seung-won Hwang; Jinyoung Yeo; |
56 | Minimally-Supervised Joint Learning of Event Volitionality and Subject Animacy Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes a novel method that jointly learns volitionality and subject animacy at a low cost, heuristically labeling events in a raw corpus. |
Hirokazu Kiyomaru; Sadao Kurohashi; |
57 | From Fully Trained to Fully Random Embeddings: Improving Neural Machine Translation with Compact Word Embedding Tables Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Embedding matrices are key components in neural natural language processing (NLP) models that are responsible to provide numerical representations of input tokens (i.e. words or subwords). In this paper, we analyze the impact and utility of such matrices in the context of neural machine translation (NMT). |
Krtin Kumar; Peyman Passban; Mehdi Rezagholizadeh; Yiusing Lau; Qun Liu; |
58 | SGD-X: A Benchmark for Robust Generalization in Schema-Guided Dialogue Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The Schema-Guided Dialogue (SGD) dataset introduced a paradigm for enabling models to support any service in zero-shot through schemas, which describe service APIs to models in natural language. |
Harrison Lee; Raghav Gupta; Abhinav Rastogi; Yuan Cao; Bin Zhang; Yonghui Wu; |
59 | Unifying Model Explainability and Robustness for Joint Text Classification and Rationale Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Intuitively, a model that produces helpful explanations should be more robust against adversarial attacks, because we cannot trust the model that outputs explanations but changes its prediction under small perturbations. To this end, we propose a joint classification and rationale extraction model named AT-BMC. |
Dongfang Li; Baotian Hu; Qingcai Chen; Tujie Xu; Jingcong Tao; Yunan Zhang; |
60 | Text Revision By On-the-Fly Representation Optimization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present an iterative in-place editing approach for text revision, which requires no parallel data. |
Jingjing Li; Zichao Li; Tao Ge; Irwin King; Michael R. Lyu; |
61 | Unified Named Entity Recognition As Word-Word Relation Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present a novel alternative by modeling the unified NER as word-word relation classification, namely W^2NER. |
Jingye Li; Hao Fei; Jiang Liu; Shengqiong Wu; Meishan Zhang; Chong Teng; Donghong Ji; Fei Li; |
62 | Sequence-to-Action: Grammatical Error Correction with Action Guided Sequence Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we combine the pros and alleviate the cons of both models by proposing a novel Sequence-to-Action (S2A) module. |
Jiquan Li; Junliang Guo; Yongxin Zhu; Xin Sheng; Deqiang Jiang; Bo Ren; Linli Xu; |
63 | Dynamic Key-Value Memory Enhanced Multi-Step Graph Reasoning for Knowledge-Based Visual Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel model named dynamic knowledge memory enhanced multi-step graph reasoning (DMMGR), which performs explicit and implicit reasoning over a key-value knowledge memory module and a spatial-aware image graph, respectively. |
Mingxiao Li; Marie-Francine Moens; |
64 | Knowledge Bridging for Empathetic Dialogue Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Lack of external knowledge makes empathetic dialogue systems difficult to perceive implicit emotions and learn emotional interactions from limited dialogue history. To address the above problems, we propose to leverage external knowledge, including commonsense knowledge and emotional lexical knowledge, to explicitly understand and express emotions in empathetic dialogue generation. |
Qintong Li; Piji Li; Zhaochun Ren; Pengjie Ren; Zhumin Chen; |
65 | Contrast and Generation Make BART A Good Dialogue Emotion Recognizer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Meanwhile, we utilize an auxiliary response generation task to enhance the model’s ability of handling context information, thereby forcing the model to recognize emotions with similar semantics in diverse contexts. To achieve these objectives, we use the pre-trained encoder-decoder model BART as our backbone model since it is very suitable for both understanding and generation tasks. |
Shimin Li; Hang Yan; Xipeng Qiu; |
66 | A Semi-supervised Learning Approach with Two Teachers to Improve Breakdown Identification in Dialogues Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While quality labeled dialogue data requires human annotation and is usually expensive to obtain, unlabeled data is easier to collect from various sources. In this paper, we propose a novel semi-supervised teacher-student learning framework to tackle this task. |
Qian Lin; Hwee Tou Ng; |
67 | DiffSinger: Singing Voice Synthesis Via Shallow Diffusion Mechanism Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose DiffSinger, an acoustic model for SVS based on the diffusion probabilistic model. |
Jinglin Liu; Chengxi Li; Yi Ren; Feiyang Chen; Zhou Zhao; |
68 | KGR4: Retrieval, Retrospect, Refine and Rethink for Commonsense Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, inspired by the process of humans creating sentences, we propose a novel Knowledge-enhanced Commonsense Generation framework, termed KGR4, consisting of four stages: Retrieval, Retrospect, Refine, Rethink. |
Xin Liu; Dayiheng Liu; Baosong Yang; Haibo Zhang; Junwei Ding; Wenqing Yao; Weihua Luo; Haiying Zhang; Jinsong Su; |
69 | Improving Biomedical Information Retrieval with Neural Retrievers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Although neural retrievers have surpassed traditional IR approaches such as TF-IDF and BM25 in standard open-domain question answering tasks, they are still found lacking in the biomedical domain. In this paper, we seek to improve information retrieval (IR) using neural retrievers (NR) in the biomedical domain, and achieve this goal using a three-pronged approach. |
Man Luo; Arindam Mitra; Tejas Gokhale; Chitta Baral; |
70 | The King Is Naked: On The Notion of Robustness for Natural Language Processing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we argue for semantic robustness, which is better aligned with the human concept of linguistic fidelity. |
Emanuele La Malfa; Marta Kwiatkowska; |
71 | Selecting Optimal Context Sentences for Event-Event Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce a novel method to better model document-level context with important context sentences for event-event relation extraction. |
Hieu Man; Nghia Trung Ngo; Linh Ngo Van; Thien Huu Nguyen; |
72 | Semantic Parsing in Task-Oriented Dialog with Recursive Insertion-Based Encoder Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a Recursive INsertion-based Encoder (RINE), a novel approach for semantic parsing in task-oriented dialog. |
Elman Mansimov; Yi Zhang; |
73 | CINS: Comprehensive Instruction for Few-Shot Learning in Task-Oriented Dialog Systems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To better utilize the power of PLMs, this paper proposes Comprehensive Instruction (CINS) that exploits PLMs with extra task-specific instructions. |
Fei Mi; Yasheng Wang; Yitong Li; |
74 | Semantic Self-Segmentation for Abstractive Summarization of Long Documents in Low-Resource Regimes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel semantic self-segmentation (Se3) approach for long document summarization to address the critical problems of low-resource regimes, namely to process inputs longer than the GPU memory capacity and produce accurate summaries despite the availability of only a few dozens of training instances. |
Gianluca Moro; Luca Ragazzi; |
75 | Eye of The Beholder: Improved Relation Generalization for Text-Based Reinforcement Learning Agents Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While the recent use of text-based resources for increasing an agent’s knowledge and improving its generalization have shown promise, we posit in this paper that there is much yet to be learned from visual representations of these same worlds. Specifically, we propose to retrieve images that represent specific instances of text observations from the world and train our agents on such images. |
Keerthiram Murugesan; Subhajit Chaudhury; Kartik Talamadupula; |
76 | Improving Neural Cross-Lingual Abstractive Summarization Via Employing Optimal Transport Distance for Knowledge Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The matter worsens when performing on languages with separate morphological or structural features, making the cross-lingual alignment more challenging, resulting in the performance drop. To overcome this problem, we propose a novel Knowledge-Distillation-based framework for Cross-Lingual Summarization, seeking to explicitly construct cross-lingual correlation by distilling the knowledge of the monolingual summarization teacher into the cross-lingual summarization student. |
Thong Thanh Nguyen; Anh Tuan Luu; |
77 | HiTKG: Towards Goal-Oriented Conversations Via Multi-Hierarchy Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present HiTKG, a hierarchical transformer-based graph walker that leverages multiscale inputs to make precise and flexible predictions on KG paths. |
Jinjie Ni; Vlad Pandelea; Tom Young; Haicang Zhou; Erik Cambria; |
78 | Is Discourse Role Important for Emotion Recognition in Conversation? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes a novel method to exploit latent discourse role information of an utterance to determine the emotion it conveys in a conversation. |
Donovan Ong; Jian Su; Bin Chen; Anh Tuan Luu; Ashok Narendranath; Yue Li; Shuqi Sun; Yingzhan Lin; Haifeng Wang; |
79 | Improved Text Classification Via Contrastive Adversarial Training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a simple and general method to regularize the fine-tuning of Transformer-based encoders for text classification tasks. |
Lin Pan; Chung-Wei Hang; Avirup Sil; Saloni Potdar; |
80 | LeSICiN: A Heterogeneous Graph-Based Approach for Automatic Legal Statute Identification from Indian Legal Documents Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we take the first step towards utilising both the text and the legal citation network for the LSI task. |
Shounak Paul; Pawan Goyal; Saptarshi Ghosh; |
81 | Transformer Uncertainty Estimation with Hierarchical Stochastic Attention Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a novel way to enable transformers to have the capability of uncertainty estimation and, meanwhile, retain the original predictive performance. |
Jiahuan Pei; Cheng Wang; György Szarvas; |
82 | STEPS: Semantic Typing of Event Processes with A Sequence-to-Sequence Approach Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we advance the field by reformulating the free-form event process typing task as a sequence generation problem and put forward STEPS, an end-to-end approach for producing user intent in terms of actions and objects only, dispensing with the need for their definitions. |
Sveva Pepe; Edoardo Barba; Rexhina Blloshmi; Roberto Navigli; |
83 | Sparse Structure Learning Via Graph Neural Networks for Inductive Document Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, most existing methods are based on static word co-occurrence graphs without sentence-level information, which poses three challenges:(1) word ambiguity, (2) word synonymity, and (3) dynamic contextual dependency. To address these challenges, we propose a novel GNN-based sparse structure learning model for inductive document classification. |
Yinhua Piao; Sangseon Lee; Dohoon Lee; Sun Kim; |
84 | STEM: Unsupervised STructural EMbedding for Stance Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel framework for stance detection. |
Ron Korenblum Pick; Vladyslav Kozhukhov; Dan Vilenchik; Oren Tsur; |
85 | ValueNet: A New Dataset for Human Value Driven Dialogue System Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we present a new large-scale human value dataset called ValueNet, which contains human attitudes on 21,374 text scenarios. |
Liang Qiu; Yizhou Zhao; Jinchao Li; Pan Lu; Baolin Peng; Jianfeng Gao; Song-Chun Zhu; |
86 | Post-OCR Document Correction with Large Ensembles of Character Sequence-to-Sequence Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel method to extend sequence-to-sequence models to accurately process sequences much longer than the ones used during training while being sample- and resource-efficient, supported by thorough experimentation. |
Juan Antonio Ramirez-Orta; Eduardo Xamena; Ana Maguitman; Evangelos Milios; Axel J. Soto; |
87 | MuMuQA: Multimedia Multi-Hop News Question Answering Via Cross-Media Knowledge Extraction and Grounding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a new QA evaluation benchmark with 1,384 questions over news articles that require cross-media grounding of objects in images onto text. |
Revant Gangi Reddy; Xilin Rui; Manling Li; Xudong Lin; Haoyang Wen; Jaemin Cho; Lifu Huang; Mohit Bansal; Avirup Sil; Shih-Fu Chang; Alexander Schwing; Heng Ji; |
88 | Pushing The Limits of Rule Reasoning in Transformers Through Natural Language Satisfiability Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The key idea is to draw insights from empirical sampling of hard propositional SAT problems and from complexity-theoretic studies of language. |
Kyle Richardson; Ashish Sabharwal; |
89 | SFSRNet: Super-resolution for Single-Channel Audio Source Separation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The problem concerning downsampling is that it usually results in information loss. In this paper, we tackle this problem by introducing SFSRNet which contains a super-resolution (SR) network. |
Joel Rixen; Matthias Renz; |
90 | CEM: Commonsense-Aware Empathetic Response Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, since empathy includes both aspects of affection and cognition, we argue that in addition to identifying the user’s emotion, cognitive understanding of the user’s situation should also be considered. To this end, we propose a novel approach for empathetic response generation, which leverages commonsense to draw more information about the user’s situation and uses this additional information to further enhance the empathy expression in generated responses. |
Sahand Sabour; Chujie Zheng; Minlie Huang; |
91 | Weakly Supervised Neuro-Symbolic Module Networks for Numerical Reasoning Over Text Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Hence, we propose Weakly Supervised Neuro-Symbolic Module Network (WNSMN) trained with answers as the sole supervision for numerical reasoning based MRC. |
Amrita Saha; Shafiq Joty; Steven C.H. Hoi; |
92 | Are Vision-Language Transformers Learning Multimodal Representations? A Probing Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we compare pre-trained and fine-tuned representations at a vision, language and multimodal level. |
Emmanuelle Salin; Badreddine Farah; Stéphane Ayache; Benoit Favre; |
93 | Entailment Relation Aware Paraphrase Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a new task of entailment relation aware paraphrase generation which aims at generating a paraphrase conforming to a given entailment relation (e.g. equivalent, forward entailing, or reverse entailing) with respect to a given input. |
Abhilasha Sancheti; Balaji Vasan Srinivasan; Rachel Rudinger; |
94 | Visual Definition Modeling: Challenging Vision & Language Models to Define Words and Objects Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we draw on the established Definition Modeling paradigm and enhance it by grounding, for the first time, textual definitions to visual representations. |
Bianca Scarlini; Tommaso Pasini; Roberto Navigli; |
95 | Active Learning on Pre-trained Language Model with Task-Independent Triplet Loss Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes a task-independent batch acquisition method using triplet loss to distinguish hard samples in an unlabeled data pool with similar features but difficult to identify labels. |
Seungmin Seo; Donghyun Kim; Youbin Ahn; Kyong-Ho Lee; |
96 | OneRel: Joint Entity and Relation Extraction with One Module in One Step Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Therefore, previous joint methods suffer from the problems of cascading errors and redundant information. To address these issues, in this paper, we propose a novel joint entity and relation extraction model, named OneRel, which casts joint extraction as a fine-grained triple classification problem. |
Yu-Ming Shang; Heyan Huang; Xianling Mao; |
97 | KATG: Keyword-Bias-Aware Adversarial Text Generation for Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a Keyword-bias-aware Adversarial Text Generation model (KATG) that implicitly generates adversarial sentences using a generator-discriminator structure. |
Lingfeng Shen; Shoushan Li; Ying Chen; |
98 | Unsupervised Deep Keyphrase Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a novel method for keyphrase generation, AutoKeyGen, without the supervision of any annotated doc-keyphrase pairs. |
Xianjie Shen; Yinghan Wang; Rui Meng; Jingbo Shang; |
99 | Generation-Focused Table-Based Intermediate Pre-training for Free-Form Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, these pre-trained language models have weaker encoding abilities over table cells and schema. To mitigate this issue, in this work, we present an intermediate pre-training framework, Generation-focused Table-based Intermediate Pre-training (GENTAP), that jointly learns representations of natural language questions and tables. |
Peng Shi; Patrick Ng; Feng Nan; Henghui Zhu; Jun Wang; Jiarong Jiang; Alexander Hanbo Li; Rishav Chakravarti; Donald Weidner; Bing Xiang; Zhiguo Wang; |
100 | StepGame: A New Benchmark for Robust Multi-Hop Spatial Reasoning in Texts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a new Question-Answering dataset called StepGame for robust multi-step spatial reasoning in texts. |
Zhengxiang Shi; Qiang Zhang; Aldo Lipani; |
101 | MINIMAL: Mining Models for Universal Adversarial Triggers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a novel data-free approach, MINIMAL, to mine input-agnostic adversarial triggers from models. |
Yaman Kumar Singla; Swapnil Parekh; Somesh Singh; Changyou Chen; Balaji Krishnamurthy; Rajiv Ratn Shah; |
102 | Hierarchical Heterogeneous Graph Attention Network for Syntax-Aware Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we essentially incorporate the constituent structure into the single document summarization via the Graph Neural Networks to learn the semantic meaning of tokens. |
Zixing Song; Irwin King; |
103 | Supervising Model Attention with Human Explanations for Robust Natural Language Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Natural Language Inference (NLI) models are known to learn from biases and artefacts within their training data, impacting how well they generalise to other unseen datasets. Existing de-biasing approaches focus on preventing the models from learning these biases, which can result in restrictive models and lower performance. |
Joe Stacey; Yonatan Belinkov; Marek Rei; |
104 | Hyperbolic Disentangled Representation for Fine-Grained Aspect Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we propose HDAE, a hyperbolic disentangled aspect extractor in which a hyperbolic aspect classifier captures words’ latent hierarchies, and an aspect-disentangled representation models the distinct latent semantics of each seed word. |
Chang-Yu Tai; Ming-Yao Li; Lun-Wei Ku; |
105 | Procedural Text Understanding Via Scene-Wise Evolution Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a new scene-wise paradigm for procedural text understanding, which jointly tracks states of all entities in a scene-by-scene manner. |
Jialong Tang; Hongyu Lin; Meng Liao; Yaojie Lu; Xianpei Han; Le Sun; Weijian Xie; Jin Xu; |
106 | Debiasing NLU Models Via Causal Intervention and Counterfactual Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we provide a new perspective with causal inference to find out the bias. |
Bing Tian; Yixin Cao; Yong Zhang; Chunxiao Xing; |
107 | Chess As A Testbed for Language Model State Tracking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Approximating this full attention results in a significant performance drop. We propose this testbed as a benchmark for future work on the development and analysis of transformer language models. |
Shubham Toshniwal; Sam Wiseman; Karen Livescu; Kevin Gimpel; |
108 | Contrast-Enhanced Semi-supervised Text Classification with Few Labels Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a certainty-driven sample selection method and a contrast-enhanced similarity graph to utilize data more efficiently in self-training, alleviating the annotation-starving problem. |
Austin Cheng-Yun Tsai; Sheng-Ya Lin; Li-Chen Fu; |
109 | Hybrid Autoregressive Inference for Scalable Multi-Hop Explanation Regeneration Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we present SCAR (for Scalable Autoregressive Inference), a hybrid framework that iteratively combines a Transformer-based bi-encoder with a sparse model of explanatory power, designed to leverage explicit inference patterns in the explanations. |
Marco Valentino; Mokanarangan Thayaparan; Deborah Ferreira; André Freitas; |
110 | DetIE: Multilingual Open Information Extraction Inspired By Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a different approach to the problem that can be equally or more successful. |
Michael Vasilkovsky; Anton Alekseev; Valentin Malykh; Ilya Shenbin; Elena Tutubalina; Dmitriy Salikhov; Mikhail Stepnov; Andrey Chertok; Sergey Nikolenko; |
111 | Hybrid Neural Networks for On-Device Directional Hearing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present DeepBeam, a hybrid model that combines traditional beamformers with a custom lightweight neural net. |
Anran Wang; Maruchi Kim; Hao Zhang; Shyamnath Gollakota; |
112 | Non-parametric Online Learning from Human Feedback for Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel non-parametric online learning method without changing the model structure. |
Dongqi Wang; Haoran Wei; Zhirui Zhang; Shujian Huang; Jun Xie; Jiajun Chen; |
113 | Parameter Differentiation Based Multilingual Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel parameter differentiation based method that allows the model to determine which parameters should be language-specific during training. |
Qian Wang; Jiajun Zhang; |
114 | DisenCite: Graph-Based Disentangled Representation Learning for Context-Specific Citation Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel disentangled representation based model DisenCite to automatically generate the citation text through integrating paper text and citation graph. |
Yifan Wang; Yiping Song; Shuai Li; Chaoran Cheng; Wei Ju; Ming Zhang; Sheng Wang; |
115 | HEAL: A Knowledge Graph for Distress Management Conversations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, such resources are limited in the context of emotional distress. To address this, we introduce HEAL, a knowledge graph developed based on 1M distress narratives and their corresponding consoling responses curated from Reddit. |
Anuradha Welivita; Pearl Pu; |
116 | Deep Fusing Pre-trained Models Into Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel framework to deep fuse the pre-trained representation into NMT, fully exploring the potential of PTMs in NMT. |
Rongxiang Weng; Heng Yu; Weihua Luo; Min Zhang; |
117 | VAST: The Valence-Assessing Semantics Test for Contextualizing Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce VAST, the Valence-Assessing Semantics Test, a novel intrinsic evaluation task for contextualized word embeddings (CWEs). |
Robert Wolfe; Aylin Caliskan; |
118 | A Label Dependence-Aware Sequence Generation Model for Multi-Level Implicit Discourse Relation Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we consider multi-level IDRR as a conditional label sequence generation task and propose a Label Dependence-aware Sequence Generation Model (LDSGM) for it. |
Changxing Wu; Liuwen Cao; Yubin Ge; Yang Liu; Min Zhang; Jinsong Su; |
119 | Fast and Constrained Absent Keyphrase Generation By Prompt-Based Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a constrained absent keyphrase generation method in a prompt-based learning fashion. |
Huanqin Wu; Baijiaxin Ma; Wei Liu; Tao Chen; Dan Nie; |
120 | GraphMemDialog: Optimizing End-to-End Task-Oriented Dialog Systems Using Graph Memory Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel Graph Memory Network (GMN) based Seq2Seq model, GraphMemDialog, to effectively learn the inherent structural information hidden in dialog history, and to model the dynamic interaction between dialog history and KBs. |
Jie Wu; Ian G Harris; Hongzhi Zhao; |
121 | Mastering The Explicit Opinion-Role Interaction: Syntax-Aided Neural Transition System for Unified Opinion Role Labeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we investigate a novel solution by revisiting the transition architecture, and augmenting it with a pointer network (PointNet). |
Shengqiong Wu; Hao Fei; Fei Li; Meishan Zhang; Yijiang Liu; Chong Teng; Donghong Ji; |
122 | A Graph Convolutional Network with Adaptive Graph Generation and Channel Selection for Event Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: With this work, we propose a novel graph convolutional method that combines an adaptive graph generation technique and a multi-channel selection strategy. |
Zhipeng Xie; Yumin Tu; |
123 | Leashing The Inner Demons: Self-Detoxification for Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on our findings, we propose a simple yet effective unsupervised method for language models to “detoxify” themselves without an additional large corpus or external discriminator. |
Canwen Xu; Zexue He; Zhankui He; Julian McAuley; |
124 | Zero-Shot Cross-Lingual Machine Reading Comprehension Via Inter-sentence Dependency Graph Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In our approach, we build the Inter-Sentence Dependency Graph (ISDG) connecting dependency trees to form global syntactic relations across sentences. |
Liyan Xu; Xuchao Zhang; Bo Zong; Yanchi Liu; Wei Cheng; Jingchao Ni; Haifeng Chen; Liang Zhao; Jinho D. Choi; |
125 | From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To maintain both task-agnostic and task-specific knowledge in our pruned model, we propose ContrAstive Pruning (CAP) under the paradigm of pre-training and fine-tuning. |
Runxin Xu; Fuli Luo; Chengyu Wang; Baobao Chang; Jun Huang; Songfang Huang; Fei Huang; |
126 | Sequence Level Contrastive Learning for Text Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In text summarization, the output summary is a shorter form of the input document and they have similar meanings. In this paper, we propose a contrastive learning model for supervised abstractive text summarization, where we view a document, its gold summary and its model generated summaries as different views of the same mean representation and maximize the similarities between them during training. |
Shusheng Xu; Xingxing Zhang; Yi Wu; Furu Wei; |
127 | Self-Supervised Knowledge Assimilation for Expert-Layman Text Style Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To mitigate the first issue, we propose a novel language model (LM) pretraining task, Knowledge Base Assimilation, to synthesize pretraining data from the edges of a graph of expert- and layman-style medical terminology terms into an LM during self-supervised learning. |
Wenda Xu; Michael Saxon; Misha Sra; William Yang Wang; |
128 | Text Is No More Enough! A Benchmark for Profile-Based Spoken Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Abstract: Current researches on spoken language understanding (SLU) heavily are limited to a simple setting: the plain text-based SLU that takes the user utterance as input and generates … |
Xiao Xu; Libo Qin; Kaiji Chen; Guoxing Wu; Linlin Li; Wanxiang Che; |
129 | SAS: Self-Augmentation Strategy for Language Model Pre-training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a self-augmentation strategy (SAS) where a single network is utilized for both regular pre-training and contextualized data augmentation for the training in later epochs. |
Yifei Xu; Jingqiao Zhang; Ru He; Liangzhu Ge; Chao Yang; Cheng Yang; Ying Nian Wu; |
130 | Hybrid Curriculum Learning for Emotion Recognition in Conversation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Emotion recognition in conversation (ERC) aims to detect the emotion label for each utterance. Motivated by recent studies which have proven that feeding training examples in a meaningful order rather than considering them randomly can boost the performance of models, we propose an ERC-oriented hybrid curriculum learning framework. |
Lin Yang; YI Shen; Yue Mao; Longjun Cai; |
131 | NumHTML: Numeric-Oriented Hierarchical Transformer Model for Multi-Task Financial Forecasting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper describes a numeric-oriented hierarchical transformer model (NumHTML) to predict stock returns, and financial risk using multi-modal aligned earnings calls data by taking advantage of the different categories of numbers (monetary, temporal, percentages etc.) and their magnitude. |
Linyi Yang; Jiazheng Li; Ruihai Dong; Yue Zhang; Barry Smyth; |
132 | Tracing Text Provenance Via Context-Aware Lexical Substitution Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To address the limitations mentioned above, we propose a natural language watermarking scheme based on context-aware lexical substitution (LS). |
Xi Yang; Jie Zhang; Kejiang Chen; Weiming Zhang; Zehua Ma; Feng Wang; Nenghai Yu; |
133 | Fusing Task-Oriented and Open-Domain Dialogues in Conversational Agents Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, such models would better mimic human-level conversation capabilities. We evaluate two baseline models on this task, including the classification-based two-stage models and the two-in-one fused models. |
Tom Young; Frank Xing; Vlad Pandelea; Jinjie Ni; Erik Cambria; |
134 | JAKET: Joint Pre-training of Knowledge Graph and Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel joint pre-training framework, JAKET, to model both the knowledge graph and language. |
Donghan Yu; Chenguang Zhu; Yiming Yang; Michael Zeng; |
135 | KID-Review: Knowledge-Guided Scientific Review Generation with Oracle Pre-training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, we present an end-to-end knowledge-guided review generation framework for scientific papers grounded in cognitive psychology research that a better understanding of text requires different types of knowledge. |
Weizhe Yuan; Pengfei Liu; |
136 | Reference-Based Speech Enhancement Via Feature Alignment and Fusion Network Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Different from them, we observe that the speeches of the same speaker are correlated in terms of frame-level short-time Fourier Transform (STFT) spectrogram. Therefore, we propose reference-based speech enhancement via a feature alignment and fusion network (FAF-Net). |
Huanjing Yue; Wenxin Duo; Xiulian Peng; Jingyu Yang; |
137 | MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We are motivated to design a general and robust framework, MDD-Eval, to address the problem. |
Chen Zhang; Luis Fernando D’Haro; Thomas Friedrichs; Haizhou Li; |
138 | Efficient Dialog Policy Learning By Reasoning with Contextual Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we develop a deep reinforcement learning framework for goal-oriented dialog policy learning that learns user preferences from user goal data, while leveraging commonsense knowledge from people. |
Haodi Zhang; Zhichao Zeng; Keting Lu; Kaishun Wu; Shiqi Zhang; |
139 | Hierarchical Cross-Modality Semantic Correlation Learning Model for Multimodal Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a hierarchical cross-modality semantic correlation learning model (HCSCL) to learn the intra- and inter-modal correlation existing in the multimodal data. |
Litian Zhang; Xiaoming Zhang; Junshu Pan; |
140 | Adversarial Data Augmentation for Task-Specific Knowledge Distillation of Pre-trained Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose AD^2, a novel and effective data augmentation approach to improving the task-specific knowledge transfer when compressing large pre-trained transformer models. |
Minjia Zhang; Niranjan Uma Naresh; Yuxiong He; |
141 | Text-Based Interactive Recommendation Via Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A direct application of policy learning with such fixed experience suffers from the distribution shift. To tackle this issue, we develop a behavior-agnostic off-policy correction framework to make offline interactive recommendation possible. |
Ruiyi Zhang; Tong Yu; Yilin Shen; Hongxia Jin; |
142 | DKPLM: Decomposable Knowledge-Enhanced Pre-trained Language Model for Natural Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel KEPLM named DKPLM that decomposes knowledge injection process of the pre-trained language models in pre-training, fine-tuning and inference stages, which facilitates the applications of KEPLMs in real-world scenarios. |
Taolin Zhang; Chengyu Wang; Nan Hu; Minghui Qiu; Chengguang Tang; Xiaofeng He; Jun Huang; |
143 | Frequency-Aware Contrastive Learning for Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, we propose a frequency-aware token-level contrastive learning method, in which the hidden state of each decoding step is pushed away from the counterparts of other target words, in a soft contrastive way based on the corresponding word frequencies. |
Tong Zhang; Wei Ye; Baosong Yang; Long Zhang; Xingzhang Ren; Dayiheng Liu; Jinan Sun; Shikun Zhang; Haibo Zhang; Wen Zhao; |
144 | Probing Word Syntactic Representations in The Brain By A Feature Elimination Method Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes an alternative framework to study how different word syntactic features are represented in the brain. |
Xiaohan Zhang; Shaonan Wang; Nan Lin; Jiajun Zhang; Chengqing Zong; |
145 | Unsupervised Sentence Representation Via Contrastive Learning with Mixing Negatives Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this study, we prove that hard negatives are essential for maintaining strong gradient signals in the training process while random sampling negative examples is ineffective for sentence representation. |
Yanzhao Zhang; Richong Zhang; Samuel Mensah; Xudong Liu; Yongyi Mao; |
146 | RetGen: A Joint Framework for Retrieval and Grounded Text Generation Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Grounded generation models appear to offer remedies, but their training typically relies on rarely-available parallel data where information-relevant documents are provided for context. We propose a framework that alleviates this data constraint by jointly training a grounded generator and document retriever on the language model signal. |
Yizhe Zhang; Siqi Sun; Xiang Gao; Yuwei Fang; Chris Brockett; Michel Galley; Jianfeng Gao; Bill Dolan; |
147 | BiRdQA: A Bilingual Dataset for Question Answering on Tricky Riddles Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce BiRdQA, a bilingual multiple-choice question answering dataset with 6614 English riddles and 8751 Chinese riddles. |
Yunxiang Zhang; Xiaojun Wan; |
148 | UniMS: A Unified Framework for Multimodal Summarization with Knowledge Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Besides, we introduce a visual guided decoder to better integrate textual and visual modalities in guiding abstractive text generation. |
Zhengkun Zhang; Xiaojun Meng; Yasheng Wang; Xin Jiang; Qun Liu; Zhenglu Yang; |
149 | DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: There is still a lack of corresponding research and powerful tools to understand and process such long dialogues. Therefore, in this work, we present a pre-training framework for long dialogue understanding and summarization. |
Ming Zhong; Yang Liu; Yichong Xu; Chenguang Zhu; Michael Zeng; |
150 | Idiomatic Expression Paraphrasing Without Strong Supervision Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Idiomatic expressions (IEs) play an essential role in natural language. In this paper, we study the task of idiomatic sentence paraphrasing (ISP), which aims to paraphrase a sentence with an IE by replacing the IE with its literal paraphrase. |
Jianing Zhou; Ziheng Zeng; Hongyu Gong; Suma Bhat; |
151 | Multilingual Code Snippets Training for Program Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce CoST, a new multilingual Code Snippet Translation dataset that contains parallel data from 7 commonly used programming languages. |
Ming Zhu; Karthik Suresh; Chandan K Reddy; |
152 | Learning Unseen Emotions from Gestures Via Semantically-Conditioned Zero-Shot Perception with Adversarial Autoencoders Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a novel generalized zero-shot algorithm to recognize perceived emotions from gestures. |
Abhishek Banerjee; Uttaran Bhattacharya; Aniket Bera; |
153 | Optimized Potential Initialization for Low-Latency Spiking Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we aim to achieve high-performance converted SNNs with extremely low latency (fewer than 32 time-steps). |
Tong Bu; Jianhao Ding; Zhaofei Yu; Tiejun Huang; |
154 | Planning with Biological Neurons and Synapses Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: A program in this framework essentially sets up a dynamical system of neurons and synapses that eventually, with high probability, accomplishes the task. The purpose of this work is to establish empirically that reasonably large programs in the Assembly Calculus can execute correctly and reliably; and that rather realistic — if idealized — higher cognitive functions, such as planning in the blocks world, can be implemented successfully by such programs. |
Francesco d’Amore; Daniel Mitropolsky; Pierluigi Crescenzi; Emanuele Natale; Christos H. Papadimitriou; |
155 | Backprop-Free Reinforcement Learning with Active Neural Generative Coding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose active neural generative coding, a computational framework for learning action-driven generative models without backpropagation of errors (backprop) in dynamic environments. |
Alexander G. Ororbia; Ankur Mali; |
156 | VECA: A New Benchmark and Toolkit for General Cognitive Development Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present the VECA(Virtual Environment for Cognitive Assessment), which consists of two main components: (i) a first benchmark to assess the overall cognitive development of an AI agent, and (ii) a novel toolkit to generate diverse and distinct cognitive tasks. |
Kwanyoung Park; Hyunseok Oh; Youngki Lee; |
157 | Bridging Between Cognitive Processing Signals and Linguistic Features Via A Unified Attentional Network Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a data-driven method to investigate the relationship between cognitive processing signals and linguistic features. |
Yuqi Ren; Deyi Xiong; |
158 | Multi-Sacle Dynamic Coding Improved Spiking Actor Network for Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Inspired by the efficient computation of cell assembly in the biological brain, whereby memory-based coding is much more complex than readout, we propose a multiscale dynamic coding improved spiking actor network (MDC-SAN) for reinforcement learning to achieve effective decision-making. |
Duzhen Zhang; Tielin Zhang; Shuncheng Jia; Bo Xu; |
159 | Joint Human Pose Estimation and Instance Segmentation with PosePlusSeg Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents PosePlusSeg, a joint model designed for both human pose estimation and instance segmentation. |
Niaz Ahmad; Jawad Khan; Jeremy Yuhyun Kim; Youngmoon Lee; |
160 | Logic Rule Guided Attribution with Dynamic Ablation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we construct the ‘if-then’ logic rules that are sufficiently precise locally. |
Jianqiao An; Yuandu Lai; Yahong Han; |
161 | Neural Marionette: Unsupervised Learning of Motion Skeleton and Latent Dynamics from Volumetric Video Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present Neural Marionette, an unsupervised approach that discovers the skeletal structure from a dynamic sequence and learns to generate diverse motions that are consistent with the observed motion dynamics. |
Jinseok Bae; Hojun Jang; Cheol-Hui Min; Hyungun Choi; Young Min Kim; |
162 | Deformable Part Region Learning for Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a deformable part region learning in order to allow decomposed part regions to be deformable according to geometric transformation of an object. |
Seung-Hwan Bae; |
163 | Towards End-to-End Image Compression and Analysis with Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose an end-to-end image compression and analysis model with Transformers, targeting to the cloud-based image classification application. |
Yuanchao Bai; Xu Yang; Xianming Liu; Junjun Jiang; Yaowei Wang; Xiangyang Ji; Wen Gao; |
164 | Handwritten Mathematical Expression Recognition Via Attention Aggregation Based Bi-directional Mutual Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an Attention aggregation based Bi-directional Mutual learning Network (ABM) which consists of one shared encoder and two parallel inverse decoders (L2R and R2L). |
Xiaohang Bian; Bo Qin; Xiaozhe Xin; Jianwu Li; Xuefeng Su; Yanfeng Wang; |
165 | ADD: Frequency Attention and Multi-View Based Knowledge Distillation to Detect Low-Quality Compressed Deepfake Images Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we apply frequency domain learning and optimal transport theory in knowledge distillation (KD) to specifically improve the detection of low-quality compressed deepfake images. |
Le Minh Binh; Simon Woo; |
166 | LUNA: Localizing Unfamiliarity Near Acquaintance for Open-Set Long-Tailed Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we discuss a promising solution to the Open-set Long-Tailed Recognition (OLTR) task utilizing metric learning. |
Jiarui Cai; Yizhou Wang; Hung-Min Hsu; Jenq-Neng Hwang; Kelsey Magrane; Craig S Rose; |
167 | Prior Gradient Mask Guided Pruning-Aware Fine-Tuning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We proposed a Prior Gradient Mask Guided Pruning-aware Fine-Tuning (PGMPF) framework to accelerate deep Convolutional Neural Networks (CNNs). |
Linhang Cai; Zhulin An; Chuanguang Yang; Yangchun Yan; Yongjun Xu; |
168 | Context-Aware Transfer Attacks for Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a new approach to generate context-aware attacks for object detectors. |
Zikui Cai; Xinxin Xie; Shasha Li; Mingjun Yin; Chengyu Song; Srikanth V. Krishnamurthy; Amit K. Roy-Chowdhury; M. Salman Asif; |
169 | OoDHDR-Codec: Out-of-Distribution Generalization for HDR Image Compression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Herein, we propose a novel out-of-distribution (OoD) HDR image compression framework (OoDHDR-codec). |
Linfeng Cao; Aofan Jiang; Wei Li; Huaying Wu; Nanyang Ye; |
170 | Visual Consensus Modeling for Video-Text Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel method to mine the commonsense knowledge shared between the video and text modalities for video-text retrieval, namely visual consensus modeling. |
Shuqiang Cao; Bairui Wang; Wei Zhang; Lin Ma; |
171 | Proximal PanNet: A Model-Based Deep Network for Pansharpening Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: These network architectures always lack sufficient interpretability, which limits further performance improvements. To alleviate this issue, we propose a novel deep network for pansharpening by combining the model-based methodology with the deep learning method. |
Xiangyong Cao; Yang Chen; Wenfei Cao; |
172 | CF-DETR: Coarse-to-Fine Transformers for End-to-End Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper observed that DETR performs surprisingly well even on small objects when measuring Average Precision (AP) at decreased Intersection-over-Union (IoU) thresholds. Motivated by this observation, we propose a simple way to improve DETR by refining the coarse features and predicted locations. |
Xipeng Cao; Peng Yuan; Bailan Feng; Kun Niu; |
173 | A Random CNN Sees Objects: One Inductive Bias of CNN and Its Applications Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: That is, a CNN has an inductive bias to naturally focus on objects, named as Tobias ("The object is at sight") in this paper. |
Yun-Hao Cao; Jianxin Wu; |
174 | Texture Generation Using Dual-Domain Feature Flow with Multi-View Hallucinations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a dual-domain generative model to estimate a texture map from a single image for colorizing a 3D human model. |
Seunggyu Chang; Jungchan Cho; Songhwai Oh; |
175 | Resistance Training Using Prior Bias: Toward Unbiased Scene Graph Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, current SGG methods usually suffer from sub-optimal scene graph generation because of the long-tailed distribution of training data. To address this problem, we propose Resistance Training using Prior Bias (RTPB) for the scene graph generation. |
Chao Chen; Yibing Zhan; Baosheng Yu; Liu Liu; Yong Luo; Bo Du; |
176 | SASA: Semantics-Augmented Set Abstraction for Point-Based 3D Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We observe that the prevailing set abstraction design for down-sampling points may maintain too much unimportant background information that can affect feature learning for detecting objects. To tackle this issue, we propose a novel set abstraction method named Semantics-Augmented Set Abstraction (SASA). |
Chen Chen; Zhe Chen; Jing Zhang; Dacheng Tao; |
177 | Comprehensive Regularization in A Bi-directional Predictive Network for Video Anomaly Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As such, we propose a novel bi-directional architecture with three consistency constraints to comprehensively regularize the prediction task from pixel-wise, cross-modal, and temporal-sequence levels. |
Chengwei Chen; Yuan Xie; Shaohui Lin; Angela Yao; Guannan Jiang; Wei Zhang; Yanyun Qu; Ruizhi Qiao; Bo Ren; Lizhuang Ma; |
178 | Keypoint Message Passing for Video-Based Person Re-identification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose to overcome the limitations of normal convolutions with a human-oriented graph method. |
Di Chen; Andreas Doering; Shanshan Zhang; Jian Yang; Juergen Gall; Bernt Schiele; |
179 | DCAN: Improving Temporal Action Detection Via Dual Context Aggregation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose the end-to-end proposal generation method named Dual Context Aggregation Network (DCAN) to aggregate context on two levels, namely, boundary level and proposal level, for generating high-quality action proposals, thereby improving the performance of temporal action detection. |
Guo Chen; Yin-Dong Zheng; Limin Wang; Tong Lu; |
180 | Geometry-Contrastive Transformer for Generalized 3D Pose Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a customized 3D mesh Transformer model for the pose transfer task. |
Haoyu Chen; Hao Tang; Zitong Yu; Nicu Sebe; Guoying Zhao; |
181 | Explore Inter-contrast Between Videos Via Composition for Weakly Supervised Temporal Sentence Grounding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Most existing methods use the fused visual-linguistic feature to reconstruct the query, where the least reconstruction error determines the target segment. This work introduces a novel approach that explores the inter-contrast between videos in a composed video by selecting components from two different videos and fusing them into a single video. |
Jiaming Chen; Weixin Luo; Wei Zhang; Lin Ma; |
182 | Adaptive Image-to-Video Scene Graph Generation Via Knowledge Reasoning and Adversarial Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We tackle the second challenge by hierarchical adversarial learning to reduce the data distribution discrepancy between images and video frames. |
Jin Chen; Xiaofeng Ji; Xinxiao Wu; |
183 | Text Gestalt: Stroke-Aware Scene Text Image Super-resolution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we attempt to design rules for decomposing English characters and digits at stroke-level, then pre-train a text recognizer to provide stroke-level attention maps as positional clues with the purpose of controlling the consistency between the generated super-resolution image and high-resolution ground truth. |
Jingye Chen; Haiyang Yu; Jianqi Ma; Bin Li; Xiangyang Xue; |
184 | Towards High-Fidelity Face Self-Occlusion Recovery Via Multi-View Residual-Based GAN Inversion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While recovering face self-occlusions based on 3D face reconstruction, e.g., 3D Morphable Model (3DMM) and its variants provides an effective solution, most of the existing methods show apparent limitations in expressing high-fidelity, natural, and diverse facial details. To overcome these limitations, we propose in this paper a new generative adversarial network (MvInvert) for natural face self-occlusion recovery without using paired image-texture data. |
Jinsong Chen; Hu Han; Shiguang Shan; |
185 | ProgressiveMotionSeg: Mutually Reinforced Framework for Event-Based Motion Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Particularly, the existing filter-based denoising methods cannot be directly applied to suppress the noise in event stream, since there is no spatial correlation. To address this issue, this paper presents a novel progressive framework, in which a Motion Estimation (ME) module and an Event Denoising (ED) module are jointly optimized in a mutually reinforced manner. |
Jinze Chen; Yang Wang; Yang Cao; Feng Wu; Zheng-Jun Zha; |
186 | Attacking Video Recognition Models with Bullet-Screen Comments Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Compared with images, attacking videos is much more challenging as it needs to consider not only spatial cues but also temporal cues. To close this gap, we introduce a novel adversarial attack in this paper, the bullet-screen comment (BSC) attack, which attacks video recognition models with BSCs. |
Kai Chen; Zhipeng Wei; Jingjing Chen; Zuxuan Wu; Yu-Gang Jiang; |
187 | VITA: A Multi-Source Vicinal Transfer Augmentation Method for Out-of-Distribution Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As a result, performance is skewed toward certain types of corruption. To address this issue, we propose a multi-source vicinal transfer augmentation (VITA) method for generating diverse on-manifold samples. |
Minghui Chen; Cheng Wen; Feng Zheng; Fengxiang He; Ling Shao; |
188 | TransZero: Attribute-Guided Transformer for Zero-Shot Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an attribute-guided Transformer network to learn the attribute localization for discriminative visual-semantic embedding representations in ZSL, termed TransZero. |
Shiming Chen; Ziming Hong; Yang Liu; Guo-Sen Xie; Baigui Sun; Hao Li; Qinmu Peng; Ke Lu; Xinge You; |
189 | Structured Semantic Transfer for Multi-Label Recognition with Partial Labels Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To reduce the annotation cost, we propose a structured semantic transfer (SST) framework that enables training multi-label recognition models with partial labels, i.e., merely some labels are known while other labels are missing (also called unknown labels) per image. |
Tianshui Chen; Tao Pu; Hefeng Wu; Yuan Xie; Liang Lin; |
190 | SJDL-Vehicle: Semi-supervised Joint Defogging Learning for Foggy Vehicle Re-identification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To our knowledge, this problem is still not well-addressed so far. In this paper, to address this problem, we propose a novel training framework called Semi-supervised Joint Defogging Learning (SJDL) framework. |
Wei-Ting Chen; I-Hsiang Chen; Chih-Yuan Yeh; Hao-Hsiang Yang; Jian-Jiun Ding; Sy-Yen Kuo; |
191 | Imagine By Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Humans can imagine a sample in new poses, scenes and view angles with their prior knowledge even if it is the first time to see this category. Inspired by this, we propose a novel reasoning-based implicit semantic data augmentation method to borrow transformation directions from other classes. |
Xiaohua Chen; Yucan Zhou; Dayan Wu; Wanqian Zhang; Yu Zhou; Bo Li; Weiping Wang; |
192 | Guide Local Feature Matching By Overlap Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a novel Overlap Estimation method conditioned on image pairs with TRansformer, named OETR, to constrain local feature matching in the commonly visible region. |
Ying Chen; Dihe Huang; Shang Xu; Jianlin Liu; Yong Liu; |
193 | Causal Intervention for Subject-Deconfounded Facial Action Unit Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a causal inference framework for subject-invariant facial action unit recognition. |
Yingjie Chen; Diqi Chen; Tao Wang; Yizhou Wang; Yun Liang; |
194 | Deep One-Class Classification Via Interpolated Gaussian Descriptor Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Current state-of-the-art OCC models learn a compact normality description by hyper-sphere minimisation, but they often suffer from overfitting the training data, especially when the training set is small or contaminated with anomalous samples. To address this issue, we introduce the interpolated Gaussian descriptor (IGD) method, a novel OCC model that learns a one-class Gaussian anomaly classifier trained with adversarially interpolated training samples. |
Yuanhong Chen; Yu Tian; Guansong Pang; Gustavo Carneiro; |
195 | Towards Ultra-Resolution Neural Style Transfer Via Thumbnail Instance Normalization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present an extremely simple Ultra-Resolution Style Transfer framework, termed URST, to flexibly process arbitrary high-resolution images (e.g., 10000×10000 pixels) style transfer for the first time. |
Zhe Chen; Wenhai Wang; Enze Xie; Tong Lu; Ping Luo; |
196 | DeTarNet: Decoupling Translation and Rotation By Siamese Network for Point Cloud Registration Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a neural network named DetarNet to decouple the translation t and rotation R, so as to overcome the performance degradation due to their mutual interference in point cloud registration. |
Zhi Chen; Fan Yang; Wenbing Tao; |
197 | LCTR: On Awakening The Local Continuity of Transformer for Weakly Supervised Object Localization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel framework built upon the transformer, termed LCTR (Local Continuity TRansformer), which targets at enhancing the local perception capability of global features among long-range feature dependencies. |
Zhiwei Chen; Changan Wang; Yabiao Wang; Guannan Jiang; Yunhang Shen; Ying Tai; Chengjie Wang; Wei Zhang; Liujuan Cao; |
198 | Efficient Virtual View Selection for 3D Hand Pose Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a new virtual view selection and fusion module for 3D hand pose estimation from single depth. |
Jian Cheng; Yanguang Wan; Dexin Zuo; Cuixia Ma; Jian Gu; Ping Tan; Hongan Wang; Xiaoming Deng; Yinda Zhang; |
199 | Pose Adaptive Dual Mixup for Few-Shot Single-View 3D Reconstruction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a pose adaptive few-shot learning procedure and a two-stage data interpolation regularization, termed Pose Adaptive Dual Mixup (PADMix), for single-image 3D reconstruction. |
Ta-Ying Cheng; Hsuan-Ru Yang; Niki Trigoni; Hwann-Tzong Chen; Tyng-Luh Liu; |
200 | PureGaze: Purifying Gaze Feature for Generalizable Gaze Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we tackle the cross-domain problem in gaze estimation. |
Yihua Cheng; Yiwei Bao; Feng Lu; |
201 | (2.5+1)D Spatio-Temporal Scene Graphs for Video Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: These approaches often ignore the fact that videos are essentially sequences of 2D “views” of events happening in a 3D space, and that the semantics of the 3D scene can thus be carried over from frame to frame. Leveraging this insight, we propose a (2.5+1)D scene graph representation to better capture the spatio-temporal information flows inside the videos. |
Anoop Cherian; Chiori Hori; Tim K. Marks; Jonathan Le Roux; |
202 | Event-Image Fusion Stereo Using Cross-Modality Feature Propagation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this study, we propose a deep network that combines the features of an image with the features of an event to generate a dense disparity map. |
Hoonhee Cho; Kuk-Jin Yoon; |
203 | Style-Guided and Disentangled Representation for Robust Image-to-Image Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on two ideas, this paper proposes Style-Guided and Disentangled Representation for Robust Image-to-Image Translation (SRIT). |
Jaewoong Choi; Daeha Kim; Byung Cheol Song; |
204 | Denoised Maximum Classifier Discrepancy for Source-Free Unsupervised Domain Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the self-training strategy may also suffer from sample selection bias and be impacted by the label noise of the pseudo-labeled samples. In this work, we provide a rigorous theoretical analysis on how these two issues affect the model generalization ability when applying the self-training strategy for the SFUDA problem. |
Tong Chu; Yahao Liu; Jinhong Deng; Wen Li; Lixin Duan; |
205 | Model-Based Image Signal Processors Via Learnable Dictionaries Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Recent approaches have attempted to bridge this gap by estimating the RGB to RAW mapping: handcrafted model-based methods that are interpretable and controllable usually require manual parameter fine-tuning, while end-to-end learnable neural networks require large amounts of training data, at times with complex training procedures, and generally lack interpretability and parametric control. Towards addressing these existing limitations, we present a novel hybrid model-based and data-driven ISP that builds on canonical ISP operations and is both learnable and interpretable. |
Marcos V. Conde; Steven McDonagh; Matteo Maggioni; Ales Leonardis; Eduardo Pérez-Pellitero; |
206 | MMA: Multi-Camera Based Global Motion Averaging Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents a tailor-made multi-camera based motion averaging system, where the fixed relative poses are utilized to improve the accuracy and robustness of SfM. |
Hainan Cui; Shuhan Shen; |
207 | GenCo: Generative Co-training for Generative Adversarial Networks with Limited Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Training effective Generative Adversarial Networks (GANs) requires large amounts of training data, without which the trained models are usually sub-optimal with discriminator over-fitting. Several prior studies address this issue by expanding the distribution of the limited training data via massive and hand-crafted data augmentation. |
Kaiwen Cui; Jiaxing Huang; Zhipeng Luo; Gongjie Zhang; Fangneng Zhan; Shijian Lu; |
208 | Unbiased IoU for Spherical Image Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose an unbiased IoU as a novel evaluation criterion for spherical image object detection, which is based on the unbiased representations and utilize unbiased analytical method for IoU calculation. |
Feng Dai; Bin Chen; Hang Xu; Yike Ma; Xiaodong Li; Bailan Feng; Peng Yuan; Chenggang Yan; Qiang Zhao; |
209 | InsCLR: Improving Instance Retrieval with Self-Supervision Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we identify that the learnt representations for instance retrieval should be invariant to large variations in viewpoint and background etc., whereas self-augmented positives applied by the current SSL methods can not provide strong enough signals for learning robust instance-level representations. To overcome this problem, we propose InsCLR, a new SSL method that builds on the instance-level contrast, to learn the intra-class invariance by dynamically mining meaningful pseudo positive samples from both mini-batches and a memory bank during training. |
Zelu Deng; Yujie Zhong; Sheng Guo; Weilin Huang; |
210 | Spatio-Temporal Recurrent Networks for Event-Based Optical Flow Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Recently, many deep learning methods have shown great success in providing model-free solutions to many event-based problems, such as optical flow estimation. |
Ziluo Ding; Rui Zhao; Jiyuan Zhang; Tianxiao Gao; Ruiqin Xiong; Zhaofei Yu; Tiejun Huang; |
211 | Construct Effective Geometry Aware Feature Pyramid Network for Multi-Scale Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose Geometry-aware Feature Pyramid Network (GaFPN), which mainly consists of the novel Geometry-aware Mapping Module and Geometry-aware Predictor Head.The Geometry-aware Mapping Module is proposed to make full use of all pyramid features to obtain better proposal features by the weight-generation subnetwork. |
Jinpeng Dong; Yuhao Huang; Songyi Zhang; Shitao Chen; Nanning Zheng; |
212 | Complementary Attention Gated Network for Pedestrian Trajectory Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a complementary attention gated network (CAGN) for pedestrian trajectory prediction, in which a dual-path architecture including normal and inverse attention is proposed to capture both frequent and peculiar modals in spatial and temporal patterns, respectively. |
Jinghai Duan; Le Wang; Chengjiang Long; Sanping Zhou; Fang Zheng; Liushuai Shi; Gang Hua; |
213 | SVT-Net: Super Light-Weight Sparse Voxel Transformer for Large Scale Place Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Moreover, the model size also becomes a serious shackle for their wide applications. To overcome these challenges, we propose a super light-weight network model termed SVT-Net. |
Zhaoxin Fan; Zhenbo Song; Hongyan Liu; Zhiwu Lu; Jun He; Xiaoyong Du; |
214 | Backdoor Attacks on The DNN Interpretation System Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we design a backdoor attack that alters the saliency map produced by the network for an input image with a specific trigger pattern while not losing the prediction performance significantly. |
Shihong Fang; Anna Choromanska; |
215 | Learning to Learn Transferable Attack Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a Learning to Learn Transferable Attack (LLTA) method, which makes the adversarial perturbations more generalized via learning from both data and model augmentation. |
Shuman Fang; Jie Li; Xianming Lin; Rongrong Ji; |
216 | Perceptual Quality Assessment of Omnidirectional Images Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we conduct a comprehensive study on the perceptual quality of omnidirectional images from both subjective and objective perspectives. |
Yuming Fang; Liping Huang; Jiebin Yan; Xuelin Liu; Yang Liu; |
217 | PatchUp: A Feature-Space Block-Level Regularization Technique for Convolutional Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose PatchUp, a hidden state block-level regularization technique for Convolutional Neural Networks (CNNs), that is applied on selected contiguous blocks of feature maps from a random pair of samples. |
Mojtaba Faramarzi; Mohammad Amini; Akilesh Badrinaaraayanan; Vikas Verma; Sarath Chandar; |
218 | DuMLP-Pin: A Dual-MLP-Dot-Product Permutation-Invariant Network for Set Feature Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel global aggregation permutation-invariant network based on dual MLP dot-product, called DuMLP-Pin, which is capable of being employed to extract features for set inputs, including unordered or unstructured pixel, attribute, and point cloud data sets. |
Jiajun Fei; Ziyu Zhu; Wenlei Liu; Zhidong Deng; Mingyang Li; Huanjun Deng; Shuo Zhang; |
219 | Attention-Aligned Transformer for Image Captioning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present A2 – an attention-aligned Transformer for image captioning, which guides attention learning in a perturbation-based self-supervised manner, without any annotation overhead. |
Zhengcong Fei; |
220 | Model Doctor: A Simple Gradient Aggregation Strategy for Diagnosing and Treating CNN Classifiers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose the first completely automatic model diagnosing and treating tool, termed as Model Doctor. |
Zunlei Feng; Jiacong Hu; Sai Wu; XiaoTian Yu; Jie Song; Mingli Song; |
221 | OctAttention: Octree-Based Large-Scale Contexts Model for Point Cloud Compression Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, the contexts gathered by the previous voxel-based methods decrease when handling sparse point clouds. To address this problem, we propose a multiple-contexts deep learning framework called OctAttention employing the octree structure, a memory-efficient representation for point clouds. |
Chunyang Fu; Ge Li; Rui Song; Wei Gao; Shan Liu; |
222 | DOC2PPT: Automatic Presentation Slides Generation from Scientific Documents Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a novel task and approach for document-to-slide generation. |
Tsu-Jui Fu; William Yang Wang; Daniel McDuff; Yale Song; |
223 | Unsupervised Underwater Image Restoration: From A Homology Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose an UnSupervised Underwater Image Restoration method (USUIR) by leveraging the homology property between a raw underwater image and a re-degraded image. |
Zhenqi Fu; Huangxing Lin; Yan Yang; Shu Chai; Liyan Sun; Yue Huang; Xinghao Ding; |
224 | Playing Lottery Tickets with Vision and Language Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In parallel, work on the lottery ticket hypothesis (LTH) has shown that deep neural networks contain small matching subnetworks that can achieve on par or even better performance than the dense networks when trained in isolation. In this work, we perform the first empirical study to assess whether such trainable subnetworks also exist in pre-trained VL models. |
Zhe Gan; Yen-Chun Chen; Linjie Li; Tianlong Chen; Yu Cheng; Shuohang Wang; Jingjing Liu; Lijuan Wang; Zicheng Liu; |
225 | Feature Distillation Interaction Weighting Network for Lightweight Image Super-resolution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Meanwhile, how to take full advantage of the intermediate features under the constraints of limited parameters and calculations is also a huge challenge. To alleviate these issues, we propose a lightweight yet efficient Feature Distillation Interaction Weighted Network (FDIWN). |
Guangwei Gao; Wenjie Li; Juncheng Li; Fei Wu; Huimin Lu; Yi Yu; |
226 | Weakly-Supervised Salient Object Detection Using Point Supervision Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel weakly-supervised salient object detection method using point supervision. |
Shuyong Gao; Wei Zhang; Yan Wang; Qianyu Guo; Chenglong Zhang; Yangji He; Wenqiang Zhang; |
227 | Latent Space Explanation By Intervention Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This study aims to reveal hidden concepts by employing an intervention mechanism that shifts the predicted class based on discrete variational autoencoders. |
Itai Gat; Guy Lorberbom; Idan Schwartz; Tamir Hazan; |
228 | Lifelong Person Re-identification By Pseudo Task Knowledge Preservation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The model tends to learn task-specific knowledge with task-wise domain gap, which results in stability and plasticity dilemma. To overcome this problem, we cast LReID as a domain adaptation problem and propose a pseudo task knowledge preservation framework to alleviate the domain gap. |
Wenhang Ge; Junlong Du; Ancong Wu; Yuqiao Xian; Ke Yan; Feiyue Huang; Wei-Shi Zheng; |
229 | Adversarial Robustness in Multi-Task Learning: Promises and Illusions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we evaluate the design choices that impact the robustness of multi-task deep learning networks. |
Salah Ghamizi; Maxime Cordy; Mike Papadakis; Yves Le Traon; |
230 | Deep Confidence Guided Distance for 3D Partial Shape Registration Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a novel non-iterative learnable method for partial-to-partial 3D shape registration. |
Dvir Ginzburg; Dan Raviv; |
231 | Predicting Physical World Destinations for Commands Given to Self-Driving Cars Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In an attempt to alleviate this issue, recent works have taken a natural language-oriented approach by allowing the passenger to give commands that refer to specific objects in the visual scene. Nevertheless, this is only half the task as the car should also understand the physical destination of the command, which is what we focus on in this paper. |
Dusan Grujicic; Thierry Deruyttere; Marie-Francine Moens; Matthew B. Blaschko; |
232 | Towards Light-Weight and Real-Time Line Segment Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a real-time and light-weight line segment detector for resource-constrained environments named Mobile LSD (M-LSD). |
Geonmo Gu; Byungsoo Ko; SeoungHyun Go; Sung-Hyun Lee; Jingeun Lee; Minchul Shin; |
233 | Exploiting Fine-Grained Face Forgery Clues Via Progressive Enhancement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the exploitation of frequency information is coarse-grained, and more importantly, their vanilla learning process struggles to extract fine-grained forgery traces. To address this issue, we propose a progressive enhancement learning framework to exploit both the RGB and fine-grained frequency clues. |
Qiqi Gu; Shen Chen; Taiping Yao; Yang Chen; Shouhong Ding; Ran Yi; |
234 | Delving Into The Local: Dynamic Inconsistency Learning for DeepFake Video Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, these works impose supervisions on sparsely sampled video frames but overlook the local mo- tions among adjacent frames, which instead encode rich in- consistency information that can serve as an efficient indica- tor for DeepFake video detection. To mitigate this issue, we delves into the local motion and propose a novel sampling unit named snippet which contains a few successive videos frames for local temporal inconsistency learning. |
Zhihao Gu; Yang Chen; Taiping Yao; Shouhong Ding; Jilin Li; Lizhuang Ma; |
235 | Assessing A Single Image in Reference-Guided Image Synthesis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a general learning-based framework, Reference-guided Image Synthesis Assessment (RISA) to quantitatively evaluate the quality of a single generated image. |
Jiayi Guo; Chaoqun Du; Jiangshan Wang; Huijuan Huang; Pengfei Wan; Gao Huang; |
236 | Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-Supervised Action Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, to make better use of the movement patterns introduced by extreme augmentations, a Contrastive Learning framework utilizing Abundant Information Mining for self-supervised action Representation (AimCLR) is proposed. |
Tianyu Guo; Hong Liu; Zhan Chen; Mengyuan Liu; Tao Wang; Runwei Ding; |
237 | Convolutional Neural Network Compression Through Generalized Kronecker Product Decomposition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we reduce memory usage and floating-point operations required by convolutional layers in CNNs. |
Marawan Gamal Abdel Hameed; Marzieh S. Tahaei; Ali Mosleh; Vahid Partovi Nia; |
238 | Meta Faster R-CNN: Towards Accurate Few-Shot Object Detection with Attentive Feature Alignment Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To improve the fine-grained few-shot proposal classification, we propose a novel attentive feature alignment method to address the spatial misalignment between the noisy proposals and few-shot classes, thus improving the performance of few-shot object detection. |
Guangxing Han; Shiyuan Huang; Jiawei Ma; Yicheng He; Shih-Fu Chang; |
239 | Delving Into Probabilistic Uncertainty for Unsupervised Domain Adaptive Person Re-identification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an approach named probabilistic uncertainty guided progressive label refinery (P2LR) for domain adaptive person re-identification. |
Jian Han; Ya-Li Li; Shengjin Wang; |
240 | Laneformer: Object-Aware Row-Column Transformers for Lane Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present Laneformer, a conceptually simple yet powerful transformer-based architecture tailored for lane detection that is a long-standing research topic for visual perception in autonomous driving. |
Jianhua Han; Xiajun Deng; Xinyue Cai; Zhen Yang; Hang Xu; Chunjing Xu; Xiaodan Liang; |
241 | Modify Self-Attention Via Skeleton Decomposition for Effective Point Cloud Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel skeleton decomposition-based self-attention (SD-SA) which has no sequence length limit and exhibits favorable scalability in long-sequence models. |
Jiayi Han; Longbin Zeng; Liang Du; Xiaoqing Ye; Weiyang Ding; Jianfeng Feng; |
242 | Generalizable Person Re-identification Via Self-Supervised Batch Norm Test-Time Adaption Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate the generalization problem of person re-identification (re-id), whose major challenge is the distribution shift on an unseen domain. |
Ke Han; Chenyang Si; Yan Huang; Liang Wang; Tieniu Tan; |
243 | RRL: Regional Rotate Layer in Convolutional Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: These methods either increase the workload of training or increase the number of model parameters. To address this problem, this paper proposes a module that can be inserted into the existing networks, and directly incorporates the rotation invariance into the feature extraction layers of the CNNs. |
Zongbo Hao; Tao Zhang; Mingwang Chen; Zou Kaixu; |
244 | QueryProp: Object Query Propagation for High-Performance Video Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper argues that with a more effective and efficient feature propagation framework, video object detectors can gain improvement in terms of both accuracy and speed. |
Fei He; Naiyu Gao; Jian Jia; Xin Zhao; Kaiqi Huang; |
245 | Flow-Based Unconstrained Lip to Speech Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Although these methods have achieved promising performance, they are prone to bring issues including high inference latency and mel-spectrogram over-smoothness. To tackle these problems, we propose a novel flow-based non-autoregressive lip-to-speech model (GlowLTS) to break autoregressive constraints and achieve faster inference. |
Jinzheng He; Zhou Zhao; Yi Ren; Jinglin Liu; Baoxing Huai; Nicholas Yuan; |
246 | TransFG: A Transformer Architecture for Fine-Grained Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Fine-grained visual classification (FGVC) which aims at recognizing objects from subcategories is a very challenging task due to the inherently subtle inter-class differences. Most existing works mainly tackle this problem by reusing the backbone network to extract features of detected discriminative regions. |
Ju He; Jie-Neng Chen; Shuai Liu; Adam Kortylewski; Cheng Yang; Yutong Bai; Changhu Wang; |
247 | Self-Supervised Robust Scene Flow Estimation Via The Alignment of Probability Density Functions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a new self-supervised scene flow estimation approach for a pair of consecutive point clouds. |
Pan He; Patrick Emami; Sanjay Ranka; Anand Rangarajan; |
248 | SVGA-Net: Sparse Voxel-Graph Attention Network for 3D Object Detection from Point Clouds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose Sparse Voxel-Graph Attention Network (SVGA-Net), a novel end-to-end trainable network which mainly contains voxel-graph module and sparse-to-dense regression module to achieve comparable 3D detection tasks from raw LIDAR data. |
Qingdong He; Zhengning Wang; Hao Zeng; Yi Zeng; Yijun Liu; |
249 | SECRET: Self-Consistent Pseudo Label Refinement for Unsupervised Domain Adaptive Person Re-identification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We argue that the consistency between different feature spaces is the key to the pseudo labels’ quality. |
Tao He; Leqi Shen; Yuchen Guo; Guiguang Ding; Zhenhua Guo; |
250 | Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing Scene Text Recognition (STR) methods typically use a language model to optimize the joint probability of the 1D character sequence predicted by a visual recognition (VR) model, which ignore the 2D spatial context of visual semantics within and between character instances, making them not generalize well to arbitrary shape scene text. To address this issue, we make the first attempt to perform textual reasoning based on visual semantics in this paper. |
Yue He; Chen Chen; Jing Zhang; Juhua Liu; Fengxiang He; Chaoyue Wang; Bo Du; |
251 | Ranking Info Noise Contrastive Estimation: Boosting Contrastive Learning Via Ranked Positives Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper introduces Ranking Info Noise Contrastive Estimation (RINCE), a new member in the family of InfoNCE losses that preserves a ranked ordering of positive samples. |
David T. Hoffmann; Nadine Behrmann; Juergen Gall; Thomas Brox; Mehdi Noroozi; |
252 | Uncertainty-Driven Dehazing Network Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel uncertainty-driven dehazing network (UDN) that improves the dehazing results by exploiting the relationship between the uncertain and confident representations. |
Ming Hong; Jianzhuang Liu; Cuihua Li; Yanyun Qu; |
253 | Shadow Generation for Composite Image in Real-World Scenes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we focus on generating plausible shadow for the foreground object in the composite image. |
Yan Hong; Li Niu; Jianfu Zhang; |
254 | Shape-Adaptive Selection and Measurement for Oriented Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose novel flexible shape-adaptive selection (SA-S) and shape-adaptive measurement (SA-M) strategies for oriented object detection, which comprise an SA-S strategy for sample selection and SA-M strategy for the quality estimation of positive samples. |
Liping Hou; Ke Lu; Jian Xue; Yuqiu Li; |
255 | H^2-MIL: Exploring Hierarchical Representation with Heterogeneous Multiple Instance Learning for Whole Slide Image Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents a novel graph neural network-based multiple instance learning framework (i.e., H^2-MIL) to learn hierarchical representation from a heterogeneous graph with different resolutions for WSI analysis. |
Wentai Hou; Lequan Yu; Chengxuan Lin; Helong Huang; Rongshan Yu; Jing Qin; Liansheng Wang; |
256 | Elastic-Link for Binarized Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose an “Elastic-Link” (EL) module to enrich information flow within a BNN by adaptively adding real-valued input features to the subsequent convolutional output features. |
Jie Hu; Ziheng Wu; Vince Tan; Zhilin Lu; Mengze Zeng; Enhua Wu; |
257 | FInfer: Frame Inference-Based Deepfake Detection for High-Visual-Quality Videos Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a frame inference-based detection framework (FInfer) to solve the problem of high-visual-quality Deepfake detection. |
Juan Hu; Xin Liao; Jinwen Liang; Wenbo Zhou; Zheng Qin; |
258 | Bi-volution: A Static and Dynamic Coupled Filter Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, SOTA dynamic convolution operators are sensitive to input noises (e.g., Gaussian noise, shot noise, e.t.c.) and lack sufficient spatial contextual information in filter generation. To alleviate this inherent weakness, we propose a lightweight and heterogeneous-structure (i.e., static and dynamic) operator, named Bi-volution. |
Xiwei Hu; Xuanhong Chen; Bingbing Ni; Teng Li; Yutian Liu; |
259 | AFDetV2: Rethinking The Necessity of The Second Stage for Object Detection from Point Clouds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this scenario, the second stage mainly rescores the boxes such that the boxes with better localization get selected. From this observation, we have devised a single-stage anchor-free network that can fulfill these requirements. |
Yihan Hu; Zhuangzhuang Ding; Runzhou Ge; Wenxin Shao; Li Huang; Kun Li; Qiang Liu; |
260 | Divide-and-Regroup Clustering for Domain Adaptive Person Re-identification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In contrast, the temporal continuity prior is beneficial, because it offers clue for distinguishing some look-alike person (who are temporally far away from each other). These two insight motivate us to propose a novel Divide-And-Regroup Clustering (DARC) pipeline for re-ID UDA. |
Zhengdong Hu; Yifan Sun; Yi Yang; Jianguang Zhou; |
261 | CMUA-Watermark: A Cross-Model Universal Adversarial Watermark for Combating Deepfakes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Despite achieving impressive results, these adversarial watermarks have low image-level and model-level transferability, meaning that they can protect only one facial image from one specific deepfake model. To address these issues, we propose a novel solution that can generate a Cross-Model Universal Adversarial Watermark (CMUA-Watermark), protecting a large number of facial images from multiple deepfake models. |
Hao Huang; Yongtao Wang; Zhaoyu Chen; Yuze Zhang; Yuheng Li; Zhi Tang; Wei Chu; Jingdong Chen; Weisi Lin; Kai-Kuang Ma; |
262 | Deconfounded Visual Grounding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We focus on the confounding bias between language and location in the visual grounding pipeline, where we find that the bias is the major visual reasoning bottleneck. |
Jianqiang Huang; Yu Qin; Jiaxin Qi; Qianru Sun; Hanwang Zhang; |
263 | Learning to Model Pixel-Embedded Affinity for Homogeneous Instance Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a pixel-embedded affinity modeling method for homogeneous instance segmentation, which is able to preserve the semantic information of instances and improve the distinguishability of adjacent instances. |
Wei Huang; Shiyu Deng; Chang Chen; Xueyang Fu; Zhiwei Xiong; |
264 | Channelized Axial Attention – Considering Channel Relation Within Spatial Attention for Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose Channelized Axial Attention (CAA) to seamlessly integrate channel attention and spatial attention into a single operation with negligible computation overhead. |
Ye Huang; Di Kang; Wenjing Jia; Liu Liu; Xiangjian He; |
265 | UFPMP-Det:Toward Accurate and Efficient Object Detection on Drone Imagery Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes a novel approach to object detection on drone imagery, namely Multi-Proxy Detection Network with Unified Foreground Packing (UFPMP-Det). |
Yecheng Huang; Jiaxin Chen; Di Huang; |
266 | Modality-Adaptive Mixup and Invariant Decomposition for RGB-Infrared Person Re-identification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a novel modality-adaptive mixup and invariant decomposition (MID) approach for RGB-infrared person re-identification towards learning modality-invariant and discriminative representations. |
Zhipeng Huang; Jiawei Liu; Liang Li; Kecheng Zheng; Zheng-Jun Zha; |
267 | MuMu: Cooperative Multitask Learning-Based Guided Multimodal Fusion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a cooperative multitask learning-based guided multimodal fusion approach, MuMu, to extract robust multimodal representations for human activity recognition (HAR). |
Md Mofijul Islam; Tariq Iqbal; |
268 | An Unsupervised Way to Understand Artifact Generating Internal Units in Generative Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose the concept of local activation, and devise a metric on the local activation to detect artifact generations without additional supervision. |
Haedong Jeong; Jiyeon Han; Jaesik Choi; |
269 | FrePGAN: Robust Deepfake Detection Using Frequency-Level Perturbations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Thus, we design a framework to generalize the deepfake detector for both the known and unseen GAN models. |
Yonghyun Jeong; Doyeon Kim; Youngmin Ro; Jongwon Choi; |
270 | Learning Disentangled Attribute Representations for Robust Pedestrian Attribute Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, this mechanism leads to low-confidence predictions and non-robustness of the model in the inference stage. In this paper, we investigate why this is the case. |
Jian Jia; Naiyu Gao; Fei He; Xiaotang Chen; Kaiqi Huang; |
271 | Degrade Is Upgrade: Learning Degradation for Low-Light Image Enhancement Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Inspired by the color image formulation (diffuse illumination color plus environment illumination color), we first estimate the degradation from low-light inputs to simulate the distortion of environment illumination color, and then refine the content to recover the loss of diffuse illumination color. To this end, we propose a novel Degradation-to-Refinement Generation Network (DRGN). |
Kui Jiang; Zhongyuan Wang; Zheng Wang; Chen Chen; Peng Yi; Tao Lu; Chia-Wen Lin; |
272 | HarmoFL: Harmonizing Local and Global Drifts in Federated Learning on Heterogeneous Medical Images Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we concentrate on handling both local and global drifts and introduce a new harmonizing framework called HarmoFL. |
Meirui Jiang; Zirui Wang; Qi Dou; |
273 | Coarse-to-Fine Generative Modeling for Graphic Layouts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we seek to improve the performance of layout generation by incorporating the concept of regions, which consist of a smaller number of elements and appears like a simple layout, into the generation process. |
Zhaoyun Jiang; Shizhao Sun; Jihua Zhu; Jian-Guang Lou; Dongmei Zhang; |
274 | DarkVisionNet: Low-Light Imaging Via RGB-NIR Fusion with Deep Inconsistency Prior Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, high-intensity noise in low-light images amplifies the effect of structure inconsistency between RGB-NIR images, which fails existing algorithms. To handle this, we propose a new RGB-NIR fusion algorithm called Dark Vision Net (DVN) with two technical novelties: Deep Structure and Deep Inconsistency Prior (DIP). |
Shuangping Jin; Bingbing Yu; Minhao Jing; Yi Zhou; Jiajun Liang; Renhe Ji; |
275 | LAGConv: Local-Context Adaptive Convolution Kernels with Global Harmonic Bias for Pansharpening Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel strategy to generate local-context adaptive (LCA) convolution kernels and introduce a new global harmonic (GH) bias mechanism, exploiting image local specificity as well as integrating global information, dubbed LAGConv. |
Zi-Rong Jin; Tian-Jing Zhang; Tai-Xiang Jiang; Gemine Vivone; Liang-Jian Deng; |
276 | Learning The Dynamics of Visual Relational Reasoning Via Reinforced Path Routing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose to learn the reasoning dynamics of visual relational reasoning by casting it as a path routing task. |
Chenchen Jing; Yunde Jia; Yuwei Wu; Chuanhao Li; Qi Wu; |
277 | Towards To-a-T Spatio-Temporal Focus for Skeleton-Based Action Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Most existing approaches do not explicitly embed the high-order spatio-temporal importance to joints’ spatial connection topology and intensity, and they do not have direct objectives on their attention module to jointly learn when and where to focus on in the action sequence. To address these problems, we propose the To-a-T Spatio-Temporal Focus (STF), a skeleton-based action recognition framework that utilizes the spatio-temporal gradient to focus on relevant spatio-temporal features. |
Lipeng Ke; Kuan-Chuan Peng; Siwei Lyu; |
278 | MODNet: Real-Time Trimap-Free Portrait Matting Via Objective Decomposition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present a light-weight matting objective decomposition network (MODNet) for portrait matting in real-time with a single input image. |
Zhanghan Ke; Jiayu Sun; Kaican Li; Qiong Yan; Rynson W.H. Lau; |
279 | Learning Mixture of Domain-Specific Experts Via Disentangled Factors for Autonomous Driving Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this study, the problem in behavior cloning is divided into several domain-specific subspaces, with experts becoming specialized on each domain-specific policy. |
Inhan Kim; Joonyeong Lee; Daijin Kim; |
280 | Towards Versatile Pedestrian Detector with Multisensory-Matching and Multispectral Recalling Memory Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce a versatile pedestrian detector that shows robust detection performance in any single modality. |
Jung Uk Kim; Sungjune Park; Yong Man Ro; |
281 | Semantic Feature Extraction for Generalized Zero-Shot Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we put forth a new GZSL technique that improves the GZSL classification performance greatly. |
Junhan Kim; Kyuhong Shim; Byonghyo Shim; |
282 | Distinguishing Homophenes Using Multi-Head Visual-Audio Memory for Lip Reading Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we try to alleviate the aforementioned two challenges in lip reading by proposing a Multi-head Visual-audio Memory (MVM). |
Minsu Kim; Jeong Hun Yeo; Yong Man Ro; |
283 | Deep Translation Prior: Test-Time Training for Photorealistic Style Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Recent techniques to solve photorealistic style transfer within deep convolutional neural networks (CNNs) generally require intensive training from large-scale datasets, thus having limited applicability and poor generalization ability to unseen images or styles. To overcome this, we propose a novel framework, dubbed Deep Translation Prior (DTP), to accomplish photorealistic style transfer through test-time training on given input image pair with untrained networks, which learns an image pair-specific translation prior and thus yields better performance and generalization. |
Sunwoo Kim; Soohyun Kim; Seungryong Kim; |
284 | PrivateSNN: Privacy-Preserving Spiking Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose PrivateSNN, which aims to build low-power Spiking Neural Networks (SNNs) from a pre-trained ANN model without leaking sensitive information contained in a dataset. |
Youngeun Kim; Yeshwanth Venkatesha; Priyadarshini Panda; |
285 | NaturalInversion: Data-Free Image Synthesis Improving Real-World Consistency Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce NaturalInversion, a novel model inversion-based method to synthesize images that agrees well with the original data distribution without using real data. |
Yujin Kim; Dogyun Park; Dohee Kim; Suhyun Kim; |
286 | Joint 3D Object Detection and Tracking Using Spatio-Temporal Representation of Camera Image and LiDAR Point Clouds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a new joint object detection and tracking (JoDT) framework for 3D object detection and tracking based on camera and LiDAR sensors. |
Junho Koh; Jaekyum Kim; Jin Hyeok Yoo; Yecheol Kim; Dongsuk Kum; Jun Won Choi; |
287 | Amplitude Spectrum Transformation for Open Compound Domain Adaptive Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, we find that latent features derived from the Fourier-based amplitude spectrum of deep CNN features hold a more tractable mapping with domain discrimination. Motivated by this, we propose a novel feature space Amplitude Spectrum Transformation (AST). |
Jogendra Nath Kundu; Akshay R Kulkarni; Suvaansh Bhambri; Varun Jampani; Venkatesh Babu Radhakrishnan; |
288 | Siamese Network with Interactive Transformer for Video Object Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel Siamese network with a specifically designed interactive transformer, called SITVOS, to enable effective context propagation from historical to current frames. |
Meng Lan; Jing Zhang; Fengxiang He; Lefei Zhang; |
289 | Adversarial Attack for Asynchronous Event-Based Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our algorithm achieves an attack success rate of 97.95% on the N-Caltech101 dataset. |
Wooju Lee; Hyun Myung; |
290 | Iteratively Selecting An Easy Reference Frame Makes Unsupervised Video Object Segmentation Easier Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In our paper, we propose Easy Frame Selector (EFS). |
Youngjo Lee; Hongje Seong; Euntai Kim; |
291 | SCTN: Sparse Convolution-Transformer Network for Scene Flow Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a novel scene flow estimation approach to capture and infer 3D motions from point clouds. |
Bing Li; Cheng Zheng; Silvio Giancola; Bernard Ghanem; |
292 | Shrinking Temporal Attention in Transformers for Video Action Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes a Shrinking Temporal Attention Transformer (STAT), which efficiently builts spatiotemporal attention maps considering the attenuation of spatial attention in short and long temporal sequences. |
Bonan Li; Pengfei Xiong; Congying Han; Tiande Guo; |
293 | DanceFormer: Music Conditioned 3D Dance Generation with Parametric Motion Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we reformulate it by a two-stage process, i.e., a key pose generation and then an in-between parametric motion curve prediction, where the key poses are easier to be synchronized with the music beats and the parametric curves can be efficiently regressed to render fluent rhythm-aligned movements. |
Buyu Li; Yongchi Zhao; Shi Zhelun; Lu Sheng; |
294 | Interpretable Generative Adversarial Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes a generic method to modify a traditional GAN into an interpretable GAN, which ensures that filters in an intermediate layer of the generator encode disentangled localized visual concepts. |
Chao Li; Kelu Yao; Jin Wang; Boyu Diao; Yongjun Xu; Quanshi Zhang; |
295 | Cross-Modal Object Tracking: Modality-Aware Representations and A Unified Benchmark Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we address the cross-modal object tracking problem and contribute a new video dataset, including 654 cross-modal image sequences with over 481K frames in total, and the average video length is more than 735 frames. |
Chenglong Li; Tianhao Zhu; Lei Liu; Xiaonan Si; Zilin Fan; Sulan Zhai; |
296 | You Only Infer Once: Cross-Modal Meta-Transfer for Referring Video Object Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present YOFO (You Only inFer Once), a new paradigm for referring video object segmentation (RVOS) that operates in an one-stage manner. |
Dezhuang Li; Ruoqi Li; Lijun Wang; Yifan Wang; Jinqing Qi; Lu Zhang; Ting Liu; Qingquan Xu; Huchuan Lu; |
297 | Knowledge Distillation for Object Detection Via Rank Mimicking and Prediction-Guided Feature Imitation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we elaborately study the behaviour difference between the teacher and student detection models, and obtain two intriguing observations: First, the teacher and student rank their detected candidate boxes quite differently, which results in their precision discrepancy. |
Gang Li; Xiang Li; Yujie Wang; Shanshan Zhang; Yichao Wu; Ding Liang; |
298 | Rethinking Pseudo Labels for Semi-supervised Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce certainty-aware pseudo labels tailored for object detection, which can effectively estimate the classification and localization quality of derived pseudo labels. |
Hengduo Li; Zuxuan Wu; Abhinav Shrivastava; Larry S. Davis; |
299 | Action-Aware Embedding Enhancement for Image-Text Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose Action-aware Memory-Enhanced embedding (AME) method for image-text retrieval, which aims to emphasize the action information when mapping the images and texts into a shared embedding space. |
Jiangtong Li; Li Niu; Liqing Zhang; |
300 | Retinomorphic Object Detection in Asynchronous Visual Streams Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel problem setting, retinomorphic object detection, which is the first trial that integrates foveal-like and peripheral-like visual streams. |
Jianing Li; Xiao Wang; Lin Zhu; Jia Li; Tiejun Huang; Yonghong Tian; |
301 | Learning from Weakly-Labeled Web Videos Via Exploring Sub-concepts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, for video action recognition, the action of interest might only exist in arbitrary clips of untrimmed web videos, resulting in high label noises in the temporal space. To address this challenge, we introduce a new method for pre-training video action recognition models using queried web videos. |
Kunpeng Li; Zizhao Zhang; Guanhang Wu; Xuehan Xiong; Chen-Yu Lee; Zhichao Lu; Yun Fu; Tomas Pfister; |
302 | Learning Universal Adversarial Perturbation By Adversarial Example Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The existing universal attack methods fail to exploit the differences and connections between the instance and universal levels to produce dominant perturbations. To address this challenge, we propose a new universal attack method that unifies instance-specific and universal attacks from a feature perspective to generate a more dominant UAP. |
Maosen Li; Yanhua Yang; Kun Wei; Xu Yang; Heng Huang; |
303 | Logit Perturbation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on a unified viewpoint between positive/negative data augmentation and loss variations incurred by logit perturbation, a new method is proposed to explicitly learn to perturb logits. |
Mengyang Li; Fengguang Su; Ou Wu; Ji Zhang; |
304 | Neighborhood-Adaptive Structure Augmented Metric Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: By exploiting the heterogeneity of local structures in the embedding space, we propose a Neighborhood-Adaptive Structure Augmented metric learning framework (NASA), where the neighborhood structure is realized as a structure embedding, and learned along with the sample embedding in a self-supervised manner. |
Pandeng Li; Yan Li; Hongtao Xie; Lei Zhang; |
305 | Stereo Neural Vernier Caliper Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a new object-centric framework for learning-based stereo 3D object detection. |
Shichao Li; Zechun Liu; Zhiqiang Shen; Kwang-Ting Cheng; |
306 | EditVAE: Unsupervised Parts-Aware Controllable 3D Point Cloud Shape Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In particular, we introduce a latent representation of the point cloud which can be decomposed into a disentangled representation for each part of the shape. |
Shidi Li; Miaomiao Liu; Christian Walder; |
307 | Self-Training Multi-Sequence Learning with Transformer for Weakly Supervised Video Anomaly Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In the inference stage, we propose to use the video-level anomaly probability to suppress the fluctuation of snippet-level anomaly scores. |
Shuo Li; Fang Liu; Licheng Jiao; |
308 | TA2N: Two-Stage Action Alignment Network for Few-Shot Action Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Recently, it has been observed that directly measuring this similarity is not ideal since different action instances may show distinctive temporal distribution, resulting in severe misalignment issues across query and support videos. In this paper, we arrest this problem from two distinct aspects — action duration misalignment and action evolution misalignment. |
Shuyuan Li; Huabin Liu; Rui Qian; Yuxi Li; John See; Mengjuan Fei; Xiaoyuan Yu; Weiyao Lin; |
309 | Best-Buddy GANs for Highly Detailed Image Super-resolution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Besides, we propose a region-aware adversarial learning strategy that directs our model to focus on generating details for textured areas adaptively. |
Wenbo Li; Kun Zhou; Lu Qi; Liying Lu; Jiangbo Lu; |
310 | SCAN: Cross Domain Object Detection with Semantic Conditioned Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Besides, the category-agnostic alignment leads to the disagreement of class-specific distributions in the two domains, further causing inevitable classification errors. To overcome these two challenges, we propose a novel Semantic Conditioned AdaptatioN (SCAN) framework such that well-modeled unbiased semantics can support semantic conditioned adaptation for precise domain adaptive object detection. |
Wuyang Li; Xinyu Liu; Xiwen Yao; Yixuan Yuan; |
311 | Hybrid Instance-Aware Temporal Fusion for Online Video Instance Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose an online video instance segmentation framework with a novel instance-aware temporal fusion method. |
Xiang Li; Jinglu Wang; Xiao Li; Yan Lu; |
312 | Close The Loop: A Unified Bottom-Up and Top-Down Paradigm for Joint Image Deraining and Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we focus on a very practical problem: image segmentation under rain conditions. |
Yi Li; Yi Chang; Changfeng Yu; Luxin Yan; |
313 | Uncertainty Estimation Via Response Scaling for Pseudo-Mask Noise Mitigation in Weakly-Supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Thus, in this paper, we simulate noisy variations of response by scaling the prediction map in multiple times for uncertainty estimation. |
Yi Li; Yiqun Duan; Zhanghui Kuang; Yimin Chen; Wayne Zhang; Xiaomeng Li; |
314 | Multi-Modal Perception Attention Network with Self-Supervised Learning for Audio-Visual Speaker Tracking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities. |
Yidi Li; Hong Liu; Hao Tang; |
315 | Defending Against Model Stealing Via Verifying Embedded External Features Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we explore the defense from another angle by verifying whether a suspicious model contains the knowledge of defender-specified external features. |
Yiming Li; Linghui Zhu; Xiaojun Jia; Yong Jiang; Shu-Tao Xia; Xiaochun Cao; |
316 | Towards An Effective Orthogonal Dictionary Convolution Strategy Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here, we propose a novel Orthogonal Dictionary Convolution Strategy (ODCS) on CNNs to improve orthogonality effect by optimizing the network architecture and changing the regularized object. |
Yishi Li; Kunran Xu; Rui Lai; Lin Gu; |
317 | ELMA: Energy-Based Learning for Multi-Agent Activity Forecasting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper describes an energy-based learning method that predicts the activities of multiple agents simultaneously. |
Yuke Li; Pin Wang; Lixiong Chen; Zheng Wang; Ching-Yao Chan; |
318 | Equal Bits: Enforcing Equally Distributed Binary Network Weights Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we show that quantizing using optimal transport can guarantee any bit ratio, including equal ratios. |
Yunqiang Li; Silvia-Laura Pintea; Jan C van Gemert; |
319 | SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-training for Spatial-Aware Visual Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Due to the discrepancy between the two-dimensional image plane and the three-dimensional space, such pre-trained models fail to perceive spatial information and serve as sub-optimal solutions for 3D-related tasks. To bridge this gap, we aim to learn a spatial-aware visual representation that can describe the three-dimensional space and is more suitable and effective for these tasks. |
Zhenyu Li; Zehui Chen; Ang Li; Liangji Fang; Qinhong Jiang; Xianming Liu; Junjun Jiang; Bolei Zhou; Hang Zhao; |
320 | Improving Human-Object Interaction Detection Via Phrase Learning and Label Composition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose PhraseHOI, containing a HOI branch and a novel phrase branch, to leverage language prior and improve relation expression. |
Zhimin Li; Cheng Zou; Yu Zhao; Boxun Li; Sheng Zhong; |
321 | Rethinking The Optimization of Average Precision: Only Penalizing Negative Instances Before Positive Ones Is Enough Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, we claim that only penalizing negative instances before positive ones is enough, because the loss only comes from these negative instances. To this end, we propose a novel loss, namely Penalizing Negative instances before Positive ones (PNP), which can directly minimize the number of negative instances before each positive one. |
Zhuo Li; Weiqing Min; Jiajun Song; Yaohui Zhu; Liping Kang; Xiaoming Wei; Xiaolin Wei; Shuqiang Jiang; |
322 | Reliability Exploration with Self-Ensemble Learning for Domain Adaptive Person Re-identification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a Reliability Exploration with Self-ensemble Learning (RESL) framework for domain adaptive person ReID. |
Zongyi Li; Yuxuan Shi; Hefei Ling; Jiazhong Chen; Qian Wang; Fengfan Zhou; |
323 | Deconfounding Physical Dynamics with Global Causal Relation and Confounder Transmission for Counterfactual Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we work on the confounders that have effect on the physical dynamics, including masses, friction coefficients, etc., to bridge relations between the intervened variable and the affected variable whose future state may be altered. |
Zongzhao Li; Xiangyu Zhu; Zhen Lei; Zhaoxiang Zhang; |
324 | One More Check: Making “Fake Background” Be Tracked Again Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we set out to restore the bounding boxes misclassified as “fake background” by proposing a re-check network. |
Chao Liang; Zhipeng Zhang; Xue Zhou; Bing Li; Weiming Hu; |
325 | Semantically Contrastive Learning for Low-Light Image Enhancement Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we respond to the intriguing learning-related question — if leveraging both accessible unpaired over/underexposed images and high-level semantic guidance, can improve the performance of cutting-edge LLE models? |
Dong Liang; Ling Li; Mingqiang Wei; Shuo Yang; Liyan Zhang; Wenhan Yang; Yun Du; Huiyu Zhou; |
326 | Self-Supervised Spatiotemporal Representation Learning By Exploiting Video Continuity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Recent self-supervised video representation learning methods have found significant success by exploring essential properties of videos, e.g. speed, temporal order, etc. |
Hanwen Liang; Niamul Quader; Zhixiang Chi; Lizhe Chen; Peng Dai; Juwei Lu; Yang Wang; |
327 | Inharmonious Region Localization By Magnifying Domain Discrepancy Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we tend to transform the input image to another color space to magnify the domain discrepancy between inharmonious region and background, so that the model can identify the inharmonious region more easily. |
Jing Liang; Li Niu; Penghao Wu; Fengjun Guo; Teng Long; |
328 | Distribution Aware VoteNet for 3D Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we revise the common regression method by predicting the distribution of the 3D box and then present a distribution-aware regression (DAR) module for box refinement and localization quality estimation. |
Junxiong Liang; Pei An; Jie Ma; |
329 | Contrastive Instruction-Trajectory Learning for Vision-Language Navigation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a Contrastive Instruction-Trajectory Learning (CITL) framework that explores invariance across similar data samples and variance across different ones to learn distinctive representations for robust navigation. |
Xiwen Liang; Fengda Zhu; Yi Zhu; Bingqian Lin; Bing Wang; Xiaodan Liang; |
330 | Interventional Multi-Instance Learning with Deconfounded Instance-Level Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: From the viewpoint of causal inference, such bag contextual prior works as a confounder and may result in model robustness and interpretability issues. Focusing on this problem, we propose a novel interventional multi-instance learning (IMIL) framework to achieve deconfounded instance-level prediction. |
Tiancheng Lin; Hongteng Xu; Canqian Yang; Yi Xu; |
331 | A Causal Debiasing Framework for Unsupervised Salient Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Spatial distribution bias means that the position distribution of all salient objects in a dataset is concentrated on the center of the image plane, which could be harmful to off-center objects prediction. This paper proposes a causal based debiasing framework to disentangle the model from the impact of such biases. |
Xiangru Lin; Ziyi Wu; Guanqi Chen; Guanbin Li; Yizhou Yu; |
332 | A Causal Inference Look at Unsupervised Video Anomaly Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Abstract: Unsupervised video anomaly detection, a task that requires no labeled normal/abnormal training data in any form, is challenging yet of great importance to both industrial … |
Xiangru Lin; Yuyang Chen; Guanbin Li; Yizhou Yu; |
333 | Unpaired Multi-Domain Stain Transfer for Kidney Histopathological Images Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, due to the interference of colors among multiple stains, it is not easy to perform multiple staining simultaneously on one biological tissue. To address this problem, we propose a network based on unpaired training data to virtually generate multiple types of staining from one staining. |
Yiyang Lin; Bowei Zeng; Yifeng Wang; Yang Chen; Zijie Fang; Jian Zhang; Xiangyang Ji; Haoqian Wang; Yongbing Zhang; |
334 | Dynamic Spatial Propagation Network for Depth Completion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our solution is to estimate independent affinity matrices in each SPN iteration, but it is over-parameterized and heavy calculation.This paper introduces an efficient model that learns the affinity among neighboring pixels with an attention-based, dynamic approach. |
Yuankai Lin; Tao Cheng; Qi Zhong; Wending Zhou; Hua Yang; |
335 | Local Similarity Pattern and Cost Self-Reassembling for Deep Stereo Matching Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: 2) Due to the static filters, current convolution based disparity refinement modules often produce over-smooth results. In this paper, we present two schemes to address these issues, where some traditional wisdoms are integrated. |
Biyang Liu; Huimin Yu; Yangqi Long; |
336 | FedFR: Joint Optimization Federated Framework for Generic and Personalized Face Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a FL based framework called FedFR to improve the generic face representation in a privacy-aware manner. |
Chih-Ting Liu; Chien-Yi Wang; Shao-Yi Chien; Shang-Hong Lai; |
337 | Memory-Guided Semantic Learning Network for Temporal Sentence Grounding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Although existing methods train well-designed deep networks with large amount of data, we find that they can easily forget the rarely appeared cases during training due to the off-balance data distribution, which influences the model generalization and leads to unsatisfactory performance. To tackle this issue, we propose a memory-augmented network, called Memory-Guided Semantic Learning Network (MGSL-Net), that learns and memorizes the rarely appeared content in TSG task. |
Daizong Liu; Xiaoye Qu; Xing Di; Yu Cheng; Zichuan Xu; Pan Zhou; |
338 | Exploring Motion and Appearance Information for Temporal Sentence Grounding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the object-level features extracted by Faster R-CNN suffer from missing motion analysis since the object detection model lacks temporal modeling. To solve this issue, we propose a novel Motion-Appearance Reasoning Network (MARN), which incorporates both motion-aware and appearance-aware object features to better reason object relations for modeling the activity among successive frames. |
Daizong Liu; Xiaoye Qu; Pan Zhou; Yang Liu; |
339 | Unsupervised Temporal Video Grounding with Deep Semantic Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we explore whether a video grounding model can be learned without any paired annotations. |
Daizong Liu; Xiaoye Qu; Yinzhen Wang; Xing Di; Kai Zou; Yu Cheng; Zichuan Xu; Pan Zhou; |
340 | SpikeConverter: An Efficient Conversion Framework Zipping The Gap Between Artificial Neural Networks and Spiking Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To better correlate ANN and SNN for better performance, we propose a conversion framework to mitigate the gap between the activation value of source ANN and the generated spike train of target SNN. |
Fangxin Liu; Wenbo Zhao; Yongbiao Chen; Zongwu Wang; Li Jiang; |
341 | Perceiving Stroke-Semantic Context: Hierarchical Contrastive Learning for Robust Scene Text Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce Perceiving Stroke-Semantic Context (PerSec), a new approach to self-supervised representation learning tailored for Scene Text Recognition (STR) task. |
Hao Liu; Bin Wang; Zhimin Bao; Mobai Xue; Sheng Kang; Deqiang Jiang; Yinsong Liu; Bo Ren; |
342 | AnchorFace: Boosting TAR@FAR for Practical Face Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we call the predefined FAR as Anchor FAR, and we argue that the existing FR loss functions cannot guarantee the optimal TAR under the Anchor FAR, which impedes further improvements of FR systems. |
Jiaheng Liu; Haoyu Qin; Yichao Wu; Ding Liang; |
343 | Memory-Based Jitter: Improving Visual Recognition on Long-Tailed Data with Diversity in Memory Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A radical solution is to augment the tail classes with higher diversity. To this end, we introduce a simple and reliable method named Memory-based Jitter (MBJ). |
Jialun Liu; Wenhui Li; Yifan Sun; |
344 | Debiased Batch Normalization Via Gaussian Process for Generalizable Person Re-identification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel Debiased Batch Normalization via Gaussian Process approach (GDNorm) for generalizable person re-identification, which models the feature statistic estimation from BN layers as a dynamically self-refining Gaussian process to alleviate the bias to unseen domain for improving the generalization. |
Jiawei Liu; Zhipeng Huang; Liang Li; Kecheng Zheng; Zheng-Jun Zha; |
345 | Parallel and High-Fidelity Text-to-Lip Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a parallel decoding model for fast and high-fidelity text-to-lip generation (ParaLip). |
Jinglin Liu; Zhiying Zhu; Yi Ren; Wencan Huang; Baoxing Huai; Nicholas Yuan; Zhou Zhao; |
346 | SiamTrans: Zero-Shot Multi-Frame Image Restoration with Pre-trained Siamese Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel zero-shot multi-frame image restoration method for removing unwanted obstruction elements (such as rains, snow, and moire patterns) that vary in successive frames. |
Lin Liu; Shanxin Yuan; Jianzhuang Liu; Xin Guo; Youliang Yan; Qi Tian; |
347 | Single-Domain Generalization in Medical Image Segmentation Via Test-Time Adaptation from Shape Dictionary Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper studies the important yet challenging single domain generalization problem, in which a model is learned under the worst-case scenario with only one source domain to directly generalize to different unseen target domains. We present a novel approach to address this problem in medical image segmentation, which extracts and integrates the semantic shape prior information of segmentation that are invariant across domains and can be well-captured even from single domain data to facilitate segmentation under distribution shifts. |
Quande Liu; Cheng Chen; Qi Dou; Pheng-Ann Heng; |
348 | Learning to Predict 3D Lane Shape and Camera Pose from A Single Image Via Geometry Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, mainstream 3D lane detectors rely on perfect camera poses provided by other sensors, which is expensive and encounters multi-sensor calibration issues. To overcome this problem, we propose to predict 3D lanes by estimating camera pose from a single image with a two-stage framework. |
Ruijin Liu; Dapeng Chen; Tie Liu; Zhiliang Xiong; Zejian Yuan; |
349 | OVIS: Open-Vocabulary Visual Instance Search Via Visual-Semantic Aligned Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce the task of open-vocabulary visual instance search (OVIS). |
Sheng Liu; Kevin Lin; Lijuan Wang; Junsong Yuan; Zicheng Liu; |
350 | Feature Generation and Hypothesis Verification for Reliable Face Anti-spoofing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a Feature Generation and Hypothesis Verification framework to alleviate the two issues. |
Shice Liu; Shitao Lu; Hongyi Xu; Jing Yang; Shouhong Ding; Lizhuang Ma; |
351 | Image-Adaptive YOLO for Object Detection in Adverse Weather Conditions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The existing methods either have difficulties in balancing the tasks of image enhancement and object detection, or often ignore the latent information beneficial for detection. To alleviate this problem, we propose a novel Image-Adaptive YOLO (IA-YOLO) framework, where each image can be adaptively enhanced for better detection performance. |
Wenyu Liu; Gaofeng Ren; Runsheng Yu; Shi Guo; Jianke Zhu; Lei Zhang; |
352 | Visual Sound Localization in The Wild By Cross-Modal Interference Erasing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose the Interference Eraser (IEr) framework, which tackles the problem of audiovisual sound source localization in the wild. |
Xian Liu; Rui Qian; Hang Zhou; Di Hu; Weiyao Lin; Ziwei Liu; Bolei Zhou; Xiaowei Zhou; |
353 | Learning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes a simple yet effective formulation for monocular 3D object detection without exploiting any extra information. |
Xianpeng Liu; Nan Xue; Tianfu Wu; |
354 | Highlighting Object Category Immunity for The Generalization of Human-Object Interaction Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: With mPD as a cue, we propose Object Category (OC) Immunity to boost HOI generalization. |
Xinpeng Liu; Yong-Lu Li; Cewu Lu; |
355 | DMN4: Few-Shot Learning Via Discriminative Mutual Nearest Neighbor Neural Network Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we argue that a Mutual Nearest Neighbor (MNN) relation should be established to explicitly select the query descriptors that are most relevant to each task and discard less relevant ones from aggregative clutters in FSL. |
Yang Liu; Tu Zheng; Jie Song; Deng Cai; Xiaofei He; |
356 | Multi-Knowledge Aggregation and Transfer for Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel multi-knowledge aggregation and transfer (MKAT) framework to comprehensively distill knowledge within an intermediate layer for semantic segmentation. |
Yuang Liu; Wei Zhang; Jun Wang; |
357 | Unsupervised Coherent Video Cartoonization with Perceptual Motion Consistency Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a spatially-adaptive semantic alignment framework with perceptual motion consistency for coherent video cartoonization in an unsupervised manner. |
Zhenhuan Liu; Liang Li; Huajie Jiang; Xin Jin; Dandan Tu; Shuhui Wang; Zheng-Jun Zha; |
358 | Task-Customized Self-Supervised Pre-training with Scalable Dynamic Routing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: On the other hand, for existing SSL methods, it is burdensome and infeasible to use different downstream-task-customized datasets in pre-training for different tasks. To address this issue, we propose a novel SSL paradigm called Scalable Dynamic Routing (SDR), which can be trained once and deployed efficiently to different downstream tasks with task-customized pre-trained models. |
Zhili LIU; Jianhua Han; Lanqing Hong; Hang Xu; Kai Chen; Chunjing Xu; Zhenguo Li; |
359 | Pose Guided Image Generation from Misaligned Sources Via Residual Flow Based Correction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, as source images are often misaligned due to the large disparities among the camera settings, strong assumptions have been made in the past with respect to the camera(s) or/and the object in interest, limiting the application of such techniques. Therefore, we propose a new general approach which models multiple types of variations among sources, such as view angles, poses, facial expressions, in a unified framework, so that it can be employed on datasets of vastly different nature. |
Jiawei Lu; He Wang; Tianjia Shao; Yin Yang; Kun Zhou; |
360 | PMAL: Open Set Recognition Via Robust Prototype Mining Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a novel Prototype Mining And Learning (PMAL) framework. |
Jing Lu; Yunlu Xu; Hao Li; Zhanzhan Cheng; Yi Niu; |
361 | Barely-Supervised Learning: Semi-supervised Learning with Very Few Labeled Images Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a method to leverage self-supervised methods that provides training signal in the absence of confident pseudo-labels. |
Thomas Lucas; Philippe Weinzaepfel; Gregory Rogez; |
362 | Learning Optical Flow with Adaptive Graph Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, taking a fresh perspective, we introduce a novel graph-based approach, called adaptive graph reasoning for optical flow (AGFlow), to emphasize the value of scene/context information in optical flow. |
Ao Luo; Fan Yang; Kunming Luo; Xin Li; Haoqiang Fan; Shuaicheng Liu; |
363 | A Fusion-Denoising Attack on InstaHide with Data Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This leads to a natural question: is InstaHide with data augmentation secure? In this paper, we provide a negative answer to this question, by devising an attack for recovering private images from the outputs of InstaHide even when data augmentation is present. |
Xinjian Luo; Xiaokui Xiao; Yuncheng Wu; Juncheng Liu; Beng Chin Ooi; |
364 | Deep Neural Networks Learn Meta-Structures from Noisy Labels in Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: So far, our understanding of the learning behavior of DNNs trained by noisy segmentation labels remains limited. In this study, we address this deficiency in both binary segmentation of biological microscopy images and multi-class segmentation of natural images. |
Yaoru Luo; Guole Liu; Yuanhao Guo; Ge Yang; |
365 | Stochastic Planner-Actor-Critic for Unsupervised Deformable Image Registration Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While deep learning-based methods can learn the complex mapping from input images to their respective deformation field, it is regression-based and is prone to be stuck at local minima, particularly when large deformations are involved. To this end, we present Stochastic Planner-Actor-Critic (spac), a novel reinforcement learning-based framework that performs step-wise registration. |
Ziwei Luo; Jing Hu; Xin Wang; Shu Hu; Bin Kong; Youbing Yin; Qi Song; Xi Wu; Siwei Lyu; |
366 | Adaptive Poincaré Point to Set Distance for Few-Shot Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose to learn a context-aware hyperbolic metric to characterize the distance between a point and a set associated with a learned set to set distance. |
Rongkai Ma; Pengfei Fang; Tom Drummond; Mehrtash Harandi; |
367 | Generative Adaptive Convolutions for Real-World Noisy Image Denoising Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This will induce problems of most deep denoisers for the overfitting or degrading performance due to the noise discrepancy between the training and test sets. To remedy this issue, we propose a novel flexible and adaptive denoising network, coined as FADNet. |
Ruijun Ma; Shuyi Li; Bob Zhang; Zhengming Li; |
368 | REMOTE: Reinforced Motion Transformation Network for Semi-supervised 2D Pose Estimation in Videos Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a semi-supervised REinforced MOtion Transformation nEtwork (REMOTE) to leverage a few labeled frames and temporal pose variations in videos, which enables effective learning of 2D pose estimation in sparsely annotated videos. |
Xianzheng Ma; Hossein Rahmani; Zhipeng Fan; Bin Yang; Jun Chen; Jun Liu; |
369 | Learning from The Target: Dual Prototype Network for Few Shot Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Along with the prototype extracted from the support set, we propose to build the pseudo-prototype based on foreground features in the query image. |
Binjie Mao; Xinbang Zhang; Lingfeng Wang; Qian Zhang; Shiming Xiang; Chunhong Pan; |
370 | MOST-GAN: 3D Morphable StyleGAN for Disentangled Face Image Manipulation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In contrast, we propose a framework that a priori models physical attributes of the face such as 3D shape, albedo, pose, and lighting explicitly, thus providing disentanglement by design. |
Safa C. Medin; Bernhard Egger; Anoop Cherian; Ye Wang; Joshua B. Tenenbaum; Xiaoming Liu; Tim K. Marks; |
371 | Towards Bridging Sample Complexity and Model Capacity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Besides, we introduce a simple indicator to evaluate the sample complexity based on continuous mapping. |
Shibin Mei; Chenglong Zhao; Shengchao Yuan; Bingbing Ni; |
372 | Towards Accurate Facial Motion Retargeting with Identity-Consistent and Expression-Exclusive Constraints Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As a result, these methods may not achieve promising performance. To address this, we propose an identity-consistent constraint to learn accurate identities by encouraging consistent identity prediction across multiple frames. |
Langyuan Mo; Haokun Li; Chaoyang Zou; Yubing Zhang; Ming Yang; Yihong Yang; Mingkui Tan; |
373 | Can Vision Transformers Learn Without Natural Images? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we experimentally verify that the results of formula-driven supervised learning (FDSL) framework are comparable with, and can even partially outperform, sophisticated self-supervised learning (SSL) methods like SimCLRv2 and MoCov2 without using any natural images in the pre-training phase. |
Kodai Nakashima; Hirokatsu Kataoka; Asato Matsumoto; Kenji Iwata; Nakamasa Inoue; Yutaka Satoh; |
374 | Federated Learning for Face Recognition with Gradient Correction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce a framework, FedGC, to tackle federated learning for face recognition and guarantees higher privacy. |
Yifan Niu; Weihong Deng; |
375 | Restorable Image Operators with Quasi-Invertible Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, most image operators often smooth out details or generate textures after the processing, which removes the original content and raises challenges for restoring the original image. To resolve this issue, we propose a quasi-invertible model that learns common image processing operators in a restorable fashion: the learned image operators can generate visually pleasing results with the original content embedded. |
Hao Ouyang; Tengfei Wang; Qifeng Chen; |
376 | TEACh: Task-Driven Embodied Agents That Chat Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Robots operating in human spaces must be able to engage in natural language interaction, both understanding and executing instructions, and using conversation to resolve ambiguity and correct mistakes. To study this, we introduce TEACh, a dataset of over 3,000 human-human, interactive dialogues to complete household tasks in simulation. |
Aishwarya Padmakumar; Jesse Thomason; Ayush Shrivastava; Patrick Lange; Anjali Narayan-Chen; Spandana Gella; Robinson Piramuthu; Gokhan Tur; Dilek Hakkani-Tur; |
377 | Label-Efficient Hybrid-Supervised Learning for Medical Image Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, these approaches only concentrate on the supervision inconsistency between strongly- and weakly-annotated instances but ignore the instance inconsistency inside the weakly-annotated instances, which inevitably leads to performance degradation. To address this problem, we propose a novel label-efficient hybrid-supervised framework, which considers each weakly-annotated instance individually and learns its weight guided by the gradient direction of the strongly-annotated instances, so that the high-quality prior in the strongly-annotated instances is better exploited and the weakly-annotated instances are depicted more precisely. |
Junwen Pan; Qi Bi; Yanzhan Yang; Pengfei Zhu; Cheng Bian; |
378 | Less Is More: Pay Less Attention in Vision Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we propose a hierarchical Transformer where we use pure multi-layer perceptrons (MLPs) to encode rich local patterns in the early stages while applying self-attention modules to capture longer dependencies in deeper layers. |
Zizheng Pan; Bohan Zhuang; Haoyu He; Jing Liu; Jianfei Cai; |
379 | Unsupervised Representation for Semantic Segmentation By Implicit Cycle-Attention Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We first explore and present two factors that have significant effects on segmentation under the contrastive learning framework: 1) the difficulty and diversity of the positive contrastive pairs, 2) the balance of global and local features. With the intention of optimizing these factors, we propose the cycle-attention contrastive learning (CACL). |
Bo Pang; Yizhuo Li; Yifan Zhang; Gao Peng; Jiajun Tang; Kaiwen Zha; Jiefeng Li; Cewu Lu; |
380 | Graph-Based Point Tracker for 3D Object Tracking in Point Clouds Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, a new deep learning network named as graph-based point tracker (GPT) is proposed for 3D object tracking in point clouds. |
Minseong Park; Hongje Seong; Wonje Jang; Euntai Kim; |
381 | SyncTalkFace: Talking Face Generation with Precise Lip-Syncing Via Audio-Lip Memory Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, they struggle to synthesize fine details of the lips varying at the phoneme level as they do not sufficiently provide visual information of the lips at the video synthesis step. To overcome this limitation, our work proposes Audio-Lip Memory that brings in visual information of the mouth region corresponding to input audio and enforces fine-grained audio-visual coherence. |
Se Jin Park; Minsu Kim; Joanna Hong; Jeongsoo Choi; Yong Man Ro; |
382 | Vision Transformers Are Robust Learners Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we study the robustness of the Vision Transformer (ViT) (Dosovitskiy et al. 2021) against common corruptions and perturbations, distribution shifts, and natural adversarial examples. |
Sayak Paul; Pin-Yu Chen; |
383 | Self-Supervised Category-Level 6D Object Pose Estimation with Deep Implicit Shape Representation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, existing category-level 6D pose estimation methods usually require supervised training with a sufficient number of 6D pose annotations of objects which makes them difficult to be applied in real scenarios. To address this problem, we propose a self-supervised framework for category-level 6D pose estimation in this paper. |
Wanli Peng; Jianhang Yan; Hongtao Wen; Yi Sun; |
384 | Semantic-Aware Representation Blending for Multi-Label Image Recognition with Partial Labels Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose to blend category-specific representation across different images to transfer information of known labels to complement unknown labels, which can get rid of pre-training models and thus does not depend on sufficient annotations. |
Tao Pu; Tianshui Chen; Hefeng Wu; Liang Lin; |
385 | ReX: An Efficient Approach to Reducing Memory Cost in Image Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel approach named recurrent aggregation operator (ReX), which uses recurrent neural networks (RNNs) to effectively aggregate intra-patch features within a large receptive field to get delicate local representations, while bypassing large early activations. |
Xuwei Qian; Renlong Hang; Qingshan Liu; |
386 | CPRAL: Collaborative Panoptic-Regional Active Learning for Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel Collaborative Panoptic-Regional Active Learning framework (CPRAL) to address the semantic segmentation task. |
Yu Qiao; Jincheng Zhu; Chengjiang Long; Zeyao Zhang; Yuxin Wang; Zhenjun Du; Xin Yang; |
387 | Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Most existing methods resort to classification-based Class Activation Maps (CAMs) to play as the initial pseudo labels, which tend to focus on the discriminative image regions and lack customized characteristics for the segmentation task. To alleviate this issue, we propose a novel activation modulation and recalibration (AMR) scheme, which leverages a spotlight branch and a compensation branch to obtain weighted CAMs that can provide recalibration supervision and task-specific concepts. |
Jie Qin; Jie Wu; Xuefeng Xiao; Lujun Li; Xingang Wang; |
388 | TransMEF: A Transformer-Based Multi-Exposure Image Fusion Framework Using Self-Supervised Multi-Task Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose TransMEF, a transformer-based multi-exposure image fusion framework that uses self-supervised multi-task learning. |
Linhao Qu; Shaolei Liu; Manning Wang; Zhijian Song; |
389 | Deep Implicit Statistical Shape Models for 3D Medical Image Delineation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present deep implicit statistical shape models (DISSMs), a new approach that marries the representation power of deep networks with the benefits of SSMs. |
Ashwin Raju; Shun Miao; Dakai Jin; Le Lu; Junzhou Huang; Adam P. Harrison; |
390 | Decompose The Sounds and Pixels, Recompose The Events Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a framework centering around a novel architecture called the Event Decomposition Recomposition Network (EDRNet) to tackle the Audio-Visual Event (AVE) localization problem in the supervised and weakly supervised settings. |
Varshanth R. Rao; Md Ibrahim Khalil; Haoda Li; Peng Dai; Juwei Lu; |
391 | Learning from Label Proportions with Prototypical Contrastive Clustering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes a new model that jointly uses prototypical contrastive learning and bag-level cluster proportions to implement efficient LLP classification. |
Laura Elena Cué La Rosa; Dário Augusto Borges Oliveira; |
392 | Beyond Learning Features: Training A Fully-Functional Classifier with ZERO Instance-Level Labels Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We attempt to train deep neural networks for classification without using any labeled data. |
Deepak Babu Sam; Abhinav Agarwalla; Venkatesh Babu Radhakrishnan; |
393 | Reference-Guided Pseudo-Label Generation for Medical Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Producing densely annotated data is a difficult and tedious task for medical imaging applications. To address this problem, we propose a novel approach to generate supervision for semi-supervised semantic segmentation. |
Constantin Marc Seibold; Simon Reiß; Jens Kleesiek; Rainer Stiefelhagen; |
394 | Information-Theoretic Bias Reduction Via Causal View of Spurious Correlation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an information-theoretic bias measurement technique through a causal interpretation of spurious correlation, which is effective to identify the feature-level algorithmic bias by taking advantage of conditional mutual information. |
Seonguk Seo; Joon-Young Lee; Bohyung Han; |
395 | Improving Scene Graph Classification By Exploiting Knowledge from Texts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we investigate whether textual scene descriptions can substitute for annotated image data. |
Sahand Sharifzadeh; Sina Moayed Baharlou; Martin Schmitt; Hinrich Schütze; Volker Tresp; |
396 | Reliable Inlier Evaluation for Unsupervised Point Cloud Registration Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a neighborhood consensus based reliable inlier evaluation method for robust unsupervised point cloud registration. |
Yaqi Shen; Le Hui; Haobo Jiang; Jin Xie; Jian Yang; |
397 | Explainable Survival Analysis with Convolution-Involved Vision Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we aim to develop a novel survival analysis model to fully utilize the complete WSI information. |
Yifan Shen; Li Liu; Zhihao Tang; Zongyi Chen; Guixiang Ma; Jiyan Dong; Xi Zhang; Lin Yang; Qingfeng Zheng; |
398 | Un-mix: Rethinking Image Mixtures for Unsupervised Visual Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This drawback hinders the model from learning subtle variance and fine-grained information. To address this, in this work we aim to involve the soft distance concept on label space in the contrastive-based unsupervised learning task and let the model be aware of the soft degree of similarity between positive or negative pairs through mixing the input data space, to further work collaboratively for the input and loss spaces. |
Zhiqiang Shen; Zechun Liu; Zhuang Liu; Marios Savvides; Trevor Darrell; Eric Xing; |
399 | On The Efficacy of Small Self-Supervised Contrastive Models Without Distillation Signals Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the issue of training self-supervised small models without distillation signals. |
Haizhou Shi; Youcai Zhang; Siliang Tang; Wenjie Zhu; Yaqian Li; Yandong Guo; Yueting Zhuang; |
400 | Social Interpretable Tree for Pedestrian Trajectory Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Understanding the multiple socially-acceptable future behaviors is an essential task for many vision applications. In this paper, we propose a tree-based method, termed as Social Interpretable Tree (SIT), to address this multi-modal prediction task, where a hand-crafted tree is built depending on the prior information of observed trajectory to model multiple future trajectories. |
Liushuai Shi; Le Wang; Chengjiang Long; Sanping Zhou; Fang Zheng; Nanning Zheng; Gang Hua; |
401 | P^3-Net: Part Mobility Parsing from Point Cloud Sequences Via Learning Explicit Point Correspondence Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents a novel approach to parse 3D part mobility from point cloud sequences. |
Yahao Shi; Xinyu Cao; Feixiang Lu; Bin Zhou; |
402 | Improving Zero-Shot Phrase Grounding Via Reasoning on External Knowledge and Spatial Relations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we design a novel phrase grounding architecture that builds multi-modal knowledge graphs using external knowledge and then performs graph reasoning and spatial relation reasoning to localize the referred nouns phrases. |
Zhan Shi; Yilin Shen; Hongxia Jin; Xiaodan Zhu; |
403 | Iterative Contrast-Classify for Semi-supervised Temporal Action Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Due to the high cost of frame-wise labeling, we propose the first semi-supervised method for temporal action segmentation. |
Dipika Singhania; Rahul Rahaman; Angela Yao; |
404 | JPV-Net: Joint Point-Voxel Representations for Accurate 3D Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we aim to exploit the strengths of both two representations, and present a novel two-stage detector, named Joint Point-Voxel Network (JPV-Net). |
Nan Song; Tianyuan Jiang; Jian Yao; |
405 | Fully Attentional Network for Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, such practices tend to condense feature dependencies along the other dimensions, hence causing attention missing, which might lead to inferior results for small/thin categories or inconsistent segmentation inside large objects. To address this problem, we propose a new approach, namely Fully Attentional Network (FLANet), to encode both spatial and channel attentions in a single similarity map while maintaining high computational efficiency. |
Qi Song; Jie Li; Chenghong Li; Hao Guo; Rui Huang; |
406 | Self-Supervised Object Localization with Joint Graph Partition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Since collecting bounding-box labels is time-consuming and laborious, many researchers focus on weakly supervised object localization (WSOL). As the recent appealing self-supervised learning technique shows its powerful function in visual tasks, in this paper, we take the early attempt to explore unsupervised object localization by self-supervision. |
Yukun Su; Guosheng Lin; Yun Hao; Yiwen Cao; Wenjun Wang; Qingyao Wu; |
407 | Correlation Field for Boosting 3D Object Detection in Structured Scenes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a simple but effective online crop-and-paste data augmentation pipeline for structured 3D point cloud scenes, named CorrelaBoost. |
Jianhua Sun; Hao-Shu Fang; Xianghui Zhu; Jiefeng Li; Cewu Lu; |
408 | Boost Supervised Pretraining for Visual Transfer Learning: Implications of Self-Supervised Contrastive Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our core finding is that it is the amount of information effectively perceived by the learning model that is crucial to transfer learning, instead of absolute size of the dataset. Based on this finding, we propose Classification Activation Map guided contrastive (CAMtrast) learning which better utilizes the label supervsion to strengthen supervised pretraining, by making the networks perceive more information from the training images. |
Jinghan Sun; Dong Wei; Kai Ma; Liansheng Wang; Yefeng Zheng; |
409 | Dual Contrastive Learning for General Face Forgery Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Previous works always formulate face forgery detection as a classification problem based on cross-entropy loss, which emphasizes category-level differences rather than the essential discrepancies between real and fake faces, limiting model generalization in unseen domains. To address this issue, we propose a novel face forgery detection framework, named Dual Contrastive Learning (DCL), which specially constructs positive and negative paired data and performs designed contrastive learning at different granularities to learn generalized feature representation. |
Ke Sun; Taiping Yao; Shen Chen; Shouhong Ding; Jilin Li; Rongrong Ji; |
410 | SSAT: A Symmetric Semantic-Aware Transformer Network for Makeup Transfer and Removal Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, most existing methods focus on the former and ignore the latter, resulting in a failure to achieve desired results. To solve the above problems, we propose a unified Symmetric Semantic-Aware Transformer (SSAT) network, which incorporates semantic correspondence learning to realize makeup transfer and removal simultaneously. |
Zhaoyang Sun; Yaxiong Chen; Shengwu Xiong; |
411 | Adversarial Bone Length Attack on Action Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we show that adversarial attacks can be performed on skeleton-based action recognition models, even in a significantly low-dimensional setting without any temporal manipulation. |
Nariki Tanaka; Hiroshi Kera; Kazuhiko Kawamoto; |
412 | Sparse MLP for Image Recognition: Is Self-Attention Really Necessary? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we explore whether the core self-attention module in Transformer is the key to achieving excellent performance in image recognition. |
Chuanxin Tang; Yucheng Zhao; Guangting Wang; Chong Luo; Wenxuan Xie; Wenjun Zeng; |
413 | Not All Voxels Are Equal: Semantic Scene Completion from The Point-Voxel Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We revisit Semantic Scene Completion (SSC), a useful task to predict the semantic and occupancy representation of 3D scenes, in this paper. |
Jiaxiang Tang; Xiaokang Chen; Jingbo Wang; Gang Zeng; |
414 | Transfer Learning for Color Constancy Via Statistic Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Recently, although the deep learning approaches have remarkably improved on single-camera data, these models still suffer from the seriously insufficient data problem, resulting in shallow model capacity and degradation in multi-camera settings. In this paper, to alleviate this problem, we present a Transfer Learning Color Constancy (TLCC) method that leverages cross-camera RAW data and massive unlabeled sRGB data to support training. |
Yuxiang Tang; Xuejing Kang; Chunxiao Li; Zhaowen Lin; Anlong Ming; |
415 | TVT: Three-Way Vision Transformer Through Multi-Modal Hypersphere Learning for Zero-Shot Sketch-Based Image Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the zero-shot sketch-based image retrieval (ZS-SBIR) task, which retrieves natural images related to sketch queries from unseen categories. |
Jialin Tian; Xing Xu; Fumin Shen; Yang Yang; Heng Tao Shen; |
416 | GuidedMix-Net: Semi-supervised Semantic Segmentation By Using Labeled Images As Reference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel method for semi-supervised semantic segmentation named GuidedMix-Net, by leveraging labeled information to guide the learning of unlabeled instances. |
Peng Tu; Yawen Huang; Feng Zheng; Zhenyu He; Liujuan Cao; Ling Shao; |
417 | MTLDesc: Looking Wider to Describe Better Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we focus on making local descriptors “look wider to describe better” by learning local Descriptors with More Than Local information (MTLDesc). |
Changwei Wang; Rongtao Xu; Yuyang Zhang; Shibiao Xu; Weiliang Meng; Bin Fan; Xiaopeng Zhang; |
418 | Active Boundary Loss for Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes a novel active boundary loss for semantic segmentation. |
Chi Wang; Yunke Zhang; Miaomiao Cui; Peiran Ren; Yin Yang; Xuansong Xie; Xian-Sheng Hua; Hujun Bao; Weiwei Xu; |
419 | Online-Updated High-Order Collaborative Networks for Single Image Deraining Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a high-order collaborative network with multi-scale compact constraints and a bidirectional scale-content similarity mining module to exploit features from deep networks externally and internally for rain streaks removal. |
Cong Wang; Jinshan Pan; Xiao-Ming Wu; |
420 | FCA: Learning A 3D Full-Coverage Vehicle Camouflage for Multi-View Physical Adversarial Attack Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To bridge the gap between digital attacks and physical attacks, we exploit the full 3D vehicle surface to propose a robust Full-coverage Camouflage Attack (FCA) to fool detectors. |
Donghua Wang; Tingsong Jiang; Jialiang Sun; Weien Zhou; Zhiqiang Gong; Xiaoya Zhang; Wen Yao; Xiaoqian Chen; |
421 | When Shift Operation Meets Vision Transformer: An Extremely Simple Alternative to Attention Mechanism Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We should pay more attentions to the remaining parts of ViT in the future work. |
Guangting Wang; Yucheng Zhao; Chuanxin Tang; Chong Luo; Wenjun Zeng; |
422 | Self-Supervised Representation Learning Framework for Remote Physiological Measurement Using Spatiotemporal Augmentation Loss Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, existing data augmentation techniques for contrastive learning are not designed to learn physiological signals from videos and often fail when there are complicated noise and subtle and periodic colour/shape variations between video frames. To address these problems, we present a novel self-supervised spatiotemporal learning framework for remote physiological signal representation learning, where there is a lack of labelled training data. |
Hao Wang; Euijoon Ahn; Jinman Kim; |
423 | UCTransNet: Rethinking The Skip Connections in U-Net from A Channel-Wise Perspective with Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Based on our findings, we propose a new segmentation framework, named UCTransNet (with a proposed CTrans module in U-Net), from the channel perspective with attention mechanism. |
Haonan Wang; Peng Cao; Jiaqi Wang; Osmar R. Zaiane; |
424 | Renovate Yourself: Calibrating Feature Representation of Misclassified Pixels for Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: These methods usually treat the misclassified and correctly classified pixels equally, hence misleading the optimization process and causing inconsistent intra-class pixel feature representations in the embedding space during learning. In this paper, we propose the auxiliary representation calibration head (RCH), which consists of the image decoupling, prototype clustering, error calibration modules and a metric loss function, to calibrate these error-prone feature representations for better intra-class consistency and segmentation performance. |
Hualiang Wang; Huanpeng Chu; Siming FU; Zuozhu Liu; Haoji Hu; |
425 | Separated Contrastive Learning for Organ-at-Risk and Gross-Tumor-Volume Segmentation with Limited Annotation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, pixels in the same map inevitably share semantics to be closer than they actually are, which may affect the discrimination of pixels in the same map and lead to the unfair comparison to pixels in other maps. To address these issues, we propose a separated region-level contrastive learning scheme, namely SepaReg, the core of which is to separate each image into regions and encode each region separately. |
Jiacheng Wang; Xiaomeng Li; Yiming Han; Jing Qin; Liansheng Wang; Zhou Qichao; |
426 | Contrastive Quantization with Code Memory for Unsupervised Image Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper provides a novel solution to unsupervised deep quantization, namely Contrastive Quantization with Code Memory (MeCoQ). |
Jinpeng Wang; Ziyun Zeng; Bin Chen; Tao Dai; Shu-Tao Xia; |
427 | Learning Temporally and Semantically Consistent Unpaired Video-to-Video Translation Through Pseudo-Supervision from Synthetic Optical Flow Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, the inaccuracies in the estimation of motion deteriorate the quality of the guidance towards spatiotemporal consistency, which leads to unstable translation. In this work, we propose a novel paradigm that regularizes the spatiotemporal consistency by synthesizing motions in input videos with the generated optical flow instead of estimating them. |
Kaihong Wang; Kumar Akash; Teruhisa Misu; |
428 | Cross-Dataset Collaborative Learning for Semantic Segmentation in Autonomous Driving Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a simple, flexible, and general method for semantic segmentation, termed Cross-Dataset Collaborative Learning (CDCL). |
Li Wang; Dong Li; Han Liu; JinZhang Peng; Lu Tian; Yi Shan; |
429 | Scaled ReLU Matters for Training Vision Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We verify, both theoretically and empirically, that scaled ReLU in conv-stem not only improves training stabilization, but also increases the diversity of patch tokens, thus boosting peak performance with a large margin via adding few parameters and flops. |
Pichao Wang; Xue Wang; Hao Luo; Jingkai Zhou; Zhipeng Zhou; Fan Wang; Hao Li; Rong Jin; |
430 | CQA-Face: Contrastive Quality-Aware Attentions for Face Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This can cause performance drops when emphasized facial parts are invisible under heavy occlusions (e.g. face masks) or large pose variations; 2) Different facial parts may appear at various quality caused by occlusion, blur, or illumination changes. In this paper, we propose contrastive quality-aware attentions, called CQA-Face, to address these two issues. |
Qiangchang Wang; Guodong Guo; |
431 | Category-Specific Nuance Exploration Network for Fine-Grained Object Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: A potential limitation of these methods is that they only focus on common parts across the dataset (e.g. head, body or even leg) by introducing additional prior knowledge, but the retrieval of a fine-grained object may rely on category-specific nuances that contribute to category prediction. To handle this limitation, we propose an end-to-end Category-specific Nuance Exploration Network (CNENet) that elaborately discovers category-specific nuances that contribute to category prediction, and semantically aligns these nuances grouped by subcategory without any additional prior knowledge, to directly emphasize the discrepancy among subcategories. |
Shijie Wang; Zhihui Wang; Haojie Li; Wanli Ouyang; |
432 | Detail-Preserving Transformer for Light Field Image Super-resolution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we put forth a novel formulation built upon Transformers, by treating LFSR as a sequence-to-sequence reconstruction task. |
Shunzhou Wang; Tianfei Zhou; Yao Lu; Huijun Di; |
433 | One-Shot Talking Face Generation from Single-Speaker Audio-Visual Correlation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Hence, we propose a novel one-shot talking face generation framework by exploring consistent correlations between audio and visual motions from a specific speaker and then transferring audio-driven motion fields to a reference image. |
Suzhen Wang; Lincheng Li; Yu Ding; Xin Yu; |
434 | Pose-Guided Feature Disentangling for Occluded Person Re-identification Based on Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Some existing pose-guided methods solve this problem by aligning body parts according to graph matching, but these graph-based methods are not intuitive and complicated. Therefore, we propose a transformer-based Pose-guided Feature Disentangling (PFD) method by utilizing pose information to clearly disentangle semantic components (e.g. human body or joint parts) and selectively match non-occluded parts correspondingly. |
Tao Wang; Hong Liu; Pinhao Song; Tianyu Guo; Wei Shi; |
435 | FFNet: Frequency Fusion Network for Semantic Scene Completion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Yet, they ignore the large discrepancy of RGB-D data and the uncertainty measurements of depth data. To solve this problem, we propose the Frequency Fusion Network (FFNet), a novel method for boosting semantic scene completion by better utilizing RGB-D data. |
Xuzhi Wang; Di Lin; Liang Wan; |
436 | Privacy-Preserving Face Recognition in The Frequency Domain Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In order to further protect the remaining frequency components, we propose a fast masking method. |
Yinggui Wang; Jian Liu; Man Luo; Le Yang; Li Wang; |
437 | Anchor DETR: Query Design for Transformer-Based Detector Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel query design for the transformer-based object detection. |
Yingming Wang; Xiangyu Zhang; Tong Yang; Jian Sun; |
438 | Panini-Net: GAN Prior Based Degradation-Aware Feature Interpolation for Face Restoration Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel GAN Prior based degradation-aware feature interpolation network, dubbed Panini-Net, for FR tasks by explicitly learning the abstract representations to distinguish various degradations. |
Yinhuai Wang; Yujie Hu; Jian Zhang; |
439 | End-to-End Transformer Based Model for Image Captioning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we build a pure Transformer-based model, which integrates image captioning into one stage and realizes end-to-end training. |
Yiyu Wang; Jungang Xu; Yingfei Sun; |
440 | Learning to Detect 3D Facial Landmarks Via Heatmap Regression with Graph Convolutional Network Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel 3D facial landmark detection method, which directly locates the coordinates of landmarks from 3D point cloud with a well-customized graph convolutional network. |
Yuan Wang; Min Cao; Zhenfeng Fan; Silong Peng; |
441 | Low-Light Image Enhancement with Normalizing Flow Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Previous works based on the pixel-wise reconstruction losses and deterministic processes fail to capture the complex conditional distribution of normally exposed images, which results in improper brightness, residual noise, and artifacts. In this paper, we investigate to model this one-to-many relationship via a proposed normalizing flow model. |
Yufei Wang; Renjie Wan; Wenhan Yang; Haoliang Li; Lap-Pui Chau; Alex Kot; |
442 | Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Based on MMN, we present a winner solution for the HC-STVG challenge of the 3rd PIC workshop. |
Zhenzhi Wang; Limin Wang; Tao Wu; Tianhao Li; Gangshan Wu; |
443 | Texture Reformer: Towards Fast and Universal Interactive Texture Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present the texture reformer, a fast and universal neural-based framework for interactive texture transfer with user-specified guidance. |
Zhizhong Wang; Lei Zhao; Haibo Chen; Ailin Li; Zhiwen Zuo; Wei Xing; Dongming Lu; |
444 | Interact, Embed, and EnlargE: Boosting Modality-Specific Representations for Multi-Modal Person Re-identification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Existing multi-modal methods ignore the importance of modality-specific information in the feature fusion stage. To this end, we propose a novel method to boost modality-specific representations for multi-modal person Re-ID: Interact, Embed, and EnlargE (IEEE). |
Zi Wang; Chenglong Li; Aihua Zheng; Ran He; Jin Tang; |
445 | Can Semantic Labels Assist Self-Supervised Visual Representation Learning? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we defend the usefulness of semantic labels but point out that fully-supervised and self-supervised methods are pursuing different kinds of features. |
Longhui Wei; Lingxi Xie; Jianzhong He; Xiaopeng Zhang; Qi Tian; |
446 | Rethinking The Two-Stage Framework for Grounded Situation Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: 2) All semantic roles are detected in an autoregressive manner, which fails to model the complex semantic relations between different roles. To this end, we propose a novel SituFormerfor GSR which consists of a Coarse-to-Fine Verb Model (CFVM) and a Transformer-based Noun Model (TNM). |
Meng Wei; Long Chen; Wei Ji; Xiaoyu Yue; Tat-Seng Chua; |
447 | Boosting The Transferability of Video Adversarial Examples Via Temporal Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Nevertheless, most existing adversarial attack methods have poor transferability when attacking other video models and transfer-based attacks on video models are still unexplored. To this end, we propose to boost the transferability of video adversarial examples for black-box attacks on video recognition models. |
Zhipeng Wei; Jingjing Chen; Zuxuan Wu; Yu-Gang Jiang; |
448 | Towards Transferable Adversarial Attacks on Vision Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we posit that adversarial attacks on transformers should be specially tailored for their architecture, jointly considering both patches and self-attention, in order to achieve high transferability. |
Zhipeng Wei; Jingjing Chen; Micah Goldblum; Zuxuan Wu; Tom Goldstein; Yu-Gang Jiang; |
449 | L-CoDe:Language-Based Colorization Using Color-Object Decoupled Conditions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose L-CoDe, a Language-based Colorization network using color-object Decoupled conditions. |
Shuchen Weng; Hao Wu; Zheng Chang; Jiajun Tang; Si Li; Boxin Shi; |
450 | Neural Interferometry: Image Reconstruction from Astronomical Interferometers Using Transformer-Conditioned Neural Fields Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a novel deep learning approach in which the representation in the Fourier domain of an astronomical source is learned implicitly using a neural field representation. |
Benjamin Wu; Chao Liu; Benjamin Eckart; Jan Kautz; |
451 | TDv2: A Novel Tree-Structured Decoder for Offline Mathematical Expression Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this study, we propose a novel tree decoder (TDv2) to fully utilize the tree structure labels. |
Changjie Wu; Jun Du; Yunqing Li; Jianshu Zhang; Chen Yang; Bo Ren; Yiqing Hu; |
452 | Learning Token-Based Representation for Image Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To generate compact global representations while maintaining regional matching capability, we propose a unified framework to jointly learn local feature representation and aggregation. |
Hui Wu; Min Wang; Wengang Zhou; Yang Hu; Houqiang Li; |
453 | Multi-Modal Answer Validation for Knowledge-Based VQA Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Using more knowledge sources increases the chance of retrieving more irrelevant or noisy facts, making it challenging to comprehend the facts and find the answer. To address this challenge, we propose Multi-modal Answer Validation using External knowledge (MAVEx), where the idea is to validate a set of promising answer candidates based on answer-specific knowledge retrieval. |
Jialin Wu; Jiasen Lu; Ashish Sabharwal; Roozbeh Mottaghi; |
454 | Neighborhood Consensus Contrastive Learning for Backward-Compatible Representation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a Neighborhood Consensus Contrastive Learning (NCCL) method. |
Shengsen Wu; Liang Chen; Yihang Lou; Yan Bai; Tao Bai; Minghua Deng; Ling-Yu Duan; |
455 | Pale Transformer: A General Vision Transformer Backbone with Pale-Shaped Attention Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Consequently, their receptive fields in a single attention layer are not large enough, resulting in insufficient context modeling. To address this issue, we propose a Pale-Shaped self-Attention (PS-Attention), which performs self-attention within a pale-shaped region. |
Sitong Wu; Tianyi Wu; Haoru Tan; Guodong Guo; |
456 | Style Mixing and Patchwise Prototypical Matching for One-Shot Unsupervised Domain Adaptive Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we tackle the problem of one-shot unsupervised domain adaptation (OSUDA) for semantic segmentation where the segmentors only see one unlabeled target image during training. |
Xinyi Wu; Zhenyao Wu; Yuhang Lu; Lili Ju; Song Wang; |
457 | Multi-Centroid Representation Network for Domain Adaptive Person Re-ID Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a novel Multi-Centroid Memory (MCM) to adaptively capture different identity information within the cluster. |
Yuhang Wu; Tengteng Huang; Haotian Yao; Chi Zhang; Yuanjie Shao; Chuchu Han; Changxin Gao; Nong Sang; |
458 | Efficient Non-local Contrastive Attention for Image Super-resolution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel Efficient Non-Local Contrastive Attention (ENLCA) to perform long-range visual modeling and leverage more relevant non-local features. |
Bin Xia; Yucheng Hang; Yapeng Tian; Wenming Yang; Qingmin Liao; Jie Zhou; |
459 | Coarse-to-Fine Embedded PatchMatch and Multi-Scale Dynamic Aggregation for Reference-Based Super-resolution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an Accelerated Multi-Scale Aggregation network (AMSA) for Reference-based Super-Resolution, including Coarse-to-Fine Embedded PatchMatch (CFE-PatchMatch) and Multi-Scale Dynamic Aggregation (MSDA) module. |
Bin Xia; Yapeng Tian; Yucheng Hang; Wenming Yang; Qingmin Liao; Jie Zhou; |
460 | Cross-Domain Collaborative Normalization Via Structural Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we develop a novel normalization technique, named Collaborative Normalization (CoN), for eliminating domain discrepancy and accelerating the model training of neural networks for UDA. |
Haifeng Xia; Zhengming Ding; |
461 | ReMoNet: Recurrent Multi-Output Network for Efficient Video Denoising Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper aims to develop a lightweight deep video denoising method that is friendly to resource-constrained mobile devices. |
Liuyu Xiang; Jundong Zhou; Jirui Liu; Zerun Wang; Haidong Huang; Jie Hu; Jungong Han; Yuchen Guo; Guiguang Ding; |
462 | Transfer Learning from Synthetic to Real LiDAR Point Cloud for Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, the study focused on 2D images and its counterpart in 3D point clouds segmentation lags far behind due to the lack of large-scale synthetic datasets and effective transfer methods. We address this issue by collecting SynLiDAR, a large-scale synthetic LiDAR dataset that contains point-wise annotated point clouds with accurate geometric shapes and comprehensive semantic classes. |
Aoran Xiao; Jiaxing Huang; Dayan Guan; Fangneng Zhan; Shijian Lu; |
463 | Video As Conditional Graph Hierarchy for Multi-Granular Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To align with the multi-granular essence of linguistic concepts in language queries, we propose to model video as a conditional graph hierarchy which weaves together visual facts of different granularity in a level-wise manner, with the guidance of corresponding textual cues. |
Junbin Xiao; Angela Yao; Zhiyuan Liu; Yicong Li; Wei Ji; Tat-Seng Chua; |
464 | AdaptivePose: Human Parts As Adaptive Points Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Multi-person pose estimation methods generally follow top-down and bottom-up paradigms, both of which can be considered as two-stage approaches thus leading to the high computation cost and low efficiency. |
Yabo Xiao; Xiao Juan Wang; Dongdong Yu; Guoli Wang; Qian Zhang; Mingshu HE; |
465 | Learning Quality-Aware Representation for Multi-Person Pose Regression Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To address the aforementioned issues, we propose to learn the pose regression quality-aware representation. |
Yabo Xiao; Dongdong Yu; Xiao Juan Wang; Lei Jin; Guoli Wang; Qian Zhang; |
466 | Attribute-Based Progressive Fusion Network for RGBT Tracking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we disentangle the fusion process via the challenge attributes, and thus propose a novel Attribute-based Progressive Fusion Network (APFNet) to increase the fusion capacity with a small number of parameters while reducing the dependence on large-scale training data. |
Yun Xiao; MengMeng Yang; Chenglong Li; Lei Liu; Jin Tang; |
467 | Detailed Facial Geometry Recovery from Multi-View Images By Learning An Implicit Function Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel architecture to recover extremely detailed 3D faces within dozens of seconds. |
Yunze Xiao; Hao Zhu; Haotian Yang; Zhengyu Diao; Xiangju Lu; Xun Cao; |
468 | FINet: Dual Branches Feature Interaction for Partial-to-Partial Point Cloud Registration Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Data association is important in the point cloud registration. In this work, we propose to solve the partial-to-partial registration from a new perspective, by introducing multi-level feature interactions between the source and the reference clouds at the feature extraction stage, such that the registration can be realized without the attentions or explicit mask estimation for the overlapping detection as adopted previously. |
Hao Xu; Nianjin Ye; Guanghui Liu; Bing Zeng; Shuaicheng Liu; |
469 | Rendering-Aware HDR Environment Map Prediction from A Single Image Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a two-stage deep learning-based method to predict an HDR environment map from a single narrow field-of-view LDR image. |
Jun-Peng Xu; Chenyu Zuo; Fang-Lue Zhang; Miao Wang; |
470 | Topology-Aware Convolutional Neural Network for Efficient Skeleton-Based Action Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: One reason is that CNNs are considered poor in modeling the irregular skeleton topology. To alleviate this limitation, we propose a pure CNN architecture named Topology-aware CNN (Ta-CNN) in this paper. |
Kailin Xu; Fanfan Ye; Qiaoyong Zhong; Di Xie; |
471 | Transcoded Video Restoration By Temporal Spatial Auxiliary Network Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a new method, temporal spatial auxiliary network (TSAN), for transcoded video restoration. |
Li Xu; Gang He; Jinjia Zhou; Jie Lei; Weiying Xie; Yunsong Li; Yu-Wing Tai; |
472 | DIRL: Domain-Invariant Representation Learning for Generalizable Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, most existing works learn the shared feature space within multi-source domains but ignore the characteristic of the feature itself (e.g., the feature sensitivity to the domain-specific style). Therefore, we propose the Domain-invariant Representation Learning (DIRL) for domain generalization which utilizes the feature sensitivity as the feature prior to guide the enhancement of the model generalization capability. |
Qi Xu; Liang Yao; Zhengkai Jiang; Guannan Jiang; Wenqing Chu; Wenhui Han; Wei Zhang; Chengjie Wang; Ying Tai; |
473 | Behind The Curtain: Learning Occluded Shapes for 3D Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To tackle the challenge, we present a novel LiDAR-based 3D object detection model, dubbed Behind the Curtain Detector (BtcDet), which learns the object shape priors and estimates the complete object shapes that are partially occluded (curtained) in point clouds. |
Qiangeng Xu; Yiqi Zhong; Ulrich Neumann; |
474 | Domain Disentangled Generative Adversarial Network for Zero-Shot Sketch-Based 3D Shape Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel domain disentangled generative adversarial network (DD-GAN) for zero-shot sketch-based 3D retrieval, which can retrieve the unseen categories that are not accessed during training. |
Rui Xu; Zongyan Han; Le Hui; Jianjun Qian; Jin Xie; |
475 | Dual Attention Networks for Few-Shot Fine-Grained Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, to generate fine-grained tailored representations for few-shot recognition, we propose a Dual Attention Network (Dual Att-Net) consisting of two dual branches of both hard- and soft-attentions. |
Shu-Lin Xu; Faen Zhang; Xiu-Shen Wei; Jianhua Wang; |
476 | Sparse Cross-Scale Attention Network for Efficient LiDAR Panoptic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the long-range geometry relationship has not been sufficiently modeled by local feature learning from the above methods. To this end, we present SCAN, a novel sparse cross-scale attention network to first align multi-scale sparse features with global voxel-encoded attention to capture the long-range relationship of instance context, which is able to boost the regression accuracy of the over-segmented large objects. |
Shuangjie Xu; Rui Wan; Maosheng Ye; Xiaoyi Zou; Tongyi Cao; |
477 | Towards Fully Sparse Training: Information Restoration with Spatial Similarity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on the observation of spatial similarity among activations, we propose pruning activations with fixed 2:4 masks. |
Weixiang Xu; Xiangyu He; Ke Cheng; Peisong Wang; Jian Cheng; |
478 | Hierarchical Image Generation Via Transformer-Based Sequential Patch Selection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To synthesize images with preferred objects and interactions, a controllable way is to generate the image from a scene graph and a large pool of object crops, where the spatial arrangements of the objects in the image are defined by the scene graph while their appearances are determined by the retrieved crops from the pool. In this paper, we propose a novel framework with such a semi-parametric generation strategy. |
Xiaogang Xu; Ning Xu; |
479 | Reliable Propagation-Correction Modulation for Video Object Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We aim to suppress error propagation through a correction mechanism with high reliability. |
Xiaohao Xu; Jinglu Wang; Xiao Li; Yan Lu; |
480 | Adaptive Hypergraph Neural Network for Multi-Person Pose Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper proposes a novel two-stage hypergraph-based framework, dubbed ADaptive Hypergraph Neural Network (AD-HNN) to estimate multiple human poses from a single image, with a keypoint localization network and an Adaptive-Pose Hypergraph Neural Network (AP-HNN) added onto the former network. |
Xixia Xu; Qi Zou; Xue Lin; |
481 | Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To tackle the limitations and expand the applicable scenario of token pruning, we present Evo-ViT, a self-motivated slow-fast token evolution approach for vision transformers. |
Yifan Xu; Zhijie Zhang; Mengdan Zhang; Kekai Sheng; Ke Li; Weiming Dong; Liqing Zhang; Changsheng Xu; Xing Sun; |
482 | MobileFaceSwap: A Lightweight Framework for Video Face Swapping Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a lightweight Identity-aware Dynamic Network (IDN) for subject-agnostic face swapping by dynamically adjusting the model parameters according to the identity information. |
Zhiliang Xu; Zhibin Hong; Changxing Ding; Zhen Zhu; Junyu Han; Jingtuo Liu; Errui Ding; |
483 | Clinical-BERT: Vision-Language Pre-training for Radiograph Diagnosis and Reports Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a vision-language pre-training model, Clinical-BERT, for the medical domain, and devise three domain-specific tasks: Clinical Diagnosis (CD), Masked MeSH Modeling (MMM), Image-MeSH Matching (IMM), together with one general pre-training task: Masked Language Modeling (MLM), to pre-train the model. |
Bin Yan; Mingtao Pei; |
484 | Inferring Prototypes for Multi-Label Few-Shot Image Classification with Word Vector Guided Attention Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As a solution, in this paper we propose to use word embeddings as a form of prior knowledge about the meaning of the labels. |
Kun Yan; Chenbin Zhang; Jun Hou; Ping Wang; Zied Bouraoui; Shoaib Jameel; Steven Schockaert; |
485 | Unsupervised Domain Adaptive Salient Object Detection Through Uncertainty-Aware Pseudo-Label Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Considering the large appearance differences between the synthetic and real-world scenarios, directly training with synthetic data will lead to performance degradation on real-world scenarios. To mitigate this problem, we propose a novel unsupervised domain adaptive SOD method to adapt between these two domains by uncertainty-aware self-training. |
Pengxiang Yan; Ziyi Wu; Mengmeng Liu; Kun Zeng; Liang Lin; Guanbin Li; |
486 | Transmission-Guided Bayesian Generative Model for Smoke Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This is caused by both knowledge level uncertainty due to limited training data for accurate smoke segmentation and labeling level uncertainty representing the difficulty in labeling ground-truth. To effectively model the two types of uncertainty, we introduce a Bayesian generative model to simultaneously estimate the posterior distribution of model parameters and its predictions. |
Siyuan Yan; Jing Zhang; Nick Barnes; |
487 | Cross-Species 3D Face Morphing Via Alignment-Aware Controller Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: It remains challenging how to preserve target structural information and source fine-grained facial details simultaneously. To this end, we propose an Alignment-aware 3D Face Morphing (AFM) framework, which builds semantic-adaptive correspondence between source and target faces across species, via an alignment-aware controller mesh (Explicit Controller, EC) with explicit source/target mesh binding. |
Xirui Yan; Zhenbo Yu; Bingbing Ni; Hang Wang; |
488 | Exploring Visual Context for Weakly Supervised Person Search Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper inventively considers weakly supervised person search with only bounding box annotations. We propose to address this novel task by investigating three levels of context clues (i.e., detection, memory and scene) in unconstrained natural images. |
Yichao Yan; Jinpeng Li; Shengcai Liao; Jie Qin; Bingbing Ni; Ke Lu; Xiaokang Yang; |
489 | Cross-Modal Mutual Learning for Audio-Visual Speech Recognition and Manipulation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As a key characteristic in audio-visual speech recognition (AVSR), relating linguistic information observed across visual and audio data has been a challenge, benefiting not only audio/visual speech recognition (ASR/VSR) but also for manipulating data within/across modalities. In this paper, we present a feature disentanglement-based framework for jointly addressing the above tasks. |
Chih-Chun Yang; Wan-Cyuan Fan; Cheng-Fu Yang; Yu-Chiang Frank Wang; |
490 | Mutual Contrastive Learning for Visual Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present a collaborative learning method called Mutual Contrastive Learning (MCL) for general visual representation learning. |
Chuanguang Yang; Zhulin An; Linhang Cai; Yongjun Xu; |
491 | Temporal Action Proposal Generation with Background Constraint Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we innovatively propose a general auxiliary Background Constraint idea to further suppress low-quality proposals, by utilizing the background prediction score to restrict the confidence of proposals. |
Haosen Yang; Wenhao Wu; Lining Wang; Sheng Jin; Boyang Xia; Hongxun Yao; Hujie Huang; |
492 | Cross-Modal Federated Human Activity Recognition Via Modality-Agnostic and Modality-Specific Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a new task of cross-modal federated human activity recognition (CMF-HAR), which is conducive to promote the large-scale use of the HAR model on more local devices. |
Xiaoshan Yang; Baochen Xiong; Yi Huang; Changsheng Xu; |
493 | Polygon-to-Polygon Distance Loss for Rotated Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we define a new distance formulation between two convex polygons describing the overlapping degree and non-overlapping degree. |
Yang Yang; Jifeng Chen; Xiaopin Zhong; Yuanlong Deng; |
494 | An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: For example, the retrieved knowledge might be noisy and irrelevant to the question, and the re-embedded knowledge features during reasoning might deviate from their original meanings in the knowledge base (KB). To address this challenge, we propose PICa, a simple yet effective method that Prompts GPT3 via the use of Image Captions, for knowledge-based VQA. |
Zhengyuan Yang; Zhe Gan; Jianfeng Wang; Xiaowei Hu; Yumao Lu; Zicheng Liu; Lijuan Wang; |
495 | ACGNet: Action Complement Graph Network for Weakly-Supervised Temporal Action Localization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing approaches typically leverage off-the-shelf segment-level features, which suffer from spatial incompleteness and temporal incoherence, thus limiting their performance. In this paper, we tackle this problem from a new perspective by enhancing segment-level representations with a simple yet effective graph convolutional network, namely action complement graph network (ACGNet). |
Zichen Yang; Jie Qin; Di Huang; |
496 | Enhancing Pseudo Label Quality for Semi-supervised Domain-Generalized Medical Image Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper presents a novel confidence-aware cross pseudo supervision algorithm for semi-supervised domain generalized medical image segmentation. |
Huifeng Yao; Xiaowei Hu; Xiaomeng Li; |
497 | Image Difference Captioning with Pre-training and Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The major challenges of this task lie in two aspects: 1) fine-grained visual differences that require learning stronger vision and language association and 2) high-cost of manual annotations that leads to limited supervised data. To address these challenges, we propose a new modeling framework following the pre-training-finetuning paradigm. |
Linli Yao; Weiying Wang; Qin Jin; |
498 | Safe Distillation Box Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel framework, termed as Safe Distillation Box~(SDB), that allows us to wrap a pre-trained model in a virtual box for intellectual property protection. |
Jingwen Ye; Yining Mao; Jie Song; Xinchao Wang; Cheng Jin; Mingli Song; |
499 | Joint Deep Multi-Graph Matching and 3D Geometry Learning from Inhomogeneous 2D Image Collections Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While these approaches mainly focus on learning node and edge attributes, they completely ignore the 3D geometry of the underlying 3D objects depicted in the 2D images. We fill this gap by proposing a trainable framework that takes advantage of graph neural networks for learning a deformable 3D geometry model from inhomogeneous image collections, i.e. a set of images that depict different instances of objects from the same category. |
Zhenzhang Ye; Tarun Yenamandra; Florian Bernard; Daniel Cremers; |
500 | Content-Variant Reference Image Quality Assessment Via Knowledge Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Although recent no-reference (NR-IQA) methods have made great progress to predict image quality free from the reference image, they still have the potential to achieve better performance since HQ image information is not fully exploited. In contrast, full-reference (FR-IQA) methods tend to provide more reliable quality evaluation, but its practicability is affected by the requirement for pixel-level aligned reference images. |
Guanghao Yin; Wei Wang; Zehuan Yuan; Chuchu Han; Wei Ji; Shouqian Sun; Changhu Wang; |
501 | Width & Depth Pruning for Vision Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Despite achieving remarkable results, these methods take one dimension of network width into consideration and ignore network depth, which is another important dimension for pruning vision transformers. Therefore, we propose a Width & Depth Pruning (WDPruning) framework that reduces both width and depth dimensions simultaneously. |
Fang Yu; Kun Huang; Meng Wang; Yuan Cheng; Wei Chu; Li Cui; |
502 | Anisotropic Fourier Features for Neural Image-Based Rendering and Relighting Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present an anisotropic random Fourier features (RFF) mapping scheme to tackle spectral biases. |
Huangjie Yu; Anpei Chen; Xin Chen; Lan Xu; Ziyu Shao; Jingyi Yu; |
503 | Self-Labeling Framework for Novel Category Discovery Over Domains Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a self-labeling framework to cluster all target samples, including those in the ”unknown” categories. |
Qing Yu; Daiki Ikami; Go Irie; Kiyoharu Aizawa; |
504 | Efficient Compact Bilinear Pooling Via Kronecker Product Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose an efficient compact bilinear pooling method to solve the inefficiency problem inherited in bilinear pooling thoroughly. |
Tan Yu; Yunfeng Cai; Ping Li; |
505 | Hybrid Graph Neural Networks for Few-Shot Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This design is problematic because the classifier needs to adapt quickly to new tasks while the embedding does not. To overcome this problem, in this paper we propose a novel hybrid GNN (HGNN) model consisting of two GNNs, an instance GNN and a prototype GNN. |
Tianyuan Yu; Sen He; Yi-Zhe Song; Tao Xiang; |
506 | SOIT: Segmenting Objects with Instance-Aware Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper presents an end-to-end instance segmentation framework, termed SOIT, that Segments Objects with Instance-aware Transformers. |
Xiaodong Yu; Dahu Shi; Xing Wei; Ye Ren; Tingqun Ye; Wenming Tan; |
507 | MSML: Enhancing Occlusion-Robustness By Multi-Scale Segmentation-Based Mask Learning for Face Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Existing methods generalize poorly due to the distribution distortion induced by unpredictable occlusions. To tackle this problem, we propose a hierarchical segmentation-based mask learning strategy for face recognition, enhancing occlusion-robustness by integrating segmentation representations of occlusion into face recognition in the latent space. |
Ge Yuan; Huicheng Zheng; Jiayu Dong; |
508 | Detecting Human-Object Interactions with Object-Guided Cross-Modal Calibrated Semantics Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we aim to boost end-to-end models with object-guided statistical priors. Specifically, We propose to utilize a Verb Semantic Model (VSM) and use semantic aggregation to profit from this object-guided hierarchy. |
Hangjie Yuan; Mang Wang; Dong Ni; Liangpeng Xu; |
509 | Task-Level Self-Supervision for Cross-Domain Few-Shot Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Among various solutions, episodic training progres-sively classifies a series of few-shot tasks and thereby is as-sumed to be beneficial for improving the model’s generalization ability. |
Wang Yuan; Zhizhong Zhang; Cong Wang; Haichuan Song; Yuan Xie; Lizhuang Ma; |
510 | Improving 360 Monocular Depth Estimation Via Non-local Dense Prediction Transformer and Joint Supervised and Self-Supervised Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose 360 monocular depth estimation methods which improve on the areas that limited previous studies. |
Ilwi Yun; Hyuk-Jae Lee; Chae Eun Rhee; |
511 | Homography Decomposition Networks for Planar Object Tracking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The essential reason behind this problem is that the condition number of such a non-linear system changes unstably when the searching range of the homography parameter space becomes larger. To this end, we propose a novel Homography Decomposition Networks~(HDN) approach that drastically reduces and stabilizes the condition number by decomposing the homography transformation into two groups. |
Xinrui Zhan; Yueran Liu; Jianke Zhu; Yang Li; |
512 | Patch Diffusion: A General Module for Face Manipulation Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a Patch Diffusion (PD) module which can be integrated into the existing face manipulation detection networks to boost the performance. |
Baogen Zhang; Sheng Li; Guorui Feng; Zhenxing Qian; Xinpeng Zhang; |
513 | Semi-supervised Object Detection with Adaptive Class-Rebalancing Self-Training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: While self-training achieves state-of-the-art results in semi-supervised object detection (SSOD), it severely suffers from foreground-background and foreground-foreground imbalances in SSOD. In this paper, we propose an Adaptive Class-Rebalancing Self-Training (ACRST) with a novel memory module called CropBank to alleviate these imbalances and generate unbiased pseudo-labels. |
Fangyuan Zhang; Tianxiang Pan; Bin Wang; |
514 | Show Your Faith: Cross-Modal Confidence-Aware Network for Image-Text Matching Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel Cross-Modal Confidence-Aware Network to infer the matching confidence that indicates the reliability of matched region-word pairs, which is combined with the local semantic similarities to refine the relevance measurement. |
Huatian Zhang; Zhendong Mao; Kun Zhang; Yongdong Zhang; |
515 | SCSNet: An Efficient Paradigm for Learning Simultaneously Image Colorization and Super-resolution Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, this pipeline is redundant and inefficient for the independent processes, and some inner features could have been shared. Therefore, we present an efficient paradigm to perform Simultaneously Image Colorization and Super-resolution (SCS) and propose an end-to-end SCSNet to achieve this goal. |
Jiangning Zhang; Chao Xu; Jian Li; Yue Han; Yabiao Wang; Ying Tai; Yong Liu; |
516 | Energy-Based Generative Cooperative Saliency Prediction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, to model the uncertainty of visual saliency, we study the saliency prediction problem from the perspective of generative models by learning a conditional probability distribution over the saliency map given an input image, and treating the saliency prediction as a sampling process from the learned distribution. |
Jing Zhang; Jianwen Xie; Zilong Zheng; Nick Barnes; |
517 | Attention-Based Transformation from Latent Features to Point Clouds Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose AXform, an attention-based method to transform latent features to point clouds. |
Kaiyi Zhang; Ximing Yang; Yuan Wu; Cheng Jin; |
518 | Suppressing Static Visual Cues Via Normalizing Flows for Self-Supervised Video Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Despite the great progress in video understanding made by deep convolutional neural networks, feature representation learned by existing methods may be biased to static visual cues. To address this issue, we propose a novel method to suppress static visual cues (SSVC) based on probabilistic analysis for self-supervised video representation learning. |
Manlin Zhang; Jinpeng Wang; Andy J. Ma; |
519 | LGD: Label-Guided Self-Distillation for Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose the first self-distillation framework for general object detection, termed LGD (Label-Guided self-Distillation). |
Peizhen Zhang; Zijian Kang; Tong Yang; Xiangyu Zhang; Nanning Zheng; Jian Sun; |
520 | Uncertainty Modeling with Second-Order Transformer for Group Re-identification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The key challenge of G-ReID is that all the cases of the intra-group member and layout variations are hard to exhaust. To this end, we propose a novel uncertainty modeling, which treats each image as a distribution depending on the current member and layout, then digs out potential group features by random samplings. |
Quan Zhang; Jian-Huang Lai; Zhanxiang Feng; Xiaohua Xie; |
521 | Deep Spatial Adaptive Network for Real Image Demosaicing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a deep spatial adaptive network (SANet) for real image demosaicing, which can adaptively learn the nonlinear mapping function for different locations. |
Tao Zhang; Ying Fu; Cheng Li; |
522 | MAGIC: Multimodal RelAtional Graph AdversarIal InferenCe for Diverse and Unpaired Text-Based Image Captioning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose the Multimodal relAtional Graph adversarIal InferenCe (MAGIC) framework for diverse and unpaired TextCap. |
Wenqiao Zhang; Haochen Shi; Jiannan Guo; Shengyu Zhang; Qingpeng Cai; Juncheng Li; Sihui Luo; Yueting Zhuang; |
523 | Class Guided Channel Weighting Network for Fine-Grained Semantic Segmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Due to the high similarity of different sub-categories and large variations in poses, scales, rotations, and color of the same sub-category in the fine-grained image set, the performance of traditional semantic segmentation methods will decline sharply. To alleviate these dilemmas, a new approach, named Class Guided Channel Weighting Network (CGCWNet), is developed in this paper to enable fine-grained semantic segmentation. |
Xiang Zhang; Wanqing Zhao; Hangzai Luo; Jinye Peng; Jianping Fan; |
524 | Context-Based Contrastive Learning for Scene Text Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: On the contrary to the superior accuracy of the seen text, models are prone to misrecognize unseen text even with good image quality. We propose a novel framework, Context-based contrastive learning (ConCLR), to alleviate this issue. |
Xinyun Zhang; Binwu Zhu; Xufeng Yao; Qi Sun; Ruiyu Li; Bei Yu; |
525 | Learning Network Architecture for Open-Set Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we make the first attempt to tackle OSR by searching the architecture of a Neural Network (NN) under the open-set assumption. |
Xuelin Zhang; Xuelian Cheng; Donghao Zhang; Paul Bonnington; Zongyuan Ge; |
526 | An Adversarial Framework for Generating Unseen Images By Activation Maximization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we aim to tackle the case where information about the target class is completely removed from the image set. |
Yang Zhang; Wang Zhou; Gaoyuan Zhang; David Cox; Shiyu Chang; |
527 | Contrastive Spatio-Temporal Pretext Learning for Self-Supervised Video Representation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, taking into account the degree of similarity of sampled instances as the intermediate state, we propose a novel pretext task – spatio-temporal overlap rate (STOR) prediction. |
Yujia Zhang; Lai-Man Po; Xuyuan Xu; Mengyang Liu; Yexin Wang; Weifeng Ou; Yuzhi Zhao; Wing-Yin Yu; |
528 | Pose-Invariant Face Recognition Via Adaptive Angular Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper introduces a novel method to learn pose-invariant feature representation without normalizing profile faces to frontal ones or learning disentangled features. |
Zhenduo Zhang; Yongru Chen; Wenming Yang; Guijin Wang; Qingmin Liao; |
529 | End-to-End Learning The Partial Permutation Matrix for Robust 3D Point Cloud Registration Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Alternatively, the soft matching-based methods have been proposed to learn the matching probability rather than hard assignment. |
Zhiyuan Zhang; Jiadai Sun; Yuchao Dai; Dingfu Zhou; Xibin Song; Mingyi He; |
530 | PetsGAN: Rethinking Priors for Single Image Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The main contributions of this paper include: 1) We interpret single image generation from the perspective of the general generative task, that is, to learn a diverse distribution from the Dirac distribution composed of a single image. |
Zicheng Zhang; Yinglu Liu; Congying Han; Hailin Shi; Tiande Guo; Bowen Zhou; |
531 | Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Hierarchical structures are popular in recent vision transformers, however, they require sophisticated designs and massive datasets to work well. In this paper, we explore the idea of nesting basic local transformers on non-overlapping image blocks and aggregating them in a hierarchical way. |
Zizhao Zhang; Han Zhang; Long Zhao; Ting Chen; Sercan Ö. Arik; Tomas Pfister; |
532 | OA-FSUI2IT: A Novel Few-Shot Cross Domain Object Detection Framework with Object-Aware Few-Shot Unsupervised Image-to-Image Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Unsupervised image-to-image (UI2I) translation methods aim to learn a mapping between different visual domains with well-preserved content and consistent structure. |
Lifan Zhao; Yunlong Meng; Lin Xu; |
533 | Static-Dynamic Co-teaching for Class-Incremental 3D Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we study the unexplored yet important class-incremental 3D object detection problem and present the first solution – SDCoT, a novel static-dynamic co-teaching method. |
Na Zhao; Gim Hee Lee; |
534 | Local Surface Descriptor for Geometry and Feature Preserved Mesh Denoising Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, due to the nature of irregular structure, CNNs-based denosing strategies cannot be trivially applied for meshes. To circumvent this limitation, in the paper, we propose the local surface descriptor (LSD), which is able to transform the local deformable surface around a face into 2D grid representation and thus facilitates the deployment of CNNs to generate denoised face normals. |
Wenbo Zhao; Xianming Liu; Junjun Jiang; Debin Zhao; Ge Li; Xiangyang Ji; |
535 | Boosting Generative Zero-Shot Learning By Synthesizing Diverse Features with Attribute Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Therefore, the generated data from attribute could have incomplete semantics. Based on this fact, we propose a novel framework to boost ZSL by synthesizing diverse features. |
Xiaojie Zhao; Yuming Shen; Shidong Wang; Haofeng Zhang; |
536 | Self-Supervised Pretraining for RGB-D Salient Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we utilize self-supervised representation learning (SSL) to design two pretext tasks: the cross-modal auto-encoder and the depth-contour estimation. |
Xiaoqi Zhao; Youwei Pang; Lihe Zhang; Huchuan Lu; Xiang Ruan; |
537 | Adaptive Logit Adjustment Loss for Long-Tailed Visual Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: For some tail classes, the features of their instances are distinct and discriminative, which can also bring satisfactory accuracy; for some head classes, although with sufficient samples, the high semantic similarity with other classes and lack of discriminative features will bring bad accuracy. Based on these observations, we propose Adaptive Logit Adjustment Loss (ALA Loss) to apply an adaptive adjusting term to the logit. |
Yan Zhao; Weicong Chen; Xu Tan; Kai Huang; Jihong Zhu; |
538 | CADRE: A Cascade Deep Reinforcement Learning Framework for Vision-Based Autonomous Urban Driving Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a novel CAscade Deep REinforcement learning framework, CADRE, to achieve model-free vision-based autonomous urban driving. |
Yinuo Zhao; Kun Wu; Zhiyuan Xu; Zhengping Che; Qi Lu; Jian Tang; Chi Harold Liu; |
539 | Learning from The Tangram to Solve Mini Visual Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: By recording human experience in solving tangram puzzles, we present the Tangram dataset and show that a pre-trained neural model on the Tangram helps solve some mini visual tasks based on low-resolution vision. |
Yizhou Zhao; Liang Qiu; Pan Lu; Feng Shi; Tian Han; Song-Chun Zhu; |
540 | Handling Slice Permutations Variability in Tensor Recovery Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we discuss SPV of several key tensor recovery problems theoretically and experimentally. |
Jingjing Zheng; Xiaoqin Zhang; Wenzhe Wang; Xianta Jiang; |
541 | Boosting Contrastive Learning with Relation Knowledge Distillation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We delve into this problem and find that the lightweight model is prone to collapse in semantic space when simply performing instance-wise contrast. To address this issue, we propose a relation-wise contrastive paradigm with Relation Knowledge Distillation (ReKD). |
Kai Zheng; Yuanjiang Wang; Ye Yuan; |
542 | Weakly Supervised Video Moment Localization with Contrastive Negative Sample Mining Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a novel weakly supervised solution by introducing Contrastive Negative sample Mining (CNM). |
Minghang Zheng; Yanjie Huang; Qingchao Chen; Yang Liu; |
543 | Dual Decoupling Training for Semi-supervised Object Detection with Noise-Bypass Head Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To cope with that, a dual decoupling training framework is proposed in the present study, i.e. clean and noisy data decoupling, and classification and localization task decoupling. |
Shida Zheng; Chenshu Chen; Xiaowei Cai; Tingqun Ye; Wenming Tan; |
544 | SCALoss: Side and Corner Aligned Loss for Bounding Box Regression Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose Side Overlap (SO) loss by maximizing the side overlap of two bounding boxes, which puts more penalty for low overlapping bounding box cases. |
Tu Zheng; Shuai Zhao; Yang Liu; Zili Liu; Deng Cai; |
545 | SepFusion: Finding Optimal Fusion Structures for Visual Sound Separation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose SepFusion, a novel framework that can smoothly produce optimal fusion structures for visual-sound separation. |
Dongzhan Zhou; Xinchi Zhou; Di Hu; Hang Zhou; Lei Bai; Ziwei Liu; Wanli Ouyang; |
546 | Pan-Sharpening with Customized Transformer and Invertible Neural Network Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, due to the limitation of the convolution operator, long-range spatial features are often not accurately obtained, thus limiting the overall performance. To this end, we propose a novel and effective method by exploiting a customized transformer architecture and information-lossless invertible neural module for long-range dependencies modeling and effective feature fusion in this paper. |
Man Zhou; Jie Huang; Yanchi Fang; Xueyang Fu; Aiping Liu; |
547 | Promoting Single-Modal Optical Flow Network for Diverse Cross-Modal Flow Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To verify our hypothesis, we design a self-supervised framework to promote the single-modal optical flow networks for diverse corss-modal flow estimation. |
Shili Zhou; Weimin Tan; Bo Yan; |
548 | Edge-Aware Guidance Fusion Network for RGB–Thermal Scene Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In addition, these methods simply fuse the features from RGB and thermal modalities but are unable to obtain comprehensive fused features. To address these problems, we propose an edge-aware guidance fusion network (EGFNet) for RGB–thermal scene parsing. |
Wujie Zhou; Shaohua Dong; Caie Xu; Yaguan Qian; |
549 | TiGAN: Text-Based Interactive Image Generation and Manipulation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel framework for Text-based Interactive image generation and manipulation (TiGAN) that responds to users’ natural-language feedback. |
Yufan Zhou; Ruiyi Zhang; Jiuxiang Gu; Chris Tensmeyer; Tong Yu; Changyou Chen; Jinhui Xu; Tong Sun; |
550 | Cross-Domain Empirical Risk Minimization for Unbiased Long-Tailed Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose Cross-Domain Empirical Risk Minimization (xERM) for training an unbiased test-agnostic model to achieve strong performances on both test distributions, which empirically demonstrates that xERM fundamentally improves the classification by learning better feature representation rather than the "head vs. tail" game. |
Beier Zhu; Yulei Niu; Xian-Sheng Hua; Hanwang Zhang; |
551 | Deep Recurrent Neural Network with Multi-Scale Bi-directional Propagation for Video Deblurring Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Instead of estimating alignment information, we propose a simple and effective deep Recurrent Neural Network with Multi-scale Bi-directional Propagation (RNN-MBP) to effectively propagate and gather the information from unaligned neighboring frames for better video deblurring. |
Chao Zhu; Hang Dong; Jinshan Pan; Boyang Liang; Yuhao Huang; Lean Fu; Fei Wang; |
552 | I Can Find You! Boundary-Guided Separated Attention Network for Camouflaged Object Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: By simulating how humans to discover the so-called ‘perfectly’-camouflaged object, we present a novel boundary-guided separated attention network (call BSA-Net). |
Hongwei Zhu; Peng Li; Haoran Xie; Xuefeng Yan; Dong Liang; Dapeng Chen; Mingqiang Wei; Jing Qin; |
553 | MoCaNet: Motion Retargeting In-the-Wild Via Canonicalization Networks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a novel framework that brings the 3D motion retargeting task from controlled environments to in-the-wild scenarios. |
Wentao Zhu; Zhuoqian Yang; Ziang Di; Wayne Wu; Yizhou Wang; Chen Change Loy; |
554 | Robust Depth Completion with Uncertainty-Driven Loss Functions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce uncertainty-driven loss functions to improve the robustness of depth completion and handle the uncertainty in depth completion. |
Yufan Zhu; Weisheng Dong; Leida Li; Jinjian Wu; Xin Li; Guangming Shi; |
555 | Efficient Model-Driven Network for Shadow Removal Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To effectively solve the variational problem, we design an iterative algorithm and unfold it into a deep network, naturally increasing the interpretability of the deep model. |
Yurui Zhu; Zeyu Xiao; Yanchi Fang; Xueyang Fu; Zhiwei Xiong; Zheng-Jun Zha; |
556 | Learning Disentangled Classification and Localization Representations for Temporal Action Localization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We evaluate our proposed method on two popular benchmarks for TAL, which outperforms all state-of-the-art methods. |
Zixin Zhu; Le Wang; Wei Tang; Ziyi Liu; Nanning Zheng; Gang Hua; |
557 | ACDNet: Adaptively Combined Dilated Convolution for Monocular Panorama Depth Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an ACDNet based on the adaptively combined dilated convolution to predict the dense depth map for a monocular panoramic image. |
Chuanqing Zhuang; Zhengda Lu; Yiqun Wang; Jun Xiao; Ying Wang; |
558 | Making Adversarial Examples More Transferable and Indistinguishable Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, most of the approaches based on fast gradient sign attack series cannot balance the indistinguishability and transferability due to the limitations of the basic sign structure. To address this problem, we propose a method, called Adam Iterative Fast Gradient Tanh Method (AI-FGTM), to generate indistinguishable adversarial examples with high transferability. |
Junhua Zou; Yexin Duan; Boyu Li; Wu Zhang; Yu Pan; Zhisong Pan; |
559 | Undercover Boolean Matrix Factorization with MaxSAT Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: The k-undercover Boolean matrix factorization problem aims to approximate a m×n Boolean matrix X as the Boolean product of an m×k and a k×n matrices A◦B such that X is a cover of A◦B, i.e., no representation error is allowed on the 0’s entries of the matrix X. To infer an optimal and “block-optimal” k-undercover, we propose two exact methods based on MaxSAT encodings. |
Florent Avellaneda; Roger Villemaire; |
560 | Achieving Zero Constraint Violation for Constrained Reinforcement Learning Via Primal-Dual Approach Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To achieve that, we advocate the use of a randomized primal-dual approach to solve the CMDP problems and propose a conservative stochastic primal-dual algorithm (CSPDA) which is shown to exhibit O(1/epsilon^2) sample complexity to achieve epsilon-optimal cumulative reward with zero constraint violations. |
Qinbo Bai; Amrit Singh Bedi; Mridul Agarwal; Alec Koppel; Vaneet Aggarwal; |
561 | GEQCA: Generic Qualitative Constraint Acquisition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose GEACQ, which stands for Generic Qualitative Constraint Acquisition, an active CA method that learns qualitative constraints via the concept of qualitative queries. |
Mohamed-Bachir Belaid; Nassim Belmecheri; Arnaud Gotlieb; Nadjib Lazaar; Helge Spieker; |
562 | Certified Symmetry and Dominance Breaking for Combinatorial Optimisation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Building on the cutting planes proof system, we develop a certification method for optimisation problems in which symmetry and dominance breaking are easily expressible. |
Bart Bogaerts; Stephan Gocht; Ciaran McCreesh; Jakob Nordström; |
563 | The Perils of Learning Before Optimizing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Typically, learning the prediction model used to generate the optimization problem and solving that problem are performed in two separate stages. |
Chris Cameron; Jason Hartford; Taylor Lundy; Kevin Leyton-Brown; |
564 | A Lyapunov-Based Methodology for Constrained Optimization with Bandit Feedback Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In a wide variety of applications including online advertising, contractual hiring, and wireless scheduling, the controller is constrained by a stringent budget constraint on the available resources, which are consumed in a random amount by each action, and a stochastic feasibility constraint that may impose important operational limitations on decision-making. In this work, we consider a general model to address such problems, where each action returns a random reward, cost, and penalty from an unknown joint distribution, and the decision-maker aims to maximize the total reward under a budget constraint B on the total cost and a stochastic constraint on the time-average penalty. |
Semih Cayci; Yilin Zheng; Atilla Eryilmaz; |
565 | Resolving Inconsistencies in Simple Temporal Problems: A Parameterized Approach Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the problem of resolving inconsistency of data encoded in the STP. |
Konrad K. Dabrowski; Peter Jonsson; Sebastian Ordyniak; George Osipov; |
566 | Efficient Riemannian Meta-Optimization By Implicit Differentiation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose an efficient Riemannian meta-optimization method that decouples the complex computation scheme from the meta-gradient. |
Xiaomeng Fan; Yuwei Wu; Zhi Gao; Yunde Jia; Mehrtash Harandi; |
567 | Faster Algorithms for Weak Backdoors Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We design a new algorithm for WB(3CNF, 0-Val) by reducing it to a local search variant of 3-SAT. |
Serge Gaspers; Andrew Kaploun; |
568 | A Divide and Conquer Algorithm for Predict+Optimize with Non-convex Problems Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work we propose a novel divide and conquer algorithm based on transition points to reason over exact optimization problems and predict the coefficients using the optimization loss. |
Ali Ugur Guler; Emir Demirović; Jeffrey Chan; James Bailey; Christopher Leckie; Peter J. Stuckey; |
569 | Computing Diverse Shortest Paths Efficiently: A Theoretical and Experimental Study Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Finding diverse solutions in combinatorial problems recently has received considerable attention (Baste et al. 2020; Fomin et al. 2020; Hanaka et al. 2021). In this paper we study the following type of problems: given an integer k, the problem asks for k solutions such that the sum of pairwise (weighted) Hamming distances between these solutions is maximized. |
Tesshu Hanaka; Yasuaki Kobayashi; Kazuhiro Kurita; See Woo Lee; Yota Otachi; |
570 | Optimizing Binary Decision Diagrams with MaxSAT for Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In fact, due to their structure (especially with small sizes), these models are inherently understandable by humans. Recently, several exact methods for computing such models are proposed to overcome weaknesses of traditional heuristic methods by providing more compact models or better prediction quality. |
Hao Hu; Marie-José Huguet; Mohamed Siala; |
571 | Using MaxSAT for Efficient Explanations of Tree Ensembles Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Motivated by the inherent propositional nature of TEs, this paper proposes to circumvent the need for linear constraints and instead employ an optimization engine for pure propositional logic to efficiently handle the prediction. |
Alexey Ignatiev; Yacine Izza; Peter J. Stuckey; Joao Marques-Silva; |
572 | Finding Backdoors to Integer Programs: A Monte Carlo Tree Search Framework Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose BaMCTS, a Monte Carlo Tree Search framework for finding backdoors to MIPs. |
Elias B. Khalil; Pashootan Vaezipoor; Bistra Dilkina; |
573 | Learning to Search in Local Branching Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we investigate the relation between the size of the search neighborhood and the behavior of the underlying LB algorithm, and we devise a leaning-based framework for guiding the neighborhood search of the LB heuristic. |
Defeng Liu; Matteo Fischetti; Andrea Lodi; |
574 | Analysis of Pure Literal Elimination Rule for Non-uniform Random (MAX) K-SAT Problem with An Arbitrary Degree Distribution Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we analyse the performance of the pure literal elimination rule. |
Oleksii Omelchenko; Andrei A. Bulatov; |
575 | The SoftCumulative Constraint with Quadratic Penalty Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a checker and a filtering algorithm for the SoftCumulative, which are inspired by the powerful energetic reasoning rule for the Cumulative. |
Yanick Ouellet; Claude-Guy Quimper; |
576 | Efficient Vertex-Oriented Polytopic Projection for Web-Scale Applications Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We develop an intuition |