Paper Digest: ACL 2020 Highlights
Readers can also choose to read this highlight article on our console, which allows users to filter out papers using keywords and find related papers.
Annual Meeting of the Association for Computational Linguistics (ACL) is one of the top natural language processing conferences in the world. In 2020, it is to be held online due to covid-19 pandemic. There were 3,429 paper submissions, of which 778 were accepted. ~90 papers also published their code (code download link).
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
If you do not want to miss any interesting academic paper, you are welcome to sign up our free daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
team@paperdigest.org
TABLE 1: ACL 2020 Papers
Title | Authors | Highlight | Related Papers | Related Patents | |
---|---|---|---|---|---|
1 | Learning to Understand Child-directed and Adult-directed Speech | Lieke Gelderloos, Grzegorz Chrupała, Afra Alishahi | This study explores the effect of child-directed speech when learning to extract semantic information from speech directly. | related papers | related patents |
2 | Predicting Depression in Screening Interviews from Latent Categorization of Interview Prompts | Alex Rinaldi, Jean Fox Tree, Snigdha Chaturvedi | We propose JLPC, a model that analyzes interview transcripts to identify depression while jointly categorizing interview prompts into latent categories. | related papers | related patents |
3 | Coach: A Coarse-to-Fine Approach for Cross-domain Slot Filling | Zihan Liu, Genta Indra Winata, Peng Xu, Pascale Fung | In this paper, we propose a Coarse-to-fine approach (Coach) for cross-domain slot filling. | related papers | related patents |
4 | Designing Precise and Robust Dialogue Response Evaluators | Tianyu Zhao, Divesh Lala, Tatsuya Kawahara | In this work, we propose to build a reference-free evaluator and exploit the power of semi-supervised training and pretrained (masked) language models. | related papers | related patents |
5 | Dialogue State Tracking with Explicit Slot Connection Modeling | Yawen Ouyang, Moxin Chen, Xinyu Dai, Yinggong Zhao, Shujian Huang, Jiajun CHEN | To handle these phenomena, we propose a Dialogue State Tracking with Slot Connections (DST-SC) model to explicitly consider slot correlations across different domains. | related papers | related patents |
6 | Generating Informative Conversational Response using Recurrent Knowledge-Interaction and Knowledge-Copy | Xiexiong Lin, Weiyu Jian, Jianshan He, Taifeng Wang, Wei Chu | To address this issue, this paper proposes a method that uses recurrent knowledge interaction among response decoding steps to incorporate appropriate knowledge. | related papers | related patents |
7 | Guiding Variational Response Generator to Exploit Persona | Bowen Wu, MengYuan Li, Zongsheng Wang, Yifu Chen, Derek F. Wong, qihang feng, Junhong Huang, Baoxun Wang | This paper proposes to adopt the personality-related characteristics of human conversations into variational response generators, by designing a specific conditional variational autoencoder based deep model with two new regularization terms employed to the loss function, so as to guide the optimization towards the direction of generating both persona-aware and relevant responses. | related papers | related patents |
8 | Large Scale Multi-Actor Generative Dialog Modeling | Alex Boyd, Raul Puri, Mohammad Shoeybi, Mostofa Patwary, Bryan Catanzaro | This work introduces the Generative Conversation Control model, an augmented and fine-tuned GPT-2 language model that conditions on past reference conversations to probabilistically model multi-turn conversations in the actor’s persona. | related papers | related patents |
9 | PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable | Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang | Inspired by this, we propose a novel dialogue generation pre-training framework to support various kinds of conversations, including chit-chat, knowledge grounded dialogues, and conversational question answering. | related papers | related patents |
10 | Slot-consistent NLG for Task-oriented Dialogue Systems with Iterative Rectification Network | Yangming Li, Kaisheng Yao, Libo Qin, Wanxiang Che, Xiaolong Li, Ting Liu | In this paper, we study slot consistency for building reliable NLG systems with all slot values of input dialogue act (DA) properly generated in output sentences. | related papers | related patents |
11 | Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations | Samuel Coope, Tyler Farghly, Daniela Gerz, Ivan Vulić, Matthew Henderson | We introduce Span-ConveRT, a light-weight model for dialog slot-filling which frames the task as a turn-based span extraction task. | related papers | related patents |
12 | Zero-Shot Transfer Learning with Synthesized Data for Multi-Domain Dialogue State Tracking | Giovanni Campagna, Agata Foryciarz, Mehrad Moradshahi, Monica Lam | This paper proposes new zero-short transfer learning technique for dialogue state tracking where the in-domain training data are all synthesized from an abstract dialogue model and the ontology of the domain. | related papers | related patents |
13 | A Complete Shift-Reduce Chinese Discourse Parser with Robust Dynamic Oracle | Shyh-Shiun Hung, Hen-Hsen Huang, Hsin-Hsi Chen | This work proposes a standalone, complete Chinese discourse parser for practical applications. | related papers | related patents |
14 | TransS-Driven Joint Learning Architecture for Implicit Discourse Relation Recognition | Ruifang He, Jian Wang, Fengyu Guo, Yugui Han | Therefore, we propose a novel TransS-driven joint learning architecture to address the issues. | related papers | related patents |
15 | A Study of Non-autoregressive Model for Sequence Generation | Yi Ren, Jinglin Liu, Xu Tan, Zhou Zhao, sheng zhao, Tie-Yan Liu | To quantify such dependency, we propose an analysis model called CoMMA to characterize the difficulty of different NAR sequence generation tasks. | related papers | related patents |
16 | Cross-modal Language Generation using Pivot Stabilization for Web-scale Language Coverage | Ashish V. Thapliyal, Radu Soricut | We describe an approach called Pivot-Language Generation Stabilization (PLuGS), which leverages directly at training time both existing English annotations (gold data) as well as their machine-translated versions (silver data); at run-time, it generates first an English caption and then a corresponding target-language caption. | related papers | related patents |
17 | Fact-based Text Editing | Hayate Iso, Chao Qiao, Hang Li | We propose a novel text editing task, referred to as \textit{fact-based text editing}, in which the goal is to revise a given document to better describe the facts in a knowledge base (e.g., several triples). | related papers | related patents |
18 | Few-Shot NLG with Pre-Trained Language Model | Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, William Yang Wang | In this work, we propose the new task of few-shot natural language generation. | related papers | related patents |
19 | Fluent Response Generation for Conversational Question Answering | Ashutosh Baheti, Alan Ritter, Kevin Small | In this work, we propose a method for situating QA responses within a SEQ2SEQ NLG approach to generate fluent grammatical answer responses while maintaining correctness. | related papers | related patents |
20 | Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs | Dong Bok Lee, Seanie Lee, Woo Tae Jeong, Donghwan Kim, Sung Ju Hwang | In this work, we propose a hierarchical conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts, while maximizing the mutual information between generated QA pairs to ensure their consistency. | related papers | related patents |
21 | Learning to Ask More: Semi-Autoregressive Sequential Question Generation under Dual-Graph Interaction | Zi Chai, Xiaojun Wan | To this end, we generate questions in a semi-autoregressive way. Our model divides questions into different groups and generates each group of them in parallel. | related papers | related patents |
22 | Neural Syntactic Preordering for Controlled Paraphrase Generation | Tanya Goyal, Greg Durrett | Our work, inspired by pre-ordering literature in machine translation, uses syntactic transformations to softly “reorder” the source sentence and guide our neural paraphrasing model. | related papers | related patents |
23 | Pre-train and Plug-in: Flexible Conditional Text Generation with Variational Auto-Encoders | Yu Duan, Canwen Xu, Jiaxin Pei, Jialong Han, Chenliang Li | In this paper, we present a new framework named Pre-train and Plug-in Variational Auto-Encoder (PPVAE) towards flexible conditional text generation. | related papers | related patents |
24 | Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order | Yi Liao, Xin Jiang, Qun Liu | In this paper, we propose a probabilistic masking scheme for the masked language model, which we call probabilistically masked language model (PMLM). | related papers | related patents |
25 | Reverse Engineering Configurations of Neural Text Generation Models | Yi Tay, Dara Bahri, Che Zheng, Clifford Brunk, Donald Metzler, Andrew Tomkins | In the spirit of better understanding generative text models and their artifacts, we propose the new task of distinguishing which of several variants of a given model generated some piece of text. | related papers | related patents |
26 | Review-based Question Generation with Adaptive Instance Transfer and Augmentation | Qian Yu, Lidong Bing, Qiong Zhang, Wai Lam, Luo Si | To obtain proper training instances for the generation model, we propose an iterative learning framework with adaptive instance transfer and augmentation. | related papers | related patents |
27 | TAG : Type Auxiliary Guiding for Code Comment Generation | Ruichu Cai, Zhihao Liang, Boyan Xu, zijian li, Yuexing Hao, Yao Chen | In order to address the issues above, we propose a Type Auxiliary Guiding encoder-decoder framework for the code comment generation task which considers the source code as an N-ary tree with type information associated with each node. | related papers | related patents |
28 | Unsupervised Paraphrasing by Simulated Annealing | Xianggen Liu, Lili Mou, Fandong Meng, Hao Zhou, Jie Zhou, Sen Song | We propose UPSA, a novel approach that accomplishes Unsupervised Paraphrasing by Simulated Annealing. | related papers | related patents |
29 | A Joint Model for Document Segmentation and Segment Labeling | Joe Barrow, Rajiv Jain, Vlad Morariu, Varun Manjunatha, Douglas Oard, Philip Resnik | We introduce Segment Pooling LSTM (S-LSTM), which is capable of jointly segmenting a document and labeling segments. | related papers | related patents |
30 | Contextualized Weak Supervision for Text Classification | Dheeraj Mekala, Jingbo Shang | In this paper, we propose a novel framework ConWea, providing contextualized weak supervision for text classification. | related papers | related patents |
31 | Every Document Owns Its Structure: Inductive Text Classification via Graph Neural Networks | Yufeng Zhang, Xueli Yu, Zeyu Cui, Shu Wu, Zhongzhen Wen, Liang Wang | Therefore in this work, to overcome such problems, we propose TextING for inductive text classification via GNN. | related papers | related patents |
32 | Neural Topic Modeling with Bidirectional Adversarial Training | Rui Wang, Xuemeng Hu, Deyu Zhou, Yulan He, Yuxuan Xiong, Chenchen Ye, Haiyang Xu | To address these limitations, we propose a neural topic modeling approach, called Bidirectional Adversarial Topic (BAT) model, which represents the first attempt of applying bidirectional adversarial training for neural topic modeling. | related papers | related patents |
33 | Text Classification with Negative Supervision | Sora Ohashi, Junya Takayama, Tomoyuki Kajiwara, Chenhui Chu, Yuki Arase | To address this problem, we propose a simple multitask learning model that uses negative supervision. | related papers | related patents |
34 | Content Word Aware Neural Machine Translation | Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita | To address this limitation, we first utilize word frequency information to distinguish between content and function words in a sentence, and then design a content word-aware NMT to improve translation performance. | related papers | related patents |
35 | Evaluating Explanation Methods for Neural Machine Translation | Jierui Li, Lemao Liu, Huayang Li, Guanlin Li, Guoping Huang, Shuming Shi | To this end, it proposes a principled metric based on fidelity in regard to the predictive behavior of the NMT model. | related papers | related patents |
36 | Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation | Junliang Guo, Linli Xu, Enhong Chen | In this work, we introduce a jointly masked sequence-to-sequence model and explore its application on non-autoregressive neural machine translation{\textasciitilde}(NAT). | related papers | related patents |
37 | Learning Source Phrase Representations for Neural Machine Translation | Hongfei Xu, Josef van Genabith, Deyi Xiong, Qiuhui Liu, Jingyi Zhang | In this paper, we first propose an attentive phrase representation generation mechanism which is able to generate phrase representations from corresponding token representations. In addition, we incorporate the generated phrase representations into the Transformer translation model to enhance its ability to capture long-distance relationships. | related papers | related patents |
38 | Lipschitz Constrained Parameter Initialization for Deep Transformers | Hongfei Xu, Qiuhui Liu, Josef van Genabith, Deyi Xiong, Jingyi Zhang | In this paper, we first empirically demonstrate that a simple modification made in the official implementation, which changes the computation order of residual connection and layer normalization, can significantly ease the optimization of deep Transformers. | related papers | related patents |
39 | Location Attention for Extrapolation to Longer Sequences | Yann Dubois, Gautier Dagan, Dieuwke Hupkes, Elia Bruni | In this paper, we first review the notion of extrapolation, why it is important and how one could hope to tackle it. We then focus on a specific type of extrapolation which is especially useful for natural language processing: generalization to sequences that are longer than the training ones. | related papers | related patents |
40 | Multiscale Collaborative Deep Models for Neural Machine Translation | Xiangpeng Wei, Heng Yu, Yue Hu, Yue Zhang, Rongxiang Weng, Weihua Luo | In this paper, we present a MultiScale Collaborative (MSC) framework to ease the training of NMT models that are substantially deeper than those used previously. | related papers | related patents |
41 | Norm-Based Curriculum Learning for Neural Machine Translation | Xuebo Liu, Houtim Lai, Derek F. Wong, Lidia S. Chao | In this paper, we aim to improve the efficiency of training an NMT by introducing a novel norm-based curriculum learning method. | related papers | related patents |
42 | Opportunistic Decoding with Timely Correction for Simultaneous Translation | Renjie Zheng, Mingbo Ma, Baigong Zheng, Kaibo Liu, Liang Huang | We propose an opportunistic decoding technique with timely correction ability, which always (over-)generates a certain mount of extra words at each step to keep the audience on track with the latest information. | related papers | related patents |
43 | A Formal Hierarchy of RNN Architectures | William Merrill, Gail Weiss, Yoav Goldberg, Roy Schwartz, Noah A. Smith, Eran Yahav | We develop a formal hierarchy of the expressive capacity of RNN architectures. | related papers | related patents |
44 | A Three-Parameter Rank-Frequency Relation in Natural Languages | Chenchen Ding, Masao Utiyama, Eiichiro Sumita | We present that, the rank-frequency relation in textual data follows $f \propto r^{-\alpha}(r+\gamma)^{-\beta}$, where $f$ is the token frequency and $r$ is the rank by frequency, with ($\alpha$, $\beta$, $\gamma$) as parameters. | related papers | related patents |
45 | Dice Loss for Data-imbalanced NLP Tasks | Xiaoya Li, Xiaofei Sun, Yuxian Meng, Junjun Liang, Fei Wu, Jiwei Li | In this paper, we propose to use dice loss in replacement of the standard cross-entropy objective for data-imbalanced NLP tasks. | related papers | related patents |
46 | Emergence of Syntax Needs Minimal Supervision | Raphaël Bailly, Kata Gábor | This paper is a theoretical contribution to the debate on the learnability of syntax from a corpus without explicit syntax-specific guidance. | related papers | related patents |
47 | Language Models as an Alternative Evaluator of Word Order Hypotheses: A Case Study in Japanese | Tatsuki Kuribayashi, Takumi Ito, Jun Suzuki, Kentaro Inui | In this study, we explore whether the LM-based method is valid for analyzing the word order. | related papers | related patents |
48 | GCAN: Graph-aware Co-Attention Networks for Explainable Fake News Detection on Social Media | Yi-Ju Lu, Cheng-Te Li | Given the source short-text tweet and the corresponding sequence of retweet users without text comments, we aim at predicting whether the source tweet is fake or not, and generating explanation by highlighting the evidences on suspicious retweeters and the words they concern. | related papers | related patents |
49 | Integrating Semantic and Structural Information with Graph Convolutional Network for Controversy Detection | Lei Zhong, Juan Cao, Qiang Sheng, Junbo Guo, Ziang Wang | To overcome the first two limitations, we propose Topic-Post-Comment Graph Convolutional Network (TPC-GCN), which integrates the information from the graph structure and content of topics, posts, and comments for post-level controversy detection. | related papers | related patents |
50 | Predicting the Topical Stance and Political Leaning of Media using Tweets | Peter Stefanov, Kareem Darwish, Atanas Atanasov, Preslav Nakov | In this paper, we propose a cascaded method that uses unsupervised learning to ascertain the stance of Twitter users with respect to a polarizing topic by leveraging their retweet behavior; then, it uses supervised learning based on user labels to characterize both the general political leaning of online media and of popular Twitter users, as well as their stance with respect to the target polarizing topic. | related papers | related patents |
51 | Simple, Interpretable and Stable Method for Detecting Words with Usage Change across Corpora | Hila Gonen, Ganesh Jawahar, Djamé Seddah, Yoav Goldberg | We propose an alternative approach that does not use vector space alignment, and instead considers the neighbors of each word. | related papers | related patents |
52 | CDL: Curriculum Dual Learning for Emotion-Controllable Response Generation | Lei Shen, Yang Feng | To alleviate these problems, we propose a novel framework named Curriculum Dual Learning (CDL) which extends the emotion-controllable response generation to a dual task to generate emotional responses and emotional queries alternatively. | related papers | related patents |
53 | Efficient Dialogue State Tracking by Selectively Overwriting Memory | Sungdong Kim, Sohee Yang, Gyuwan Kim, Sang-Woo Lee | Here, we consider dialogue state as an explicit fixed-sized memory and propose a selectively overwriting mechanism for more efficient DST. | related papers | related patents |
54 | End-to-End Neural Pipeline for Goal-Oriented Dialogue Systems using GPT-2 | Donghoon Ham, Jeong-Gwan Lee, Youngsoo Jang, Kee-Eung Kim | In this paper, we present an end-to-end neural architecture for dialogue systems that addresses both challenges above. | related papers | related patents |
55 | Evaluating Dialogue Generation Systems via Response Selection | Shiki Sato, Reina Akama, Hiroki Ouchi, Jun Suzuki, Kentaro Inui | Specifically, we propose to construct test sets filtering out some types of false candidates: (i) those unrelated to the ground-truth response and (ii) those acceptable as appropriate responses. | related papers | related patents |
56 | Gated Convolutional Bidirectional Attention-based Model for Off-topic Spoken Response Detection | Yefei Zha, Ruobing Li, Hui Lin | In this paper, we propose a novel approach for off-topic spoken response detection with high off-topic recall on both seen and unseen prompts. | related papers | related patents |
57 | Learning Low-Resource End-To-End Goal-Oriented Dialog for Fast and Reliable System Deployment | Yinpei Dai, Hangyu Li, Chengguang Tang, Yongbin Li, Jian Sun, Xiaodan Zhu | In this paper, we propose the Meta-Dialog System (MDS), which combines the advantages of both meta-learning approaches and human-machine collaboration. | related papers | related patents |
58 | Learning to Tag OOV Tokens by Integrating Contextual Representation and Background Knowledge | Keqing He, Yuanmeng Yan, Weiran XU | In this paper, we propose a novel knowledge-enhanced slot tagging model to integrate contextual representation of input text and the large-scale lexical background knowledge. | related papers | related patents |
59 | Multi-Agent Task-Oriented Dialog Policy Learning with Role-Aware Reward Decomposition | Ryuichi Takanobu, Runze Liang, Minlie Huang | To avoid explicitly building a user simulator beforehand, we propose Multi-Agent Dialog Policy Learning, which regards both the system and the user as the dialog agents. | related papers | related patents |
60 | Paraphrase Augmented Task-Oriented Dialog Generation | Silin Gao, Yichi Zhang, Zhijian Ou, Zhou Yu | We propose a paraphrase augmented response generation (PARG) framework that jointly trains a paraphrase model and a response generation model to improve the dialog generation performance. | related papers | related patents |
61 | Response-Anticipated Memory for On-Demand Knowledge Integration in Response Generation | Zhiliang Tian, Wei Bi, Dongkyu Lee, Lanqing Xue, YIPING SONG, Xiaojiang Liu, Nevin L. Zhang | In this paper, we propose to create the document memory with some anticipated responses in mind. | related papers | related patents |
62 | Semi-Supervised Dialogue Policy Learning via Stochastic Reward Estimation | Xinting Huang, Jianzhong Qi, Yu Sun, Rui Zhang | To overcome this limitation, we propose a novel reward learning approach for semi-supervised policy learning. | related papers | related patents |
63 | Towards Unsupervised Language Understanding and Generation by Joint Dual Learning | Shang-Yu Su, Chao-Wei Huang, Yun-Nung Chen | However, the prior work still learned both components in a supervised manner; instead, this paper introduces a general learning framework to effectively exploit such duality, providing flexibility of incorporating both supervised and unsupervised learning algorithms to train language understanding and generation models in a joint fashion. | related papers | related patents |
64 | USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation | Shikib Mehri, Maxine Eskenazi | To this end, this paper presents USR, an UnSupervised and Reference-free evaluation metric for dialog. | related papers | related patents |
65 | Explicit Semantic Decomposition for Definition Generation | Jiahuan Li, Yu Bao, Shujian Huang, Xinyu Dai, Jiajun CHEN | In this paper, we propose ESD, namely Explicit Semantic Decomposition for definition Generation, which explicitly decomposes the meaning of words into semantic components, and models them with discrete latent variables for definition generation. | related papers | related patents |
66 | Improved Natural Language Generation via Loss Truncation | Daniel Kang, Tatsunori Hashimoto | We propose loss truncation: a simple and scalable procedure which adaptively removes high log loss examples as a way to optimize for distinguishability. | related papers | related patents |
67 | Line Graph Enhanced AMR-to-Text Generation with Mix-Order Graph Attention Networks | Yanbin Zhao, Lu Chen, Zhi Chen, Ruisheng Cao, Su Zhu, Kai Yu | In this work, we propose a novel graph encoding framework which can effectively explore the edge relations. | related papers | related patents |
68 | Rigid Formats Controlled Text Generation | Piji Li, Haisong Zhang, Xiaojiang Liu, Shuming Shi | Therefore, we propose a simple and elegant framework named SongNet to tackle this problem. | related papers | related patents |
69 | Syn-QG: Syntactic and Shallow Semantic Rules for Question Generation | Kaustubh Dhole, Christopher D. Manning | We implement this observation by developing Syn-QG, a set of transparent syntactic rules leveraging universal dependencies, shallow semantic parsing, lexical resources, and custom rules which transform declarative sentences into question-answer pairs. | related papers | related patents |
70 | An Online Semantic-enhanced Dirichlet Model for Short Text Stream Clustering | Jay Kumar, Junming Shao, Salah Uddin, Wazir Ali | Therefore, in this paper, we propose an Online Semantic-enhanced Dirichlet Model for short sext stream clustering, called OSDM, which integrates the word-occurance semantic information (i.e., context) into a new graphical model and clusters each arriving short text automatically in an online way. | related papers | related patents |
71 | Generative Semantic Hashing Enhanced via Boltzmann Machines | Lin Zheng, Qinliang Su, Dinghan Shen, Changyou Chen | In this paper, to introduce correlations among the bits of hash codes, we propose to employ the distribution of Boltzmann machine as the variational posterior. | related papers | related patents |
72 | Interactive Construction of User-Centric Dictionary for Text Analytics | Ryosuke Kohita, Issei Yoshida, Hiroshi Kanayama, Tetsuya Nasukawa | To optimize the interaction, we propose a new algorithm that effectively captures an analyst’s intention starting from only a small number of sample terms. | related papers | related patents |
73 | Tree-Structured Neural Topic Model | Masaru Isonuma, Junichiro Mori, Danushka Bollegala, Ichiro Sakata | This paper presents a tree-structured neural topic model, which has a topic distribution over a tree with an infinite number of branches. | related papers | related patents |
74 | Unsupervised FAQ Retrieval with Question Generation and BERT | Yosi Mass, Boaz Carmeli, Haggai Roitman, David Konopnicki | We present a fully unsupervised method that exploits the FAQ pairs to train two BERT models. | related papers | related patents |
75 | “The Boating Store Had Its Best Sail Ever”: Pronunciation-attentive Contextualized Pun Recognition | Yichao Zhou, Jyun-Yu Jiang, Jieyu Zhao, Kai-Wei Chang, Wei Wang | In this paper, we propose Pronunciation-attentive Contextualized Pun Recognition (PCPR) to perceive human humor, detect if a sentence contains puns and locate them in the sentence. | related papers | related patents |
76 | Fast and Accurate Deep Bidirectional Language Representations for Unsupervised Learning | Joongbo Shin, Yoonhyung Lee, Seunghyun Yoon, Kyomin Jung | To resolve this limitation, we propose a novel deep bidirectional language model called a Transformer-based Text Autoencoder (T-TA). | related papers | related patents |
77 | Fine-grained Interest Matching for Neural News Recommendation | Heyuan Wang, Fangzhao Wu, Zheng Liu, Xing Xie | In this paper, we propose FIM, a Fine-grained Interest Matching method for neural news recommendation. | related papers | related patents |
78 | Interpretable Operational Risk Classification with Semi-Supervised Variational Autoencoder | Fan Zhou, Shengming Zhang, Yi Yang | To tackle these challenges, we present a semi-supervised text classification framework that integrates multi-head attention mechanism with Semi-supervised variational inference for Operational Risk Classification (SemiORC). | related papers | related patents |
79 | Interpreting Twitter User Geolocation | Ting Zhong, Tianliang Wang, Fan Zhou, Goce Trajcevski, Kunpeng Zhang, Yi Yang | In this work, we adopt influence functions to interpret the behavior of GNN-based models by identifying the importance of training users when predicting the locations of the testing users. | related papers | related patents |
80 | Modeling Code-Switch Languages Using Bilingual Parallel Corpus | Grandee Lee, Haizhou Li | We propose a bilingual attention language model (BALM) that simultaneously performs language modeling objective with a quasi-translation objective to model both the monolingual as well as the cross-lingual sequential dependency. | related papers | related patents |
81 | SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check | Xingyi Cheng, Weidi Xu, Kunlong Chen, Shaohua Jiang, Feng Wang, Taifeng Wang, Wei Chu, Yuan Qi | This paper proposes to incorporate phonological and visual similarity knowledge into language models for CSC via a specialized graph convolutional network (SpellGCN). | related papers | related patents |
82 | Spelling Error Correction with Soft-Masked BERT | Shaohua Zhang, Haoran Huang, Jicong Liu, Hang Li | In this work, we propose a novel neural architecture to address the aforementioned issue, which consists of a network for error detection and a network for error correction based on BERT, with the former being connected to the latter with what we call soft-masking technique. | related papers | related patents |
83 | A Frame-based Sentence Representation for Machine Reading Comprehension | Shaoru Guo, Ru Li, Hongye Tan, Xiaoli Li, Yong Guan, Hongyan Zhao, Yueping Zhang | To bridge the gap, we proposed a novel Frame-based Sentence Representation (FSR) method, which employs frame semantic knowledge to facilitate sentence modelling. | related papers | related patents |
84 | A Methodology for Creating Question Answering Corpora Using Inverse Data Annotation | Jan Deriu, Katsiaryna Mlynchyk, Philippe Schläpfer, Alvaro Rodrigo, Dirk von Grünigen, Nicolas Kaiser, Kurt Stockinger, Eneko Agirre, Mark Cieliebak | In this paper, we introduce a novel methodology to efficiently construct a corpus for question answering over structured data. | related papers | related patents |
85 | Contextualized Sparse Representations for Real-Time Open-Domain Question Answering | Jinhyuk Lee, Minjoon Seo, Hannaneh Hajishirzi, Jaewoo Kang | In this paper, we aim to improve the quality of each phrase embedding by augmenting it with a contextualized sparse representation (Sparc). | related papers | related patents |
86 | Dynamic Sampling Strategies for Multi-Task Reading Comprehension | Ananth Gottumukkala, Dheeru Dua, Sameer Singh, Matt Gardner | We show that a simple dynamic sampling strategy, selecting instances for training proportional to the multi-task model�s current performance on a dataset relative to its single task performance, gives substantive gains over prior multi-task sampling strategies, mitigating the catastrophic forgetting that is common in multi-task learning. | related papers | related patents |
87 | Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension | Fei Yuan, Linjun Shou, Xuanyu Bai, Ming Gong, Yaobo Liang, Nan Duan, Yan Fu, Daxin Jiang | In this paper, we propose two auxiliary tasks in the fine-tuning stage to create additional phrase boundary supervision: (1) A mixed MRC task, which translates the question or passage to other languages and builds cross-lingual question-passage pairs; (2) A language-agnostic knowledge masking task by leveraging knowledge phrases mined from web. | related papers | related patents |
88 | Explicit Memory Tracker with Coarse-to-Fine Reasoning for Conversational Machine Reading | Yifan Gao, Chien-Sheng Wu, Shafiq Joty, Caiming Xiong, Richard Socher, Irwin King, Michael Lyu, Steven C.H. Hoi | In this paper, we present a new framework of conversational machine reading that comprises a novel Explicit Memory Tracker (EMT) to track whether conditions listed in the rule text have already been satisfied to make a decision. | related papers | related patents |
89 | Injecting Numerical Reasoning Skills into Language Models | Mor Geva, Ankit Gupta, Jonathan Berant | In this work, we show that numerical reasoning is amenable to automatic data generation, and thus one can inject this skill into pre-trained LMs, by generating large amounts of data, and training in a multi-task setup. | related papers | related patents |
90 | Learning to Identify Follow-Up Questions in Conversational Question Answering | Souvik Kundu, Qian Lin, Hwee Tou Ng | In this paper, we introduce a new follow-up question identification task. | related papers | related patents |
91 | Query Graph Generation for Answering Multi-hop Complex Questions from Knowledge Bases | Yunshi Lan, Jing Jiang | In this paper, we handle both types of complexity at the same time. | related papers | related patents |
92 | A Diverse Corpus for Evaluating and Developing English Math Word Problem Solvers | Shen-yun Miao, Chao-Chun Liang, Keh-Yih Su | We present ASDiv (Academia Sinica Diverse MWP Dataset), a diverse (in terms of both language patterns and problem types) English math word problem (MWP) corpus for evaluating the capability of various MWP solvers. | related papers | related patents |
93 | Improving Image Captioning Evaluation by Considering Inter References Variance | Yanzhi Yi, Hangyu Deng, Jinglu Hu | In this paper, we propose a novel metric based on BERTScore that could handle such a challenge and extend BERTScore with a few new features appropriately for image captioning evaluation. | related papers | related patents |
94 | Revisiting the Context Window for Cross-lingual Word Embeddings | Ryokan Ri, Yoshimasa Tsuruoka | In this work, we provide a thorough evaluation, in various languages, domains, and tasks, of bilingual embeddings trained with different context windows. | related papers | related patents |
95 | Moving Down the Long Tail of Word Sense Disambiguation with Gloss Informed Bi-encoders | Terra Blevins, Luke Zettlemoyer | We propose a bi-encoder model that independently embeds (1) the target word with its surrounding context and (2) the dictionary definition, or gloss, of each sense. | related papers | related patents |
96 | Code-Switching Patterns Can Be an Effective Route to Improve Performance of Downstream NLP Applications: A Case Study of Humour, Sarcasm and Hate Speech Detection | Srijan Bansal, Vishal Garimella, Ayush Suhane, Jasabanta Patro, Animesh Mukherjee | In this paper, we demonstrate how code-switching patterns can be utilised to improve various downstream NLP applications. | related papers | related patents |
97 | DTCA: Decision Tree-based Co-Attention Networks for Explainable Claim Verification | Lianwei Wu, Yuan Rao, yongqiang zhao, Hao Liang, Ambreen Nazir | In this paper, we propose a Decision Tree-based Co-Attention model (DTCA) to discover evidence for explainable claim verification. | related papers | related patents |
98 | Towards Conversational Recommendation over Multi-Type Dialogs | Zeming Liu, Haifeng Wang, Zheng-Yu Niu, Hua Wu, Wanxiang Che, Ting Liu | We focus on the study of conversational recommendation in the context of multi-type dialogs, where the bots can proactively and naturally lead a conversation from a non-recommendation dialog (e.g., QA) to a recommendation dialog, taking into account user’s interests and feedback. | related papers | related patents |
99 | Unknown Intent Detection Using Gaussian Mixture Model with an Application to Zero-shot Intent Classification | Guangfeng Yan, Lu Fan, Qimai Li, Han Liu, Xiaotong Zhang, Xiao-Ming Wu, Albert Y.S. Lam | This paper proposes a semantic-enhanced Gaussian mixture model (SEG) for unknown intent detection. | related papers | related patents |
100 | Expertise Style Transfer: A New Task Towards Better Communication between Experts and Laymen | Yixin Cao, Ruihao Shui, Liangming Pan, Min-Yen Kan, Zhiyuan Liu, Tat-Seng Chua | We propose a new task of expertise style transfer and contribute a manually annotated dataset with the goal of alleviating such cognitive biases. | related papers | related patents |
101 | Towards Faithful Neural Table-to-Text Generation with Content-Matching Constraints | Zhenyi Wang, Xiaoyang Wang, Bang An, Dong Yu, Changyou Chen | In this paper, for the first time, we propose a novel Transformer-based generation framework to achieve the goal. | related papers | related patents |
102 | Dynamic Memory Induction Networks for Few-Shot Text Classification | Ruiying Geng, Binhua Li, Yongbin Li, Jian Sun, Xiaodan Zhu | This paper proposes Dynamic Memory Induction Networks (DMIN) for few-short text classification. | related papers | related patents |
103 | Exclusive Hierarchical Decoding for Deep Keyphrase Generation | Wang Chen, Hou Pong Chan, Piji Li, Irwin King | To overcome these limitations, we propose an exclusive hierarchical decoding framework that includes a hierarchical decoding process and either a soft or a hard exclusion mechanism. | related papers | related patents |
104 | Hierarchy-Aware Global Model for Hierarchical Text Classification | Jie Zhou, Chunping Ma, Dingkun Long, Guangwei Xu, Ning Ding, Haoyu Zhang, Pengjun Xie, Gongshen Liu | In this paper, we formulate the hierarchy as a directed graph and introduce hierarchy-aware structure encoders for modeling label dependencies. | related papers | related patents |
105 | Keyphrase Generation for Scientific Document Retrieval | Florian Boudin, Ygor Gallina, Akiko Aizawa | This study provides empirical evidence that such models can significantly improve retrieval performance, and introduces a new extrinsic evaluation framework that allows for a better understanding of the limitations of keyphrase generation models. | related papers | related patents |
106 | A Graph Auto-encoder Model of Derivational Morphology | Valentin Hofmann, Hinrich Schütze, Janet Pierrehumbert | We present a graph auto-encoder that learns embeddings capturing information about the compatibility of affixes and stems in derivation. | related papers | related patents |
107 | Building a User-Generated Content North-African Arabizi Treebank: Tackling Hell | Djamé Seddah, Farah Essaidi, Amal Fethi, Matthieu Futeral, Benjamin Muller, Pedro Javier Ortiz Suárez, Benoît Sagot, Abhishek Srivastava | We introduce the first treebank for a romanized user-generated content variety of Algerian, a North-African Arabic dialect known for its frequent usage of code-switching. | related papers | related patents |
108 | Crawling and Preprocessing Mailing Lists At Scale for Dialog Analysis | Janek Bevendorff, Khalid Al Khatib, Martin Potthast, Benno Stein | This paper introduces the Webis Gmane Email Corpus 2019, the largest publicly available and fully preprocessed email corpus to date. | related papers | related patents |
109 | Fine-Grained Analysis of Cross-Linguistic Syntactic Divergences | Dmitry Nikolaev, Ofir Arviv, Taelin Karidi, Neta Kenneth, Veronika Mitnik, Lilja Maria Saeboe, Omri Abend | We propose a framework for extracting divergence patterns for any language pair from a parallel corpus, building on Universal Dependencies. | related papers | related patents |
110 | Generating Counter Narratives against Online Hate Speech: Data and Strategies | Serra Sinem Tekiroğlu, Yi-Ling Chung, Marco Guerini | Being aware of the aforementioned limitations, we present a study on how to collect responses to hate effectively, employing large scale unsupervised language models such as GPT-2 for the generation of silver data, and the best annotation strategies/neural architectures that can be used for data filtering before expert validation/post-editing. | related papers | related patents |
111 | KLEJ: Comprehensive Benchmark for Polish Language Understanding | Piotr Rybak, Robert Mroczkowski, Janusz Tracz, Ireneusz Gawlik | To alleviate this issue, we introduce a comprehensive multi-task benchmark for the Polish language understanding, accompanied by an online leaderboard. | related papers | related patents |
112 | Learning and Evaluating Emotion Lexicons for 91 Languages | Sven Buechel, Susanna Rücker, Udo Hahn | In order to break this bottleneck, we here introduce a methodology for creating almost arbitrarily large emotion lexicons for any target language. | related papers | related patents |
113 | Multi-Hypothesis Machine Translation Evaluation | Marina Fomicheva, Lucia Specia, Francisco Guzmán | In this paper, we propose an alternative approach: instead of modelling linguistic variation in human reference we exploit the MT model uncertainty to generate multiple diverse translations and use these: (i) as surrogates to reference translations; (ii) to obtain a quantification of translation variability to either complement existing metric scores or (iii) replace references altogether. | related papers | related patents |
114 | Multimodal Quality Estimation for Machine Translation | Shu Okabe, Frédéric Blain, Lucia Specia | We propose approaches to Quality Estimation (QE) for Machine Translation that explore both text and visual modalities for Multimodal QE. | related papers | related patents |
115 | PuzzLing Machines: A Challenge on Learning From Small Data | Gözde Gül Şahin, Yova Kementchedjhieva, Phillip Rust, Iryna Gurevych | To expose this problem in a new light, we introduce a challenge on learning from small data, PuzzLing Machines, which consists of Rosetta Stone puzzles from Linguistic Olympiads for high school students. | related papers | related patents |
116 | The SOFC-Exp Corpus and Neural Approaches to Information Extraction in the Materials Science Domain | Annemarie Friedrich, Heike Adel, Federico Tomazic, Johannes Hingerl, Renou Benteau, Anika Marusczyk, Lukas Lange | This paper presents a new challenging information extraction task in the domain of materials science. | related papers | related patents |
117 | The TechQA Dataset | Vittorio Castelli, Rishav Chakravarti, Saswati Dana, Anthony Ferritto, Radu Florian, Martin Franz, Dinesh Garg, Dinesh Khandelwal, Scott McCarley, Michael McCawley, Mohamed Nasr, Lin Pan, Cezar Pendus, John Pitrelli, Saurabh Pujar, Salim Roukos, Andrzej Sakrajda, Avi Sil, Rosario Uceda-Sosa, Todd Ward, Rong Zhang | We introduce TECHQA, a domain-adaptation question answering dataset for the technical support domain. | related papers | related patents |
118 | iSarcasm: A Dataset of Intended Sarcasm | Silviu Oprea, Walid Magdy | We show the limitations of previous labelling methods in capturing intended sarcasm and introduce the iSarcasm dataset of tweets labeled for sarcasm directly by their authors. | related papers | related patents |
119 | AMR Parsing via Graph-Sequence Iterative Inference | Deng Cai, Wai Lam | We propose a new end-to-end model that treats AMR parsing as a series of dual decisions on the input sequence and the incrementally constructed graph. | related papers | related patents |
120 | A Large-Scale Multi-Document Summarization Dataset from the Wikipedia Current Events Portal | Demian Gholipour Ghalandari, Chris Hokamp, Nghia The Pham, John Glover, Georgiana Ifrim | This work presents a new dataset for MDS that is large both in the total number of document clusters and in the size of individual clusters. | related papers | related patents |
121 | Attend, Translate and Summarize: An Efficient Method for Neural Cross-Lingual Summarization | Junnan Zhu, Yu Zhou, Jiajun Zhang, Chengqing Zong | In this paper, we propose a novel method inspired by the translation pattern in the process of obtaining a cross-lingual summary. | related papers | related patents |
122 | Examining the State-of-the-Art in News Timeline Summarization | Demian Gholipour Ghalandari, Georgiana Ifrim | In this paper, we compare different TLS strategies using appropriate evaluation frameworks, and propose a simple and effective combination of methods that improves over the stateof-the-art on all tested benchmarks. | related papers | related patents |
123 | Improving Truthfulness of Headline Generation | Kazuki Matsumaru, Sho Takase, Naoaki Okazaki | This paper explores improving the truthfulness in headline generation on two popular datasets. | related papers | related patents |
124 | SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization | Yang Gao, Wei Zhao, Steffen Eger | We propose SUPERT, which rates the quality of a summary by measuring its semantic similarity with a pseudo reference summary, i.e. selected salient sentences from the source documents, using contextualized embeddings and soft token alignment techniques. | related papers | related patents |
125 | Self-Attention Guided Copy Mechanism for Abstractive Summarization | Song Xu, Haoran Li, Peng Yuan, Youzheng Wu, Xiaodong He, Bowen Zhou | In this work, we propose a Transformer-based model to enhance the copy mechanism. | related papers | related patents |
126 | Beyond User Self-Reported Likert Scale Ratings: A Comparison Model for Automatic Dialog Evaluation | Weixin Liang, James Zou, Zhou Yu | To alleviate this problem, we formulate dialog evaluation as a comparison task. | related papers | related patents |
127 | Conversational Word Embedding for Retrieval-Based Dialog System | Wentao Ma, Yiming Cui, Ting Liu, Dong Wang, Shijin Wang, Guoping Hu | In this paper, we propose a conversational word embedding method named PR-Embedding, which utilizes the conversation pairs {\textless}post, reply{\textgreater} to learn word embedding. | related papers | related patents |
128 | Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network | Yutai Hou, Wanxiang Che, Yongkui Lai, Zhihan Zhou, Yijia Liu, Han Liu, Ting Liu | In this paper, we explore the slot tagging with only a few labeled support sentences (a.k.a. few-shot). | related papers | related patents |
129 | Learning Dialog Policies from Weak Demonstrations | Gabriel Gordon-Hall, Philip John Gorinski, Shay B. Cohen | We introduce Reinforced Fine-tune Learning, an extension to DQfD, enabling us to overcome the domain gap between the datasets and the environment. | related papers | related patents |
130 | MuTual: A Dataset for Multi-Turn Dialogue Reasoning | Leyang Cui, Yu Wu, Shujie Liu, Yue Zhang, Ming Zhou | To facilitate the conversation reasoning research, we introduce MuTual, a novel dataset for Multi-Turn dialogue Reasoning, consisting of 8,860 manually annotated dialogues based on Chinese student English listening comprehension exams. | related papers | related patents |
131 | You Impress Me: Dialogue Generation via Mutual Persona Perception | Qian Liu, Yihong Chen, Bei Chen, Jian-Guang LOU, Zixuan Chen, Bin Zhou, Dongmei Zhang | Motivated by this, we propose P{\^{}}2 Bot, a transmitter-receiver based framework with the aim of explicitly modeling understanding. | related papers | related patents |
132 | Bridging Anaphora Resolution as Question Answering | Yufang Hou | In this paper, we cast bridging anaphora resolution as question answering based on context. | related papers | related patents |
133 | Dialogue Coherence Assessment Without Explicit Dialogue Act Labels | Mohsen Mesgar, Sebastian Bücker, Iryna Gurevych | We address these issues by introducing a novel approach to dialogue coherence assessment. | related papers | related patents |
134 | Fast and Accurate Non-Projective Dependency Tree Linearization | Xiang Yu, Simon Tannert, Ngoc Thang Vu, Jonas Kuhn | We propose a graph-based method to tackle the dependency tree linearization task. | related papers | related patents |
135 | Semantic Graphs for Generating Deep Questions | Liangming Pan, Yuxi Xie, Yansong Feng, Tat-Seng Chua, Min-Yen Kan | This paper proposes the problem of Deep Question Generation (DQG), which aims to generate complex questions that require reasoning over multiple pieces of information about the input passage. | related papers | related patents |
136 | A Novel Cascade Binary Tagging Framework for Relational Triple Extraction | Zhepei Wei, Jianlin Su, Yue Wang, Yuan Tian, Yi Chang | In this work, we introduce a fresh perspective to revisit the relational triple extraction task and propose a novel cascade binary tagging framework (CasRel) derived from a principled problem formulation. | related papers | related patents |
137 | In Layman’s Terms: Semi-Open Relation Extraction from Scientific Texts | Ruben Kruiper, Julian Vincent, Jessica Chen-Burger, Marc Desmulliez, Ioannis Konstas | In this work we combine the output of both types of systems to achieve Semi-Open Relation Extraction, a new task that we explore in the Biology domain. | related papers | related patents |
138 | NAT: Noise-Aware Training for Robust Neural Sequence Labeling | Marcin Namysl, Sven Behnke, Joachim Köhler | To this end, we formulate the noisy sequence labeling problem, where the input may undergo an unknown noising process and propose two Noise-Aware Training (NAT) objectives that improve robustness of sequence labeling performed on perturbed input: Our data augmentation method trains a neural model using a mixture of clean and noisy samples, whereas our stability training algorithm encourages the model to create a noise-invariant latent representation. | related papers | related patents |
139 | Named Entity Recognition without Labelled Data: A Weak Supervision Approach | Pierre Lison, Jeremy Barnes, Aliaksandr Hubin, Samia Touileb | This paper presents a simple but powerful approach to learn NER models in the absence of labelled data through weak supervision. | related papers | related patents |
140 | Probing Linguistic Features of Sentence-Level Representations in Relation Extraction | Christoph Alt, Aleksandra Gabryszak, Leonhard Hennig | We introduce 14 probing tasks targeting linguistic properties relevant to RE, and we use them to study representations learned by more than 40 different encoder architecture and linguistic feature combinations trained on two datasets, TACRED and SemEval 2010 Task 8. | related papers | related patents |
141 | Reasoning with Latent Structure Refinement for Document-Level Relation Extraction | Guoshun Nan, Zhijiang Guo, Ivan Sekulic, Wei Lu | Unlike previous methods that may not be able to capture rich non-local interactions for inference, we propose a novel model that empowers the relational reasoning across sentences by automatically inducing the latent document-level graph. | related papers | related patents |
142 | TACRED Revisited: A Thorough Evaluation of the TACRED Relation Extraction Task | Christoph Alt, Aleksandra Gabryszak, Leonhard Hennig | In this paper, we investigate the questions: Have we reached a performance ceiling or is there still room for improvement? | related papers | related patents |
143 | Bilingual Dictionary Based Neural Machine Translation without Using Parallel Sentences | Xiangyu Duan, Baijun Ji, Hao Jia, Min Tan, Min Zhang, Boxing Chen, Weihua Luo, Yue Zhang | In this paper, we propose a new task of machine translation (MT), which is based on no parallel sentences but can refer to a ground-truth bilingual dictionary. | related papers | related patents |
144 | Boosting Neural Machine Translation with Similar Translations | Jitao XU, Josep Crego, Jean Senellart | This paper explores data augmentation methods for training Neural Machine Translation to make use of similar translations, in a comparable way a human translator employs fuzzy matches. | related papers | related patents |
145 | Character-Level Translation with Self-attention | Yingqiang Gao, Nikola I. Nikolov, Yuhuang Hu, Richard H.R. Hahnloser | We explore the suitability of self-attention models for character-level neural machine translation. | related papers | related patents |
146 | End-to-End Neural Word Alignment Outperforms GIZA++ | Thomas Zenkel, Joern Wuebker, John DeNero | We present the first end-to-end neural word alignment method that consistently outperforms GIZA++ on three data sets. | related papers | related patents |
147 | Enhancing Machine Translation with Dependency-Aware Self-Attention | Emanuele Bugliarello, Naoaki Okazaki | In this work, we investigate different approaches to incorporate syntactic knowledge in the Transformer model and also propose a novel, parameter-free, dependency-aware self-attention mechanism that improves its translation quality, especially for long sentences and in low-resource scenarios. | related papers | related patents |
148 | Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation | Biao Zhang, Philip Williams, Ivan Titov, Rico Sennrich | We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics, and overcome this bottleneck via language-specific components and deepening NMT architectures. | related papers | related patents |
149 | It’s Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information | Emanuele Bugliarello, Sabrina J. Mielke, Antonios Anastasopoulos, Ryan Cotterell, Naoaki Okazaki | In this paper, we propose cross-mutual information (XMI): an asymmetric information-theoretic metric of machine translation difficulty that exploits the probabilistic nature of most neural machine translation models. | related papers | related patents |
150 | Language-aware Interlingua for Multilingual Neural Machine Translation | Changfeng Zhu, Heng Yu, Shanbo Cheng, Weihua Luo | In this paper, we incorporate a language-aware interlingua into the Encoder-Decoder architecture. | related papers | related patents |
151 | On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation | Wei Zhao, Goran Glavaš, Maxime Peyrard, Yang Gao, Robert West, Steffen Eger | In this paper, we concern ourselves with reference-free machine translation (MT) evaluation where we directly compare source texts to (sometimes low-quality) system translations, which represents a natural adversarial setup for multilingual encoders. | related papers | related patents |
152 | Parallel Sentence Mining by Constrained Decoding | Pinzhen Chen, Nikolay Bogoychev, Kenneth Heafield, Faheem Kirefu | We present a novel method to extract parallel sentences from two monolingual corpora, using neural machine translation. | related papers | related patents |
153 | Self-Attention with Cross-Lingual Position Representation | Liang Ding, Longyue Wang, Dacheng Tao | In this paper, we augment SANs with \textit{cross-lingual position representations} to model the bilingually aware latent structure for the input sentence. | related papers | related patents |
154 | “You Sound Just Like Your Father” Commercial Machine Translation Systems Include Stylistic Biases | Dirk Hovy, Federico Bianchi, Tommaso Fornaciari | We show that as a consequence, the output of three commercial machine translation systems (Bing, DeepL, Google) make demographically diverse samples from five languages “sound” older and more male than the original. | related papers | related patents |
155 | MMPE: A Multi-Modal Interface for Post-Editing Machine Translation | Nico Herbig, Tim Düwel, Santanu Pal, Kalliopi Meladaki, Mahsa Monshizadeh, Antonio Krüger, Josef van Genabith | Since this paradigm shift offers potential for modalities other than mouse and keyboard, we present MMPE, the first prototype to combine traditional input modes with pen, touch, and speech modalities for PE of MT. | related papers | related patents |
156 | A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages | Pedro Javier Ortiz Suárez, Laurent Romary, Benoît Sagot | We use the multilingual OSCAR corpus, extracted from Common Crawl via language classification, filtering and cleaning, to train monolingual contextualized word embeddings (ELMo) for five mid-resource languages. | related papers | related patents |
157 | Will-They-Won’t-They: A Very Large Dataset for Stance Detection on Twitter | Costanza Conforti, Jakob Berndt, Mohammad Taher Pilehvar, Chryssi Giannitsarou, Flavio Toxvaerd, Nigel Collier | We present a new challenging stance detection dataset, called Will-They-Won’t-They (WT–WT), which contains 51,284 tweets in English, making it by far the largest available dataset of the type. | related papers | related patents |
158 | A Systematic Assessment of Syntactic Generalization in Neural Language Models | Jennifer Hu, Jon Gauthier, Peng Qian, Ethan Wilcox, Roger Levy | We present a systematic evaluation of the syntactic knowledge of neural language models, testing 20 combinations of model types and data sizes on a set of 34 English-language syntactic test suites. | related papers | related patents |
159 | Inflecting When There’s No Majority: Limitations of Encoder-Decoder Neural Networks as Cognitive Models for German Plurals | Kate McCurdy, Sharon Goldwater, Adam Lopez | We conclude that modern neural models may still struggle with minority-class generalization. | related papers | related patents |
160 | Overestimation of Syntactic Representation in Neural Language Models | Jordan Kodner, Nitish Gupta | We illustrate a fundamental problem with this approach by reproducing positive results from a recent paper with two non-syntactic baseline language models: an n-gram model and an LSTM model trained on scrambled inputs. | related papers | related patents |
161 | Suspense in Short Stories is Predicted By Uncertainty Reduction over Neural Story Representation | David Wilmot, Frank Keller | We propose a hierarchical language model that encodes stories and computes surprise and uncertainty reduction. | related papers | related patents |
162 | You Don’t Have Time to Read This: An Exploration of Document Reading Time Prediction | Orion Weller, Jordan Hildebrandt, Ilya Reznik, Christopher Challis, E. Shannon Tass, Quinn Snell, Kevin Seppi | We seek to extend these works by examining whether or not document level predictions are effective, given additional information such as subject matter, font characteristics, and readability metrics. | related papers | related patents |
163 | A Generative Model for Joint Natural Language Understanding and Generation | Bo-Hsiang Tseng, Jianpeng Cheng, Yimai Fang, David Vandyke | In this work, we propose a generative model which couples NLU and NLG through a shared latent variable. | related papers | related patents |
164 | Automatic Detection of Generated Text is Easiest when Humans are Fooled | Daphne Ippolito, Daniel Duckworth, Chris Callison-Burch, Douglas Eck | Here, we perform careful benchmarking and analysis of three popular sampling-based decoding strategies-top-{\_}k{\_}, nucleus sampling, and untruncated random sampling-and show that improvements in decoding methods have primarily optimized for fooling humans. | related papers | related patents |
165 | Multi-Domain Neural Machine Translation with Word-Level Adaptive Layer-wise Domain Mixing | Haoming Jiang, Chen Liang, Chong Wang, Tuo Zhao | To overcome this limitation, we propose a novel multi-domain NMT model using individual modules for each domain, on which we apply word-level, adaptive and layer-wise domain mixing. | related papers | related patents |
166 | Conversational Graph Grounded Policy Learning for Open-Domain Conversation Generation | Jun Xu, Haifeng Wang, Zheng-Yu Niu, Hua Wu, Wanxiang Che, Ting Liu | To address the challenge of policy learning in open-domain multi-turn conversation, we propose to represent prior information about dialog transitions as a graph and learn a graph grounded dialog policy, aimed at fostering a more coherent and controllable dialog. | related papers | related patents |
167 | GPT-too: A Language-Model-First Approach for AMR-to-Text Generation | Manuel Mager, Ramón Fernandez Astudillo, Tahira Naseem, Md Arafat Sultan, Young-Suk Lee, Radu Florian, Salim Roukos | In this paper, we propose an alternative approach that combines a strong pre-trained language model with cycle consistency-based re-scoring. | related papers | related patents |
168 | Learning to Update Natural Language Comments Based on Code Changes | Sheena Panthaplackel, Pengyu Nie, Milos Gligoric, Junyi Jessy Li, Raymond Mooney | We propose an approach that learns to correlate changes across two distinct language representations, to generate a sequence of edits that are applied to the existing comment to reflect the source code modifications. | related papers | related patents |
169 | Politeness Transfer: A Tag and Generate Approach | Aman Madaan, Amrith Setlur, Tanmay Parekh, Barnabas Poczos, Graham Neubig, Yiming Yang, Ruslan Salakhutdinov, Alan W Black, Shrimai Prabhumoye | This paper introduces a new task of politeness transfer which involves converting non-polite sentences to polite sentences while preserving the meaning. | related papers | related patents |
170 | BPE-Dropout: Simple and Effective Subword Regularization | Ivan Provilkov, Dmitrii Emelianenko, Elena Voita | We introduce BPE-dropout – simple and effective subword regularization method based on and compatible with conventional BPE. | related papers | related patents |
171 | Improving Non-autoregressive Neural Machine Translation with Monolingual Data | Jiawei Zhou, Phillip Keung | Under this framework, we leverage large monolingual corpora to improve the NAR model’s performance, with the goal of transferring the AR model’s generalization ability while preventing overfitting. | related papers | related patents |
172 | Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization | Sajad Sotudeh Gharebagh, Nazli Goharian, Ross Filice | In this paper, we approach the content selection problem for clinical abstractive summarization by augmenting salient ontological terms into the summarizer. | related papers | related patents |
173 | On Faithfulness and Factuality in Abstractive Summarization | Joshua Maynez, Shashi Narayan, Bernd Bohnet, Ryan McDonald | In this paper we have analyzed limitations of these models for abstractive document summarization and found that these models are highly prone to hallucinate content that is unfaithful to the input document. | related papers | related patents |
174 | Screenplay Summarization Using Latent Narrative Structure | Pinelopi Papalampidi, Frank Keller, Lea Frermann, Mirella Lapata | In this paper, we propose to explicitly incorporate the underlying structure of narratives into general unsupervised and supervised extractive summarization models. | related papers | related patents |
175 | Unsupervised Opinion Summarization with Noising and Denoising | Reinald Kim Amplayo, Mirella Lapata | In this paper we enable the use of supervised learning for the setting where there are only documents available (e.g., product or business reviews) without ground truth summaries. | related papers | related patents |
176 | A Tale of Two Perplexities: Sensitivity of Neural Language Models to Lexical Retrieval Deficits in Dementia of the Alzheimer’s Type | Trevor Cohen, Serguei Pakhomov | In this paper, we interrogate neural LMs trained on participants with and without dementia by using synthetic narratives previously developed to simulate progressive semantic dementia by manipulating lexical frequency. | related papers | related patents |
177 | Probing Linguistic Systematicity | Emily Goodwin, Koustuv Sinha, Timothy J. O’Donnell | We examine the notion of systematicity from a linguistic perspective, defining a set of probing tasks and a set of metrics to measure systematic behaviour. We also identify ways in which network architectures can generalize non-systematically, and discuss why such forms of generalization may be unsatisfying. | related papers | related patents |
178 | Recollection versus Imagination: Exploring Human Memory and Cognition via Neural Language Models | Maarten Sap, Eric horvitz, Yejin Choi, Noah A. Smith, James Pennebaker | We introduce a measure of narrative flow and use this to examine the narratives for imagined and recalled events. | related papers | related patents |
179 | Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment | Forrest Davis, Marten van Schijndel | Our work uses ambiguous relative clause attachment to extend such evaluations to cases of multiple simultaneous valid interpretations, where stark grammaticality differences are absent. | related papers | related patents |
180 | Speakers enhance contextually confusable words | Eric Meinhardt, Eric Bakovic, Leon Bergen | We develop a measure of contextual confusability during word recognition based on psychoacoustic data. Applying this measure to naturalistic speech corpora, we find evidence suggesting that speakers alter their productions to make contextually more confusable words easier to understand. | related papers | related patents |
181 | What determines the order of adjectives in English? Comparing efficiency-based theories using dependency treebanks | Richard Futrell, William Dyer, Greg Scontras | The four theories we test are subjectivity (Scontras et al., 2017), information locality (Futrell, 2019), integration cost (Dyer, 2017), and information gain, which we introduce. | related papers | related patents |
182 | “None of the Above”: Measure Uncertainty in Dialog Response Retrieval | Yulan Feng, Shikib Mehri, Maxine Eskenazi, Tiancheng Zhao | This paper discusses the importance of uncovering uncertainty in end-to-end dialog tasks and presents our experimental results on uncertainty classification on the processed Ubuntu Dialog Corpus. | related papers | related patents |
183 | Can You Put it All Together: Evaluating Conversational Agents’ Ability to Blend Skills | Eric Michael Smith, Mary Williamson, Kurt Shuster, Jason Weston, Y-Lan Boureau | In this work, we investigate several ways to combine models trained towards isolated capabilities, ranging from simple model aggregation schemes that require minimal additional training, to various forms of multi-task training that encompass several skills at all training stages. | related papers | related patents |
184 | Grounded Conversation Generation as Guided Traverses in Commonsense Knowledge Graphs | Houyu Zhang, Zhenghao Liu, Chenyan Xiong, Zhiyuan Liu | This paper presents a new conversation generation model, ConceptFlow, which leverages commonsense knowledge graphs to explicitly model conversation flows. | related papers | related patents |
185 | Negative Training for Neural Dialogue Response Generation | Tianxing He, James Glass | In this work, we propose a framework named “Negative Training” to minimize such behaviors. | related papers | related patents |
186 | Recursive Template-based Frame Generation for Task Oriented Dialog | Rashmi Gangadharaiah, Balakrishnan Narayanaswamy | We propose a recursive, hierarchical frame-based representation and show how to learn it from data. | related papers | related patents |
187 | Speak to your Parser: Interactive Text-to-SQL with Natural Language Feedback | Ahmed Elgohary, saghar Hosseini, Ahmed Hassan Awadallah | In this paper, we investigate a more interactive scenario where humans can further interact with the system by providing free-form natural language feedback to correct the system when it generates an inaccurate interpretation of an initial utterance. | related papers | related patents |
188 | Calibrating Structured Output Predictors for Natural Language Processing | Abhyuday Jagannatha, hong yu | In this study, we propose a general calibration scheme for output entities of interest in neural network based structured prediction models. | related papers | related patents |
189 | Active Imitation Learning with Noisy Guidance | Kianté Brantley, Hal Daumé III, Amr Sharaf | To combat this query complexity, we consider an active learning setting in which the learning algorithm has additional access to a much cheaper noisy heuristic that provides noisy guidance. | related papers | related patents |
190 | ExpBERT: Representation Engineering with Natural Language Explanations | Shikhar Murty, Pang Wei Koh, Percy Liang | In this paper, we allow model developers to specify these types of inductive biases as natural language explanations. | related papers | related patents |
191 | GAN-BERT: Generative Adversarial Learning for Robust Text Classification with a Bunch of Labeled Examples | Danilo Croce, Giuseppe Castellucci, Roberto Basili | In this paper, we propose GAN-BERT that ex- tends the fine-tuning of BERT-like architectures with unlabeled data in a generative adversarial setting. | related papers | related patents |
192 | Generalizing Natural Language Analysis through Span-relation Representations | Zhengbao Jiang, Wei Xu, Jun Araki, Graham Neubig | In this paper, we provide the simple insight that a great variety of tasks can be represented in a single unified format consisting of labeling spans and relations between spans, thus a single task-independent model can be used across different tasks. | related papers | related patents |
193 | Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling | Ouyu Lan, Xiao Huang, Bill Yuchen Lin, He Jiang, Liyuan Liu, Xiang Ren | In this paper, we propose a novel framework Consensus Network (ConNet) that can be trained on annotations from multiple sources (e.g., crowd annotation, cross-domain data). | related papers | related patents |
194 | MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification | Jiaao Chen, Zichao Yang, Diyi Yang | This paper presents MixText, a semi-supervised learning method for text classification, which uses our newly designed data augmentation method called TMix. | related papers | related patents |
195 | MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices | Zhiqing Sun, Hongkun Yu, Xiaodan Song, Renjie Liu, Yiming Yang, Denny Zhou | In this paper, we propose MobileBERT for compressing and accelerating the popular BERT model. | related papers | related patents |
196 | On Importance Sampling-Based Evaluation of Latent Language Models | Robert L Logan IV, Matt Gardner, Sameer Singh | In this paper, we carry out this analysis for three models: RNNG, EntityNLM, and KGLM. | related papers | related patents |
197 | SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization | Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, Tuo Zhao | To address such an issue in a principled manner, we propose a new learning framework for robust and efficient fine-tuning for pre-trained models to attain better generalization performance. | related papers | related patents |
198 | Stolen Probability: A Structural Weakness of Neural Language Models | David Demeter, Gregory Kimmel, Doug Downey | We present numerical, theoretical and empirical analyses which show that words on the interior of the convex hull in the embedding space have their probability bounded by the probabilities of the words on the hull. | related papers | related patents |
199 | Taxonomy Construction of Unseen Domains via Graph-based Cross-Domain Knowledge Transfer | Chao Shang, Sarthak Dash, Md. Faisal Mahbub Chowdhury, Nandana Mihindukulasooriya, Alfio Gliozzo | In this paper, we propose Graph2Taxo, a GNN-based cross-domain transfer framework for the taxonomy construction task. | related papers | related patents |
200 | To Pretrain or Not to Pretrain: Examining the Benefits of Pretrainng on Resource Rich Tasks | Sinong Wang, Madian Khabsa, Hao Ma | This paper examines the benefits of pretrained models as a function of the number of training samples used in the downstream task. | related papers | related patents |
201 | Why Overfitting Isn’t Always Bad: Retrofitting Cross-Lingual Word Embeddings to Dictionaries | Mozhi Zhang, Yoshinari Fujinuma, Michael J. Paul, Jordan Boyd-Graber | We address this limitation by retrofitting CLWE to the training dictionary, which pulls training translation pairs closer in the embedding space and overfits the training dictionary. | related papers | related patents |
202 | XtremeDistil: Multi-stage Distillation for Massive Multilingual Models | Subhabrata Mukherjee, Ahmed Hassan Awadallah | In this work we study knowledge distillation with a focus on multilingual Named Entity Recognition (NER). | related papers | related patents |
203 | A Girl Has A Name: Detecting Authorship Obfuscation | Asad Mahmood, Zubair Shafiq, Padmini Srinivasan | In this paper, we evaluate the stealthiness of state-of-the-art authorship obfuscation methods under an adversarial threat model. | related papers | related patents |
204 | DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference | Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, Jimmy Lin | We propose a simple but effective method, DeeBERT, to accelerate BERT inference. | related papers | related patents |
205 | Efficient Strategies for Hierarchical Text Classification: External Knowledge and Auxiliary Tasks | Kervy Rivas Rojas, Gina Bustamante, Arturo Oncevay, Marco Antonio Sobrevilla Cabezudo | In hierarchical text classification, we perform a sequence of inference steps to predict the category of a document from top to bottom of a given class taxonomy. | related papers | related patents |
206 | Investigating the effect of auxiliary objectives for the automated grading of learner English speech transcriptions | Hannah Craighead, Andrew Caines, Paula Buttery, Helen Yannakoudakis | Motivated by recent advances in multi-task learning, we develop neural networks trained in a multi-task fashion that learn to predict the proficiency level of non-native English speakers by taking advantage of inductive transfer between the main task (grading) and auxiliary prediction tasks: morpho-syntactic labeling, language modeling, and native language identification (L1). | related papers | related patents |
207 | SPECTER: Document-level Representation Learning using Citation-informed Transformers | Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel Weld | We propose SPECTER, a new method to generate document-level embedding of scientific papers based on pretraining a Transformer language model on a powerful signal of document-level relatedness: the citation graph. | related papers | related patents |
208 | Semantic Scaffolds for Pseudocode-to-Code Generation | Ruiqi Zhong, Mitchell Stern, Dan Klein | We propose a method for program generation based on semantic scaffolds, lightweight structures representing the high-level semantic and syntactic composition of a program. | related papers | related patents |
209 | Can We Predict New Facts with Open Knowledge Graph Embeddings? A Benchmark for Open Link Prediction | Samuel Broscheit, Kiril Gashteovski, Yanjie Wang, Rainer Gemulla | In this paper, we investigate whether it is possible to infer new facts directly from the open knowledge graph without any canonicalization or any supervision from curated knowledge. | related papers | related patents |
210 | INFOTABS: Inference on Tables as Semi-structured Data | Vivek Gupta, Maitrey Mehta, Pegah Nokhiz, Vivek Srikumar | In this paper, we observe that semi-structured tabulated text is ubiquitous; understanding them requires not only comprehending the meaning of text fragments, but also implicit relationships between them. | related papers | related patents |
211 | Interactive Machine Comprehension with Information Seeking Agents | Xingdi Yuan, Jie Fu, Marc-Alexandre Côté, Yi Tay, Chris Pal, Adam Trischler | In this paper, we propose a simple method that reframes existing MRC datasets as interactive, partially observable environments. | related papers | related patents |
212 | Syntactic Data Augmentation Increases Robustness to Inference Heuristics | Junghyun Min, R. Thomas McCoy, Dipanjan Das, Emily Pitler, Tal Linzen | We explore several methods to augment standard training sets with syntactically informative examples, generated by applying syntactic transformations to sentences from the MNLI corpus. | related papers | related patents |
213 | Improved Speech Representations with Multi-Target Autoregressive Predictive Coding | Yu-An Chung, James Glass | In this paper we extend this hypothesis and aim to enrich the information encoded in the hidden states by training the model to make more accurate future predictions. | related papers | related patents |
214 | Integrating Multimodal Information in Large Pretrained Transformers | Wasifur Rahman, Md Kamrul Hasan, Sangwu Lee, AmirAli Bagher Zadeh, Chengfeng Mao, Louis-Philippe Morency, Ehsan Hoque | In this paper, we proposed an attachment to BERT and XLNet called Multimodal Adaptation Gate (MAG). | related papers | related patents |
215 | MultiQT: Multimodal learning for real-time question tracking in speech | Jakob D. Havtorn, Jan Latko, Joakim Edin, Lars Maaløe, Lasse Borgholt, Lorenzo Belgrano, Nicolai Jacobsen, Regitze Sdun, Željko Agić | We propose a novel multimodal approach to real-time sequence labeling in speech. | related papers | related patents |
216 | Multimodal and Multiresolution Speech Recognition with Transformers | Georgios Paraskevopoulos, Srinivas Parthasarathy, Aparna Khare, Shiva Sundaram | This paper presents an audio visual automatic speech recognition (AV-ASR) system using a Transformer-based architecture. | related papers | related patents |
217 | Phone Features Improve Speech Translation | Elizabeth Salesky, Alan W Black | We compare cascaded and end-to-end models across high, medium, and low-resource conditions, and show that cascades remain stronger baselines. | related papers | related patents |
218 | Grounding Conversations with Improvised Dialogues | Hyundong Cho, Jonathan May | We collect a corpus of more than 26,000 yes-and turns, transcribing them from improv dialogues and extracting them from larger, but more sparsely populated movie script dialogue corpora, via a bootstrapped classifier. | related papers | related patents |
219 | Image-Chat: Engaging Grounded Conversations | Kurt Shuster, Samuel Humeau, Antoine Bordes, Jason Weston | In this work we study large-scale architectures and datasets for this goal. | related papers | related patents |
220 | Learning an Unreferenced Metric for Online Dialogue Evaluation | Koustuv Sinha, Prasanna Parthasarathi, Jasmine Wang, Ryan Lowe, William L. Hamilton, Joelle Pineau | Here, we propose an unreferenced automated evaluation metric that uses large pre-trained language models to extract latent representations of utterances, and leverages the temporal transitions that exist between them. | related papers | related patents |
221 | Neural Generation of Dialogue Response Timings | Matthew Roddy, Naomi Harte | We propose neural models that simulate the distributions of these response offsets, taking into account the response turn as well as the preceding turn. | related papers | related patents |
222 | The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents | Kurt Shuster, Da JU, Stephen Roller, Emily Dinan, Y-Lan Boureau, Jason Weston | We introduce dodecaDialogue: a set of 12 tasks that measures if a conversational agent can communicate engagingly with personality and empathy, ask questions, answer questions by utilizing knowledge resources, discuss topics and situations, and perceive and converse about images. | related papers | related patents |
223 | Automatic Poetry Generation from Prosaic Text | Tim Van de Cruys | In this paper, we will explore how these approaches can be adapted and combined to model the linguistic and literary aspects needed for poetry generation. | related papers | related patents |
224 | Bridging the Structural Gap Between Encoding and Decoding for Data-To-Text Generation | Chao Zhao, Marilyn Walker, Snigdha Chaturvedi | To narrow this gap, we propose DualEnc, a dual encoding model that can not only incorporate the graph structure, but can also cater to the linear structure of the output text. | related papers | related patents |
225 | Enabling Language Models to Fill in the Blanks | Chris Donahue, Mina Lee, Percy Liang | We present a simple approach for \textit{text infilling}, the task of predicting missing spans of text at any position in a document. | related papers | related patents |
226 | INSET: Sentence Infilling with INter-SEntential Transformer | Yichen Huang, Yizhe Zhang, Oussama Elachqar, Yu Cheng | In this paper, we propose a framework to decouple the challenge and address these three aspects respectively, leveraging the power of existing large-scale pre-trained models such as BERT and GPT-2. | related papers | related patents |
227 | Improving Adversarial Text Generation by Modeling the Distant Future | Ruiyi Zhang, Changyou Chen, Zhe Gan, Wenlin Wang, Dinghan Shen, Guoyin Wang, Zheng Wen, Lawrence Carin | We consider a text planning scheme and present a model-based imitation-learning approach to alleviate the aforementioned issues. | related papers | related patents |
228 | Simple and Effective Retrieve-Edit-Rerank Text Generation | Nabil Hossain, Marjan Ghazvininejad, Luke Zettlemoyer | We propose to extend this framework with a simple and effective post-generation ranking approach. | related papers | related patents |
229 | BabyWalk: Going Farther in Vision-and-Language Navigation by Taking Baby Steps | Wang Zhu, Hexiang Hu, Jiacheng Chen, Zhiwei Deng, Vihan Jain, Eugene Ie, Fei Sha | In this paper, we study how an agent can navigate long paths when learning from a corpus that consists of shorter ones. | related papers | related patents |
230 | Cross-media Structured Common Space for Multimedia Event Extraction | Manling Li, Alireza Zareian, Qi Zeng, Spencer Whitehead, Di Lu, Heng Ji, Shih-Fu Chang | We propose a novel method, Weakly Aligned Structured Embedding (WASE), that encodes structured representations of semantic information from textual and visual data into a common embedding space. | related papers | related patents |
231 | Learning to Segment Actions from Observation and Narration | Daniel Fried, Jean-Baptiste Alayrac, Phil Blunsom, Chris Dyer, Stephen Clark, Aida Nematzadeh | We apply a generative segmental model of task structure, guided by narration, to action segmentation in video. | related papers | related patents |
232 | Learning to execute instructions in a Minecraft dialogue | Prashant Jayannavar, Anjali Narayan-Chen, Julia Hockenmaier | We define the subtask of predicting correct action sequences (block placements and removals) in a given game context, and show that capturing B’s past actions as well as B’s perspective leads to a significant improvement in performance on this challenging language understanding problem. | related papers | related patents |
233 | MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning | Jie Lei, Liwei Wang, Yelong Shen, Dong Yu, Tamara Berg, Mohit Bansal | Towards this goal, we propose a new approach called Memory-Augmented Recurrent Transformer (MART), which uses a memory module to augment the transformer architecture. | related papers | related patents |
234 | What is Learned in Visually Grounded Neural Syntax Acquisition | Noriyuki Kojima, Hadar Averbuch-Elor, Alexander Rush, Yoav Artzi | We also find that a simple lexical signal of noun concreteness plays the main role in the model�s predictions as opposed to more complex syntactic reasoning. | related papers | related patents |
235 | A Batch Normalized Inference Network Keeps the KL Vanishing Away | Qile Zhu, Wei Bi, Xiaojiang Liu, Xiyao Ma, Xiaolin Li, Dapeng Wu | We propose to let the KL follow a distribution across the whole dataset, and analyze that it is sufficient to prevent posterior collapse by keeping the expectation of the KL’s distribution positive. | related papers | related patents |
236 | Contextual Embeddings: When Are They Worth It? | Simran Arora, Avner May, Jian Zhang, Christopher Ré | We study the settings for which deep contextual embeddings (e.g., BERT) give large improvements in performance relative to classic pretrained embeddings (e.g., GloVe), and an even simpler baseline-random word embeddings-focusing on the impact of the training set size and the linguistic properties of the task. | related papers | related patents |
237 | Interactive Classification by Asking Informative Questions | Lili Yu, Howard Chen, Sida I. Wang, Tao Lei, Yoav Artzi | We study the potential for interaction in natural language classification. | related papers | related patents |
238 | Knowledge Graph Embedding Compression | Mrinmaya Sachan | Thus, we propose an approach that compresses the KG embedding layer by representing each entity in the KG as a vector of discrete codes and then composes the embeddings from these codes. | related papers | related patents |
239 | Low Resource Sequence Tagging using Sentence Reconstruction | Tal Perl, Sriram Chaudhury, Raja Giryes | Specifically, our method demonstrates how by adding a decoding layer for sentence reconstruction, we can improve the performance of various baselines. | related papers | related patents |
240 | Masked Language Model Scoring | Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff | We release our library for language model scoring at https://github.com/awslabs/mlm-scoring. | related papers | related patents |
241 | Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding | Yun Tang, Jing Huang, Guangtao Wang, Xiaodong He, Bowen Zhou | In this work, we propose a novel distance-based approach for knowledge graph link prediction. | related papers | related patents |
242 | Posterior Calibrated Training on Sentence Classification Tasks | Taehee Jung, Dongyeop Kang, Hua Cheng, Lucas Mentch, Thomas Schaaf | Here we propose an end-to-end training procedure called posterior calibrated (PosCal) training that directly optimizes the objective while minimizing the difference between the predicted and empirical posterior probabilities. | related papers | related patents |
243 | Posterior Control of Blackbox Generation | Xiang Lisa Li, Alexander Rush | In this work, we consider augmenting neural generation models with discrete control states learned through a structured latent-variable approach. | related papers | related patents |
244 | Pretrained Transformers Improve Out-of-Distribution Robustness | Dan Hendrycks, Xiaoyuan Liu, Eric Wallace, Adam Dziedzic, Rishabh Krishnan, Dawn Song | We examine which factors affect robustness, finding that larger models are not necessarily more robust, distillation can be harmful, and more diverse pretraining data can enhance robustness. | related papers | related patents |
245 | Robust Encodings: A Framework for Combating Adversarial Typos | Erik Jones, Robin Jia, Aditi Raghunathan, Percy Liang | In this work, we introduce robust encodings (RobEn): a simple framework that confers guaranteed robustness, without making compromises on model architecture. | related papers | related patents |
246 | Showing Your Work Doesn’t Always Work | Raphael Tang, Jaejun Lee, Ji Xin, Xinyu Liu, Yaoliang Yu, Jimmy Lin | One exemplar publication, titled �Show Your Work: Improved Reporting of Experimental Results� (Dodge et al., 2019), advocates for reporting the expected validation effectiveness of the best-tuned model, with respect to the computational budget. In the present work, we critically examine this paper. As far as statistical generalizability is concerned, we find unspoken pitfalls and caveats with this approach. | related papers | related patents |
247 | Span Selection Pre-training for Question Answering | Michael Glass, Alfio Gliozzo, Rishav Chakravarti, Anthony Ferritto, Lin Pan, G P Shrivatsa Bhargav, Dinesh Garg, Avi Sil | In this paper we introduce a new pre-training task inspired by reading comprehension to better align the pre-training from memorization to understanding. | related papers | related patents |
248 | Topological Sort for Sentence Ordering | Shrimai Prabhumoye, Ruslan Salakhutdinov, Alan W Black | In this paper, we propose a new framing of this task as a constraint solving problem and introduce a new technique to solve it. | related papers | related patents |
249 | Weight Poisoning Attacks on Pretrained Models | Keita Kurita, Paul Michel, Graham Neubig | In this paper, we show that it is possible to construct “weight poisoning” attacks where pre-trained weights are injected with vulnerabilities that expose “backdoors” after fine-tuning, enabling the attacker to manipulate the model prediction simply by injecting an arbitrary keyword. | related papers | related patents |
250 | schuBERT: Optimizing Elements of BERT | Ashish Khetan, Zohar Karnin | In this work we revisit the architecture choices of BERT in efforts to obtain a lighter model. | related papers | related patents |
251 | ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation | Lifu Tu, Richard Yuanzhe Pang, Sam Wiseman, Kevin Gimpel | We propose to train a non-autoregressive machine translation model to minimize the energy defined by a pretrained autoregressive model. | related papers | related patents |
252 | Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation | Aditya Siddhant, Ankur Bapna, Yuan Cao, Orhan Firat, Mia Chen, Sneha Kudugunta, Naveen Arivazhagan, Yonghui Wu | In this work, we join these two lines of research and demonstrate the efficacy of monolingual data with self-supervision in multilingual NMT. | related papers | related patents |
253 | On The Evaluation of Machine Translation SystemsTrained With Back-Translation | Sergey Edunov, Myle Ott, Marc’Aurelio Ranzato, Michael Auli | In this work, we show that this conjecture is not empirically supported and that back-translation improves translation quality of both naturally occurring text as well as translationese according to professional human translators. | related papers | related patents |
254 | Simultaneous Translation Policies: From Fixed to Adaptive | Baigong Zheng, Kaibo Liu, Renjie Zheng, Mingbo Ma, Hairong Liu, Liang Huang | We design an algorithm to achieve adaptive policies via a simple heuristic composition of a set of fixed policies. | related papers | related patents |
255 | Breaking Through the 80% Glass Ceiling: Raising the State of the Art in Word Sense Disambiguation by Incorporating Knowledge Graph Information | Michele Bevilacqua, Roberto Navigli | We present Enhanced WSD Integrating Synset Embeddings and Relations (EWISER), a neural supervised architecture that is able to tap into this wealth of knowledge by embedding information from the LKB graph within the neural architecture, and to exploit pretrained synset embeddings, enabling the network to predict synsets that are not in the training set. | related papers | related patents |
256 | Glyph2Vec: Learning Chinese Out-of-Vocabulary Word Embedding from Glyphs | Hong-You Chen, SZ-HAN YU, Shou-de Lin | We present a multi-modal model, \textit{Glyph2Vec}, to tackle Chinese out-of-vocabulary word embedding problem. | related papers | related patents |
257 | Multidirectional Associative Optimization of Function-Specific Word Representations | Daniela Gerz, Ivan Vulić, Marek Rei, Roi Reichart, Anna Korhonen | We present a neural framework for learning associations between interrelated groups of words such as the ones found in Subject-Verb-Object (SVO) structures. | related papers | related patents |
258 | Predicting Degrees of Technicality in Automatic Terminology Extraction | Anna Hätty, Dominik Schlechtweg, Michael Dorna, Sabine Schulte im Walde | We suggest two novel models to exploit general- vs. domain-specific comparisons: a simple neural network model with pre-computed comparative-embedding information as input, and a multi-channel model computing the comparison internally. | related papers | related patents |
259 | Verbal Multiword Expressions for Identification of Metaphor | Omid Rohanian, Marek Rei, Shiva Taslimipoor, Le An Ha | This work is the first attempt at analysing the interplay of metaphor and MWEs processing through the design of a neural architecture whereby classification of metaphors is enhanced by informing the model of the presence of MWEs. | related papers | related patents |
260 | Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer | Jieyu Zhao, Subhabrata Mukherjee, saghar Hosseini, Kai-Wei Chang, Ahmed Hassan Awadallah | In this paper, we study gender bias in multilingual embeddings and how it affects transfer learning for NLP applications. | related papers | related patents |
261 | Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis? | kobi leins, Jey Han Lau, Timothy Baldwin | We examine this question with respect to a paper on automatic legal sentencing from EMNLP 2019 which was a source of some debate, in asking whether the paper should have been allowed to be published, who should have been charged with making such a decision, and on what basis. | related papers | related patents |
262 | Is Your Classifier Actually Biased? Measuring Fairness under Uncertainty with Bernstein Bounds | Kawin Ethayarajh | In this work, we propose using Bernstein bounds to represent this uncertainty about the bias estimate as a confidence interval. Instead of annotating all the examples, can we annotate a subset of them and use that sample to estimate the bias? | related papers | related patents |
263 | It’s Morphin’ Time! Combating Linguistic Discrimination with Inflectional Perturbations | Samson Tan, Shafiq Joty, Min-Yen Kan, Richard Socher | We perturb the inflectional morphology of words to craft plausible and semantically similar adversarial examples that expose these biases in popular NLP models, e.g., BERT and Transformer, and show that adversarially fine-tuning them for a single epoch significantly improves robustness without sacrificing performance on clean data. | related papers | related patents |
264 | Mitigating Gender Bias Amplification in Distribution by Posterior Regularization | Shengyu Jia, Tao Meng, Jieyu Zhao, Kai-Wei Chang | In this paper, we investigate the gender bias amplification issue from the distribution perspective and demonstrate that the bias is amplified in the view of predicted probability distribution over labels. | related papers | related patents |
265 | Towards Understanding Gender Bias in Relation Extraction | Andrew Gaut, Tony Sun, Shirlyn Tang, Yuxin Huang, Jing Qian, Mai ElSherief, Jieyu Zhao, Diba Mirza, Elizabeth Belding, Kai-Wei Chang, William Yang Wang | In this paper, we create WikiGenderBias, a distantly supervised dataset composed of over 45,000 sentences including a 10% human annotated test set for the purpose of analyzing gender bias in relation extraction systems. | related papers | related patents |
266 | A Probabilistic Generative Model for Typographical Analysis of Early Modern Printing | Kartik Goyal, Chris Dyer, Christopher Warren, Maxwell G’Sell, Taylor Berg-Kirkpatrick | We propose a deep and interpretable probabilistic generative model to analyze glyph shapes in printed Early Modern documents. | related papers | related patents |
267 | Attentive Pooling with Learnable Norms for Text Representation | Chuhan Wu, Fangzhao Wu, Tao Qi, Xiaohui Cui, Yongfeng Huang | In this paper, we propose an Attentive Pooling with Learnable Norms (APLN) approach for text representation. | related papers | related patents |
268 | Estimating the influence of auxiliary tasks for multi-task learning of sequence tagging tasks | Fynn Schröder, Chris Biemann | We propose new methods to automatically assess the similarity of sequence tagging datasets to identify beneficial auxiliary data for MTL or TL setups. | related papers | related patents |
269 | How Does Selective Mechanism Improve Self-Attention Networks? | Xinwei Geng, Longyue Wang, Xing Wang, Bing Qin, Ting Liu, Zhaopeng Tu | In this paper, we bridge the gap by assessing the strengths of selective SANs (SSANs), which are implemented with a flexible and universal Gumbel-Softmax. | related papers | related patents |
270 | Improving Transformer Models by Reordering their Sublayers | Ofir Press, Noah A. Smith, Omer Levy | We propose a new transformer pattern that adheres to this property, the sandwich transformer, and show that it improves perplexity on multiple word-level and character-level language modeling benchmarks, at no cost in parameters, memory, or training time. | related papers | related patents |
271 | Single Model Ensemble using Pseudo-Tags and Distinct Vectors | Ryosuke Kuwabara, Jun Suzuki, Hideki Nakayama | In this study, we propose a novel method that replicates the effects of a model ensemble with a single model. | related papers | related patents |
272 | Zero-shot Text Classification via Reinforced Self-training | Zhiquan Ye, Yuxia Geng, Jiaoyan Chen, Jingmin Chen, Xiaoxiao Xu, SuHang Zheng, Feng Wang, Jun Zhang, Huajun Chen | To tackle this problem, in this paper we propose a self-training based method to efficiently leverage unlabeled data. | related papers | related patents |
273 | A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine Translation | Yongjing Yin, Fandong Meng, Jinsong Su, Chulun Zhou, Zhengyuan Yang, Jie Zhou, Jiebo Luo | To deal with this issue, in this paper, we propose a novel graph-based multi-modal fusion encoder for NMT. | related papers | related patents |
274 | A Relaxed Matching Procedure for Unsupervised BLI | Xu Zhao, Zihao Wang, Yong Zhang, Hao Wu | Thus We propose a relaxed matching procedure to find a more precise matching between two languages. | related papers | related patents |
275 | Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation | Xuanli He, Gholamreza Haffari, Mohammad Norouzi | This paper introduces Dynamic Programming Encoding (DPE), a new segmentation algorithm for tokenizing sentences into subword units. | related papers | related patents |
276 | Geometry-aware domain adaptation for unsupervised alignment of word embeddings | Pratik Jawanpuria, Mayank Meghwanshi, Bamdev Mishra | We propose a novel manifold based geometric approach for learning unsupervised alignment of word embeddings between the source and the target languages. | related papers | related patents |
277 | Learning to Recover from Multi-Modality Errors for Non-Autoregressive Neural Machine Translation | Qiu Ran, Yankai Lin, Peng Li, Jie Zhou | To alleviate this problem, we propose a novel semi-autoregressive model RecoverSAT in this work, which generates a translation as a sequence of segments. | related papers | related patents |
278 | On the Inference Calibration of Neural Machine Translation | Shuo Wang, Zhaopeng Tu, Shuming Shi, Yang Liu | By carefully designing experiments on three language pairs, our work provides in-depth analyses of the correlation between calibration and translation performance as well as linguistic properties of miscalibration and reports a number of interesting findings that might help humans better analyze, understand and improve NMT models. | related papers | related patents |
279 | Camouflaged Chinese Spam Content Detection with Semi-supervised Generative Active Learning | Zhuoren Jiang, Zhe Gao, Yu Duan, Yangyang Kang, Changlong Sun, Qiong Zhang, Xiaozhong Liu | We propose a Semi-supervIsed GeNerative Active Learning (SIGNAL) model to address the imbalance, efficiency, and text camouflage problems of Chinese text spam detection task. | related papers | related patents |
280 | Distinguish Confusing Law Articles for Legal Judgment Prediction | Nuo Xu, Pinghui Wang, Long Chen, Li Pan, Xiaoyan Wang, Junzhou Zhao | In this paper, we present an end-to-end model, LADAN, to solve the task of LJP. | related papers | related patents |
281 | Hiring Now: A Skill-Aware Multi-Attention Model for Job Posting Generation | Liting Liu, Jie Liu, Wenzheng Zhang, Ziming Chi, Wenxuan Shi, Yalou Huang | To this end, we propose a novel task of Job Posting Generation (JPG) which is cast as a conditional text generation problem to generate job requirements according to the job descriptions. | related papers | related patents |
282 | HyperCore: Hyperbolic and Co-graph Representation for Automatic ICD Coding | Pengfei Cao, Yubo Chen, Kang Liu, Jun Zhao, Shengping Liu, Weifeng Chong | In this paper, we propose a Hyperbolic and Co-graph Representation method (HyperCore) to address the above problem. | related papers | related patents |
283 | Hyperbolic Capsule Networks for Multi-Label Classification | Boli Chen, Xin Huang, Lin Xiao, Liping Jing | Thus, we propose Hyperbolic Capsule Networks (HyperCaps) for Multi-Label Classification (MLC), which have two merits. | related papers | related patents |
284 | Improving Segmentation for Technical Support Problems | Kushal Chauhan, Abhirut Gupta | In this paper, we address the problem of segmentation for technical support questions. | related papers | related patents |
285 | MOOCCube: A Large-scale Data Repository for NLP Applications in MOOCs | Jifan Yu, Gan Luo, Tong Xiao, Qingyang Zhong, Yuquan Wang, wenzheng feng, Junyi Luo, Chenyu Wang, Lei Hou, Juanzi Li, Zhiyuan Liu, Jie Tang | Therefore, we present MOOCCube, a large-scale data repository of over 700 MOOC courses, 100k concepts, 8 million student behaviors with an external resource. | related papers | related patents |
286 | Towards Interpretable Clinical Diagnosis with Bayesian Network Ensembles Stacked on Entity-Aware CNNs | Jun Chen, Xiaoya Dai, Quan Yuan, Chao Lu, Haifeng Huang | In this paper, we attempt to propose a solution by introducing a novel framework that stacks Bayesian Network Ensembles on top of Entity-Aware Convolutional Neural Networks (CNN) towards building an accurate yet interpretable diagnosis system. | related papers | related patents |
287 | Analyzing the Persuasive Effect of Style in News Editorial Argumentation | Roxanne El Baff, Henning Wachsmuth, Khalid Al Khatib, Benno Stein | In contrast, this paper studies how important the style of news editorials is to achieve persuasion. | related papers | related patents |
288 | ECPE-2D: Emotion-Cause Pair Extraction based on Joint Two-Dimensional Representation, Interaction and Prediction | Zixiang Ding, Rui Xia, Jianfei Yu | To address these shortcomings, in this paper we propose a new end-to-end approach, called ECPE-Two-Dimensional (ECPE-2D), to represent the emotion-cause pairs by a 2D representation scheme. | related papers | related patents |
289 | Effective Inter-Clause Modeling for End-to-End Emotion-Cause Pair Extraction | Penghui Wei, Jiahao Zhao, Wenji Mao | In this paper, we tackle emotion-cause pair extraction from a ranking perspective, i.e., ranking clause pair candidates in a document, and propose a one-step neural approach which emphasizes inter-clause modeling to perform end-to-end extraction. | related papers | related patents |
290 | Embarrassingly Simple Unsupervised Aspect Extraction | Stéphan Tulkens, Andreas van Cranenburgh | We present a simple but effective method for aspect identification in sentiment analysis. | related papers | related patents |
291 | Enhancing Cross-target Stance Detection with Transferable Semantic-Emotion Knowledge | Bowen Zhang, Min Yang, Xutao Li, Yunming Ye, Xiaofei Xu, Kuai Dai | In this paper, we proposed a Semantic-Emotion Knowledge Transferring (SEKT) model for cross-target stance detection, which uses the external knowledge (semantic and emotion lexicons) as a bridge to enable knowledge transfer across different targets. | related papers | related patents |
292 | KinGDOM: Knowledge-Guided DOMain Adaptation for Sentiment Analysis | Deepanway Ghosal, Devamanyu Hazarika, Abhinaba Roy, Navonil Majumder, Rada Mihalcea, Soujanya Poria | In this paper, we take a novel perspective on this task by exploring the role of external commonsense knowledge. | related papers | related patents |
293 | Modelling Context and Syntactical Features for Aspect-based Sentiment Analysis | Minh Hieu Phan, Philip O. Ogunbona | This paper explores the grammatical aspect of the sentence and employs the self-attention mechanism for syntactical learning. | related papers | related patents |
294 | Parallel Data Augmentation for Formality Style Transfer | Yi Zhang, Tao Ge, Xu SUN | In this paper, we study how to augment parallel data and propose novel and simple data augmentation methods for this task to obtain useful sentence pairs with easily accessible models and systems. | related papers | related patents |
295 | Relational Graph Attention Network for Aspect-based Sentiment Analysis | Kai Wang, Weizhou Shen, Yunyi Yang, Xiaojun Quan, Rui Wang | In this paper, we address this problem by means of effective encoding of syntax information. | related papers | related patents |
296 | SpanMlt: A Span-based Multi-Task Learning Framework for Pair-wise Aspect and Opinion Terms Extraction | He Zhao, Longtao Huang, Rong Zhang, Quan Lu, hui xue | To this end, this paper proposes an end-to-end method to solve the task of Pair-wise Aspect and Opinion Terms Extraction (PAOTE). | related papers | related patents |
297 | Syntax-Aware Opinion Role Labeling with Dependency Graph Convolutional Networks | Bo Zhang, Yue Zhang, Rui Wang, Zhenghua Li, Min Zhang | In this work, we try to enhance neural ORL models with syntactic knowledge by comparing and integrating different representations. | related papers | related patents |
298 | Towards Better Non-Tree Argument Mining: Proposition-Level Biaffine Parsing with Task-Specific Parameterization | Gaku Morio, Hiroaki Ozaki, Terufumi Morishita, Yuta Koreeda, Kohsuke Yanai | In this paper, we focus on non-tree argument mining with a neural model. | related papers | related patents |
299 | A Span-based Linearization for Constituent Trees | Yang Wei, Yuanbin Wu, Man Lan | We propose a novel linearization of a constituent tree, together with a new locally normalized model. | related papers | related patents |
300 | An Empirical Comparison of Unsupervised Constituency Parsing Methods | Jun Li, Yifan Cao, Jiong Cai, Yong Jiang, Kewei Tu | In this paper, we first examine experimental settings used in previous work and propose to standardize the settings for better comparability between methods. | related papers | related patents |
301 | Efficient Constituency Parsing by Pointing | Thanh-Tung Nguyen, Xuan-Phi Nguyen, Shafiq Joty, Xiaoli Li | We propose a novel constituency parsing model that casts the parsing problem into a series of pointing tasks. | related papers | related patents |
302 | Efficient Second-Order TreeCRF for Neural Dependency Parsing | Yu Zhang, Zhenghua Li, Min Zhang | To address this issue, we propose an effective way to batchify the inside and Viterbi algorithms for direct large matrix operation on GPUs, and to avoid the complex outside algorithm via efficient back-propagation. | related papers | related patents |
303 | Representations of Syntax [MASK] Useful: Effects of Constituency and Dependency Structure in Recursive LSTMs | Michael Lepori, Tal Linzen, R. Thomas McCoy | We evaluate which of these two representational schemes more effectively introduces biases for syntactic structure that increase performance on the subject-verb agreement prediction task. | related papers | related patents |
304 | Structure-Level Knowledge Distillation For Multilingual Sequence Labeling | Xinyu Wang, Yong Jiang, Nguyen Bach, Tao Wang, Fei Huang, Kewei Tu | In this paper, we propose to reduce the gap between monolingual models and the unified multilingual model by distilling the structural knowledge of several monolingual models (teachers) to the unified multilingual model (student). | related papers | related patents |
305 | Dynamic Online Conversation Recommendation | Xingshan Zeng, Jing Li, Lu Wang, Zhiming Mao, Kam-Fai Wong | Concretely, we propose a neural architecture to exploit changes of user interactions and interests over time, to predict which discussions they are likely to enter. | related papers | related patents |
306 | Improving Multimodal Named Entity Recognition via Entity Span Detection with Unified Multimodal Transformer | Jianfei Yu, Jing Jiang, Li Yang, Rui Xia | In this paper, we study Multimodal Named Entity Recognition (MNER) for social media posts. | related papers | related patents |
307 | Stock Embeddings Acquired from News Articles and Price History, and an Application to Portfolio Optimization | Xin Du, Kumiko Tanaka-Ishii | In contrast, this paper presents a method to encode the influence of news articles through a vector representation of stocks called a \textit{stock embedding}. | related papers | related patents |
308 | What Was Written vs. Who Read It: News Media Profiling Using Text Analysis and Social Media Context | Ramy Baly, Georgi Karadzhov, Jisun An, Haewoon Kwak, Yoan Dinkov, Ahmed Ali, James Glass, Preslav Nakov | Here, we study the impact of both, namely (i) what was written (i.e., what was published by the target medium, and how it describes itself in Twitter) vs. (ii) who reads it (i.e., analyzing the target medium’s audience on social media). | related papers | related patents |
309 | An Analysis of the Utility of Explicit Negative Examples to Improve the Syntactic Abilities of Neural Language Models | Hiroshi Noji, Hiroya Takamura | We explore the utilities of explicit negative examples in training neural language models. | related papers | related patents |
310 | On the Robustness of Language Encoders against Grammatical Errors | Fan Yin, Quanyu Long, Tao Meng, Kai-Wei Chang | We conduct a thorough study to diagnose the behaviors of pre-trained language encoders (ELMo, BERT, and RoBERTa) when confronted with natural grammatical errors. | related papers | related patents |
311 | Roles and Utilization of Attention Heads in Transformer-based Neural Language Models | Jae-young Jo, Sung-Hyon Myaeng | Meaningful insights are shown through the lens of heat map visualization and utilized to propose a relatively simple sentence representation method that takes advantage of most influential attention heads, resulting in additional performance improvements on the downstream tasks. | related papers | related patents |
312 | Understanding Attention for Text Classification | Xiaobing Sun, Wei Lu | In this work, we present a study on understanding the internal mechanism of attention by looking into the gradient update process, checking its behavior when approaching a local minimum during training. | related papers | related patents |
313 | A Relational Memory-based Embedding Model for Triple Classification and Search Personalization | Dai Quoc Nguyen, Tu Nguyen, Dinh Phung | To this end, we introduce a novel embedding model, named R-MeN, that explores a relational memory network to encode potential dependencies in relationship triples. | related papers | related patents |
314 | Do you have the right scissors? Tailoring Pre-trained Language Models via Monte-Carlo Methods | Ning Miao, Yuxuan Song, Hao Zhou, Lei Li | In this paper, we propose MC-Tailor, a novel method to alleviate the above issue in text generation tasks by truncating and transferring the probability mass from over-estimated regions to under-estimated ones. | related papers | related patents |
315 | Enhancing Pre-trained Chinese Character Representation with Word-aligned Attention | Yanzeng Li, Bowen Yu, Xue Mengge, Tingwen Liu | Hence, we propose a novel word-aligned attention to exploit explicit word information, which is complementary to various character-based Chinese pre-trained language models. | related papers | related patents |
316 | On the Encoder-Decoder Incompatibility in Variational Text Modeling and Beyond | Chen Wu, Prince Zizhuang Wang, William Yang Wang | To this end, we propose Coupled-VAE, which couples a VAE model with a deterministic autoencoder with the same structure and improves the encoder and decoder parameterizations via encoder weight sharing and decoder signal matching. | related papers | related patents |
317 | SAFER: A Structure-free Approach for Certified Robustness to Adversarial Word Substitutions | Mao Ye, Chengyue Gong, Qiang Liu | In this work, we propose a certified robust method based on a new randomized smoothing technique, which constructs a stochastic ensemble by applying random word substitutions on the input sentences, and leverage the statistical properties of the ensemble to provably certify the robustness. | related papers | related patents |
318 | A Graph-based Coarse-to-fine Method for Unsupervised Bilingual Lexicon Induction | Shuo Ren, Shujie Liu, Ming Zhou, Shuai Ma | To deal with those issues, in this paper, we propose a novel graph-based paradigm to induce bilingual lexicons in a coarse-to-fine way. | related papers | related patents |
319 | A Reinforced Generation of Adversarial Examples for Neural Machine Translation | wei zou, Shujian Huang, Jun Xie, Xinyu Dai, Jiajun CHEN | Instead of collecting and analyzing bad cases using limited handcrafted error features, here we investigate this issue by generating adversarial examples via a new paradigm based on reinforcement learning. | related papers | related patents |
320 | A Retrieve-and-Rewrite Initialization Method for Unsupervised Machine Translation | Shuo Ren, Yu Wu, Shujie Liu, Ming Zhou, Shuai Ma | In this paper, we propose a novel retrieval and rewriting based method to better initialize unsupervised translation models. | related papers | related patents |
321 | A Simple and Effective Unified Encoder for Document-Level Machine Translation | Shuming Ma, Dongdong Zhang, Ming Zhou | In this work, we propose a simple and effective unified encoder that can outperform the baseline models of dual-encoder models in terms of BLEU and METEOR scores. | related papers | related patents |
322 | Does Multi-Encoder Help? A Case Study on Context-Aware Neural Machine Translation | Bei Li, Hui Liu, Ziyang Wang, Yufan Jiang, Tong Xiao, Jingbo Zhu, Tongran Liu, changliang li | In this paper, we investigate multi-encoder approaches in document-level neural machine translation (NMT). | related papers | related patents |
323 | Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change | Hongfei Xu, Josef van Genabith, Deyi Xiong, Qiuhui Liu | To improve the efficiency of our approach for large models, we propose a sampling approach to select gradients of parameters sensitive to the batch size. | related papers | related patents |
324 | Knowledge Distillation for Multilingual Unsupervised Neural Machine Translation | Haipeng Sun, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao | In this paper, we empirically introduce a simple method to translate between thirteen languages using a single encoder and a single decoder, making use of multilingual data to improve UNMT for all language pairs. | related papers | related patents |
325 | Lexically Constrained Neural Machine Translation with Levenshtein Transformer | Raymond Hendy Susanto, Shamil Chollampatt, Liling Tan | This paper proposes a simple and effective algorithm for incorporating lexical constraints in neural machine translation. | related papers | related patents |
326 | On Exposure Bias, Hallucination and Domain Shift in Neural Machine Translation | Chaojun Wang, Rico Sennrich | In this paper, we link exposure bias to another well-known problem in NMT, namely the tendency to generate hallucinations under domain shift. | related papers | related patents |
327 | Automatic Machine Translation Evaluation using Source Language Inputs and Cross-lingual Language Model | Kosuke Takahashi, Katsuhito Sudoh, Satoshi Nakamura | We propose an automatic evaluation method of machine translation that uses source language sentences regarded as additional pseudo references. | related papers | related patents |
328 | ChartDialogs: Plotting from Natural Language Instructions | Yutong Shao, Ndapa Nakashole | This paper presents the problem of conversational plotting agents that carry out plotting actions from natural language instructions. | related papers | related patents |
329 | GLUECoS: An Evaluation Benchmark for Code-Switched NLP | Simran Khanuja, Sandipan Dandapat, Anirudh Srinivasan, Sunayana Sitaram, Monojit Choudhury | We present an evaluation benchmark, GLUECoS, for code-switched languages, that spans several NLP tasks in English-Hindi and English-Spanish. | related papers | related patents |
330 | MATINF: A Jointly Labeled Large-Scale Dataset for Classification, Question Answering and Summarization | Canwen Xu, Jiaxin Pei, Hongtao Wu, Yiyu Liu, Chenliang Li | We propose MATINF, the first jointly labeled large-scale dataset for classification, question answering and summarization. | related papers | related patents |
331 | MIND: A Large-scale Dataset for News Recommendation | Fangzhao Wu, Ying Qiao, Jiun-Hung Chen, Chuhan Wu, Tao Qi, Jianxun Lian, Danyang Liu, Xing Xie, Jianfeng Gao, Winnie Wu, Ming Zhou | In this paper, we present a large-scale dataset named MIND for news recommendation. | related papers | related patents |
332 | That is a Known Lie: Detecting Previously Fact-Checked Claims | Shaden Shaar, Nikolay Babulkov, Giovanni Da San Martino, Preslav Nakov | Interestingly, despite the importance of the task, it has been largely ignored by the research community so far. Here, we aim to bridge this gap. | related papers | related patents |
333 | Towards Holistic and Automatic Evaluation of Open-Domain Dialogue Generation | Bo Pang, Erik Nijkamp, Wenjuan Han, Linqi Zhou, Yixian Liu, Kewei Tu | In this paper, we propose holistic evaluation metrics that capture different aspects of open-domain dialogues. | related papers | related patents |
334 | BiRRE: Learning Bidirectional Residual Relation Embeddings for Supervised Hypernymy Detection | Chengyu Wang, XIAOFENG HE | In this work, we revisit supervised distributional models for hypernymy detection. | related papers | related patents |
335 | Biomedical Entity Representations with Synonym Marginalization | Mujeen Sung, Hwisang Jeon, Jinhyuk Lee, Jaewoo Kang | In this paper, we focus on learning representations of biomedical entities solely based on the synonyms of entities. | related papers | related patents |
336 | Hypernymy Detection for Low-Resource Languages via Meta Learning | Changlong Yu, Jialong Han, Haisong Zhang, Wilfred Ng | This paper addresses the problem of low-resource hypernymy detection by combining high-resource languages. | related papers | related patents |
337 | Investigating Word-Class Distributions in Word Vector Spaces | Ryohei Sasano, Anna Korhonen | This paper presents an investigation on the distribution of word vectors belonging to a certain word class in a pre-trained word vector space. | related papers | related patents |
338 | Aspect Sentiment Classification with Document-level Sentiment Preference Modeling | Xiao Chen, Changlong Sun, Jingjing Wang, Shoushan Li, Luo Si, Min Zhang, Guodong Zhou | In this paper, we explore two kinds of sentiment preference information inside a document, i.e., contextual sentiment consistency w.r.t. the same aspect (namely intra-aspect sentiment consistency) and contextual sentiment tendency w.r.t. all the related aspects (namely inter-aspect sentiment tendency). | related papers | related patents |
339 | Don’t Eclipse Your Arts Due to Small Discrepancies: Boundary Repositioning with a Pointer Network for Aspect Extraction | Zhenkai Wei, Yu Hong, Bowei Zou, Meng Cheng, Jianmin YAO | In this paper, we propose to utilize a pointer network for repositioning the boundaries. | related papers | related patents |
340 | Relation-Aware Collaborative Learning for Unified Aspect-Based Sentiment Analysis | Zhuang Chen, Tieyun Qian | In order to fully exploit these relations, we propose a Relation-Aware Collaborative Learning (RACL) framework which allows the subtasks to work coordinately via the multi-task learning and relation propagation mechanisms in a stacked multi-layer network. | related papers | related patents |
341 | SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics | Da Yin, Tao Meng, Kai-Wei Chang | We propose SentiBERT, a variant of BERT that effectively captures compositional sentiment semantics. | related papers | related patents |
342 | Transition-based Directed Graph Construction for Emotion-Cause Pair Extraction | Chuang Fan, Chaofa Yuan, Jiachen Du, Lin Gui, Min Yang, Ruifeng Xu | Towards this issue, we propose a transition-based model to transform the task into a procedure of parsing-like directed graph construction. | related papers | related patents |
343 | CH-SIMS: A Chinese Multimodal Sentiment Analysis Dataset with Fine-grained Annotation of Modality | Wenmeng Yu, Hua Xu, Fanyang Meng, Yilin Zhu, Yixiao Ma, Jiele Wu, Jiyun Zou, Kaicheng Yang | In this paper, we introduce a Chinese single- and multi-modal sentiment analysis dataset, CH-SIMS, which contains 2,281 refined video segments in the wild with both multimodal and independent unimodal annotations. | related papers | related patents |
344 | Curriculum Pre-training for End-to-End Speech Translation | Chengyi Wang, Yu Wu, Shujie Liu, Ming Zhou, Zhenglu Yang | Inspired by this, we propose a curriculum pre-training method that includes an elementary course for transcription learning and two advanced courses for understanding the utterance and mapping words in two languages. | related papers | related patents |
345 | How Accents Confound: Probing for Accent Information in End-to-End Speech Recognition Systems | Archiki Prasad, Preethi Jyothi | In this work, we present a detailed analysis of how accent information is reflected in the internal representation of speech in an end-to-end automatic speech recognition (ASR) system. | related papers | related patents |
346 | Improving Disfluency Detection by Self-Training a Self-Attentive Model | Paria Jamshid Lou, Mark Johnson | However, we show that self-training � a semi-supervised technique for incorporating unlabeled data � sets a new state-of-the-art for the self-attentive parser on disfluency detection, demonstrating that self-training provides benefits orthogonal to the pre-trained contextualized word representations. | related papers | related patents |
347 | Learning Spoken Language Representations with Neural Lattice Language Modeling | Chao-Wei Huang, Yun-Nung Chen | We propose a framework that trains neural lattice language models to provide contextualized representations for spoken language understanding tasks. | related papers | related patents |
348 | Meta-Transfer Learning for Code-Switched Speech Recognition | Genta Indra Winata, Samuel Cahyawijaya, Zhaojiang Lin, Zihan Liu, Peng Xu, Pascale Fung | We therefore propose a new learning method, meta-transfer learning, to transfer learn on a code-switched speech recognition system in a low-resource setting by judiciously extracting information from high-resource monolingual datasets. | related papers | related patents |
349 | Reasoning with Multimodal Sarcastic Tweets via Modeling Cross-Modality Contrast and Semantic Association | Nan Xu, Zhixiong Zeng, Wenji Mao | To reason with multimodal sarcastic tweets, in this paper, we propose a novel method for modeling cross-modality contrast in the associated context. | related papers | related patents |
350 | SimulSpeech: End-to-End Simultaneous Speech to Text Translation | Yi Ren, Jinglin Liu, Xu Tan, Chen Zhang, Tao QIN, Zhou Zhao, Tie-Yan Liu | In this work, we develop SimulSpeech, an end-to-end simultaneous speech to text translation system which translates speech in source language to text in target language concurrently. | related papers | related patents |
351 | Towards end-2-end learning for predicting behavior codes from spoken utterances in psychotherapy conversations | Karan Singla, Zhuohao Chen, David Atkins, Shrikanth Narayanan | We propose a novel framework for predicting utterance level labels directly from speech features, thus removing the dependency on first generating transcripts, and transcription free behavioral coding. | related papers | related patents |
352 | Neural Temporal Opinion Modelling for Opinion Prediction on Twitter | Lixing Zhu, Yulan He, Deyu Zhou | In this paper, we model users’ tweet posting behaviour as a temporal point process to jointly predict the posting time and the stance label of the next tweet given a user’s historical tweet sequence and tweets posted by their neighbours. | related papers | related patents |
353 | It Takes Two to Lie: One to Lie, and One to Listen | Denis Peskov, Benny Cheng, Ahmed Elgohary, Joe Barrow, Cristian Danescu-Niculescu-Mizil, Jordan Boyd-Graber | We study the language and dynamics of deception in the negotiation-based game Diplomacy, where seven players compete for world domination by forging and breaking alliances with each other. | related papers | related patents |
354 | Learning Implicit Text Generation via Feature Matching | Inkit Padhi, Pierre Dognin, Ke Bai, Cícero Nogueira dos Santos, Vijil Chenthamarakshan, Youssef Mroueh, Payel Das | In this paper, we present new GFMN formulations that are effective for sequential data. | related papers | related patents |
355 | Two Birds, One Stone: A Simple, Unified Model for Text Generation from Structured and Unstructured Data | Hamidreza Shahidi, Ming Li, Jimmy Lin | In this work, we show that this is also the case for text generation from structured and unstructured data. | related papers | related patents |
356 | Bayesian Hierarchical Words Representation Learning | Oren Barkan, Idan Rejwan, Avi Caciularu, Noam Koenigstein | This paper presents the Bayesian Hierarchical Words Representation (BHWR) learning algorithm. | related papers | related patents |
357 | Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning | Alexandre Tamborrino, Nicola Pellicanò, Baptiste Pannier, Pascal Voitot, Louise Naudin | In this paper, we introduce a new scoring method that casts a plausibility ranking task in a full-text format and leverages the masked language modeling head tuned during the pre-training phase. | related papers | related patents |
358 | SEEK: Segmented Embedding of Knowledge Graphs | Wentao Xu, Shun Zheng, Liang He, Bin Shao, Jian Yin, Tie-Yan Liu | To mitigate this problem, we propose a lightweight modeling framework that can achieve highly competitive relational expressiveness without increasing the model complexity. | related papers | related patents |
359 | Selecting Backtranslated Data from Multiple Sources for Improved Neural Machine Translation | Xabier Soto, Dimitar Shterionov, Alberto Poncelas, Andy Way | In this work we analyse the impact that data translated with rule-based, phrase-based statistical and neural MT systems has on new MT systems. | related papers | related patents |
360 | Successfully Applying the Stabilized Lottery Ticket Hypothesis to the Transformer Architecture | Christopher Brix, Parnia Bahar, Hermann Ney | On the transformer architecture and the WMT 2014 English-to-German and English-to-French tasks, we show that stabilized lottery ticket pruning performs similar to magnitude pruning for sparsity levels of up to 85%, and propose a new combination of pruning techniques that outperforms all other techniques for even higher levels of sparsity. | related papers | related patents |
361 | A Self-Training Method for Machine Reading Comprehension with Soft Evidence Extraction | Yilin Niu, Fangkai Jiao, Mantong Zhou, Ting Yao, jingfang xu, Minlie Huang | To address this problem, we present a Self-Training method (STM), which supervises the evidence extractor with auto-generated evidence labels in an iterative process. | related papers | related patents |
362 | Graph-to-Tree Learning for Solving Math Word Problems | Jipeng Zhang, Lei Wang, Roy Ka-Wei Lee, Yi Bin, Yan Wang, Jie Shao, Ee-Peng Lim | In this paper, we propose Graph2Tree, a novel deep learning architecture that combines the merits of the graph-based encoder and tree-based decoder to generate better solution expressions. | related papers | related patents |
363 | An Effectiveness Metric for Ordinal Classification: Formal Properties and Experimental Results | Enrique Amigo, Julio Gonzalo, Stefano Mizzaro, Jorge Carrillo-de-Albornoz | In this paper we propose a new metric for Ordinal Classification, Closeness Evaluation Measure, that is rooted on Measurement Theory and Information Theory. | related papers | related patents |
364 | Adaptive Compression of Word Embeddings | Yeachan Kim, Kang-Min Kim, SangKeun Lee | In this paper, we propose a novel method to adaptively compress word embeddings. | related papers | related patents |
365 | Analysing Lexical Semantic Change with Contextualised Word Representations | Mario Giulianelli, Marco Del Tredici, Raquel Fernández | This paper presents the first unsupervised approach to lexical semantic change that makes use of contextualised word representations. | related papers | related patents |
366 | Autoencoding Keyword Correlation Graph for Document Clustering | Billy Chiu, Sunil Kumar Sahu, Derek Thomas, Neha Sengupta, Mohammady Mahdy | To address this, we present a novel graph-based representation for document clustering that builds a \textit{graph autoencoder} (GAE) on a Keyword Correlation Graph. | related papers | related patents |
367 | Autoencoding Pixies: Amortised Variational Inference with Graph Convolutions for Functional Distributional Semantics | Guy Emerson | In this paper, I introduce the Pixie Autoencoder, which augments the generative model of Functional Distributional Semantics with a graph-convolutional neural network to perform amortised variational inference. | related papers | related patents |
368 | BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance | Timo Schick, Hinrich Schütze | In this work, we transfer this idea to pretrained language models: We introduce BERTRAM, a powerful architecture based on BERT that is capable of inferring high-quality embeddings for rare words that are suitable as input representations for deep language models. | related papers | related patents |
369 | CluBERT: A Cluster-Based Approach for Learning Sense Distributions in Multiple Languages | Tommaso Pasini, Federico Scozzafava, Bianca Scarlini | To address this issue, in this paper we present CluBERT, an automatic and multilingual approach for inducing the distributions of word senses from a corpus of raw sentences. | related papers | related patents |
370 | Adversarial and Domain-Aware BERT for Cross-Domain Sentiment Analysis | Chunning Du, Haifeng Sun, Jingyu Wang, Qi Qi, Jianxin Liao | In this paper, we investigate how to efficiently apply the pre-training language model BERT on the unsupervised domain adaptation. | related papers | related patents |
371 | From Arguments to Key Points: Towards Automatic Argument Summarization | Roy Bar-Haim, Lilach Eden, Roni Friedman, Yoav Kantor, Dan Lahav, Noam Slonim | We propose to represent such summaries as a small set of talking points, termed \textit{key points}, each scored according to its salience. We study the task of argument-to-key point mapping, and introduce a novel large-scale dataset for this task. | related papers | related patents |
372 | GoEmotions: A Dataset of Fine-Grained Emotions | Dorottya Demszky, Dana Movshovitz-Attias, Jeongwoo Ko, Alan Cowen, Gaurav Nemade, Sujith Ravi | We introduce GoEmotions, the largest manually annotated dataset of 58k English Reddit comments, labeled for 27 emotion categories or Neutral. | related papers | related patents |
373 | He said “who’s gonna take care of your children when you are at ACL?”: Reported Sexist Acts are Not Sexist | Patricia Chiril, Véronique MORICEAU, Farah Benamara, Alda Mari, Gloria Origgi, Marlène Coulomb-Gully | We propose: (1) a new characterization of sexist content inspired by speech acts theory and discourse analysis studies, (2) the first French dataset annotated for sexism detection, and (3) a set of deep learning experiments trained on top of a combination of several tweet’s vectorial representations (word embeddings, linguistic features, and various generalization strategies). | related papers | related patents |
374 | SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis | Hao Tian, Can Gao, Xinyan Xiao, Hao Liu, Bolei He, Hua Wu, Haifeng Wang, feng wu | In this paper, we introduce Sentiment Knowledge Enhanced Pre-training (SKEP) in order to learn a unified sentiment representation for multiple sentiment analysis tasks. | related papers | related patents |
375 | Do Neural Language Models Show Preferences for Syntactic Formalisms? | Artur Kulmizev, Vinit Ravishankar, Mostafa Abdou, Joakim Nivre | In this study, we aim to investigate the extent to which the semblance of syntactic structure captured by language models adheres to a surface-syntactic or deep syntactic style of analysis, and whether the patterns are consistent across different languages. | related papers | related patents |
376 | Enriched In-Order Linearization for Faster Sequence-to-Sequence Constituent Parsing | Daniel Fernández-González, Carlos Gómez-Rodríguez | In this paper, we show that these results can be improved by using an in-order linearization instead. | related papers | related patents |
377 | Exact yet Efficient Graph Parsing, Bi-directional Locality and the Constructivist Hypothesis | Yajie Ye, Weiwei Sun | We demonstrate, for the first time, that exact graph parsing can be efficient for large graphs and with large Hyperedge Replacement Grammars (HRGs). | related papers | related patents |
378 | Max-Margin Incremental CCG Parsing | Miloš Stanojević, Mark Steedman | Instead, we tackle all of these three biases at the same time using an improved version of beam search optimisation that minimises all beam search violations instead of minimising only the biggest violation. | related papers | related patents |
379 | Neural Reranking for Dependency Parsing: An Evaluation | Bich-Ngoc Do, Ines Rehbein | In the paper, we re-assess the potential of successful neural reranking models from the literature on English and on two morphologically rich(er) languages, German and Czech. | related papers | related patents |
380 | Demographics Should Not Be the Reason of Toxicity: Mitigating Discrimination in Text Classifications with Instance Weighting | Guanhua Zhang, Bing Bai, Junqi Zhang, Kun Bai, Conghui Zhu, Tiejun Zhao | In this paper, we formalize the unintended biases in text classification datasets as a kind of selection bias from the non-discrimination distribution to the discrimination distribution. | related papers | related patents |
381 | Analyzing analytical methods: The case of phonology in neural models of spoken language | Grzegorz Chrupała, Bertrand Higy, Afra Alishahi | As a step in this direction we study the case of representations of phonology in neural network models of spoken language. | related papers | related patents |
382 | Make Up Your Mind! Adversarial Generation of Inconsistent Natural Language Explanations | Oana-Maria Camburu, Brendan Shillingford, Pasquale Minervini, Thomas Lukasiewicz, Phil Blunsom | In this work, we show that such models are nonetheless prone to generating mutually inconsistent explanations, such as “Because there is a dog in the image.” | related papers | related patents |
383 | Perturbed Masking: Parameter-free Probing for Analyzing and Interpreting BERT | Zhiyong Wu, Yun Chen, Ben Kao, Qun Liu | Complementary to those works, we propose a parameter-free probing technique for analyzing pre-trained language models (e.g., BERT). | related papers | related patents |
384 | Probing for Referential Information in Language Models | Ionut-Teodor Sorodoc, Kristina Gulordava, Gemma Boleda | We analyze two state of the art models with LSTM and Transformer architectures, via probe tasks and analysis on a coreference annotated corpus. | related papers | related patents |
385 | Quantifying Attention Flow in Transformers | Samira Abnar, Willem Zuidema | In this paper, we consider the problem of quantifying this flow of information through self-attention. | related papers | related patents |
386 | Towards Faithfully Interpretable NLP Systems: How Should We Define and Evaluate Faithfulness? | Alon Jacovi, Yoav Goldberg | We call for more clearly differentiating between different desired criteria an interpretation should satisfy, and focus on the faithfulness criteria. | related papers | related patents |
387 | Towards Transparent and Explainable Attention Models | Akash Kumar Mohankumar, Preksha Nema, Sharan Narasimhan, Mitesh M. Khapra, Balaji Vasan Srinivasan, Balaraman Ravindran | To make attention mechanisms more faithful and plausible, we propose a modified LSTM cell with a diversity-driven training objective that ensures that the hidden representations learned at different time steps are diverse. | related papers | related patents |
388 | Tchebycheff Procedure for Multi-task Text Classification | Yuren Mao, Shuang Yun, Weiwei Liu, Bo Du | To address this issue, this paper presents a novel Tchebycheff procedure to optimize the multi-task classification problems without convex assumption. | related papers | related patents |
389 | Modeling Word Formation in English–German Neural Machine Translation | Marion Weller-Di Marco, Alexander Fraser | This paper studies strategies to model word formation in NMT using rich linguistic information, namely a word segmentation approach that goes beyond splitting into substrings by considering fusional morphology. | related papers | related patents |
390 | Empowering Active Learning to Jointly Optimize System and User Demands | Ji-Ung Lee, Christian M. Meyer, Iryna Gurevych | In this paper, we propose a new active learning approach that jointly optimizes the seemingly counteracting objectives of the active learning system (training efficiently) and the user (receiving useful instances). | related papers | related patents |
391 | Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction | Masahiro Kaneko, Masato Mita, Shun Kiyono, Jun Suzuki, Kentaro Inui | This paper investigates how to effectively incorporate a pre-trained masked language model (MLM), such as BERT, into an encoder-decoder (EncDec) model for grammatical error correction (GEC). | related papers | related patents |
392 | Graph Neural News Recommendation with Unsupervised Preference Disentanglement | Linmei Hu, Siyong Xu, Chen Li, Cheng Yang, Chuan Shi, Nan Duan, Xing Xie, Ming Zhou | In this paper, we model the user-news interactions as a bipartite graph and propose a novel Graph Neural News Recommendation model with Unsupervised Preference Disentanglement, named GNUD. | related papers | related patents |
393 | Identifying Principals and Accessories in a Complex Case based on the Comprehension of Fact Description | Yakun Hu, Zhunchen Luo, Wenhan Chao | In this paper, we study the problem of identifying the principals and accessories from the fact description with multiple defendants in a criminal case. | related papers | related patents |
394 | Joint Modelling of Emotion and Abusive Language Detection | Santhosh Rajamanickam, Pushkar Mishra, Helen Yannakoudakis, Ekaterina Shutova | In this paper, we present the first joint model of emotion and abusive language detection, experimenting in a multi-task learning framework that allows one task to inform the other. | related papers | related patents |
395 | Programming in Natural Language with fuSE: Synthesizing Methods from Spoken Utterances Using Deep Natural Language Understanding | Sebastian Weigelt, Vanessa Steurer, Tobias Hey, Walter F. Tichy | We examine how to teach intelligent systems new functions, expressed in natural language. | related papers | related patents |
396 | Toxicity Detection: Does Context Really Matter? | John Pavlopoulos, Jeffrey Sorensen, Lucas Dixon, Nithum Thain, Ion Androutsopoulos | We investigate this assumption by focusing on two questions: (a) does context affect the human judgement, and (b) does conditioning on context improve performance of toxicity detection systems? | related papers | related patents |
397 | AMR Parsing with Latent Structural Information | Qiji Zhou, Yue Zhang, Donghong Ji, Hao Tang | We investigate parsing AMR with explicit dependency structures and interpretable latent structures. | related papers | related patents |
398 | TaPas: Weakly Supervised Table Parsing via Pre-training | Jonathan Herzig, Pawel Krzysztof Nowak, Thomas Müller, Francesco Piccinno, Julian Eisenschlos | In this paper, we present TaPas, an approach to question answering over tables without generating logical forms. | related papers | related patents |
399 | Target Inference in Argument Conclusion Generation | Milad Alshomary, Shahbaz Syed, Martin Potthast, Henning Wachsmuth | We develop two complementary target inference approaches: one ranks premise targets and selects the top-ranked target as the conclusion target, the other finds a new conclusion target in a learned embedding space using a triplet neural network. | related papers | related patents |
400 | Multimodal Transformer for Multimodal Machine Translation | Shaowei Yao, Xiaojun Wan | In this paper, we introduce the multimodal self-attention in Transformer to solve the issues above in MMT. | related papers | related patents |
401 | Sentiment and Emotion help Sarcasm? A Multi-task Learning Framework for Multi-Modal Sarcasm, Sentiment and Emotion Analysis | Dushyant Singh Chauhan, Dhanush S R, Asif Ekbal, Pushpak Bhattacharyya | In this paper, we hypothesize that sarcasm is closely related to sentiment and emotion, and thereby propose a multi-task deep learning framework to solve all these three problems simultaneously in a multi-modal conversational scenario. | related papers | related patents |
402 | Towards Emotion-aided Multi-modal Dialogue Act Classification | Tulika Saha, Aditya Patra, Sriparna Saha, Pushpak Bhattacharyya | In this work, we address the role of \textit{both} multi-modality and emotion recognition (ER) in DAC. | related papers | related patents |
403 | Analyzing Political Parody in Social Media | Antonios Maronikolakis, Danae Sánchez Villegas, Daniel Preotiuc-Pietro, Nikolaos Aletras | In this paper, we present the first computational study of parody. | related papers | related patents |
404 | Masking Actor Information Leads to Fairer Political Claims Detection | Erenay Dayanik, Sebastian Padó | We propose two simple debiasing methods which mask proper names and pronouns during training of the model, thus removing personal information bias. | related papers | related patents |
405 | When do Word Embeddings Accurately Reflect Surveys on our Beliefs About People? | Kenneth Joseph, Jonathan Morgan | Here, we investigate the extent to which publicly-available word embeddings accurately reflect beliefs about certain kinds of people as measured via traditional survey methods. | related papers | related patents |
406 | “Who said it, and Why?” Provenance for Natural Language Claims | Yi Zhang, Zachary Ives, Dan Roth | This paper suggests that the key to a longer-term, holistic, and systematic approach to navigating this information pollution is capturing the provenance of claims. | related papers | related patents |
407 | Compositionality and Generalization In Emergent Languages | Rahma Chaabouni, Eugene Kharitonov, Diane Bouchacourt, Emmanuel Dupoux, Marco Baroni | In this paper, we study whether the language emerging in deep multi-agent simulations possesses a similar ability to refer to novel primitive combinations, and whether it accomplishes this feat by strategies akin to human-language compositionality. | related papers | related patents |
408 | ERASER: A Benchmark to Evaluate Rationalized NLP Models | Jay DeYoung, Sarthak Jain, Nazneen Fatema Rajani, Eric Lehman, Caiming Xiong, Richard Socher, Byron C. Wallace | We propose the \textbf{E}valuating \textbf{R}ationales \textbf{A}nd \textbf{S}imple \textbf{E}nglish \textbf{R}easoning (\textbf{ERASER} a benchmark to advance research on interpretable models in NLP. | related papers | related patents |
409 | Learning to Faithfully Rationalize by Construction | Sarthak Jain, Sarah Wiegreffe, Yuval Pinter, Byron C. Wallace | We propose a simpler variant of this approach that provides faithful explanations by construction. In our scheme, named FRESH, arbitrary feature importance scores (e.g., gradients from a trained model) are used to induce binary labels over token inputs, which an extractor can be trained to predict. An independent classifier module is then trained exclusively on snippets provided by the extractor; these snippets thus constitute faithful explanations, even if the classifier is arbitrarily complex. | related papers | related patents |
410 | Clinical Reading Comprehension: A Thorough Analysis of the emrQA Dataset | Xiang Yue, Bernal Jimenez Gutierrez, Huan Sun | In this paper, we provide an in-depth analysis of this dataset and the clinical reading comprehension (CliniRC) task. | related papers | related patents |
411 | DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering | Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian | We introduce DeFormer, a decomposed transformer, which substitutes the full self-attention with question-wide and passage-wide self-attentions in the lower layers. | related papers | related patents |
412 | Improving Multi-hop Question Answering over Knowledge Graphs using Knowledge Base Embeddings | Apoorv Saxena, Aditay Tripathi, Partha Talukdar | We fill this gap in this paper and propose EmbedKGQA. EmbedKGQA is particularly effective in performing multi-hop KGQA over sparse KGs. | related papers | related patents |
413 | Template-Based Question Generation from Retrieved Sentences for Improved Unsupervised Question Answering | Alexander Fabbri, Patrick Ng, Zhiguo Wang, Ramesh Nallapati, Bing Xiang | We propose an unsupervised approach to training QA models with generated pseudo-training data. | related papers | related patents |
414 | Unsupervised Alignment-based Iterative Evidence Retrieval for Multi-hop Question Answering | Vikas Yadav, Steven Bethard, Mihai Surdeanu | We introduce a simple, fast, and unsupervised iterative evidence retrieval method, which relies on three ideas: (a) an unsupervised alignment approach to soft-align questions and answers with justification sentences using only GloVe embeddings, (b) an iterative process that reformulates queries focusing on terms that are not covered by existing justifications, which (c) stops when the terms in the given question and candidate answers are covered by the retrieved justifications. | related papers | related patents |
415 | A Corpus for Large-Scale Phonetic Typology | Elizabeth Salesky, Eleanor Chodroff, Tiago Pimentel, Matthew Wiesner, Ryan Cotterell, Alan W Black, Jason Eisner | We present VoxClamantis v1.0, the first large-scale corpus for phonetic typology, with aligned segments and estimated phoneme-level labels in 690 readings spanning 635 languages, along with acoustic-phonetic measures of vowels and sibilants. | related papers | related patents |
416 | Dscorer: A Fast Evaluation Metric for Discourse Representation Structure Parsing | Jiangming Liu, Shay B. Cohen, Mirella Lapata | We introduce Dscorer, an efficient new metric which converts box-style DRSs to graphs and then measures the overlap of n-grams. | related papers | related patents |
417 | ParaCrawl: Web-Scale Acquisition of Parallel Corpora | Marta Bañón, Pinzhen Chen, Barry Haddow, Kenneth Heafield, Hieu Hoang, Miquel Esplà-Gomis, Mikel L. Forcada, Amir Kamran, Faheem Kirefu, Philipp Koehn, Sergio Ortiz Rojas, Leopoldo Pla Sempere, Gema Ramírez-Sánchez, Elsa Sarrías, Marek Strelec, Brian Thompson, William Waites, Dion Wiggins, Jaume Zaragoza | We report on methods to create the largest publicly available parallel corpora by crawling the web, using open source software. | related papers | related patents |
418 | Toward Gender-Inclusive Coreference Resolution | Yang Trista Cao, Hal Daumé III | Through these studies, conducted on English text, we confirm that without acknowledging and building systems that recognize the complexity of gender, we build systems that lead to many potential harms. | related papers | related patents |
419 | Human Attention Maps for Text Classification: Do Humans and Neural Networks Focus on the Same Words? | Cansu Sen, Thomas Hartvigsen, Biao Yin, Xiangnan Kong, Elke Rundensteiner | In this work, we conduct the first quantitative assessment of human versus computational attention mechanisms for the text classification task. | related papers | related patents |
420 | Information-Theoretic Probing for Linguistic Structure | Tiago Pimentel, Josef Valvoda, Rowan Hall Maudslay, Ran Zmigrod, Adina Williams, Ryan Cotterell | We propose an information-theoretic operationalization of probing as estimating mutual information that contradicts this received wisdom: one should always select the highest performing probe one can, even if it is more complex, since it will result in a tighter estimate, and thus reveal more of the linguistic information inherent in the representation. | related papers | related patents |
421 | On the Cross-lingual Transferability of Monolingual Representations | Mikel Artetxe, Sebastian Ruder, Dani Yogatama | More concretely, we first train a transformer-based masked language model on one language, and transfer it to a new language by learning a new embedding matrix with the same masked language modeling objective, freezing parameters of all other layers. | related papers | related patents |
422 | Similarity Analysis of Contextual Word Representation Models | John Wu, Yonatan Belinkov, Hassan Sajjad, Nadir Durrani, Fahim Dalvi, James Glass | This paper investigates contextual word representation models from the lens of similarity analysis. | related papers | related patents |
423 | SenseBERT: Driving Some Sense into BERT | Yoav Levine, Barak Lenz, Or Dagan, Ori Ram, Dan Padnos, Or Sharir, Shai Shalev-Shwartz, Amnon Shashua, Yoav Shoham | This paper proposes a method to employ weak-supervision directly at the word sense level. | related papers | related patents |
424 | ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations | Fernando Alva-Manchego, Louis Martin, Antoine Bordes, Carolina Scarton, Benoît Sagot, Lucia Specia | To alleviate this limitation, this paper introduces ASSET, a new dataset for assessing sentence simplification in English. | related papers | related patents |
425 | Fatality Killed the Cat or: BabelPic, a Multimodal Dataset for Non-Concrete Concepts | Agostina Calabrese, Michele Bevilacqua, Roberto Navigli | We fill this gap by presenting BabelPic, a hand-labeled dataset built by cleaning the image-synset association found within the BabelNet Lexical Knowledge Base (LKB). | related papers | related patents |
426 | Modeling Label Semantics for Predicting Emotional Reactions | Radhika Gaonkar, Heeyoung Kwon, Mohaddeseh Bastan, Niranjan Balasubramanian, Nathanael Chambers | In this work, we explicitly model label classes via label embeddings, and add mechanisms that track label-label correlations both during training and inference. | related papers | related patents |
427 | CraftAssist Instruction Parsing: Semantic Parsing for a Voxel-World Assistant | Kavya Srinet, Yacine Jernite, Jonathan Gray, arthur szlam | We propose a semantic parsing dataset focused on instruction-driven communication with an agent in the game Minecraft. | related papers | related patents |
428 | Don’t Say That! Making Inconsistent Dialogue Unlikely with Unlikelihood Training | Margaret Li, Stephen Roller, Ilia Kulikov, Sean Welleck, Y-Lan Boureau, Kyunghyun Cho, Jason Weston | We show that appropriate loss functions which regularize generated outputs to match human distributions are effective for the first three issues. For the last important general issue, we show applying unlikelihood to collected data of what a model should not do is effective for improving logical consistency, potentially paving the way to generative models with greater reasoning ability. | related papers | related patents |
429 | How does BERT’s attention change when you fine-tune? An analysis methodology and a case study in negation scope | Yiyun Zhao, Steven Bethard | We propose a procedure and analysis methods that take a hypothesis of how a transformer-based model might encode a linguistic phenomenon, and test the validity of that hypothesis based on a comparison between knowledge-related downstream tasks with downstream control tasks, and measurement of cross-dataset consistency. | related papers | related patents |
430 | Influence Paths for Characterizing Subject-Verb Number Agreement in LSTM Language Models | Kaiji Lu, Piotr Mardziel, Klas Leino, Matt Fredrikson, Anupam Datta | We introduce *influence paths*, a causal account of structural properties as carried by paths across gates and neurons of a recurrent neural network. | related papers | related patents |
431 | Interpreting Pretrained Contextualized Representations via Reductions to Static Embeddings | Rishi Bommasani, Kelly Davis, Claire Cardie | Consequently, we introduce simple and fully general methods for converting from contextualized representations to static lookup-table embeddings which we apply to 5 popular pretrained models and 9 sets of pretrained weights. | related papers | related patents |
432 | Learning to Deceive with Attention-Based Explanations | Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton | We call the latter use of attention mechanisms into question by demonstrating a simple method for training models to produce deceptive attention masks. | related papers | related patents |
433 | On the Spontaneous Emergence of Discrete and Compositional Signals | Nur Geffen Lan, Emmanuel Chemla, Shane Steinert-Threlkeld | We propose a general framework to study language emergence through signaling games with neural agents. | related papers | related patents |
434 | Spying on Your Neighbors: Fine-grained Probing of Contextual Embeddings for Information about Surrounding Words | Josef Klafka, Allyson Ettinger | To address this question, we introduce a suite of probing tasks that enable fine-grained testing of contextual embeddings for encoding of information about surrounding words. | related papers | related patents |
435 | Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA | Hyounghun Kim, Zineng Tang, Mohit Bansal | In this paper, we propose a video question answering model which effectively integrates multi-modal input sources and finds the temporally relevant information to answer questions. | related papers | related patents |
436 | Shaping Visual Representations with Language for Few-Shot Classification | Jesse Mu, Percy Liang, Noah Goodman | Instead, we propose language-shaped learning (LSL), an end-to-end model that regularizes visual representations to predict language. | related papers | related patents |
437 | Discrete Latent Variable Representations for Low-Resource Text Classification | Shuning Jin, Sam Wiseman, Karl Stratos, Karen Livescu | We consider several approaches to learning discrete latent variable models for text in the case where exact marginalization over these variables is intractable. | related papers | related patents |
438 | Learning Constraints for Structured Prediction Using Rectifier Networks | Xingyuan Pan, Maitrey Mehta, Vivek Srikumar | We frame the problem as that of training a two-layer rectifier network to identify valid structures or substructures, and show a construction for converting a trained network into a system of linear constraints over the inference variables. | related papers | related patents |
439 | Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models | Dan Iter, Kelvin Guu, Larry Lansing, Dan Jurafsky | We propose Conpono, an inter-sentence objective for pretraining language models that models discourse coherence and the distance between sentences. | related papers | related patents |
440 | A Recipe for Creating Multimodal Aligned Datasets for Sequential Tasks | Angela Lin, Sudha Rao, Asli Celikyilmaz, Elnaz Nouri, Chris Brockett, Debadeepta Dey, Bill Dolan | To address these challenges, we use an unsupervised alignment algorithm that learns pairwise alignments between instructions of different recipes for the same dish. | related papers | related patents |
441 | Adversarial NLI: A New Benchmark for Natural Language Understanding | Yixin Nie, Adina Williams, Emily Dinan, Mohit Bansal, Jason Weston, Douwe Kiela | We introduce a new large-scale NLI benchmark dataset, collected via an iterative, adversarial human-and-model-in-the-loop procedure. | related papers | related patents |
442 | Beyond Accuracy: Behavioral Testing of NLP Models with CheckList | Marco Tulio Ribeiro, Tongshuang Wu, Carlos Guestrin, Sameer Singh | Inspired by principles of behavioral testing in software engineering, we introduce CheckList, a task-agnostic methodology for testing NLP models. | related papers | related patents |
443 | Code and Named Entity Recognition in StackOverflow | Jeniya Tabassum, Mounica Maddela, Wei Xu, Alan Ritter | In this paper, we introduce a new named entity recognition (NER) corpus for the computer programming domain, consisting of 15,372 sentences annotated with 20 fine-grained entity types. | related papers | related patents |
444 | Dialogue-Based Relation Extraction | Dian Yu, Kai Sun, Claire Cardie, Dong Yu | We present the first human-annotated dialogue-based relation extraction (RE) dataset DialogRE, aiming to support the prediction of relation(s) between two arguments that appear in a dialogue. | related papers | related patents |
445 | Facet-Aware Evaluation for Extractive Summarization | Yuning Mao, Liyuan Liu, Qi Zhu, Xiang Ren, Jiawei Han | In this paper, we present a facet-aware evaluation setup for better assessment of the information coverage in extracted summaries. | related papers | related patents |
446 | More Diverse Dialogue Datasets via Diversity-Informed Data Collection | Katherine Stasaski, Grace Hui Yang, Marti A. Hearst | We introduce a new strategy to address this problem, called Diversity-Informed Data Collection. | related papers | related patents |
447 | S2ORC: The Semantic Scholar Open Research Corpus | Kyle Lo, Lucy Lu Wang, Mark Neumann, Rodney Kinney, Daniel Weld | We introduce S2ORC, a large corpus of 81.1M English-language academic papers spanning many academic disciplines. | related papers | related patents |
448 | Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation Metrics | Nitika Mathur, Timothy Baldwin, Trevor Cohn | We show that current methods for judging metrics are highly sensitive to the translations used for assessment, particularly the presence of outliers, which often leads to falsely confident conclusions about a metric’s efficacy. | related papers | related patents |
449 | A Transformer-based Approach for Source Code Summarization | Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang | To learn code representation for summarization, we explore the Transformer model that uses a self-attention mechanism and has shown to be effective in capturing long-range dependencies. | related papers | related patents |
450 | Asking and Answering Questions to Evaluate the Factual Consistency of Summaries | Alex Wang, Kyunghyun Cho, Mike Lewis | We propose QAGS (pronounced “kags”), an automatic evaluation protocol that is designed to identify factual inconsistencies in a generated summary. | related papers | related patents |
451 | Discourse-Aware Neural Extractive Text Summarization | Jiacheng Xu, Zhe Gan, Yu Cheng, Jingjing Liu | To address these issues, we present a discourse-aware neural summarization model – DiscoBert. | related papers | related patents |
452 | Discrete Optimization for Unsupervised Sentence Summarization with Word-Level Extraction | Raphael Schumann, Lili Mou, Yao Lu, Olga Vechtomova, Katja Markert | A good summary is characterized by language fluency and high information overlap with the source sentence. We model these two aspects in an unsupervised objective function, consisting of language modeling and semantic similarity metrics. | related papers | related patents |
453 | Exploring Content Selection in Summarization of Novel Chapters | Faisal Ladhak, Bryan Li, Yaser Al-Onaizan, Kathleen McKeown | We present a new summarization task, generating summaries of novel chapters using summary/chapter pairs from online study guides. | related papers | related patents |
454 | FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization | Esin Durmus, He He, Mona Diab | We tackle the problem of evaluating faithfulness of a generated summary given its source document. | related papers | related patents |
455 | Fact-based Content Weighting for Evaluating Abstractive Summarisation | Xinnuo Xu, Ondřej Dušek, Jingyi Li, Verena Rieser, Ioannis Konstas | We introduce a new evaluation metric which is based on fact-level content weighting, i.e. relating the facts of the document to the facts of the summary. | related papers | related patents |
456 | Hooks in the Headline: Learning to Generate Headlines with Controlled Styles | Di Jin, Zhijing Jin, Joey Tianyi Zhou, Lisa Orii, Peter Szolovits | We propose a new task, Stylistic Headline Generation (SHG), to enrich the headlines with three style options (humor, romance and clickbait), thus attracting more readers. | related papers | related patents |
457 | Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven Cloze Reward | Luyang Huang, Lingfei Wu, Lu Wang | In this paper, we present ASGARD, a novel framework for Abstractive Summarization with Graph-Augmentation and semantic-driven RewarD. | related papers | related patents |
458 | Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports | Yuhao Zhang, Derek Merck, Emily Tsai, Christopher D. Manning, Curtis Langlotz | In this work, we develop a general framework where we evaluate the factual correctness of a generated summary by fact-checking it automatically against its reference using an information extraction module. | related papers | related patents |
459 | Storytelling with Dialogue: A Critical Role Dungeons and Dragons Dataset | Revanth Rameshkumar, Peter Bailey | This paper describes the Critical Role Dungeons and Dragons Dataset (CRD3) and related analyses. | related papers | related patents |
460 | The Summary Loop: Learning to Write Abstractive Summaries Without Examples | Philippe Laban, Andrew Hsi, John Canny, Marti A. Hearst | This work presents a new approach to unsupervised abstractive summarization based on maximizing a combination of coverage and fluency for a given length constraint. | related papers | related patents |
461 | Unsupervised Opinion Summarization as Copycat-Review Generation | Arthur Bražinskas, Mirella Lapata, Ivan Titov | We define a generative model for a review collection which capitalizes on the intuition that when generating a new review given a set of other reviews of a product, we should be able to control the “amount of novelty” going into the new review or, equivalently, vary the extent to which it deviates from the input. | related papers | related patents |
462 | (Re)construing Meaning in NLP | Sean Trott, Tiago Timponi Torrent, Nancy Chang, Nathan Schneider | In this paper, we engage with an idea largely absent from discussions of meaning in natural language understanding-namely, that the way something is expressed reflects different ways of conceptualizing or construing the information being conveyed. | related papers | related patents |
463 | Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data | Emily M. Bender, Alexander Koller | In this position paper, we argue that a system trained only on form has a priori no way to learn meaning. | related papers | related patents |
464 | Examining Citations of Natural Language Processing Literature | Saif M. Mohammad | We extracted information from the ACL Anthology (AA) and Google Scholar (GS) to examine trends in citations of NLP papers. | related papers | related patents |
465 | How Can We Accelerate Progress Towards Human-like Linguistic Generalization? | Tal Linzen | This position paper describes and critiques the Pretraining-Agnostic Identically Distributed (PAID) evaluation paradigm, which has become a central tool for measuring progress in natural language understanding. | related papers | related patents |
466 | How Does NLP Benefit Legal System: A Summary of Legal Artificial Intelligence | Haoxi Zhong, Chaojun Xiao, Cunchao Tu, Tianyang Zhang, Zhiyuan Liu, Maosong Sun | In this paper, we introduce the history, the current state, and the future directions of research in LegalAI. | related papers | related patents |
467 | Intermediate-Task Transfer Learning with Pretrained Language Models: When and Why Does It Work? | Yada Pruksachatkun, Jason Phang, Haokun Liu, Phu Mon Htut, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, Samuel R. Bowman | To investigate this, we perform a large-scale study on the pretrained RoBERTa model with 110 intermediate-target task combinations. | related papers | related patents |
468 | Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview | Deven Santosh Shah, H. Andrew Schwartz, Dirk Hovy | In this paper, we propose a unifying predictive bias framework for NLP. | related papers | related patents |
469 | What Does BERT with Vision Look At? | Liunian Harold Li, Mark Yatskar, Da Yin, Cho-Jui Hsieh, Kai-Wei Chang | In this work, we demonstrate that certain attention heads of a visually grounded language model actively ground elements of language to image regions. | related papers | related patents |
470 | Balancing Objectives in Counseling Conversations: Advancing Forwards or Looking Backwards | Justine Zhang, Cristian Danescu-Niculescu-Mizil | In this work, we develop an unsupervised methodology to quantify how counselors manage this balance. | related papers | related patents |
471 | Detecting Perceived Emotions in Hurricane Disasters | Shrey Desai, Cornelia Caragea, Junyi Jessy Li | In this paper, we introduce HurricaneEmo, an emotion dataset of 15,000 English tweets spanning three hurricanes: Harvey, Irma, and Maria. | related papers | related patents |
472 | Hierarchical Modeling for User Personality Prediction: The Role of Message-Level Attention | Veronica Lynn, Niranjan Balasubramanian, H. Andrew Schwartz | In this paper, we present a novel model that uses message-level attention to learn the relative weight of users’ social media posts for assessing their five factor personality traits. | related papers | related patents |
473 | Measuring Forecasting Skill from Text | Shi Zong, Alan Ritter, Eduard Hovy | In this paper we explore connections between the language people use to describe their predictions and their forecasting skill. | related papers | related patents |
474 | Text and Causal Inference: A Review of Using Text to Remove Confounding from Causal Estimates | Katherine Keith, David Jensen, Brendan O’Connor | Despite increased attention on adjusting for confounding using text, there are still many open problems, which we highlight in this paper. | related papers | related patents |
475 | Text-Based Ideal Points | Keyon Vafa, Suresh Naidu, David Blei | In this paper, we introduce the text-based ideal point model (TBIP), an unsupervised probabilistic topic model that analyzes texts to quantify the political positions of its authors. | related papers | related patents |
476 | Understanding the Language of Political Agreement and Disagreement in Legislative Texts | Maryam Davoodi, Eric Waltenburg, Dan Goldwasser | In this paper, we take the first step towards a better understanding of these processes and the underlying dynamics that shape them, using data-driven methods. | related papers | related patents |
477 | Would you Rather? A New Benchmark for Learning Machine Alignment with Cultural Values and Social Preferences | Yi Tay, Donovan Ong, Jie Fu, Alvin Chan, Nancy Chen, Anh Tuan Luu, Chris Pal | Concretely, we present a new task and corpus for learning alignments between machine and human preferences. | related papers | related patents |
478 | Discourse as a Function of Event: Profiling Discourse Structure in News Articles around the Main Event | Prafulla Kumar Choubey, Aaron Lee, Ruihong Huang, Lu Wang | To enable computational modeling of news structures, we apply an existing theory of functional discourse structure for news articles that revolves around the main event and create a human-annotated corpus of 802 documents spanning over four domains and three media sources. | related papers | related patents |
479 | Harnessing the linguistic signal to predict scalar inferences | Sebastian Schuster, Yuxing Chen, Judith Degen | In this work, we explore to what extent neural network sentence encoders can learn to predict the strength of scalar inferences. | related papers | related patents |
480 | Implicit Discourse Relation Classification: We Need to Talk about Evaluation | Najoung Kim, Song Feng, Chulaka Gunasekara, Luis Lastras | In this work, we highlight these inconsistencies and propose an improved evaluation protocol. | related papers | related patents |
481 | PeTra: A Sparsely Supervised Memory Model for People Tracking | Shubham Toshniwal, Allyson Ettinger, Kevin Gimpel, Karen Livescu | We propose PeTra, a memory-augmented neural network designed to track entities in its memory slots. | related papers | related patents |
482 | ZPR2: Joint Zero Pronoun Recovery and Resolution using Multi-Task Learning and BERT | Linfeng Song, Kun Xu, Yue Zhang, Jianshu Chen, Dong Yu | We propose to better explore their interaction by solving both tasks together, while the previous work treats them separately. | related papers | related patents |
483 | Contextualizing Hate Speech Classifiers with Post-hoc Explanation | Brendan Kennedy, Xisen Jin, Aida Mostafazadeh Davani, Morteza Dehghani, Xiang Ren | We extract post-hoc explanations from fine-tuned BERT classifiers to detect bias towards identity terms. Then, we propose a novel regularization technique based on these explanations that encourages models to learn from the context of group identifiers in addition to the identifiers themselves. | related papers | related patents |
484 | Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation | Tianlu Wang, Xi Victoria Lin, Nazneen Fatema Rajani, Bryan McCann, Vicente Ordonez, Caiming Xiong | We propose a simple but effective technique, Double Hard Debias, which purifies the word embeddings against such corpus regularities prior to inferring and removing the gender subspace. | related papers | related patents |
485 | Language (Technology) is Power: A Critical Survey of “Bias” in NLP | Su Lin Blodgett, Solon Barocas, Hal Daumé III, Hanna Wallach | Based on these findings, we describe the beginnings of a path forward by proposing three recommendations that should guide work analyzing “bias” in NLP systems. | related papers | related patents |
486 | Social Bias Frames: Reasoning about Social and Power Implications of Language | Maarten Sap, Saadia Gabriel, Lianhui Qin, Dan Jurafsky, Noah A. Smith, Yejin Choi | We introduce Social Bias Frames, a new conceptual formalism that aims to model the pragmatic frames in which people project social biases and stereotypes onto others. | related papers | related patents |
487 | Social Biases in NLP Models as Barriers for Persons with Disabilities | Ben Hutchinson, Vinodkumar Prabhakaran, Emily Denton, Kellie Webster, Yu Zhong, Stephen Denuyl | In this paper, we present evidence of such undesirable biases towards mentions of disability in two different English language models: toxicity prediction and sentiment analysis. | related papers | related patents |
488 | Towards Debiasing Sentence Representations | Paul Pu Liang, Irene Mengze Li, Emily Zheng, Yao Chong Lim, Ruslan Salakhutdinov, Louis-Philippe Morency | In this paper, we investigate the presence of social biases in sentence-level representations and propose a new method, Sent-Debias, to reduce these biases. | related papers | related patents |
489 | A Re-evaluation of Knowledge Graph Completion Methods | Zhiqing Sun, Shikhar Vashishth, Soumya Sanyal, Partha Talukdar, Yiming Yang | In this paper, we find that this can be attributed to the inappropriate evaluation protocol used by them and propose a simple evaluation protocol to address this problem. | related papers | related patents |
490 | Cross-Linguistic Syntactic Evaluation of Word Prediction Models | Aaron Mueller, Garrett Nicolai, Panayiota Petrou-Zeniou, Natalia Talmina, Tal Linzen | To investigate how these models’ ability to learn syntax varies by language, we introduce CLAMS (Cross-Linguistic Assessment of Models on Syntax), a syntactic evaluation suite for monolingual and multilingual models. | related papers | related patents |
491 | Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior? | Peter Hase, Mohit Bansal | Through two kinds of simulation tests involving text and tabular data, we evaluate five explanations methods: (1) LIME, (2) Anchor, (3) Decision Boundary, (4) a Prototype model, and (5) a Composite approach that combines explanations from each method. | related papers | related patents |
492 | Explaining Black Box Predictions and Unveiling Data Artifacts through Influence Functions | Xiaochuang Han, Byron C. Wallace, Yulia Tsvetkov | In this work, we investigate the use of influence functions for NLP, providing an alternative approach to interpreting neural text classifiers. | related papers | related patents |
493 | Finding Universal Grammatical Relations in Multilingual BERT | Ethan A. Chi, John Hewitt, Christopher D. Manning | Motivated by these results, we present an unsupervised analysis method that provides evidence mBERT learns representations of syntactic dependency labels, in the form of clusters which largely agree with the Universal Dependencies taxonomy. | related papers | related patents |
494 | Generating Hierarchical Explanations on Text Classification via Feature Interaction Detection | Hanjie Chen, Guangtao Zheng, Yangfeng Ji | In this work, we build hierarchical explanations by detecting feature interactions. | related papers | related patents |
495 | Obtaining Faithful Interpretations from Compositional Neural Networks | Sanjay Subramanian, Ben Bogin, Nitish Gupta, Tomer Wolfson, Sameer Singh, Jonathan Berant, Matt Gardner | In this work, we propose and conduct a systematic evaluation of the intermediate outputs of NMNs on NLVR2 and DROP, two datasets which require composing multiple reasoning steps. | related papers | related patents |
496 | Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport | Kyle Swanson, Lili Yu, Tao Lei | In this work, we extend this selective rationalization approach to text matching, where the goal is to jointly select and align text pieces, such as tokens or sentences, as a justification for the downstream prediction. | related papers | related patents |
497 | Benefits of Intermediate Annotations in Reading Comprehension | Dheeru Dua, Sameer Singh, Matt Gardner | In this work, we study the benefits of collecting intermediate reasoning supervision along with the answer during data collection. | related papers | related patents |
498 | Crossing Variational Autoencoders for Answer Retrieval | Wenhao Yu, Lingfei Wu, Qingkai Zeng, Shu Tao, Yu Deng, Meng Jiang | In this work, we propose to cross variational auto-encoders by generating questions with aligned answers and generating answers with aligned questions. | related papers | related patents |
499 | Logic-Guided Data Augmentation and Regularization for Consistent Question Answering | Akari Asai, Hannaneh Hajishirzi | This paper addresses the problem of improving the accuracy and consistency of responses to comparison questions by integrating logic rules and neural models. | related papers | related patents |
500 | On the Importance of Diversity in Question Generation for QA | Md Arafat Sultan, Shubham Chandel, Ramón Fernandez Astudillo, Vittorio Castelli | In this paper we ask: Is textual diversity in QG beneficial for downstream QA? | related papers | related patents |
501 | Probabilistic Assumptions Matter: Improved Models for Distantly-Supervised Document-Level Question Answering | Hao Cheng, Ming-Wei Chang, Kenton Lee, Kristina Toutanova | We address the problem of extractive question answering using document-level distant super-vision, pairing questions and relevant documents with answer strings. | related papers | related patents |
502 | SCDE: Sentence Cloze Dataset with High Quality Distractors From Examinations | Xiang Kong, Varun Gangal, Eduard Hovy | We introduce SCDE, a dataset to evaluate the performance of computational models through sentence prediction. | related papers | related patents |
503 | Selective Question Answering under Domain Shift | Amita Kamath, Robin Jia, Percy Liang | In this work, we propose the setting of selective question answering under domain shift, in which a QA model is tested on a mixture of in-domain and out-of-domain data, and must answer (i.e., not abstain on) as many questions as possible while maintaining high accuracy. | related papers | related patents |
504 | The Cascade Transformer: an Application for Efficient Answer Sentence Selection | Luca Soldaini, Alessandro Moschitti | In this paper, we introduce the Cascade Transformer, a simple yet effective technique to adapt transformer-based models into a cascade of rankers. | related papers | related patents |
505 | Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-based Question Answering | Changmao Li, Jinho D. Choi | We introduce a novel approach to transformers that learns hierarchical representations in multiparty dialogue. | related papers | related patents |
506 | Not All Claims are Created Equal: Choosing the Right Statistical Approach to Assess Hypotheses | Erfan Sadeqi Azer, Daniel Khashabi, Ashish Sabharwal, Dan Roth | We address this gap by contrasting various hypothesis assessment techniques, especially those not commonly used in the field (such as evaluations based on Bayesian inference). | related papers | related patents |
507 | STARC: Structured Annotations for Reading Comprehension | Yevgeni Berzak, Jonathan Malmaud, Roger Levy | We present STARC (Structured Annotations for Reading Comprehension), a new annotation framework for assessing reading comprehension with multiple choice questions. | related papers | related patents |
508 | WinoWhy: A Deep Diagnosis of Essential Commonsense Knowledge for Answering Winograd Schema Challenge | Hongming Zhang, Xinran Zhao, Yangqiu Song | In this paper, we present the first comprehensive categorization of essential commonsense knowledge for answering the Winograd Schema Challenge (WSC). | related papers | related patents |
509 | Agreement Prediction of Arguments in Cyber Argumentation for Detecting Stance Polarity and Intensity | Joseph Sirrianni, Xiaoqing Liu, Douglas Adams | We introduce a new research problem, stance polarity and intensity prediction in response relationships between posts. | related papers | related patents |
510 | Cross-Lingual Unsupervised Sentiment Classification with Multi-View Transfer Learning | Hongliang Fei, Ping Li | We propose an unsupervised cross-lingual sentiment classification model named multi-view encoder-classifier (MVEC) that leverages an unsupervised machine translation (UMT) system and a language discriminator. | related papers | related patents |
511 | Efficient Pairwise Annotation of Argument Quality | Lukas Gienapp, Benno Stein, Matthias Hagen, Martin Potthast | We present an efficient annotation framework for argument quality, a feature difficult to be measured reliably as per previous work. | related papers | related patents |
512 | Entity-Aware Dependency-Based Deep Graph Attention Network for Comparative Preference Classification | Nianzu Ma, Sahisnu Mazumder, Hao Wang, Bing Liu | This paper proposes a novel Entity-aware Dependency-based Deep Graph Attention Network (ED-GAT) that employs a multi-hop graph attention over a dependency graph sentence representation to leverage both the semantic information from word embeddings and the syntactic information from the dependency graph to solve the problem. | related papers | related patents |
513 | OpinionDigest: A Simple Framework for Opinion Summarization | Yoshihiko Suhara, Xiaolan Wang, Stefanos Angelidis, Wang-Chiew Tan | We present OpinionDigest, an abstractive opinion summarization framework, which does not rely on gold-standard summaries for training. | related papers | related patents |
514 | A Comprehensive Analysis of Preprocessing for Word Representation Learning in Affective Tasks | Nastaran Babanejad, Ameeta Agrawal, Aijun An, Manos Papagelis | To address this limitation, we conduct a comprehensive analysis of the role of preprocessing techniques in affective analysis based on word vector models. | related papers | related patents |
515 | Diverse and Informative Dialogue Generation with Context-Specific Commonsense Knowledge Awareness | Sixing Wu, Ying Li, Dawei Zhang, Yang Zhou, Zhonghai Wu | To this end, this paper proposes a novel commonsense knowledge-aware dialogue generation model, ConKADI. We collect and build a large-scale Chinese dataset aligned with the commonsense knowledge for dialogue generation. | related papers | related patents |
516 | Generate, Delete and Rewrite: A Three-Stage Framework for Improving Persona Consistency of Dialogue Generation | Haoyu Song, Yan Wang, Wei-Nan Zhang, Xiaojiang Liu, Ting Liu | In this work, we introduce a three-stage framework that employs a generate-delete-rewrite mechanism to delete inconsistent words from a generated response prototype and further rewrite it to a personality-consistent one. | related papers | related patents |
517 | Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks | YIPING SONG, Zequn Liu, Wei Bi, Rui Yan, Ming Zhang | In this paper, we propose an algorithm that can customize a unique dialogue model for each task in the few-shot setting. | related papers | related patents |
518 | Video-Grounded Dialogues with Pretrained Generation Language Models | Hung Le, Steven C.H. Hoi | In this paper, we leverage the power of pre-trained language models for improving video-grounded dialogue, which is very challenging and involves complex features of different dynamics: (1) Video features which can extend across both spatial and temporal dimensions; and (2) Dialogue features which involve semantic dependencies over multiple dialogue turns. | related papers | related patents |
519 | A Unified MRC Framework for Named Entity Recognition | Xiaoya Li, Jingrong Feng, Yuxian Meng, Qinghong Han, Fei Wu, Jiwei Li | In this paper, we propose a unified framework that is capable of handling both flat and nested NER tasks. | related papers | related patents |
520 | An Effective Transition-based Model for Discontinuous NER | Xiang Dai, Sarvnaz Karimi, Ben Hachey, Cecile Paris | We propose a simple, effective transition-based model with generic neural encoding for discontinuous NER. | related papers | related patents |
521 | IMoJIE: Iterative Memory-Based Joint Open Information Extraction | Keshav Kolluru, Samarth Aggarwal, Vipul Rathore, Mausam, Soumen Chakrabarti | We present IMoJIE, an extension to CopyAttention, which produces the next extraction conditioned on all previously extracted tuples. | related papers | related patents |
522 | Improving Event Detection via Open-domain Trigger Knowledge | Meihan Tong, Bin Xu, Shuai Wang, Yixin Cao, Lei Hou, Juanzi Li, Jun Xie | To address the issue, we propose a novel Enrichment Knowledge Distillation (EKD) model to leverage external open-domain trigger knowledge to reduce the in-built biases to frequent trigger words in annotations. | related papers | related patents |
523 | Improving Low-Resource Named Entity Recognition using Joint Sentence and Token Labeling | Canasai Kruengkrai, Thien Hai Nguyen, Sharifah Mahani Aljunied, Lidong Bing | We present a joint model that supports multi-class classification and introduce a simple variant of self-attention that allows the model to learn scaling factors. | related papers | related patents |
524 | Multi-Cell Compositional LSTM for NER Domain Adaptation | Chen Jia, Yue Zhang | We investigate a multi-cell compositional LSTM structure for multi-task learning, modeling each entity type using a separate cell state. | related papers | related patents |
525 | Pyramid: A Layered Model for Nested Named Entity Recognition | Jue WANG, Lidan Shou, Ke Chen, Gang Chen | This paper presents Pyramid, a novel layered model for Nested Named Entity Recognition (nested NER). | related papers | related patents |
526 | ReInceptionE: Relation-Aware Inception Network with Joint Local-Global Structural Information for Knowledge Graph Embedding | Zhiwen Xie, Guangyou Zhou, Jin Liu, Jimmy Xiangji Huang | In this paper, we take the benefits of ConvE and KBGAT together and propose a Relation-aware Inception network with joint local-global structural information for knowledge graph Embedding (ReInceptionE). | related papers | related patents |
527 | Relabel the Noise: Joint Extraction of Entities and Relations via Cooperative Multiagents | Daoyuan Chen, Yaliang Li, Kai Lei, Ying Shen | We propose a joint extraction approach to address this problem by re-labeling noisy instances with a group of cooperative multiagents. | related papers | related patents |
528 | Simplify the Usage of Lexicon in Chinese NER | Ruotian Ma, Minlong Peng, Qi Zhang, Zhongyu Wei, Xuanjing Huang | In this work, we propose a simple but effective method for incorporating the word lexicon into the character representations. | related papers | related patents |
529 | AdvAug: Robust Adversarial Augmentation for Neural Machine Translation | Yong Cheng, Lu Jiang, Wolfgang Macherey, Jacob Eisenstein | In this paper, we propose a new adversarial augmentation method for Neural Machine Translation (NMT). | related papers | related patents |
530 | Contextual Neural Machine Translation Improves Translation of Cataphoric Pronouns | KayYen Wong, Sameen Maruf, Gholamreza Haffari | In this work, we investigate the effect of future sentences as context by comparing the performance of a contextual NMT model trained with the future context to the one trained with the past context. | related papers | related patents |
531 | Improving Neural Machine Translation with Soft Template Prediction | Jian Yang, Shuming Ma, Dongdong Zhang, Zhoujun Li, Ming Zhou | Inspired by the success of template-based and syntax-based approaches in other fields, we propose to use extracted templates from tree structures as soft target templates to guide the translation procedure. | related papers | related patents |
532 | Tagged Back-translation Revisited: Why Does It Really Work? | Benjamin Marie, Raphael Rubino, Atsushi Fujita | In this paper, we show that neural machine translation (NMT) systems trained on large back-translated data overfit some of the characteristics of machine-translated texts. | related papers | related patents |
533 | Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation | Shun-Po Chuang, Tzu-Wei Sung, Alexander H. Liu, Hung-yi Lee | Because whether the output of the recognition decoder has the correct semantics is more critical than its accuracy, we propose to improve the multitask ST model by utilizing word embedding as the intermediate. | related papers | related patents |
534 | Neural-DINF: A Neural Network based Framework for Measuring Document Influence | Jie Tan, Changlin Yang, Ying Li, Siliang Tang, Chen Huang, Yueting Zhuang | In this paper, we use both frequency changes and word semantic shifts to measure document influence by developing a neural network framework. | related papers | related patents |
535 | Paraphrase Generation by Learning How to Edit from Samples | Amirhossein Kazemnejad, Mohammadreza Salehi, Mahdieh Soleymani Baghshah | To address these problems, we propose a novel retrieval-based method for paraphrase generation. | related papers | related patents |
536 | Emerging Cross-lingual Structure in Pretrained Language Models | Alexis Conneau, Shijie Wu, Haoran Li, Luke Zettlemoyer, Veselin Stoyanov | We study the problem of multilingual masked language modeling, i.e. the training of a single model on concatenated text from multiple languages, and present a detailed study of several factors that influence why these models are so effective for cross-lingual transfer. | related papers | related patents |
537 | FastBERT: a Self-distilling BERT with Adaptive Inference Time | Weijie Liu, Peng Zhou, Zhiruo Wang, Zhe Zhao, Haotang Deng, QI JU | To improve their efficiency with an assured model performance, we propose a novel speed-tunable FastBERT with adaptive inference time. | related papers | related patents |
538 | Incorporating External Knowledge through Pre-training for Natural Language to Code Generation | Frank F. Xu, Zhengbao Jiang, Pengcheng Yin, Bogdan Vasilescu, Graham Neubig | Motivated by the intuition that developers usually retrieve resources on the web when writing code, we explore the effectiveness of incorporating two varieties of external knowledge into NL-to-code generation: automatically mined NL-code pairs from the online programming QA forum StackOverflow and programming language API documentation. | related papers | related patents |
539 | LogicalFactChecker: Leveraging Logical Operations for Fact Checking with Graph Module Network | Wanjun Zhong, Duyu Tang, Zhangyin Feng, Nan Duan, Ming Zhou, Ming Gong, Linjun Shou, Daxin Jiang, Jiahai Wang, Jian Yin | In this work, we propose LogicalFactChecker, a neural network approach capable of leveraging logical operations for fact checking. | related papers | related patents |
540 | Word-level Textual Adversarial Attacking as Combinatorial Optimization | Yuan Zang, Fanchao Qi, Chenghao Yang, Zhiyuan Liu, Meng Zhang, Qun Liu, Maosong Sun | In this paper, we propose a novel attack model, which incorporates the sememe-based word substitution method and particle swarm optimization-based search algorithm to solve the two problems separately. | related papers | related patents |
541 | Benchmarking Multimodal Regex Synthesis with Complex Structures | Xi Ye, Qiaochu Chen, Isil Dillig, Greg Durrett | We introduce StructuredRegex, a new regex synthesis dataset differing from prior ones in three aspects. | related papers | related patents |
542 | Curriculum Learning for Natural Language Understanding | Benfeng Xu, Licheng Zhang, Zhendong Mao, Quan Wang, Hongtao Xie, Yongdong Zhang | However, examples in NLU tasks can vary greatly in difficulty, and similar to human learning procedure, language models can benefit from an easy-to-difficult curriculum. Based on this idea, we propose our Curriculum Learning approach. | related papers | related patents |
543 | Do Neural Models Learn Systematicity of Monotonicity Inference in Natural Language? | Hitomi Yanaka, Koji Mineshima, Daisuke Bekki, Kentaro Inui | In this paper, we introduce a method for evaluating whether neural models can learn systematicity of monotonicity inference in natural language, namely, the regularity for performing arbitrary inferences with generalization on composition. | related papers | related patents |
544 | Evidence-Aware Inferential Text Generation with Vector Quantised Variational AutoEncoder | Daya Guo, Duyu Tang, Nan Duan, Jian Yin, Daxin Jiang, Ming Zhou | To address this, we propose an approach that automatically finds evidence for an event from a large text corpus, and leverages the evidence to guide the generation of inferential texts. | related papers | related patents |
545 | How to Ask Good Questions? Try to Leverage Paraphrases | Xin Jia, Wenjie Zhou, Xu SUN, Yunfang Wu | Specifically, we present a two-hand hybrid model leveraging a self-built paraphrase resource, which is automatically conducted by a simple back-translation method. | related papers | related patents |
546 | NeuInfer: Knowledge Inference on N-ary Facts | Saiping Guan, Xiaolong Jin, Jiafeng Guo, Yuanzhuo Wang, Xueqi Cheng | We represent each n-ary fact as a primary triple coupled with a set of its auxiliary descriptive attribute-value pair(s). | related papers | related patents |
547 | Neural Graph Matching Networks for Chinese Short Text Matching | Lu Chen, Yanbin Zhao, Boer Lyu, Lesheng Jin, Zhi Chen, Su Zhu, Kai Yu | To address this problem, we propose neural graph matching networks, a novel sentence matching framework capable of dealing with multi-granular input information. | related papers | related patents |
548 | Neural Mixed Counting Models for Dispersed Topic Discovery | Jiemin Wu, Yanghui Rao, Zusheng Zhang, Haoran Xie, Qing Li, Fu Lee Wang, Ziye Chen | In this paper, we propose two efficient neural mixed counting models, i.e., the Negative Binomial-Neural Topic Model (NB-NTM) and the Gamma Negative Binomial-Neural Topic Model (GNB-NTM) for dispersed topic discovery. | related papers | related patents |
549 | Reasoning Over Semantic-Level Graph for Fact Checking | Wanjun Zhong, Jingjing Xu, Duyu Tang, Zenan Xu, Nan Duan, Ming Zhou, Jiahai Wang, Jian Yin | In this work, we present a method suitable for reasoning about the semantic-level structure of evidence. | related papers | related patents |
550 | Automatic Generation of Citation Texts in Scholarly Papers: A Pilot Study | Xinyu Xing, Xiaosheng Fan, Xiaojun Wan | In this paper, we study the challenging problem of automatic generation of citation texts in scholarly papers. | related papers | related patents |
551 | Composing Elementary Discourse Units in Abstractive Summarization | Zhenwen Li, Wenhao Wu, Sujian Li | In this paper, we argue that elementary discourse unit (EDU) is a more appropriate textual unit of content selection than the sentence unit in abstractive summarization. | related papers | related patents |
552 | Extractive Summarization as Text Matching | Ming Zhong, Pengfei Liu, Yiran Chen, Danqing Wang, Xipeng Qiu, Xuanjing Huang | This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems. | related papers | related patents |
553 | Heterogeneous Graph Neural Networks for Extractive Document Summarization | Danqing Wang, Pengfei Liu, Yining Zheng, Xipeng Qiu, Xuanjing Huang | In this paper, we present a heterogeneous graph-based neural network for extractive summarization (HETERSUMGRAPH), which contains semantic nodes of different granularity levels apart from sentences. | related papers | related patents |
554 | Jointly Learning to Align and Summarize for Neural Cross-Lingual Summarization | Yue Cao, Hui Liu, Xiaojun Wan | In this paper, we propose to ease the cross-lingual summarization training by jointly learning to align and summarize. | related papers | related patents |
555 | Leveraging Graph to Improve Abstractive Multi-Document Summarization | Wei Li, Xinyan Xiao, Jiachen Liu, Hua Wu, Haifeng Wang, Junping Du | In this paper, we develop a neural abstractive multi-document summarization (MDS) model which can leverage well-known graph representations of documents such as similarity graph and discourse graph, to more effectively process multiple input documents and produce abstractive summaries. | related papers | related patents |
556 | Multi-Granularity Interaction Network for Extractive and Abstractive Multi-Document Summarization | Hanqi Jin, Tianming Wang, Xiaojun Wan | In this paper, we propose a multi-granularity interaction network for extractive and abstractive multi-document summarization, which jointly learn semantic representations for words, sentences, and documents. | related papers | related patents |
557 | Tetra-Tagging: Word-Synchronous Parsing with Linear-Time Inference | Nikita Kitaev, Dan Klein | We present a constituency parsing algorithm that, like a supertagger, works by assigning labels to each word in a sentence. | related papers | related patents |
558 | Are we Estimating or Guesstimating Translation Quality? | Shuo Sun, Francisco Guzmán, Lucia Specia | Our findings suggest that although QE models might capture fluency of translated sentences and complexity of source sentences, they cannot model adequacy of translations effectively. | related papers | related patents |
559 | Language (Re)modelling: Towards Embodied Language Understanding | Ronen Tamari, Chen Shani, Tom Hope, Miriam R L Petruck, Omri Abend, Dafna Shahaf | This work proposes an approach to representation and learning based on the tenets of embodied cognitive linguistics (ECL). | related papers | related patents |
560 | The State and Fate of Linguistic Diversity and Inclusion in the NLP World | Pratik Joshi, Sebastin Santy, Amar Budhiraja, Kalika Bali, Monojit Choudhury | In this paper we look at the relation between the types of languages, resources, and their representation in NLP conferences to understand the trajectory that different languages have followed over time. | related papers | related patents |
561 | The Unstoppable Rise of Computational Linguistics in Deep Learning | James Henderson | In this paper, we trace the history of neural networks applied to natural language understanding tasks, and identify key contributions which the nature of language has made to the development of neural network architectures. | related papers | related patents |
562 | To Boldly Query What No One Has Annotated Before? The Frontiers of Corpus Querying | Markus Gärtner, Kerstin Jung | This paper offers a broad overview of the history of corpora and corpus query tools. | related papers | related patents |
563 | A Contextual Hierarchical Attention Network with Adaptive Objective for Dialogue State Tracking | Yong Shan, Zekang Li, Jinchao Zhang, Fandong Meng, Yang Feng, Cheng Niu, Jie Zhou | In this paper, we propose to enhance the DST through employing a contextual hierarchical attention network to not only discern relevant information at both word level and turn level but also learn contextual representations. | related papers | related patents |
564 | Data Manipulation: Towards Effective Instance Learning for Neural Dialogue Generation via Learning to Augment and Reweight | Hengyi Cai, Hongshen Chen, Yonghao Song, Cheng Zhang, Xiaofang Zhao, Dawei Yin | In this paper, we propose a data manipulation framework to proactively reshape the data distribution towards reliable samples by augmenting and highlighting effective learning samples as well as reducing the effect of inefficient samples simultaneously. | related papers | related patents |
565 | Dynamic Fusion Network for Multi-Domain End-to-end Task-Oriented Dialog | Libo Qin, Xiao Xu, Wanxiang Che, Yue Zhang, Ting Liu | To this end, we investigate methods that can make explicit use of domain knowledge and introduce a shared-private network to learn shared and specific knowledge. | related papers | related patents |
566 | Learning Efficient Dialogue Policy from Demonstrations through Shaping | Huimin Wang, Baolin Peng, Kam-Fai Wong | In this paper, we present S{\^{}}2Agent that efficiently learns dialogue policy from demonstrations through policy shaping and reward shaping. | related papers | related patents |
567 | SAS: Dialogue State Tracking via Slot Attention and Slot Information Sharing | Jiaying Hu, Yan Yang, Chencai Chen, liang he, Zhou Yu | We propose a Dialogue State Tracker with Slot Attention and Slot Information Sharing (SAS) to reduce redundant information’s interference and improve long dialogue context tracking. | related papers | related patents |
568 | Speaker Sensitive Response Evaluation Model | JinYeong Bak, Alice Oh | In this paper, we propose an automatic evaluation model based on that idea and learn the model parameters from an unlabeled conversation corpus. | related papers | related patents |
569 | A Top-down Neural Architecture towards Text-level Parsing of Discourse Rhetorical Structure | Longyin Zhang, Yuqing Xing, Fang Kong, Peifeng Li, Guodong Zhou | In this paper, we justify from both computational and perceptive points-of-view that the top-down architecture is more suitable for text-level DRS parsing. | related papers | related patents |
570 | Amalgamation of protein sequence, structure and textual information for improving protein-protein interaction identification | pratik Dutta, Sriparna Saha | In this paper, we argue that incorporating multimodal cues can improve the automatic identification of PPI. | related papers | related patents |
571 | Bipartite Flat-Graph Network for Nested Named Entity Recognition | Ying Luo, Hai Zhao | In this paper, we propose a novel bipartite flat-graph network (BiFlaG) for nested named entity recognition (NER), which contains two subgraph modules: a flat NER module for outermost entities and a graph module for all the entities located in inner layers. | related papers | related patents |
572 | Connecting Embeddings for Knowledge Graph Entity Typing | Yu Zhao, anxiang zhang, Ruobing Xie, Kang Liu, Xiaojie WANG | In this paper, we propose a novel approach for KG entity typing which is trained by jointly utilizing local typing knowledge from existing entity type assertions and global triple knowledge in KGs. | related papers | related patents |
573 | Continual Relation Learning via Episodic Memory Activation and Reconsolidation | Xu Han, Yi Dai, Tianyu Gao, Yankai Lin, Zhiyuan Liu, Peng Li, Maosong Sun, Jie Zhou | Inspired by the mechanism in human long-term memory formation, we introduce episodic memory activation and reconsolidation (EMAR) to continual relation learning. | related papers | related patents |
574 | Handling Rare Entities for Neural Sequence Labeling | Yangming Li, Han Li, Kaisheng Yao, Xiaolong Li | Most of test set entities appear only few times and are even unseen in training corpus, yielding large number of out-of-vocabulary (OOV) and low-frequency (LF) entities during evaluation. In this work, we propose approaches to address this problem. | related papers | related patents |
575 | Instance-Based Learning of Span Representations: A Case Study through Named Entity Recognition | Hiroki Ouchi, Jun Suzuki, Sosuke Kobayashi, Sho Yokoi, Tatsuki Kuribayashi, Ryuto Konno, Kentaro Inui | In this study, we develop models possessing interpretable inference process for structured prediction. | related papers | related patents |
576 | MIE: A Medical Information Extractor towards Medical Dialogues | Yuanzhe Zhang, Zhongtao Jiang, Tao Zhang, Shiwan Liu, Jiarun Cao, Kang Liu, Shengping Liu, Jun Zhao | We then propose a Medical Information Extractor (MIE) towards medical dialogues. MIE is able to extract mentioned symptoms, surgeries, tests, other information and their corresponding status. | related papers | related patents |
577 | Named Entity Recognition as Dependency Parsing | Juntao Yu, Bernd Bohnet, Massimo Poesio | In this paper, we use ideas from graph-based dependency parsing to provide our model a global view on the input via a biaffine model (Dozat and Manning, 2017). | related papers | related patents |
578 | Neighborhood Matching Network for Entity Alignment | Yuting Wu, Xiao Liu, Yansong Feng, Zheng Wang, Dongyan Zhao | This paper presents Neighborhood Matching Network (NMN), a novel entity alignment framework for tackling the structural heterogeneity challenge. | related papers | related patents |
579 | Relation Extraction with Explanation | Hamed Shahbazi, Xiaoli Fern, Reza Ghaeini, Prasad Tadepalli | In this work we annotate a test set with ground-truth sentence-level explanations to evaluate the quality of explanations afforded by the relation extraction models. | related papers | related patents |
580 | Representation Learning for Information Extraction from Form-like Documents | Bodhisattwa Prasad Majumder, Navneet Potti, Sandeep Tata, James Bradley Wendt, Qi Zhao, Marc Najork | We propose a novel approach using representation learning for tackling the problem of extracting structured information from form-like document images. | related papers | related patents |
581 | Single-/Multi-Source Cross-Lingual NER via Teacher-Student Learning on Unlabeled Data in Target Language | Qianhui Wu, Zijia Lin, Börje Karlsson, Jian-Guang LOU, Biqing Huang | In this paper, we propose a teacher-student learning method to address such limitations, where NER models in the source languages are used as teachers to train a student model on unlabeled data in the target language. | related papers | related patents |
582 | Synchronous Double-channel Recurrent Network for Aspect-Opinion Pair Extraction | Shaowei Chen, Jie Liu, Yu Wang, Wenzheng Zhang, Ziming Chi | In this paper, we explore Aspect-Opinion Pair Extraction (AOPE) task, which aims at extracting aspects and opinion expressions in pairs. To verify the performance of SDRN, we manually build three datasets based on SemEval 2014 and 2015 benchmarks. | related papers | related patents |
583 | Cross-modal Coherence Modeling for Caption Generation | Malihe Alikhani, Piyush Sharma, Shengjie Li, Radu Soricut, Matthew Stone | We introduce a new task for learning inferences in imagery and text, coherence relation prediction, and show that these coherence annotations can be exploited to learn relation classifiers as an intermediary step, and also train coherence-aware, controllable image captioning models. | related papers | related patents |
584 | Knowledge Supports Visual Language Grounding: A Case Study on Colour Terms | Simeon Schüz, Sina Zarrieß | We go beyond previous studies on colour terms using isolated colour swatches and study visual grounding of colour terms in realistic objects. | related papers | related patents |
585 | Span-based Localizing Network for Natural Language Video Localization | Hao Zhang, Aixin Sun, Wei Jing, Joey Tianyi Zhou | In this work, we address NLVL task with a span-based QA approach by treating the input video as text passage. | related papers | related patents |
586 | Words Aren’t Enough, Their Order Matters: On the Robustness of Grounding Visual Referring Expressions | Arjun Akula, Spandana Gella, Yaser Al-Onaizan, Song-Chun Zhu, Siva Reddy | Using these datasets, we empirically show that existing methods fail to exploit linguistic structure and are 12% to 23% lower in performance than the established progress for this task. | related papers | related patents |
587 | A Mixture of h – 1 Heads is Better than h Heads | Hao Peng, Roy Schwartz, Dianqi Li, Noah A. Smith | In this work, we instead “reallocate” them-the model learns to activate different heads on different inputs. | related papers | related patents |
588 | Dependency Graph Enhanced Dual-transformer Structure for Aspect-based Sentiment Classification | Hao Tang, Donghong Ji, Chenliang Li, Qiji Zhou | To this end, we propose a dependency graph enhanced dual-transformer network (named DGEDT) by jointly considering the flat representations learnt from Transformer and graph-based representations learnt from the corresponding dependency graph in an iterative interaction manner. | related papers | related patents |
589 | Differentiable Window for Dynamic Local Attention | Thanh-Tung Nguyen, Xuan-Phi Nguyen, Shafiq Joty, Xiaoli Li | We propose Differentiable Window, a new neural module and general purpose component for dynamic window selection. | related papers | related patents |
590 | Evaluating and Enhancing the Robustness of Neural Network-based Dependency Parsing Models with Adversarial Examples | Xiaoqing Zheng, Jiehang Zeng, Yi Zhou, Cho-Jui Hsieh, Minhao Cheng, Xuanjing Huang | In this study, we show that adversarial examples also exist in dependency parsing: we propose two approaches to study where and how parsers make mistakes by searching over perturbations to existing texts at sentence and phrase levels, and design algorithms to construct such examples in both of the black-box and white-box settings. | related papers | related patents |
591 | Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach | Wenyu Du, Zhouhan Lin, Yikang Shen, Timothy J. O’Donnell, Yoshua Bengio, Yue Zhang | In this paper, we make use of a multi-task objective, i.e., the models simultaneously predict words as well as ground truth parse trees in a form called “syntactic distances”, where information between these two separate objectives shares the same intermediate representation. | related papers | related patents |
592 | Learning Architectures from an Extended Search Space for Language Modeling | Yinqiao Li, Chi Hu, Yuhao Zhang, Nuo Xu, Yufan Jiang, Tong Xiao, Jingbo Zhu, Tongran Liu, changliang li | Neural architecture search (NAS) has advanced significantly in recent years but most NAS systems restrict search to learning architectures of a recurrent or convolutional cell. In this paper, we extend the search space of NAS. | related papers | related patents |
593 | The Right Tool for the Job: Matching Model and Instance Complexities | Roy Schwartz, Gabriel Stanovsky, Swabha Swayamdipta, Jesse Dodge, Noah A. Smith | To better respect a given inference budget, we propose a modification to contextual representation fine-tuning which, during inference, allows for an early (and fast) “exit” from neural network calculations for simple instances, and late (and accurate) exit for hard instances. | related papers | related patents |
594 | Bootstrapping Techniques for Polysynthetic Morphological Analysis | William Lane, Steven Bird | To address this challenge, we offer linguistically-informed approaches for bootstrapping a neural morphological analyzer, and demonstrate its application to Kunwinjku, a polysynthetic Australian language. | related papers | related patents |
595 | Coupling Distant Annotation and Adversarial Training for Cross-Domain Chinese Word Segmentation | Ning Ding, Dingkun Long, Guangwei Xu, Muhua Zhu, Pengjun Xie, Xiaobin Wang, Haitao Zheng | In order to simultaneously alleviate the issues, this paper intuitively couples distant annotation and adversarial training for cross-domain CWS. | related papers | related patents |
596 | Modeling Morphological Typology for Unsupervised Learning of Language Morphology | Hongzhi Xu, Jordan Kodner, Mitchell Marcus, Charles Yang | This paper describes a language-independent model for fully unsupervised morphological analysis that exploits a universal framework leveraging morphological typology. | related papers | related patents |
597 | Predicting Declension Class from Form and Meaning | Adina Williams, Tiago Pimentel, Hagen Blix, Arya D. McCarthy, Eleanor Chodroff, Ryan Cotterell | More specifically, we operationalize this by measuring how much information, in bits, we can glean about declension class from knowing the form and/or meaning of nouns. | related papers | related patents |
598 | Unsupervised Morphological Paradigm Completion | Huiming Jin, Liwei Cai, Yihui Peng, Chen Xia, Arya McCarthy, Katharina Kann | We propose the task of unsupervised morphological paradigm completion. | related papers | related patents |
599 | Document Modeling with Graph Attention Networks for Multi-grained Machine Reading Comprehension | Bo Zheng, Haoyang Wen, Yaobo Liang, Nan Duan, Wanxiang Che, Daxin Jiang, Ming Zhou, Ting Liu | To address this issue, we present a novel multi-grained machine reading comprehension framework that focuses on modeling documents at their hierarchical nature, which are different levels of granularity: documents, paragraphs, sentences, and tokens. | related papers | related patents |
600 | Harvesting and Refining Question-Answer Pairs for Unsupervised QA | Zhongli Li, Wenhui Wang, Li Dong, Furu Wei, Ke Xu | In this work, we introduce two approaches to improve unsupervised QA. | related papers | related patents |
601 | Low-Resource Generation of Multi-hop Reasoning Questions | Jianxing Yu, Wei Liu, Shuang Qiu, Qinliang Su, Kai Wang, Xiaojun Quan, Jian Yin | Since the labeled data is limited and insufficient for training, we propose to learn the model with the help of a large scale of unlabeled data that is much easier to obtain. | related papers | related patents |
602 | R4C: A Benchmark for Evaluating RC Systems to Get the Right Answer for the Right Reason | Naoya Inoue, Pontus Stenetorp, Kentaro Inui | We present a reliable, crowdsourced framework for scalably annotating RC datasets with derivations. | related papers | related patents |
603 | Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension | Hongyu Gong, Yelong Shen, Dian Yu, Jianshu Chen, Dong Yu | In this paper, we study machine reading comprehension (MRC) on long texts: where a model takes as inputs a lengthy document and a query, extracts a text span from the document as an answer. | related papers | related patents |
604 | RikiNet: Reading Wikipedia Pages for Natural Question Answering | Dayiheng Liu, Yeyun Gong, Jie Fu, Yu Yan, Jiusheng Chen, Daxin Jiang, Jiancheng Lv, Nan Duan | In this paper, we introduce a new model, called RikiNet, which reads Wikipedia pages for natural question answering. | related papers | related patents |
605 | Parsing into Variable-in-situ Logico-Semantic Graphs | Yufei Chen, Weiwei Sun | We propose variable-in-situ logico-semantic graphs to bridge the gap between semantic graph and logical form parsing. | related papers | related patents |
606 | Semantic Parsing for English as a Second Language | Yuanyuan Zhao, Weiwei Sun, junjie cao, Xiaojun Wan | Motivated by the theoretical emphasis on the learning challenges that occur at the syntax-semantics interface during second language acquisition, we formulate the task based on the divergence between literal and intended meanings. | related papers | related patents |
607 | Semi-Supervised Semantic Dependency Parsing Using CRF Autoencoders | Zixia Jia, Youmi Ma, Jiong Cai, Kewei Tu | We propose an approach to semi-supervised learning of semantic dependency parsers based on the CRF autoencoder framework. | related papers | related patents |
608 | Unsupervised Dual Paraphrasing for Two-stage Semantic Parsing | Ruisheng Cao, Su Zhu, Chenyu Yang, Chen Liu, Rao Ma, Yanbin Zhao, Lu Chen, Kai Yu | Aiming to reduce nontrivial human labor, we propose a two-stage semantic parsing framework, where the first stage utilizes an unsupervised paraphrase model to convert an unlabeled natural language utterance into the canonical utterance. | related papers | related patents |
609 | DRTS Parsing with Structure-Aware Encoding and Decoding | Qiankun Fu, Yue Zhang, Jiangming Liu, Meishan Zhang | In this work, we propose a structural-aware model at both the encoder and decoder phase to integrate the structural information, where graph attention network (GAT) is exploited for effectively modeling. | related papers | related patents |
610 | A Two-Stage Masked LM Method for Term Set Expansion | Guy Kushilevitz, Shaul Markovitch, Yoav Goldberg | We harness the power of neural masked language models (MLM) and propose a novel TSE algorithm, which combines the pattern-based and distributional approaches. | related papers | related patents |
611 | FLAT: Chinese NER Using Flat-Lattice Transformer | Xiaonan Li, Hang Yan, Xipeng Qiu, Xuanjing Huang | In this paper, we propose FLAT: Flat-LAttice Transformer for Chinese NER, which converts the lattice structure into a flat structure consisting of spans. | related papers | related patents |
612 | Improving Entity Linking through Semantic Reinforced Entity Embeddings | Feng Hou, Ruili Wang, Jun He, Yi Zhou | We propose a simple yet effective method, FGS2EE, to inject fine-grained semantic information into entity embeddings to reduce the distinctiveness and facilitate the learning of contextual commonality. | related papers | related patents |
613 | Document Translation vs. Query Translation for Cross-Lingual Information Retrieval in the Medical Domain | Shadi Saleh, Pavel Pecina | We present a thorough comparison of two principal approaches to Cross-Lingual Information Retrieval: document translation (DT) and query translation (QT). | related papers | related patents |
614 | Learning Robust Models for e-Commerce Product Search | Thanh Nguyen, Nikhil Rao, Karthik Subbian | In this paper, we develop a deep, end-to-end model that learns to effectively classify mismatches and to generate hard mismatched examples to improve the classifier. | related papers | related patents |
615 | Generalized Entropy Regularization or: There’s Nothing Special about Label Smoothing | Clara Meister, Elizabeth Salesky, Ryan Cotterell | We introduce a parametric family of entropy regularizers, which includes label smoothing as a special case, and use it to gain a better understanding of the relationship between the entropy of a model and its performance on language generation tasks. | related papers | related patents |
616 | Highway Transformer: Self-Gating Enhanced Self-Attentive Networks | Yekun Chai, Shuo Jin, Xinwen Hou | Through a pseudo information highway, we introduce a gated component self-dependency units (SDU) that incorporates LSTM-styled gating units to replenish internal semantic importance within the multi-dimensional latent space of individual representations. | related papers | related patents |
617 | Low-Dimensional Hyperbolic Knowledge Graph Embeddings | Ines Chami, Adva Wolf, Da-Cheng Juan, Frederic Sala, Sujith Ravi, Christopher Ré | In this work, we introduce a class of hyperbolic KG embedding models that simultaneously capture hierarchical and logical patterns. | related papers | related patents |
618 | Classification-Based Self-Learning for Weakly Supervised Bilingual Lexicon Induction | Mladen Karan, Ivan Vulić, Anna Korhonen, Goran Glavaš | In this work, we present ClassyMap, a classification-based approach to self-learning, yielding a more robust and a more effective induction of projection-based CLWEs. | related papers | related patents |
619 | Gender in Danger? Evaluating Speech Translation Technology on the MuST-SHE Corpus | Luisa Bentivogli, Beatrice Savoldi, Matteo Negri, Mattia A. Di Gangi, Roldano Cattoni, Marco Turchi | We present the first thorough investigation of gender bias in speech translation, contributing with: i) the release of a benchmark useful for future studies, and ii) the comparison of different technologies (cascade and end-to-end) on two language directions (English-Italian/French). | related papers | related patents |
620 | Uncertainty-Aware Curriculum Learning for Neural Machine Translation | Yikai Zhou, Baosong Yang, Derek F. Wong, Yu Wan, Lidia S. Chao | We propose uncertainty-aware curriculum learning, which is motivated by the intuition that: 1) the higher the uncertainty in a translation pair, the more complex and rarer the information it contains; and 2) the end of the decline in model uncertainty indicates the completeness of current training stage. | related papers | related patents |
621 | Closing the Gap: Joint De-Identification and Concept Extraction in the Clinical Domain | Lukas Lange, Heike Adel, Jannik Strötgen | In this paper, we close this gap by reporting concept extraction performance on automatically anonymized data and investigating joint models for de-identification and concept extraction. | related papers | related patents |
622 | CorefQA: Coreference Resolution as Query-based Span Prediction | Wei Wu, Fei Wang, Arianna Yuan, Fei Wu, Jiwei Li | In this paper, we present CorefQA, an accurate and extensible approach for the coreference resolution task. | related papers | related patents |
623 | Estimating predictive uncertainty for rumour verification models | Elena Kochkina, Maria Liakata | We propose two methods for uncertainty-based instance rejection, supervised and unsupervised. | related papers | related patents |
624 | From Zero to Hero: Human-In-The-Loop Entity Linking in Low Resource Domains | Jan-Christoph Klie, Richard Eckart de Castilho, Iryna Gurevych | We therefore present a novel domain-agnostic Human-In-The-Loop annotation approach: we use recommenders that suggest potential concepts and adaptive candidate ranking, thereby speeding up the overall annotation process and making it less tedious for users. | related papers | related patents |
625 | Language to Network: Conditional Parameter Adaptation with Natural Language Descriptions | Tian Jin, Zhun Liu, Shengjia Yan, Alexandre Eichenberger, Louis-Philippe Morency | In this paper, we propose \textbf{N3} (\textbf{N}eural \textbf{N}etworks from \textbf{N}atural Language) – a new paradigm of synthesizing task-specific neural networks from language descriptions and a generic pre-trained model. | related papers | related patents |
626 | Controlled Crowdsourcing for High-Quality QA-SRL Annotation | Paul Roit, Ayal Klein, Daniela Stepanov, Jonathan Mamou, Julian Michael, Gabriel Stanovsky, Luke Zettlemoyer, Ido Dagan | In this paper, we present an improved crowdsourcing protocol for complex semantic annotation, involving worker selection and training, and a data consolidation phase. | related papers | related patents |
627 | Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus | Hao Fei, Meishan Zhang, Donghong Ji | In this paper, we propose a novel alternative based on corpus translation, constructing high-quality training datasets for the target languages from the source gold-standard SRL annotations. | related papers | related patents |
628 | Sentence Meta-Embeddings for Unsupervised Semantic Textual Similarity | Nina Poerner, Ulli Waltinger, Hinrich Schütze | We address the task of unsupervised Semantic Textual Similarity (STS) by ensembling diverse pre-trained sentence encoders into sentence meta-embeddings. | related papers | related patents |
629 | Transition-based Semantic Dependency Parsing with Pointer Networks | Daniel Fernández-González, Carlos Gómez-Rodríguez | In order to further test the capabilities of these powerful neural networks on a harder NLP problem, we propose a transition system that, thanks to Pointer Networks, can straightforwardly produce labelled directed acyclic graphs and perform semantic dependency parsing. | related papers | related patents |
630 | tBERT: Topic Models and BERT Joining Forces for Semantic Similarity Detection | Nicole Peinelt, Dong Nguyen, Maria Liakata | We propose a novel topic-informed BERT-based architecture for pairwise semantic similarity detection and show that our model improves performance over strong neural baselines across a variety of English language datasets. | related papers | related patents |
631 | Conditional Augmentation for Aspect Term Extraction via Masked Sequence-to-Sequence Generation | Kun Li, Chengbo Chen, Xiaojun Quan, Qing Ling, Yan Song | In this paper, we formulate the data augmentation as a conditional generation task: generating a new sentence while preserving the original opinion targets and labels. | related papers | related patents |
632 | Exploiting Personal Characteristics of Debaters for Predicting Persuasiveness | Khalid Al Khatib, Michael Völske, Shahbaz Syed, Nikolay Kolyada, Benno Stein | In this paper, we model debaters’ prior beliefs, interests, and personality traits based on their previous activity, without dependence on explicit user profiles or questionnaires. | related papers | related patents |
633 | Out of the Echo Chamber: Detecting Countering Debate Speeches | Matan Orbach, Yonatan Bilu, Assaf Toledo, Dan Lahav, Michal Jacovi, Ranit Aharonov, Noam Slonim | Given such a speech, we aim to identify, from among a set of speeches on the same topic and with an opposing stance, the ones that directly counter it. | related papers | related patents |
634 | Diversifying Dialogue Generation with Non-Conversational Text | Hui Su, Xiaoyu Shen, Sanqiang Zhao, Zhou Xiao, Pengwei Hu, randy zhong, Cheng Niu, Jie Zhou | In this paper, we propose a new perspective to diversify dialogue generation by leveraging \textit{non-conversational} text. | related papers | related patents |
635 | KdConv: A Chinese Multi-domain Dialogue Dataset Towards Multi-turn Knowledge-driven Conversation | Hao Zhou, Chujie Zheng, Kaili Huang, Minlie Huang, Xiaoyan Zhu | In this paper, we propose a Chinese multi-domain knowledge-driven conversation dataset, KdConv, which grounds the topics in multi-turn conversations to knowledge graphs. | related papers | related patents |
636 | Meta-Reinforced Multi-Domain State Generator for Dialogue Systems | Yi Huang, Junlan Feng, Min Hu, Xiaoting Wu, Xiaoyu Du, Shuo Ma | In this paper, we propose a Meta-Reinforced Multi-Domain State Generator (MERET). | related papers | related patents |
637 | Modeling Long Context for Task-Oriented Dialogue State Generation | Jun Quan, Deyi Xiong | Based on the recently proposed transferable dialogue state generator (TRADE) that predicts dialogue states from utterance-concatenated dialogue context, we propose a multi-task learning model with a simple yet effective utterance tagging technique and a bidirectional language model as an auxiliary task for task-oriented dialogue state generation. | related papers | related patents |
638 | Multi-Domain Dialogue Acts and Response Co-Generation | Kai Wang, Junfeng Tian, Rui Wang, Xiaojun Quan, Jianxing Yu | To address these issues, we propose a neural co-generation model that generates dialogue acts and responses concurrently. | related papers | related patents |
639 | Exploring Contextual Word-level Style Relevance for Unsupervised Style Transfer | Chulun Zhou, Liangyu Chen, Jiachen Liu, Xinyan Xiao, Jinsong Su, Sheng Guo, Hua Wu | In this paper, we propose a novel attentional sequence-to-sequence (Seq2seq) model that dynamically exploits the relevance of each output word to the target style for unsupervised style transfer. | related papers | related patents |
640 | Heterogeneous Graph Transformer for Graph-to-Sequence Learning | Shaowei Yao, Tianming Wang, Xiaojun Wan | In this paper, we propose the Heterogeneous Graph Transformer to independently model the different relations in the individual subgraphs of the original graph, including direct relations, indirect relations and multiple possible relations between nodes. | related papers | related patents |
641 | Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence | Xiaoyu Shen, Ernie Chang, Hui Su, Cheng Niu, Dietrich Klakow | To address this concern, we propose to explicitly segment target text into fragment units and align them with their data correspondences. | related papers | related patents |
642 | Aligned Dual Channel Graph Convolutional Network for Visual Question Answering | Qingbao Huang, Jielong Wei, Yi Cai, Changmeng Zheng, Junying Chen, Ho-fung Leung, Qing Li | To simultaneously capture the relations between objects in an image and the syntactic dependency relations between words in a question, we propose a novel dual channel graph convolutional network (DC-GCN) for better combining visual and textual advantages. | related papers | related patents |
643 | Multimodal Neural Graph Memory Networks for Visual Question Answering | Mahmoud Khademi | We introduce a new neural network architecture, Multimodal Neural Graph Memory Networks (MN-GMN), for visual question answering. | related papers | related patents |
644 | Refer360$^circ$: A Referring Expression Recognition Dataset in 360$^circ$ Images | Volkan Cirik, Taylor Berg-Kirkpatrick, Louis-Philippe Morency | We propose a novel large-scale referring expression recognition dataset, Refer360{\mbox{$^\circ$}}, consisting of 17,137 instruction sequences and ground-truth actions for completing these instructions in 360{\mbox{$^\circ$}} scenes. | related papers | related patents |
645 | CamemBERT: a Tasty French Language Model | Louis Martin, Benjamin Muller, Pedro Javier Ortiz Suárez, Yoann Dupont, Laurent Romary, Éric de la Clergerie, Djamé Seddah, Benoît Sagot | In this paper, we investigate the feasibility of training monolingual Transformer-based language models for other languages, taking French as an example and evaluating our language models on part-of-speech tagging, dependency parsing, named entity recognition and natural language inference tasks. | related papers | related patents |
646 | Effective Estimation of Deep Generative Language Models | Tom Pelsmaeker, Wilker Aziz | We concentrate on one such model, the variational auto-encoder, which we argue is an important building block in hierarchical probabilistic models of language. | related papers | related patents |
647 | Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection | Shauli Ravfogel, Yanai Elazar, Hila Gonen, Michael Twiton, Yoav Goldberg | We present Iterative Null-space Projection (INLP), a novel method for removing information from neural representations. | related papers | related patents |
648 | 2kenize: Tying Subword Sequences for Chinese Script Conversion | Pranav A, Isabelle Augenstein | Here, we propose a model that can disambiguate between mappings and convert between the two scripts. | related papers | related patents |
649 | Predicting the Growth of Morphological Families from Social and Linguistic Factors | Valentin Hofmann, Janet Pierrehumbert, Hinrich Schütze | We present the first study that examines the evolution of morphological families, i.e., sets of morphologically related words such as “trump”, “antitrumpism”, and “detrumpify”, in social media. | related papers | related patents |
650 | Semi-supervised Contextual Historical Text Normalization | Peter Makarov, Simon Clematide | By utilizing a simple generative normalization model and obtaining powerful contextualization from the target-side language model, we train accurate models with unlabeled historical data. | related papers | related patents |
651 | ClarQ: A large-scale and diverse dataset for Clarification Question Generation | Vaibhav Kumar, Alan W Black | In order to overcome these limitations, we devise a novel bootstrapping framework (based on self-supervision) that assists in the creation of a diverse, large-scale dataset of clarification questions based on post-comment tuples extracted from stackexchange. | related papers | related patents |
652 | DoQA – Accessing Domain-Specific FAQs via Conversational QA | Jon Ander Campos, Arantxa Otegi, Aitor Soroa, Jan Deriu, Mark Cieliebak, Eneko Agirre | The goal of this work is to build conversational Question Answering (QA) interfaces for the large body of domain-specific information available in FAQ sites. | related papers | related patents |
653 | MLQA: Evaluating Cross-lingual Extractive Question Answering | Patrick Lewis, Barlas Oguz, Ruty Rinott, Sebastian Riedel, Holger Schwenk | We present MLQA, a multi-way aligned extractive QA evaluation benchmark intended to spur research in this area. | related papers | related patents |
654 | Multi-source Meta Transfer for Low Resource Multiple-Choice Question Answering | Ming Yan, Hao Zhang, Di Jin, Joey Tianyi Zhou | To address this challenge, we propose a multi-source meta transfer (MMT) for low-resource MCQA. | related papers | related patents |
655 | Fine-grained Fact Verification with Kernel Graph Attention Network | Zhenghao Liu, Chenyan Xiong, Maosong Sun, Zhiyuan Liu | This paper presents Kernel Graph Attention Network (KGAT), which conducts more fine-grained fact verification with kernel-based attentions. | related papers | related patents |
656 | Generating Fact Checking Explanations | Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein | This paper provides the first study of how these explanations can be generated automatically based on available claim context, and how this task can be modelled jointly with veracity prediction. | related papers | related patents |
657 | Premise Selection in Natural Language Mathematical Texts | Deborah Ferreira, André Freitas | We propose an approach to solve this task as a link prediction problem, using Deep Convolutional Graph Neural Networks. | related papers | related patents |
658 | A Call for More Rigor in Unsupervised Cross-lingual Learning | Mikel Artetxe, Sebastian Ruder, Dani Yogatama, Gorka Labaka, Eneko Agirre | We review motivations, definition, approaches, and methodology for unsupervised cross-lingual learning and call for a more rigorous position in each of them. | related papers | related patents |
659 | A Tale of a Probe and a Parser | Rowan Hall Maudslay, Josef Valvoda, Tiago Pimentel, Adina Williams, Ryan Cotterell | To explore whether syntactic probes would do better to make use of existing techniques, we compare the structural probe to a more traditional parser with an identical lightweight parameterisation. | related papers | related patents |
660 | From SPMRL to NMRL: What Did We Learn (and Unlearn) in a Decade of Parsing Morphologically-Rich Languages (MRLs)? | Reut Tsarfaty, Dan Bareket, Stav Klein, Amit Seker | Here we reflect on parsing MRLs in that decade, highlight the solutions and lessons learned for the architectural, modeling and lexical challenges in the pre-neural era, and argue that similar challenges re-emerge in neural architectures for MRLs. | related papers | related patents |
661 | Speech Translation and the End-to-End Promise: Taking Stock of Where We Are | Matthias Sperber, Matthias Paulik | This paper provides a unifying categorization and nomenclature that covers both traditional and recent approaches and that may help researchers by highlighting both trade-offs and open research questions. | related papers | related patents |
662 | What Question Answering can Learn from Trivia Nerds | Jordan Boyd-Graber, Benjamin Börschinger | We argue that creating a QA dataset-and the ubiquitous leaderboard that goes with it-closely resembles running a trivia tournament: you write questions, have agents (either humans or machines) answer the questions, and declare a winner. | related papers | related patents |
663 | What are the Goals of Distributional Semantics? | Guy Emerson | In this paper, I take a broad linguistic perspective, looking at how well current models can deal with various semantic challenges. | related papers | related patents |
664 | Improving Image Captioning with Better Use of Caption | Zhan Shi, Xu Zhou, Xipeng Qiu, Xiaodan Zhu | In this paper, we present a novel image captioning architecture to better explore semantics available in captions and leverage that to enhance both image representation and caption generation. | related papers | related patents |
665 | Shape of Synth to Come: Why We Should Use Synthetic Data for English Surface Realization | Henry Elder, Robert Burke, Alexander O’Connor, Jennifer Foster | We analyse the effects of synthetic data, and we argue that its use should be encouraged rather than prohibited so that future research efforts continue to explore systems that can take advantage of such data. | related papers | related patents |
666 | Toward Better Storylines with Sentence-Level Language Models | Daphne Ippolito, David Grangier, Douglas Eck, Chris Callison-Burch | We propose a sentence-level language model which selects the next sentence in a story from a finite set of fluent alternatives. | related papers | related patents |
667 | A Two-Step Approach for Implicit Event Argument Detection | Zhisong Zhang, Xiang Kong, Zhengzhong Liu, Xuezhe Ma, Eduard Hovy | In this work, we explore the implicit event argument detection task, which studies event arguments beyond sentence boundaries. | related papers | related patents |
668 | Machine Reading of Historical Events | Or Honovich, Lucas Torroba Hennigen, Omri Abend, Shay B. Cohen | Within this broad framework, we address the task of machine reading the time of historical events, compile datasets for the task, and develop a model for tackling it. | related papers | related patents |
669 | Revisiting Unsupervised Relation Extraction | Thy Thy Tran, Phong Le, Sophia Ananiadou | However, we demonstrate that by using only named entities to induce relation types, we can outperform existing methods on two popular datasets. | related papers | related patents |
670 | SciREX: A Challenge Dataset for Document-Level Information Extraction | Sarthak Jain, Madeleine van Zuylen, Hannaneh Hajishirzi, Iz Beltagy | In this paper, we introduce SciREX, a document level IE dataset that encompasses multiple IE tasks, including salient entity identification and document level N-ary relation identification from scientific articles. | related papers | related patents |
671 | Contrastive Self-Supervised Learning for Commonsense Reasoning | Tassilo Klein, Moin Nabi | We propose a self-supervised method to solve Pronoun Disambiguation and Winograd Schema Challenge problems. | related papers | related patents |
672 | Do Transformers Need Deep Long-Range Memory? | Jack Rae, Ali Razavi | We perform a set of interventions to show that comparable performance can be obtained with 6X fewer long range memories and better performance can be obtained by limiting the range of attention in lower layers of the network. | related papers | related patents |
673 | Improving Disentangled Text Representation Learning with Information-Theoretic Guidance | Pengyu Cheng, Martin Renqiang Min, Dinghan Shen, Christopher Malon, Yizhe Zhang, Yitong Li, Lawrence Carin | Inspired by information theory, we propose a novel method that effectively manifests disentangled representations of text, without any supervision on semantics. | related papers | related patents |
674 | Understanding Advertisements with BERT | Kanika Kalra, Bhargav Kurma, Silpa Vadakkeeveetil Sreelatha, Manasi Patwardhan, Shirish Karande | We consider a task based on CVPR 2018 challenge dataset on advertisement (Ad) understanding. The task involves detecting the viewer�s interpretation of an Ad image captured as text. | related papers | related patents |
675 | Non-Linear Instance-Based Cross-Lingual Mapping for Non-Isomorphic Embedding Spaces | Goran Glavaš, Ivan Vulić | We present InstaMap, an instance-based method for learning projection-based cross-lingual word embeddings. | related papers | related patents |
676 | Good-Enough Compositional Data Augmentation | Jacob Andreas | We propose a simple data augmentation protocol aimed at providing a compositional inductive bias in conditional and unconditional sequence models. | related papers | related patents |
677 | RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers | Bailin Wang, Richard Shin, Xiaodong Liu, Oleksandr Polozov, Matthew Richardson | We present a unified framework, based on the relation-aware self-attention mechanism, to address schema encoding, schema linking, and feature representation within a text-to-SQL encoder. | related papers | related patents |
678 | Temporal Common Sense Acquisition with Minimal Supervision | Ben Zhou, Qiang Ning, Daniel Khashabi, Dan Roth | This work proposes a novel sequence modeling approach that exploits explicit and implicit mentions of temporal common sense, extracted from a large corpus, to build TacoLM, a temporal common sense language model. | related papers | related patents |
679 | The Sensitivity of Language Models and Humans to Winograd Schema Perturbations | Mostafa Abdou, Vinit Ravishankar, Maria Barrett, Yonatan Belinkov, Desmond Elliott, Anders Søgaard | Our results highlight interesting differences between humans and language models: language models are more sensitive to number or gender alternations and synonym replacements than humans, and humans are more stable and consistent in their predictions, maintain a much higher absolute performance, and perform better on non-associative instances than associative ones. | related papers | related patents |
680 | Temporally-Informed Analysis of Named Entity Recognition | Shruti Rijhwani, Daniel Preotiuc-Pietro | We analyze and propose methods that make better use of temporally-diverse training data, with a focus on the task of named entity recognition. To support these experiments, we introduce a novel data set of English tweets annotated with named entities. | related papers | related patents |
681 | Towards Open Domain Event Trigger Identification using Adversarial Domain Adaptation | Aakanksha Naik, Carolyn Rose | We tackle the task of building supervised event trigger identification models which can generalize better across domains. | related papers | related patents |
682 | CompGuessWhat?!: A Multi-task Evaluation Framework for Grounded Language Learning | Alessandro Suglia, Ioannis Konstas, Andrea Vanzo, Emanuele Bastianelli, Desmond Elliott, Stella Frank, Oliver Lemon | To remedy this, we present GroLLA, an evaluation framework for Grounded Language Learning with Attributes based on three sub-tasks: 1) Goal-oriented evaluation; 2) Object attribute prediction evaluation; and 3) Zero-shot evaluation. | related papers | related patents |
683 | Cross-Modality Relevance for Reasoning on Language and Vision | Chen Zheng, Quan Guo, Parisa Kordjamshidi | This work deals with the challenge of learning and reasoning over language and vision data for the related downstream tasks such as visual question answering (VQA) and natural language for visual reasoning (NLVR). | related papers | related patents |
684 | Learning Web-based Procedures by Reasoning over Explanations and Demonstrations in Context | Shashank Srivastava, Oleksandr Polozov, Nebojsa Jojic, Christopher Meek | We explore learning web-based tasks from a human teacher through natural language explanations and a single demonstration. | related papers | related patents |
685 | Multi-agent Communication meets Natural Language: Synergies between Functional and Structural Language Learning | Angeliki Lazaridou, Anna Potapenko, Olivier Tieleman | We present a method for combining multi-agent communication and traditional data-driven approaches to natural language learning, with an end goal of teaching agents to communicate with humans in natural language. | related papers | related patents |
686 | HAT: Hardware-Aware Transformers for Efficient Natural Language Processing | Hanrui Wang, Zhanghao Wu, Zhijian Liu, Han Cai, Ligeng Zhu, Chuang Gan, Song Han | To enable low-latency inference on resource-constrained hardware platforms, we propose to design Hardware-Aware Transformers (HAT) with neural architecture search. | related papers | related patents |
687 | Hard-Coded Gaussian Attention for Neural Machine Translation | Weiqiu You, Simeng Sun, Mohit Iyyer | We push further in this direction by developing a �hard-coded� attention variant without any learned parameters. | related papers | related patents |
688 | In Neural Machine Translation, What Does Transfer Learning Transfer? | Alham Fikri Aji, Nikolay Bogoychev, Kenneth Heafield, Rico Sennrich | We perform several ablation studies that limit information transfer, then measure the quality impact across three language pairs to gain a black-box understanding of transfer learning. | related papers | related patents |
689 | Learning a Multi-Domain Curriculum for Neural Machine Translation | Wei Wang, Ye Tian, Jiquan Ngiam, Yinfei Yang, Isaac Caswell, Zarana Parekh | This is achieved by carefully introducing instance-level domain-relevance features and automatically constructing a training curriculum to gradually concentrate on multi-domain relevant and noise-reduced data batches. | related papers | related patents |
690 | Reducing Gender Bias in Neural Machine Translation as a Domain Adaptation Problem | Danielle Saunders, Bill Byrne | At inference time we propose a lattice-rescoring scheme which outperforms all systems evaluated in Stanovsky et al, 2019 on WinoMT with no degradation of general test set BLEU. | related papers | related patents |
691 | Translationese as a Language in “Multilingual” NMT | Parker Riley, Isaac Caswell, Markus Freitag, David Grangier | Motivated by this, we model translationese and original (i.e. natural) text as separate languages in a multilingual model, and pose the question: can we perform zero-shot translation between original source text and original target text? | related papers | related patents |
692 | Unsupervised Domain Clusters in Pretrained Language Models | Roee Aharoni, Yoav Goldberg | We harness this property and propose domain data selection methods based on such models, which require only a small set of in-domain monolingual data. | related papers | related patents |
693 | Using Context in Neural Machine Translation Training Objectives | Danielle Saunders, Felix Stahlberg, Bill Byrne | We present Neural Machine Translation (NMT) training using document-level metrics with batch-level documents. | related papers | related patents |
694 | Variational Neural Machine Translation with Normalizing Flows | Hendra Setiawan, Matthias Sperber, Udhyakumar Nallasamy, Matthias Paulik | In this paper, we propose to apply the VNMT framework to the state-of-the-art Transformer and introduce a more flexible approximate posterior based on normalizing flows. | related papers | related patents |
695 | The Paradigm Discovery Problem | Alexander Erdmann, Micha Elsner, Shijie Wu, Ryan Cotterell, Nizar Habash | This work treats the paradigm discovery problem (PDP), the task of learning an inflectional morphological system from unannotated sentences. | related papers | related patents |
696 | Supervised Grapheme-to-Phoneme Conversion of Orthographic Schwas in Hindi and Punjabi | Aryaman Arora, Luke Gessler, Nathan Schneider | We present the first statistical schwa deletion classifier for Hindi, which relies solely on the orthography as the input and outperforms previous approaches. | related papers | related patents |
697 | Automated Evaluation of Writing — 50 Years and Counting | Beata Beigman Klebanov, Nitin Madnani | In this theme paper, we focus on Automated Writing Evaluation (AWE), using Ellis Page’s seminal 1966 paper to frame the presentation. | related papers | related patents |
698 | Negated and Misprimed Probes for Pretrained Language Models: Birds Can Talk, But Cannot Fly | Nora Kassner, Hinrich Schütze | Building on Petroni et al. 2019, we propose two new probing tasks analyzing factual knowledge stored in Pretrained Language Models (PLMs). | related papers | related patents |
699 | On Forgetting to Cite Older Papers: An Analysis of the ACL Anthology | Marcel Bollmann, Desmond Elliott | In this paper, we address this question through bibliographic analysis. | related papers | related patents |
700 | Returning the N to NLP: Towards Contextually Personalized Classification Models | Lucie Flek | This paper surveys the landscape of personalization in natural language processing and related fields, and offers a path forward to mitigate the decades of deviation of the NLP tools from sociolingustic findings, allowing to flexibly process the �natural� language of each user rather than enforcing a uniform NLP treatment. | related papers | related patents |
701 | To Test Machine Comprehension, Start by Defining Comprehension | Jesse Dunietz, Greg Burnham, Akash Bharadwaj, Owen Rambow, Jennifer Chu-Carroll, Dave Ferrucci | First, we argue that existing approaches do not adequately define comprehension; they are too unsystematic about what content is tested. Second, we present a detailed definition of comprehension�a �Template of Understanding��for a widely useful class of texts, namely short narratives. | related papers | related patents |
702 | Gender Gap in Natural Language Processing Research: Disparities in Authorship and Citations | Saif M. Mohammad | In this work, we examine female first author percentages and the citations to their papers in Natural Language Processing. | related papers | related patents |
703 | BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension | Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, Luke Zettlemoyer | We present BART, a denoising autoencoder for pretraining sequence-to-sequence models. | related papers | related patents |
704 | BLEURT: Learning Robust Metrics for Text Generation | Thibault Sellam, Dipanjan Das, Ankur Parikh | We propose BLEURT, a learned evaluation metric for English based on BERT. | related papers | related patents |
705 | Distilling Knowledge Learned in BERT for Text Generation | Yen-Chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu | In this paper, we present a novel approach, Conditional Masked Language Modeling (C-MLM), to enable the finetuning of BERT on target generation tasks. | related papers | related patents |
706 | ESPRIT: Explaining Solutions to Physical Reasoning Tasks | Nazneen Fatema Rajani, Rui Zhang, Yi Chern Tan, Stephan Zheng, Jeremy Weiss, Aadit Vyas, Abhijit Gupta, Caiming Xiong, Richard Socher, Dragomir Radev | We propose ESPRIT, a framework for commonsense reasoning about qualitative physics in natural language that generates interpretable descriptions of physical events. | related papers | related patents |
707 | Iterative Edit-Based Unsupervised Sentence Simplification | Dhruv Kumar, Lili Mou, Lukasz Golab, Olga Vechtomova | We present a novel iterative, edit-based approach to unsupervised sentence simplification. | related papers | related patents |
708 | Logical Natural Language Generation from Open-Domain Tables | Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang | In this paper, we suggest a new NLG task where a model is tasked with generating natural language statements that can be \textit{logically entailed} by the facts in an open-domain semi-structured table. | related papers | related patents |
709 | Neural CRF Model for Sentence Alignment in Text Simplification | Chao Jiang, Mounica Maddela, Wuwei Lan, Yang Zhong, Wei Xu | We propose a novel neural CRF alignment model which not only leverages the sequential nature of sentences in parallel documents but also utilizes a neural sentence pair model to capture semantic similarity. | related papers | related patents |
710 | One Size Does Not Fit All: Generating and Evaluating Variable Number of Keyphrases | Xingdi Yuan, Tong Wang, Rui Meng, Khushboo Thaker, Peter Brusilovsky, Daqing He, Adam Trischler | In this study, we address this problem from both modeling and evaluation perspectives. | related papers | related patents |
711 | R^3: Reverse, Retrieve, and Rank for Sarcasm Generation with Commonsense Knowledge | Tuhin Chakrabarty, Debanjan Ghosh, Smaranda Muresan, Nanyun Peng | We propose an unsupervised approach for sarcasm generation based on a non-sarcastic input sentence. | related papers | related patents |
712 | Structural Information Preserving for Graph-to-Text Generation | Linfeng Song, Ante Wang, Jinsong Su, Yue Zhang, Kun Xu, Yubin Ge, Dong Yu | We propose to tackle this problem by leveraging richer training signals that can guide our model for preserving input information. | related papers | related patents |
713 | A Joint Neural Model for Information Extraction with Global Features | Ying Lin, Heng Ji, Fei Huang, Lingfei Wu | In order to capture such cross-subtask and cross-instance inter-dependencies, we propose a joint neural framework, OneIE, that aims to extract the globally optimal IE result as a graph from an input sentence. | related papers | related patents |
714 | Document-Level Event Role Filler Extraction using Multi-Granularity Contextualized Encoding | Xinya Du, Claire Cardie | To dynamically aggregate information captured by neural representations learned at different levels of granularity (e.g., the sentence- and paragraph-level), we propose a novel multi-granularity reader. | related papers | related patents |
715 | Exploiting the Syntax-Model Consistency for Neural Relation Extraction | Amir Pouran Ben Veyseh, Franck Dernoncourt, Dejing Dou, Thien Huu Nguyen | In order to overcome these issues, we propose a novel deep learning model for RE that uses the dependency trees to extract the syntax-based importance scores for the words, serving as a tree representation to introduce syntactic information into the models with greater generalization. | related papers | related patents |
716 | From English to Code-Switching: Transfer Learning with Strong Morphological Clues | Gustavo Aguilar, Thamar Solorio | In this paper, we aim at adapting monolingual models to code-switched text in various tasks. | related papers | related patents |
717 | Learning Interpretable Relationships between Entities, Relations and Concepts via Bayesian Structure Learning on Open Domain Facts | Jingyuan Zhang, Mingming Sun, Yue Feng, Ping Li | In this paper, we propose the task of learning interpretable relationships from open-domain facts to enrich and refine concept graphs. | related papers | related patents |
718 | Multi-Sentence Argument Linking | Seth Ebner, Patrick Xia, Ryan Culkin, Kyle Rawlins, Benjamin Van Durme | We present a novel document-level model for finding argument spans that fill an event’s roles, connecting related ideas in sentence-level semantic role labeling and coreference resolution. | related papers | related patents |
719 | Rationalizing Medical Relation Prediction from Corpus-level Statistics | Zhen Wang, Jennifer Lee, Simon Lin, Huan Sun | Aiming to shed some light on how to rationalize medical relation prediction, we present a new interpretable framework inspired by existing theories on how human memory works, e.g., theories of recall and recognition. | related papers | related patents |
720 | Sources of Transfer in Multilingual Named Entity Recognition | David Mueller, Nicholas Andrews, Mark Dredze | To explain this phenomena, we explore the sources of multilingual transfer in polyglot NER models and examine the weight structure of polyglot models compared to their monolingual counterparts. | related papers | related patents |
721 | ZeroShotCeres: Zero-Shot Relation Extraction from Semi-Structured Webpages | Colin Lockard, Prashant Shiralkar, Xin Luna Dong, Hannaneh Hajishirzi | In this work, we propose a solution for “zero-shot” open-domain relation extraction from webpages with a previously unseen template, including from websites with little overlap with existing sources of knowledge for distant supervision and websites in entirely new subject verticals. | related papers | related patents |
722 | Soft Gazetteers for Low-Resource Named Entity Recognition | Shruti Rijhwani, Shuyan Zhou, Graham Neubig, Jaime Carbonell | To address this problem, we propose a method of “soft gazetteers” that incorporates ubiquitously available information from English knowledge bases, such as Wikipedia, into neural named entity recognition models through cross-lingual entity linking. | related papers | related patents |
723 | A Prioritization Model for Suicidality Risk Assessment | Han-Chin Shing, Philip Resnik, Douglas Oard | Building on measures developed for resource-bounded document retrieval, we introduce a well founded evaluation paradigm, and demonstrate using an expert-annotated test collection that meaningful improvements over plausible cascade model baselines can be achieved using an approach that jointly ranks individuals and their social media posts. | related papers | related patents |
724 | CluHTM – Semantic Hierarchical Topic Modeling based on CluWords | Felipe Viegas, Washington Cunha, Christian Gomes, Antônio Pereira, Leonardo Rocha, Marcos Goncalves | In this paper, we advance the state-of-the-art on HTM by means of the design and evaluation of CluHTM, a novel non-probabilistic hierarchical matrix factorization aimed at solving the specific issues of HTM. | related papers | related patents |
725 | Empower Entity Set Expansion via Language Model Probing | Yunyi Zhang, Jiaming Shen, Jingbo Shang, Jiawei Han | In this study, we propose a novel iterative set expansion framework that leverages automatically generated class names to address the semantic drift issue. | related papers | related patents |
726 | Feature Projection for Improved Text Classification | Qi Qin, Wenpeng Hu, Bing Liu | In this paper, we propose a novel angle to further improve this representation learning, i.e., feature projection. | related papers | related patents |
727 | A negative case analysis of visual grounding methods for VQA | Robik Shrestha, Kushal Kafle, Christopher Kanan | However, we show that the performance improvements are not a result of improved visual grounding, but a regularization effect which prevents over-fitting to linguistic priors. | related papers | related patents |
728 | History for Visual Dialog: Do we really need it? | Shubham Agarwal, Trung Bui, Joon-Young Lee, Ioannis Konstas, Verena Rieser | In this paper, we show that co-attention models which explicitly encode dialoh history outperform models that don’t, achieving state-of-the-art performance (72 % NDCG on val set). | related papers | related patents |
729 | Mapping Natural Language Instructions to Mobile UI Action Sequences | Yang Li, Jiacong He, Xin Zhou, Yuan Zhang, Jason Baldridge | We present a new problem: grounding natural language instructions to mobile user interface actions, and create three new datasets for it. | related papers | related patents |
730 | TVQA+: Spatio-Temporal Grounding for Video Question Answering | Jie Lei, Licheng Yu, Tamara Berg, Mohit Bansal | We present the task of Spatio-Temporal Video Question Answering, which requires intelligent systems to simultaneously retrieve relevant moments and detect referenced visual concepts (people and objects) to answer natural language questions about videos. | related papers | related patents |
731 | Unsupervised Multimodal Neural Machine Translation with Pseudo Visual Pivoting | Po-Yao Huang, Junjie Hu, Xiaojun Chang, Alexander Hauptmann | In this paper, we investigate how to utilize visual content for disambiguation and promoting latent space alignment in unsupervised MMT. | related papers | related patents |
732 | A Multitask Learning Approach for Diacritic Restoration | Sawsan Alqahtani, Ajay Mishra, Mona Diab | Thus, to compensate for this loss, we investigate the use of multi-task learning to jointly optimize diacritic restoration with related NLP problems namely word segmentation, part-of-speech tagging, and syntactic diacritization. | related papers | related patents |
733 | Frugal Paradigm Completion | Alexander Erdmann, Tom Kenter, Markus Becker, Christian Schallhart | We propose a frugal paradigm completion approach that predicts all related forms in a morphological paradigm from as few manually provided forms as possible. | related papers | related patents |
734 | Improving Chinese Word Segmentation with Wordhood Memory Networks | Yuanhe Tian, Yan Song, Fei Xia, Tong Zhang, Yonggang Wang | In this paper, we therefore propose a neural framework, WMSeg, which uses memory networks to incorporate wordhood information with several popular encoder-decoder combinations for CWS. | related papers | related patents |
735 | Joint Chinese Word Segmentation and Part-of-speech Tagging via Two-way Attentions of Auto-analyzed Knowledge | Yuanhe Tian, Yan Song, Xiang Ao, Fei Xia, Xiaojun Quan, Tong Zhang, Yonggang Wang | In this paper, we propose a neural model named TwASP for joint CWS and POS tagging following the character-based sequence labeling paradigm, where a two-way attention mechanism is used to incorporate both context feature and their corresponding syntactic knowledge for each input character. | related papers | related patents |
736 | Joint Diacritization, Lemmatization, Normalization, and Fine-Grained Morphological Tagging | Nasser Zalmout, Nizar Habash | Our approach models the different features jointly, whether lexicalized (on the character-level), or non-lexicalized (on the word-level). | related papers | related patents |
737 | Phonetic and Visual Priors for Decipherment of Informal Romanization | Maria Ryskina, Matthew R. Gormley, Taylor Berg-Kirkpatrick | We propose a noisy-channel WFST cascade model for deciphering the original non-Latin script from observed romanized text in an unsupervised fashion. | related papers | related patents |
738 | Active Learning for Coreference Resolution using Discrete Annotation | Belinda Z. Li, Gabriel Stanovsky, Luke Zettlemoyer | We improve upon pairwise annotation for active learning in coreference resolution, by asking annotators to identify mention antecedents if a presented mention pair is deemed not coreferent. | related papers | related patents |
739 | Beyond Possession Existence: Duration and Co-Possession | Dhivya Chinnappa, Srikala Murugan, Eduardo Blanco | This paper introduces two tasks: determining (a) the duration of possession relations and (b) co-possessions, i.e., whether multiple possessors possess a possessee at the same time. | related papers | related patents |
740 | Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks | Suchin Gururangan, Ana Marasović, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, Noah A. Smith | We present a study across four domains (biomedical and computer science publications, news, and reviews) and eight classification tasks, showing that a second phase of pretraining in-domain (domain-adaptive pretraining) leads to performance gains, under both high- and low-resource settings. | related papers | related patents |
741 | Estimating Mutual Information Between Dense Word Embeddings | Vitalii Zhelezniak, Aleksandar Savkov, Nils Hammerla | In this work we go through a vast literature on estimating MI in such cases and single out the most promising methods, yielding a simple and elegant similarity measure for word embeddings. | related papers | related patents |
742 | Exploring Unexplored Generalization Challenges for Cross-Database Semantic Parsing | Alane Suhr, Ming-Wei Chang, Peter Shaw, Kenton Lee | We propose a challenging evaluation setup for cross-database semantic parsing, focusing on variation across database schemas and in-domain language use. | related papers | related patents |
743 | Predicting the Focus of Negation: Model and Error Analysis | Md Mosharaf Hossain, Kathleen Hamilton, Alexis Palmer, Eduardo Blanco | In this paper, we experiment with neural networks to predict the focus of negation. | related papers | related patents |
744 | Structured Tuning for Semantic Role Labeling | Tao Li, Parth Anand Jawale, Martha Palmer, Vivek Srikumar | In this paper, we present a structured tuning framework to improve models using softened constraints only at training time. | related papers | related patents |
745 | TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data | Pengcheng Yin, Graham Neubig, Wen-tau Yih, Sebastian Riedel | In this paper we present TaBERT, a pretrained LM that jointly learns representations for NL sentences and (semi-)structured tables. | related papers | related patents |
746 | Universal Decompositional Semantic Parsing | Elias Stengel-Eskin, Aaron Steven White, Sheng Zhang, Benjamin Van Durme | We introduce a transductive model for parsing into Universal Decompositional Semantics (UDS) representations, which jointly learns to map natural language utterances into UDS graph structures and annotate the graph with decompositional semantic attribute scores. | related papers | related patents |
747 | Unsupervised Cross-lingual Representation Learning at Scale | Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, Veselin Stoyanov | This paper shows that pretraining multilingual language models at scale leads to significant performance gains for a wide range of cross-lingual transfer tasks. | related papers | related patents |
748 | A Generate-and-Rank Framework with Semantic Type Regularization for Biomedical Concept Normalization | Dongfang Xu, Zeyu Zhang, Steven Bethard | In this paper, we propose an architecture consisting of a candidate generator and a list-wise ranker based on BERT. | related papers | related patents |
749 | Hierarchical Entity Typing via Multi-level Learning to Rank | Tongfei Chen, Yunmo Chen, Benjamin Van Durme | We propose a novel method for hierarchical entity classification that embraces ontological structure at both training and during prediction. | related papers | related patents |
750 | Multi-Domain Named Entity Recognition with Genre-Aware and Agnostic Inference | Jing Wang, Mayank Kulkarni, Daniel Preotiuc-Pietro | We introduce a new architecture tailored to this task by using shared and private domain parameters and multi-task learning. | related papers | related patents |
751 | TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories | Giannis Karamanolakis, Jun Ma, Xin Luna Dong | This paper proposes TXtract, a taxonomy-aware knowledge extraction model that applies to thousands of product categories organized in a hierarchical taxonomy. | related papers | related patents |
752 | TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition | Bill Yuchen Lin, Dong-Ho Lee, Ming Shen, Ryan Moreno, Xiao Huang, Prashant Shiralkar, Xiang Ren | In this paper, we introduce “entity triggers,” an effective proxy of human explanations for facilitating label-efficient learning of NER models. | related papers | related patents |
753 | Addressing Posterior Collapse with Mutual Information for Improved Variational Neural Machine Translation | Arya D. McCarthy, Xian Li, Jiatao Gu, Ning Dong | This paper proposes a simple and effective approach to address the problem of posterior collapse in conditional variational autoencoders (CVAEs). | related papers | related patents |
754 | Balancing Training for Multilingual Neural Machine Translation | Xinyi Wang, Yulia Tsvetkov, Graham Neubig | In this paper, we propose a method that instead automatically learns how to weight training data through a data scorer that is optimized to maximize performance on all test languages. | related papers | related patents |
755 | Evaluating Robustness to Input Perturbations for Neural Machine Translation | Xing Niu, Prashant Mathur, Georgiana Dinu, Yaser Al-Onaizan | This paper proposes additional metrics which measure the relative degradation and changes in translation when small perturbations are added to the input. | related papers | related patents |
756 | Parallel Corpus Filtering via Pre-trained Language Models | Boliang Zhang, Ajay Nagesh, Kevin Knight | In this paper, we propose a novel approach to filter out noisy sentence pairs from web-crawled corpora via pre-trained language models. | related papers | related patents |
757 | Regularized Context Gates on Transformer for Machine Translation | Xintong Li, Lemao Liu, Rui Wang, Guoping Huang, Max Meng | This paper first provides a method to identify source and target contexts and then introduce a gate mechanism to control the source and target contributions in Transformer. In addition, to further reduce the bias problem in the gate mechanism, this paper proposes a regularization method to guide the learning of the gates with supervision automatically generated using pointwise mutual information. | related papers | related patents |
758 | A Multi-Perspective Architecture for Semantic Code Search | Rajarshi Haldar, Lingfei Wu, JinJun Xiong, Julia Hockenmaier | In this paper, we propose a novel multi-perspective cross-lingual neural framework for code-text matching, inspired in part by a previous model for monolingual text-to-text matching, to capture both global and local similarities. | related papers | related patents |
759 | Automated Topical Component Extraction Using Neural Network Attention Scores from Source-based Essay Scoring | Haoran Zhang, Diane Litman | This paper presents a method for linking AWE and neural AES, by extracting Topical Components (TCs) representing evidence from a source text using the intermediate output of attention layers. | related papers | related patents |
760 | Clinical Concept Linking with Contextualized Neural Representations | Elliot Schumacher, Andriy Mulyar, Mark Dredze | We propose an approach to concept linking that leverages recent work in contextualized neural models, such as ELMo (Peters et al. 2018), which create a token representation that integrates the surrounding context of the mention and concept name. | related papers | related patents |
761 | DeSePtion: Dual Sequence Prediction and Adversarial Examples for Improved Fact-Checking | Christopher Hidey, Tuhin Chakrabarty, Tariq Alhindi, Siddharth Varia, Kriste Krstovski, Mona Diab, Smaranda Muresan | We show that current systems for FEVER are vulnerable to three categories of realistic challenges for fact-checking � multiple propositions, temporal reasoning, and ambiguity and lexical variation � and introduce a resource with these types of claims. Then we present a system designed to be resilient to these �attacks� using multiple pointer networks for document selection and jointly modeling a sequence of evidence sentences and veracity relation predictions. | related papers | related patents |
762 | Let Me Choose: From Verbal Context to Font Selection | Amirreza Shirani, Franck Dernoncourt, Jose Echevarria, Paul Asente, Nedim Lipka, Thamar Solorio | In this paper, we aim to learn associations between visual attributes of fonts and the verbal context of the texts they are typically applied to. We introduce a new dataset, containing examples of different topics in social media posts and ads, labeled through crowd-sourcing. | related papers | related patents |
763 | Multi-Label and Multilingual News Framing Analysis | Afra Feyza Akyürek, Lei Guo, Randa Elanwar, Prakash Ishwar, Margrit Betke, Derry Tanti Wijaya | In this work, we explore multilingual transfer learning to detect multiple frames from just the news headline in a genuinely low-resource context where there are few/no frame annotations in the target language. | related papers | related patents |
764 | Predicting Performance for Natural Language Processing Tasks | Mengzhou Xia, Antonios Anastasopoulos, Ruochen Xu, Yiming Yang, Graham Neubig | In this work, we attempt to explore the possibility of gaining plausible judgments of how well an NLP model can perform under an experimental setting, \textit{without actually training or testing the model}. | related papers | related patents |
765 | ScriptWriter: Narrative-Guided Script Generation | Yutao Zhu, Ruihua Song, Zhicheng Dou, Jian-Yun NIE, Jin Zhou | In this paper, we address a key problem involved in these applications – guiding a dialogue by a narrative. | related papers | related patents |
766 | Should All Cross-Lingual Embeddings Speak English? | Antonios Anastasopoulos, Graham Neubig | First, we show that the choice of hub language can significantly impact downstream lexicon induction zero-shot POS tagging performance. Second, we both expand a standard English-centered evaluation dictionary collection to include all language pairs using triangulation, and create new dictionaries for under-represented languages. | related papers | related patents |
767 | Smart To-Do: Automatic Generation of To-Do Items from Emails | Sudipto Mukherjee, Subhabrata Mukherjee, Marcello Hasegawa, Ahmed Hassan Awadallah, Ryen White | In this work, we explore a new application, Smart-To-Do, that helps users with task management over emails. | related papers | related patents |
768 | Are Natural Language Inference Models IMPPRESsive? Learning IMPlicature and PRESupposition | Paloma Jeretic, Alex Warstadt, Suvrat Bhooshan, Adina Williams | We use IMPPRES to evaluate whether BERT, InferSent, and BOW NLI models trained on MultiNLI (Williams et al., 2018) learn to make pragmatic inferences. | related papers | related patents |
769 | End-to-End Bias Mitigation by Modelling Biases in Corpora | Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson | We propose two learning strategies to train neural models, which are more robust to such biases and transfer better to out-of-domain datasets. | related papers | related patents |
770 | Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance | Prasetya Ajie Utama, Nafise Sadat Moosavi, Iryna Gurevych | In this paper, we address this trade-off by introducing a novel debiasing method, called confidence regularization, which discourage models from exploiting biases while enabling them to receive enough incentive to learn from all the training examples. | related papers | related patents |
771 | NILE : Natural Language Inference with Faithful Natural Language Explanations | Sawan Kumar, Partha Talukdar | We propose Natural-language Inference over Label-specific Explanations (NILE), a novel NLI method which utilizes auto-generated label-specific NL explanations to produce labels along with its faithful explanation. | related papers | related patents |
772 | QuASE: Question-Answer Driven Sentence Encoding | Hangfeng He, Qiang Ning, Dan Roth | This paper studies a natural question: Can we get supervision from QA data for other tasks (typically, non-QA ones)? | related papers | related patents |
773 | Towards Robustifying NLI Models Against Lexical Dataset Biases | Xiang Zhou, Mohit Bansal | Using contradiction-word bias and word-overlapping bias as our two bias examples, this paper explores both data-level and model-level debiasing methods to robustify models against lexical dataset biases. | related papers | related patents |
774 | Uncertain Natural Language Inference | Tongfei Chen, Zhengping Jiang, Adam Poliak, Keisuke Sakaguchi, Benjamin Van Durme | We introduce Uncertain Natural Language Inference (UNLI), a refinement of Natural Language Inference (NLI) that shifts away from categorical labels, targeting instead the direct prediction of subjective probability assessments. | related papers | related patents |
775 | Extracting Headless MWEs from Dependency Parse Trees: Parsing, Tagging, and Joint Modeling Approaches | Tianze Shi, Lillian Lee | We empirically compare these two common strategies�parsing and tagging�for predicting flat MWEs. Additionally, we propose an efficient joint decoding algorithm that combines scores from both strategies. | related papers | related patents |
776 | Revisiting Higher-Order Dependency Parsers | Erick Fonseca, André F. T. Martins | We tested this hypothesis and found that neural parsers may benefit from higher-order features, even when employing a powerful pre-trained encoder, such as BERT. | related papers | related patents |
777 | SeqVAT: Virtual Adversarial Training for Semi-Supervised Sequence Labeling | Luoxin Chen, Weitong Ruan, Xinyue Liu, Jianhua Lu | In this paper, we propose SeqVAT, a method which naturally applies VAT to sequence labeling models with CRF. | related papers | related patents |
778 | Treebank Embedding Vectors for Out-of-domain Dependency Parsing | Joachim Wagner, James Barry, Jennifer Foster | We build on this idea by 1) introducing a method to predict a treebank vector for sentences that do not come from a treebank used in training, and 2) exploring what happens when we move away from predefined treebank embedding vectors during test time and instead devise tailored interpolations. | related papers | related patents |