Paper Digest: ACL 2020 Highlights

June 19, 2020August 18, 2020 admin

Download ACL-2020-Paper-Digests.pdf– highlights of all ACL-2020 papers.
Readers can also choose to read this highlight article on our console, which allows users to filter out papers using keywords and find related papers.

Annual Meeting of the Association for Computational Linguistics (ACL) is one of the top natural language processing conferences in the world. In 2020, it is to be held online due to covid-19 pandemic. There were 3,429 paper submissions, of which 778 were accepted. ~90 papers also published their code (code download link).

To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.

If you do not want to miss any interesting academic paper, you are welcome to sign up our free daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.

Paper Digest Team
team@paperdigest.org

TABLE 1: ACL 2020 Papers

	Title	Authors	Highlight	Related Papers	Related Patents
1	Learning to Understand Child-directed and Adult-directed Speech	Lieke Gelderloos, Grzegorz Chrupała, Afra Alishahi	This study explores the effect of child-directed speech when learning to extract semantic information from speech directly.	related papers	related patents
2	Predicting Depression in Screening Interviews from Latent Categorization of Interview Prompts	Alex Rinaldi, Jean Fox Tree, Snigdha Chaturvedi	We propose JLPC, a model that analyzes interview transcripts to identify depression while jointly categorizing interview prompts into latent categories.	related papers	related patents
3	Coach: A Coarse-to-Fine Approach for Cross-domain Slot Filling	Zihan Liu, Genta Indra Winata, Peng Xu, Pascale Fung	In this paper, we propose a Coarse-to-fine approach (Coach) for cross-domain slot filling.	related papers	related patents
4	Designing Precise and Robust Dialogue Response Evaluators	Tianyu Zhao, Divesh Lala, Tatsuya Kawahara	In this work, we propose to build a reference-free evaluator and exploit the power of semi-supervised training and pretrained (masked) language models.	related papers	related patents
5	Dialogue State Tracking with Explicit Slot Connection Modeling	Yawen Ouyang, Moxin Chen, Xinyu Dai, Yinggong Zhao, Shujian Huang, Jiajun CHEN	To handle these phenomena, we propose a Dialogue State Tracking with Slot Connections (DST-SC) model to explicitly consider slot correlations across different domains.	related papers	related patents
6	Generating Informative Conversational Response using Recurrent Knowledge-Interaction and Knowledge-Copy	Xiexiong Lin, Weiyu Jian, Jianshan He, Taifeng Wang, Wei Chu	To address this issue, this paper proposes a method that uses recurrent knowledge interaction among response decoding steps to incorporate appropriate knowledge.	related papers	related patents
7	Guiding Variational Response Generator to Exploit Persona	Bowen Wu, MengYuan Li, Zongsheng Wang, Yifu Chen, Derek F. Wong, qihang feng, Junhong Huang, Baoxun Wang	This paper proposes to adopt the personality-related characteristics of human conversations into variational response generators, by designing a specific conditional variational autoencoder based deep model with two new regularization terms employed to the loss function, so as to guide the optimization towards the direction of generating both persona-aware and relevant responses.	related papers	related patents
8	Large Scale Multi-Actor Generative Dialog Modeling	Alex Boyd, Raul Puri, Mohammad Shoeybi, Mostofa Patwary, Bryan Catanzaro	This work introduces the Generative Conversation Control model, an augmented and fine-tuned GPT-2 language model that conditions on past reference conversations to probabilistically model multi-turn conversations in the actor’s persona.	related papers	related patents
9	PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable	Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang	Inspired by this, we propose a novel dialogue generation pre-training framework to support various kinds of conversations, including chit-chat, knowledge grounded dialogues, and conversational question answering.	related papers	related patents
10	Slot-consistent NLG for Task-oriented Dialogue Systems with Iterative Rectification Network	Yangming Li, Kaisheng Yao, Libo Qin, Wanxiang Che, Xiaolong Li, Ting Liu	In this paper, we study slot consistency for building reliable NLG systems with all slot values of input dialogue act (DA) properly generated in output sentences.	related papers	related patents
11	Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations	Samuel Coope, Tyler Farghly, Daniela Gerz, Ivan Vulić, Matthew Henderson	We introduce Span-ConveRT, a light-weight model for dialog slot-filling which frames the task as a turn-based span extraction task.	related papers	related patents
12	Zero-Shot Transfer Learning with Synthesized Data for Multi-Domain Dialogue State Tracking	Giovanni Campagna, Agata Foryciarz, Mehrad Moradshahi, Monica Lam	This paper proposes new zero-short transfer learning technique for dialogue state tracking where the in-domain training data are all synthesized from an abstract dialogue model and the ontology of the domain.	related papers	related patents
13	A Complete Shift-Reduce Chinese Discourse Parser with Robust Dynamic Oracle	Shyh-Shiun Hung, Hen-Hsen Huang, Hsin-Hsi Chen	This work proposes a standalone, complete Chinese discourse parser for practical applications.	related papers	related patents
14	TransS-Driven Joint Learning Architecture for Implicit Discourse Relation Recognition	Ruifang He, Jian Wang, Fengyu Guo, Yugui Han	Therefore, we propose a novel TransS-driven joint learning architecture to address the issues.	related papers	related patents
15	A Study of Non-autoregressive Model for Sequence Generation	Yi Ren, Jinglin Liu, Xu Tan, Zhou Zhao, sheng zhao, Tie-Yan Liu	To quantify such dependency, we propose an analysis model called CoMMA to characterize the difficulty of different NAR sequence generation tasks.	related papers	related patents
16	Cross-modal Language Generation using Pivot Stabilization for Web-scale Language Coverage	Ashish V. Thapliyal, Radu Soricut	We describe an approach called Pivot-Language Generation Stabilization (PLuGS), which leverages directly at training time both existing English annotations (gold data) as well as their machine-translated versions (silver data); at run-time, it generates first an English caption and then a corresponding target-language caption.	related papers	related patents
17	Fact-based Text Editing	Hayate Iso, Chao Qiao, Hang Li	We propose a novel text editing task, referred to as \textit{fact-based text editing}, in which the goal is to revise a given document to better describe the facts in a knowledge base (e.g., several triples).	related papers	related patents
18	Few-Shot NLG with Pre-Trained Language Model	Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, William Yang Wang	In this work, we propose the new task of few-shot natural language generation.	related papers	related patents
19	Fluent Response Generation for Conversational Question Answering	Ashutosh Baheti, Alan Ritter, Kevin Small	In this work, we propose a method for situating QA responses within a SEQ2SEQ NLG approach to generate fluent grammatical answer responses while maintaining correctness.	related papers	related patents
20	Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs	Dong Bok Lee, Seanie Lee, Woo Tae Jeong, Donghwan Kim, Sung Ju Hwang	In this work, we propose a hierarchical conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts, while maximizing the mutual information between generated QA pairs to ensure their consistency.	related papers	related patents
21	Learning to Ask More: Semi-Autoregressive Sequential Question Generation under Dual-Graph Interaction	Zi Chai, Xiaojun Wan	To this end, we generate questions in a semi-autoregressive way. Our model divides questions into different groups and generates each group of them in parallel.	related papers	related patents
22	Neural Syntactic Preordering for Controlled Paraphrase Generation	Tanya Goyal, Greg Durrett	Our work, inspired by pre-ordering literature in machine translation, uses syntactic transformations to softly “reorder” the source sentence and guide our neural paraphrasing model.	related papers	related patents
23	Pre-train and Plug-in: Flexible Conditional Text Generation with Variational Auto-Encoders	Yu Duan, Canwen Xu, Jiaxin Pei, Jialong Han, Chenliang Li	In this paper, we present a new framework named Pre-train and Plug-in Variational Auto-Encoder (PPVAE) towards flexible conditional text generation.	related papers	related patents
24	Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order	Yi Liao, Xin Jiang, Qun Liu	In this paper, we propose a probabilistic masking scheme for the masked language model, which we call probabilistically masked language model (PMLM).	related papers	related patents
25	Reverse Engineering Configurations of Neural Text Generation Models	Yi Tay, Dara Bahri, Che Zheng, Clifford Brunk, Donald Metzler, Andrew Tomkins	In the spirit of better understanding generative text models and their artifacts, we propose the new task of distinguishing which of several variants of a given model generated some piece of text.	related papers	related patents
26	Review-based Question Generation with Adaptive Instance Transfer and Augmentation	Qian Yu, Lidong Bing, Qiong Zhang, Wai Lam, Luo Si	To obtain proper training instances for the generation model, we propose an iterative learning framework with adaptive instance transfer and augmentation.	related papers	related patents
27	TAG : Type Auxiliary Guiding for Code Comment Generation	Ruichu Cai, Zhihao Liang, Boyan Xu, zijian li, Yuexing Hao, Yao Chen	In order to address the issues above, we propose a Type Auxiliary Guiding encoder-decoder framework for the code comment generation task which considers the source code as an N-ary tree with type information associated with each node.	related papers	related patents
28	Unsupervised Paraphrasing by Simulated Annealing	Xianggen Liu, Lili Mou, Fandong Meng, Hao Zhou, Jie Zhou, Sen Song	We propose UPSA, a novel approach that accomplishes Unsupervised Paraphrasing by Simulated Annealing.	related papers	related patents
29	A Joint Model for Document Segmentation and Segment Labeling	Joe Barrow, Rajiv Jain, Vlad Morariu, Varun Manjunatha, Douglas Oard, Philip Resnik	We introduce Segment Pooling LSTM (S-LSTM), which is capable of jointly segmenting a document and labeling segments.	related papers	related patents
30	Contextualized Weak Supervision for Text Classification	Dheeraj Mekala, Jingbo Shang	In this paper, we propose a novel framework ConWea, providing contextualized weak supervision for text classification.	related papers	related patents
31	Every Document Owns Its Structure: Inductive Text Classification via Graph Neural Networks	Yufeng Zhang, Xueli Yu, Zeyu Cui, Shu Wu, Zhongzhen Wen, Liang Wang	Therefore in this work, to overcome such problems, we propose TextING for inductive text classification via GNN.	related papers	related patents
32	Neural Topic Modeling with Bidirectional Adversarial Training	Rui Wang, Xuemeng Hu, Deyu Zhou, Yulan He, Yuxuan Xiong, Chenchen Ye, Haiyang Xu	To address these limitations, we propose a neural topic modeling approach, called Bidirectional Adversarial Topic (BAT) model, which represents the first attempt of applying bidirectional adversarial training for neural topic modeling.	related papers	related patents
33	Text Classification with Negative Supervision	Sora Ohashi, Junya Takayama, Tomoyuki Kajiwara, Chenhui Chu, Yuki Arase	To address this problem, we propose a simple multitask learning model that uses negative supervision.	related papers	related patents
34	Content Word Aware Neural Machine Translation	Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita	To address this limitation, we first utilize word frequency information to distinguish between content and function words in a sentence, and then design a content word-aware NMT to improve translation performance.	related papers	related patents
35	Evaluating Explanation Methods for Neural Machine Translation	Jierui Li, Lemao Liu, Huayang Li, Guanlin Li, Guoping Huang, Shuming Shi	To this end, it proposes a principled metric based on fidelity in regard to the predictive behavior of the NMT model.	related papers	related patents
36	Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation	Junliang Guo, Linli Xu, Enhong Chen	In this work, we introduce a jointly masked sequence-to-sequence model and explore its application on non-autoregressive neural machine translation{\textasciitilde}(NAT).	related papers	related patents
37	Learning Source Phrase Representations for Neural Machine Translation	Hongfei Xu, Josef van Genabith, Deyi Xiong, Qiuhui Liu, Jingyi Zhang	In this paper, we first propose an attentive phrase representation generation mechanism which is able to generate phrase representations from corresponding token representations. In addition, we incorporate the generated phrase representations into the Transformer translation model to enhance its ability to capture long-distance relationships.	related papers	related patents
38	Lipschitz Constrained Parameter Initialization for Deep Transformers	Hongfei Xu, Qiuhui Liu, Josef van Genabith, Deyi Xiong, Jingyi Zhang	In this paper, we first empirically demonstrate that a simple modification made in the official implementation, which changes the computation order of residual connection and layer normalization, can significantly ease the optimization of deep Transformers.	related papers	related patents
39	Location Attention for Extrapolation to Longer Sequences	Yann Dubois, Gautier Dagan, Dieuwke Hupkes, Elia Bruni	In this paper, we first review the notion of extrapolation, why it is important and how one could hope to tackle it. We then focus on a specific type of extrapolation which is especially useful for natural language processing: generalization to sequences that are longer than the training ones.	related papers	related patents
40	Multiscale Collaborative Deep Models for Neural Machine Translation	Xiangpeng Wei, Heng Yu, Yue Hu, Yue Zhang, Rongxiang Weng, Weihua Luo	In this paper, we present a MultiScale Collaborative (MSC) framework to ease the training of NMT models that are substantially deeper than those used previously.	related papers	related patents
41	Norm-Based Curriculum Learning for Neural Machine Translation	Xuebo Liu, Houtim Lai, Derek F. Wong, Lidia S. Chao	In this paper, we aim to improve the efficiency of training an NMT by introducing a novel norm-based curriculum learning method.	related papers	related patents
42	Opportunistic Decoding with Timely Correction for Simultaneous Translation	Renjie Zheng, Mingbo Ma, Baigong Zheng, Kaibo Liu, Liang Huang	We propose an opportunistic decoding technique with timely correction ability, which always (over-)generates a certain mount of extra words at each step to keep the audience on track with the latest information.	related papers	related patents
43	A Formal Hierarchy of RNN Architectures	William Merrill, Gail Weiss, Yoav Goldberg, Roy Schwartz, Noah A. Smith, Eran Yahav	We develop a formal hierarchy of the expressive capacity of RNN architectures.	related papers	related patents
44	A Three-Parameter Rank-Frequency Relation in Natural Languages	Chenchen Ding, Masao Utiyama, Eiichiro Sumita	We present that, the rank-frequency relation in textual data follows $f \propto r^{-\alpha}(r+\gamma)^{-\beta}$, where $f$ is the token frequency and $r$ is the rank by frequency, with ($\alpha$, $\beta$, $\gamma$) as parameters.	related papers	related patents
45	Dice Loss for Data-imbalanced NLP Tasks	Xiaoya Li, Xiaofei Sun, Yuxian Meng, Junjun Liang, Fei Wu, Jiwei Li	In this paper, we propose to use dice loss in replacement of the standard cross-entropy objective for data-imbalanced NLP tasks.	related papers	related patents
46	Emergence of Syntax Needs Minimal Supervision	Raphaël Bailly, Kata Gábor	This paper is a theoretical contribution to the debate on the learnability of syntax from a corpus without explicit syntax-specific guidance.	related papers	related patents
47	Language Models as an Alternative Evaluator of Word Order Hypotheses: A Case Study in Japanese	Tatsuki Kuribayashi, Takumi Ito, Jun Suzuki, Kentaro Inui	In this study, we explore whether the LM-based method is valid for analyzing the word order.	related papers	related patents
48	GCAN: Graph-aware Co-Attention Networks for Explainable Fake News Detection on Social Media	Yi-Ju Lu, Cheng-Te Li	Given the source short-text tweet and the corresponding sequence of retweet users without text comments, we aim at predicting whether the source tweet is fake or not, and generating explanation by highlighting the evidences on suspicious retweeters and the words they concern.	related papers	related patents
49	Integrating Semantic and Structural Information with Graph Convolutional Network for Controversy Detection	Lei Zhong, Juan Cao, Qiang Sheng, Junbo Guo, Ziang Wang	To overcome the first two limitations, we propose Topic-Post-Comment Graph Convolutional Network (TPC-GCN), which integrates the information from the graph structure and content of topics, posts, and comments for post-level controversy detection.	related papers	related patents
50	Predicting the Topical Stance and Political Leaning of Media using Tweets	Peter Stefanov, Kareem Darwish, Atanas Atanasov, Preslav Nakov	In this paper, we propose a cascaded method that uses unsupervised learning to ascertain the stance of Twitter users with respect to a polarizing topic by leveraging their retweet behavior; then, it uses supervised learning based on user labels to characterize both the general political leaning of online media and of popular Twitter users, as well as their stance with respect to the target polarizing topic.	related papers	related patents
51	Simple, Interpretable and Stable Method for Detecting Words with Usage Change across Corpora	Hila Gonen, Ganesh Jawahar, Djamé Seddah, Yoav Goldberg	We propose an alternative approach that does not use vector space alignment, and instead considers the neighbors of each word.	related papers	related patents
52	CDL: Curriculum Dual Learning for Emotion-Controllable Response Generation	Lei Shen, Yang Feng	To alleviate these problems, we propose a novel framework named Curriculum Dual Learning (CDL) which extends the emotion-controllable response generation to a dual task to generate emotional responses and emotional queries alternatively.	related papers	related patents
53	Efficient Dialogue State Tracking by Selectively Overwriting Memory	Sungdong Kim, Sohee Yang, Gyuwan Kim, Sang-Woo Lee	Here, we consider dialogue state as an explicit fixed-sized memory and propose a selectively overwriting mechanism for more efficient DST.	related papers	related patents
54	End-to-End Neural Pipeline for Goal-Oriented Dialogue Systems using GPT-2	Donghoon Ham, Jeong-Gwan Lee, Youngsoo Jang, Kee-Eung Kim	In this paper, we present an end-to-end neural architecture for dialogue systems that addresses both challenges above.	related papers	related patents
55	Evaluating Dialogue Generation Systems via Response Selection	Shiki Sato, Reina Akama, Hiroki Ouchi, Jun Suzuki, Kentaro Inui	Specifically, we propose to construct test sets filtering out some types of false candidates: (i) those unrelated to the ground-truth response and (ii) those acceptable as appropriate responses.	related papers	related patents
56	Gated Convolutional Bidirectional Attention-based Model for Off-topic Spoken Response Detection	Yefei Zha, Ruobing Li, Hui Lin	In this paper, we propose a novel approach for off-topic spoken response detection with high off-topic recall on both seen and unseen prompts.	related papers	related patents
57	Learning Low-Resource End-To-End Goal-Oriented Dialog for Fast and Reliable System Deployment	Yinpei Dai, Hangyu Li, Chengguang Tang, Yongbin Li, Jian Sun, Xiaodan Zhu	In this paper, we propose the Meta-Dialog System (MDS), which combines the advantages of both meta-learning approaches and human-machine collaboration.	related papers	related patents
58	Learning to Tag OOV Tokens by Integrating Contextual Representation and Background Knowledge	Keqing He, Yuanmeng Yan, Weiran XU	In this paper, we propose a novel knowledge-enhanced slot tagging model to integrate contextual representation of input text and the large-scale lexical background knowledge.	related papers	related patents
59	Multi-Agent Task-Oriented Dialog Policy Learning with Role-Aware Reward Decomposition	Ryuichi Takanobu, Runze Liang, Minlie Huang	To avoid explicitly building a user simulator beforehand, we propose Multi-Agent Dialog Policy Learning, which regards both the system and the user as the dialog agents.	related papers	related patents
60	Paraphrase Augmented Task-Oriented Dialog Generation	Silin Gao, Yichi Zhang, Zhijian Ou, Zhou Yu	We propose a paraphrase augmented response generation (PARG) framework that jointly trains a paraphrase model and a response generation model to improve the dialog generation performance.	related papers	related patents
61	Response-Anticipated Memory for On-Demand Knowledge Integration in Response Generation	Zhiliang Tian, Wei Bi, Dongkyu Lee, Lanqing Xue, YIPING SONG, Xiaojiang Liu, Nevin L. Zhang	In this paper, we propose to create the document memory with some anticipated responses in mind.	related papers	related patents
62	Semi-Supervised Dialogue Policy Learning via Stochastic Reward Estimation	Xinting Huang, Jianzhong Qi, Yu Sun, Rui Zhang	To overcome this limitation, we propose a novel reward learning approach for semi-supervised policy learning.	related papers	related patents
63	Towards Unsupervised Language Understanding and Generation by Joint Dual Learning	Shang-Yu Su, Chao-Wei Huang, Yun-Nung Chen	However, the prior work still learned both components in a supervised manner; instead, this paper introduces a general learning framework to effectively exploit such duality, providing flexibility of incorporating both supervised and unsupervised learning algorithms to train language understanding and generation models in a joint fashion.	related papers	related patents
64	USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation	Shikib Mehri, Maxine Eskenazi	To this end, this paper presents USR, an UnSupervised and Reference-free evaluation metric for dialog.	related papers	related patents
65	Explicit Semantic Decomposition for Definition Generation	Jiahuan Li, Yu Bao, Shujian Huang, Xinyu Dai, Jiajun CHEN	In this paper, we propose ESD, namely Explicit Semantic Decomposition for definition Generation, which explicitly decomposes the meaning of words into semantic components, and models them with discrete latent variables for definition generation.	related papers	related patents
66	Improved Natural Language Generation via Loss Truncation	Daniel Kang, Tatsunori Hashimoto	We propose loss truncation: a simple and scalable procedure which adaptively removes high log loss examples as a way to optimize for distinguishability.	related papers	related patents
67	Line Graph Enhanced AMR-to-Text Generation with Mix-Order Graph Attention Networks	Yanbin Zhao, Lu Chen, Zhi Chen, Ruisheng Cao, Su Zhu, Kai Yu	In this work, we propose a novel graph encoding framework which can effectively explore the edge relations.	related papers	related patents
68	Rigid Formats Controlled Text Generation	Piji Li, Haisong Zhang, Xiaojiang Liu, Shuming Shi	Therefore, we propose a simple and elegant framework named SongNet to tackle this problem.	related papers	related patents
69	Syn-QG: Syntactic and Shallow Semantic Rules for Question Generation	Kaustubh Dhole, Christopher D. Manning	We implement this observation by developing Syn-QG, a set of transparent syntactic rules leveraging universal dependencies, shallow semantic parsing, lexical resources, and custom rules which transform declarative sentences into question-answer pairs.	related papers	related patents
70	An Online Semantic-enhanced Dirichlet Model for Short Text Stream Clustering	Jay Kumar, Junming Shao, Salah Uddin, Wazir Ali	Therefore, in this paper, we propose an Online Semantic-enhanced Dirichlet Model for short sext stream clustering, called OSDM, which integrates the word-occurance semantic information (i.e., context) into a new graphical model and clusters each arriving short text automatically in an online way.	related papers	related patents
71	Generative Semantic Hashing Enhanced via Boltzmann Machines	Lin Zheng, Qinliang Su, Dinghan Shen, Changyou Chen	In this paper, to introduce correlations among the bits of hash codes, we propose to employ the distribution of Boltzmann machine as the variational posterior.	related papers	related patents
72	Interactive Construction of User-Centric Dictionary for Text Analytics	Ryosuke Kohita, Issei Yoshida, Hiroshi Kanayama, Tetsuya Nasukawa	To optimize the interaction, we propose a new algorithm that effectively captures an analyst’s intention starting from only a small number of sample terms.	related papers	related patents
73	Tree-Structured Neural Topic Model	Masaru Isonuma, Junichiro Mori, Danushka Bollegala, Ichiro Sakata	This paper presents a tree-structured neural topic model, which has a topic distribution over a tree with an infinite number of branches.	related papers	related patents
74	Unsupervised FAQ Retrieval with Question Generation and BERT	Yosi Mass, Boaz Carmeli, Haggai Roitman, David Konopnicki	We present a fully unsupervised method that exploits the FAQ pairs to train two BERT models.	related papers	related patents
75	“The Boating Store Had Its Best Sail Ever”: Pronunciation-attentive Contextualized Pun Recognition	Yichao Zhou, Jyun-Yu Jiang, Jieyu Zhao, Kai-Wei Chang, Wei Wang	In this paper, we propose Pronunciation-attentive Contextualized Pun Recognition (PCPR) to perceive human humor, detect if a sentence contains puns and locate them in the sentence.	related papers	related patents
76	Fast and Accurate Deep Bidirectional Language Representations for Unsupervised Learning	Joongbo Shin, Yoonhyung Lee, Seunghyun Yoon, Kyomin Jung	To resolve this limitation, we propose a novel deep bidirectional language model called a Transformer-based Text Autoencoder (T-TA).	related papers	related patents
77	Fine-grained Interest Matching for Neural News Recommendation	Heyuan Wang, Fangzhao Wu, Zheng Liu, Xing Xie	In this paper, we propose FIM, a Fine-grained Interest Matching method for neural news recommendation.	related papers	related patents
78	Interpretable Operational Risk Classification with Semi-Supervised Variational Autoencoder	Fan Zhou, Shengming Zhang, Yi Yang	To tackle these challenges, we present a semi-supervised text classification framework that integrates multi-head attention mechanism with Semi-supervised variational inference for Operational Risk Classification (SemiORC).	related papers	related patents
79	Interpreting Twitter User Geolocation	Ting Zhong, Tianliang Wang, Fan Zhou, Goce Trajcevski, Kunpeng Zhang, Yi Yang	In this work, we adopt influence functions to interpret the behavior of GNN-based models by identifying the importance of training users when predicting the locations of the testing users.	related papers	related patents
80	Modeling Code-Switch Languages Using Bilingual Parallel Corpus	Grandee Lee, Haizhou Li	We propose a bilingual attention language model (BALM) that simultaneously performs language modeling objective with a quasi-translation objective to model both the monolingual as well as the cross-lingual sequential dependency.	related papers	related patents
81	SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check	Xingyi Cheng, Weidi Xu, Kunlong Chen, Shaohua Jiang, Feng Wang, Taifeng Wang, Wei Chu, Yuan Qi	This paper proposes to incorporate phonological and visual similarity knowledge into language models for CSC via a specialized graph convolutional network (SpellGCN).	related papers	related patents
82	Spelling Error Correction with Soft-Masked BERT	Shaohua Zhang, Haoran Huang, Jicong Liu, Hang Li	In this work, we propose a novel neural architecture to address the aforementioned issue, which consists of a network for error detection and a network for error correction based on BERT, with the former being connected to the latter with what we call soft-masking technique.	related papers	related patents
83	A Frame-based Sentence Representation for Machine Reading Comprehension	Shaoru Guo, Ru Li, Hongye Tan, Xiaoli Li, Yong Guan, Hongyan Zhao, Yueping Zhang	To bridge the gap, we proposed a novel Frame-based Sentence Representation (FSR) method, which employs frame semantic knowledge to facilitate sentence modelling.	related papers	related patents
84	A Methodology for Creating Question Answering Corpora Using Inverse Data Annotation	Jan Deriu, Katsiaryna Mlynchyk, Philippe Schläpfer, Alvaro Rodrigo, Dirk von Grünigen, Nicolas Kaiser, Kurt Stockinger, Eneko Agirre, Mark Cieliebak	In this paper, we introduce a novel methodology to efficiently construct a corpus for question answering over structured data.	related papers	related patents
85	Contextualized Sparse Representations for Real-Time Open-Domain Question Answering	Jinhyuk Lee, Minjoon Seo, Hannaneh Hajishirzi, Jaewoo Kang	In this paper, we aim to improve the quality of each phrase embedding by augmenting it with a contextualized sparse representation (Sparc).	related papers	related patents
86	Dynamic Sampling Strategies for Multi-Task Reading Comprehension	Ananth Gottumukkala, Dheeru Dua, Sameer Singh, Matt Gardner	We show that a simple dynamic sampling strategy, selecting instances for training proportional to the multi-task model�s current performance on a dataset relative to its single task performance, gives substantive gains over prior multi-task sampling strategies, mitigating the catastrophic forgetting that is common in multi-task learning.	related papers	related patents
87	Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension	Fei Yuan, Linjun Shou, Xuanyu Bai, Ming Gong, Yaobo Liang, Nan Duan, Yan Fu, Daxin Jiang	In this paper, we propose two auxiliary tasks in the fine-tuning stage to create additional phrase boundary supervision: (1) A mixed MRC task, which translates the question or passage to other languages and builds cross-lingual question-passage pairs; (2) A language-agnostic knowledge masking task by leveraging knowledge phrases mined from web.	related papers	related patents
88	Explicit Memory Tracker with Coarse-to-Fine Reasoning for Conversational Machine Reading	Yifan Gao, Chien-Sheng Wu, Shafiq Joty, Caiming Xiong, Richard Socher, Irwin King, Michael Lyu, Steven C.H. Hoi	In this paper, we present a new framework of conversational machine reading that comprises a novel Explicit Memory Tracker (EMT) to track whether conditions listed in the rule text have already been satisfied to make a decision.	related papers	related patents
89	Injecting Numerical Reasoning Skills into Language Models	Mor Geva, Ankit Gupta, Jonathan Berant	In this work, we show that numerical reasoning is amenable to automatic data generation, and thus one can inject this skill into pre-trained LMs, by generating large amounts of data, and training in a multi-task setup.	related papers	related patents
90	Learning to Identify Follow-Up Questions in Conversational Question Answering	Souvik Kundu, Qian Lin, Hwee Tou Ng	In this paper, we introduce a new follow-up question identification task.	related papers	related patents
91	Query Graph Generation for Answering Multi-hop Complex Questions from Knowledge Bases	Yunshi Lan, Jing Jiang	In this paper, we handle both types of complexity at the same time.	related papers	related patents
92	A Diverse Corpus for Evaluating and Developing English Math Word Problem Solvers	Shen-yun Miao, Chao-Chun Liang, Keh-Yih Su	We present ASDiv (Academia Sinica Diverse MWP Dataset), a diverse (in terms of both language patterns and problem types) English math word problem (MWP) corpus for evaluating the capability of various MWP solvers.	related papers	related patents
93	Improving Image Captioning Evaluation by Considering Inter References Variance	Yanzhi Yi, Hangyu Deng, Jinglu Hu	In this paper, we propose a novel metric based on BERTScore that could handle such a challenge and extend BERTScore with a few new features appropriately for image captioning evaluation.	related papers	related patents
94	Revisiting the Context Window for Cross-lingual Word Embeddings	Ryokan Ri, Yoshimasa Tsuruoka	In this work, we provide a thorough evaluation, in various languages, domains, and tasks, of bilingual embeddings trained with different context windows.	related papers	related patents
95	Moving Down the Long Tail of Word Sense Disambiguation with Gloss Informed Bi-encoders	Terra Blevins, Luke Zettlemoyer	We propose a bi-encoder model that independently embeds (1) the target word with its surrounding context and (2) the dictionary definition, or gloss, of each sense.	related papers	related patents
96	Code-Switching Patterns Can Be an Effective Route to Improve Performance of Downstream NLP Applications: A Case Study of Humour, Sarcasm and Hate Speech Detection	Srijan Bansal, Vishal Garimella, Ayush Suhane, Jasabanta Patro, Animesh Mukherjee	In this paper, we demonstrate how code-switching patterns can be utilised to improve various downstream NLP applications.	related papers	related patents
97	DTCA: Decision Tree-based Co-Attention Networks for Explainable Claim Verification	Lianwei Wu, Yuan Rao, yongqiang zhao, Hao Liang, Ambreen Nazir	In this paper, we propose a Decision Tree-based Co-Attention model (DTCA) to discover evidence for explainable claim verification.	related papers	related patents
98	Towards Conversational Recommendation over Multi-Type Dialogs	Zeming Liu, Haifeng Wang, Zheng-Yu Niu, Hua Wu, Wanxiang Che, Ting Liu	We focus on the study of conversational recommendation in the context of multi-type dialogs, where the bots can proactively and naturally lead a conversation from a non-recommendation dialog (e.g., QA) to a recommendation dialog, taking into account user’s interests and feedback.	related papers	related patents
99	Unknown Intent Detection Using Gaussian Mixture Model with an Application to Zero-shot Intent Classification	Guangfeng Yan, Lu Fan, Qimai Li, Han Liu, Xiaotong Zhang, Xiao-Ming Wu, Albert Y.S. Lam	This paper proposes a semantic-enhanced Gaussian mixture model (SEG) for unknown intent detection.	related papers	related patents
100	Expertise Style Transfer: A New Task Towards Better Communication between Experts and Laymen	Yixin Cao, Ruihao Shui, Liangming Pan, Min-Yen Kan, Zhiyuan Liu, Tat-Seng Chua	We propose a new task of expertise style transfer and contribute a manually annotated dataset with the goal of alleviating such cognitive biases.	related papers	related patents
101	Towards Faithful Neural Table-to-Text Generation with Content-Matching Constraints	Zhenyi Wang, Xiaoyang Wang, Bang An, Dong Yu, Changyou Chen	In this paper, for the first time, we propose a novel Transformer-based generation framework to achieve the goal.	related papers	related patents
102	Dynamic Memory Induction Networks for Few-Shot Text Classification	Ruiying Geng, Binhua Li, Yongbin Li, Jian Sun, Xiaodan Zhu	This paper proposes Dynamic Memory Induction Networks (DMIN) for few-short text classification.	related papers	related patents
103	Exclusive Hierarchical Decoding for Deep Keyphrase Generation	Wang Chen, Hou Pong Chan, Piji Li, Irwin King	To overcome these limitations, we propose an exclusive hierarchical decoding framework that includes a hierarchical decoding process and either a soft or a hard exclusion mechanism.	related papers	related patents
104	Hierarchy-Aware Global Model for Hierarchical Text Classification	Jie Zhou, Chunping Ma, Dingkun Long, Guangwei Xu, Ning Ding, Haoyu Zhang, Pengjun Xie, Gongshen Liu	In this paper, we formulate the hierarchy as a directed graph and introduce hierarchy-aware structure encoders for modeling label dependencies.	related papers	related patents
105	Keyphrase Generation for Scientific Document Retrieval	Florian Boudin, Ygor Gallina, Akiko Aizawa	This study provides empirical evidence that such models can significantly improve retrieval performance, and introduces a new extrinsic evaluation framework that allows for a better understanding of the limitations of keyphrase generation models.	related papers	related patents
106	A Graph Auto-encoder Model of Derivational Morphology	Valentin Hofmann, Hinrich Schütze, Janet Pierrehumbert	We present a graph auto-encoder that learns embeddings capturing information about the compatibility of affixes and stems in derivation.	related papers	related patents
107	Building a User-Generated Content North-African Arabizi Treebank: Tackling Hell	Djamé Seddah, Farah Essaidi, Amal Fethi, Matthieu Futeral, Benjamin Muller, Pedro Javier Ortiz Suárez, Benoît Sagot, Abhishek Srivastava	We introduce the first treebank for a romanized user-generated content variety of Algerian, a North-African Arabic dialect known for its frequent usage of code-switching.	related papers	related patents
108	Crawling and Preprocessing Mailing Lists At Scale for Dialog Analysis	Janek Bevendorff, Khalid Al Khatib, Martin Potthast, Benno Stein	This paper introduces the Webis Gmane Email Corpus 2019, the largest publicly available and fully preprocessed email corpus to date.	related papers	related patents
109	Fine-Grained Analysis of Cross-Linguistic Syntactic Divergences	Dmitry Nikolaev, Ofir Arviv, Taelin Karidi, Neta Kenneth, Veronika Mitnik, Lilja Maria Saeboe, Omri Abend	We propose a framework for extracting divergence patterns for any language pair from a parallel corpus, building on Universal Dependencies.	related papers	related patents
110	Generating Counter Narratives against Online Hate Speech: Data and Strategies	Serra Sinem Tekiroğlu, Yi-Ling Chung, Marco Guerini	Being aware of the aforementioned limitations, we present a study on how to collect responses to hate effectively, employing large scale unsupervised language models such as GPT-2 for the generation of silver data, and the best annotation strategies/neural architectures that can be used for data filtering before expert validation/post-editing.	related papers	related patents
111	KLEJ: Comprehensive Benchmark for Polish Language Understanding	Piotr Rybak, Robert Mroczkowski, Janusz Tracz, Ireneusz Gawlik	To alleviate this issue, we introduce a comprehensive multi-task benchmark for the Polish language understanding, accompanied by an online leaderboard.	related papers	related patents
112	Learning and Evaluating Emotion Lexicons for 91 Languages	Sven Buechel, Susanna Rücker, Udo Hahn	In order to break this bottleneck, we here introduce a methodology for creating almost arbitrarily large emotion lexicons for any target language.	related papers	related patents
113	Multi-Hypothesis Machine Translation Evaluation	Marina Fomicheva, Lucia Specia, Francisco Guzmán	In this paper, we propose an alternative approach: instead of modelling linguistic variation in human reference we exploit the MT model uncertainty to generate multiple diverse translations and use these: (i) as surrogates to reference translations; (ii) to obtain a quantification of translation variability to either complement existing metric scores or (iii) replace references altogether.	related papers	related patents
114	Multimodal Quality Estimation for Machine Translation	Shu Okabe, Frédéric Blain, Lucia Specia	We propose approaches to Quality Estimation (QE) for Machine Translation that explore both text and visual modalities for Multimodal QE.	related papers	related patents
115	PuzzLing Machines: A Challenge on Learning From Small Data	Gözde Gül Şahin, Yova Kementchedjhieva, Phillip Rust, Iryna Gurevych	To expose this problem in a new light, we introduce a challenge on learning from small data, PuzzLing Machines, which consists of Rosetta Stone puzzles from Linguistic Olympiads for high school students.	related papers	related patents
116	The SOFC-Exp Corpus and Neural Approaches to Information Extraction in the Materials Science Domain	Annemarie Friedrich, Heike Adel, Federico Tomazic, Johannes Hingerl, Renou Benteau, Anika Marusczyk, Lukas Lange	This paper presents a new challenging information extraction task in the domain of materials science.	related papers	related patents
117	The TechQA Dataset	Vittorio Castelli, Rishav Chakravarti, Saswati Dana, Anthony Ferritto, Radu Florian, Martin Franz, Dinesh Garg, Dinesh Khandelwal, Scott McCarley, Michael McCawley, Mohamed Nasr, Lin Pan, Cezar Pendus, John Pitrelli, Saurabh Pujar, Salim Roukos, Andrzej Sakrajda, Avi Sil, Rosario Uceda-Sosa, Todd Ward, Rong Zhang	We introduce TECHQA, a domain-adaptation question answering dataset for the technical support domain.	related papers	related patents
118	iSarcasm: A Dataset of Intended Sarcasm	Silviu Oprea, Walid Magdy	We show the limitations of previous labelling methods in capturing intended sarcasm and introduce the iSarcasm dataset of tweets labeled for sarcasm directly by their authors.	related papers	related patents
119	AMR Parsing via Graph-Sequence Iterative Inference	Deng Cai, Wai Lam	We propose a new end-to-end model that treats AMR parsing as a series of dual decisions on the input sequence and the incrementally constructed graph.	related papers	related patents
120	A Large-Scale Multi-Document Summarization Dataset from the Wikipedia Current Events Portal	Demian Gholipour Ghalandari, Chris Hokamp, Nghia The Pham, John Glover, Georgiana Ifrim	This work presents a new dataset for MDS that is large both in the total number of document clusters and in the size of individual clusters.	related papers	related patents
121	Attend, Translate and Summarize: An Efficient Method for Neural Cross-Lingual Summarization	Junnan Zhu, Yu Zhou, Jiajun Zhang, Chengqing Zong	In this paper, we propose a novel method inspired by the translation pattern in the process of obtaining a cross-lingual summary.	related papers	related patents
122	Examining the State-of-the-Art in News Timeline Summarization	Demian Gholipour Ghalandari, Georgiana Ifrim	In this paper, we compare different TLS strategies using appropriate evaluation frameworks, and propose a simple and effective combination of methods that improves over the stateof-the-art on all tested benchmarks.	related papers	related patents
123	Improving Truthfulness of Headline Generation	Kazuki Matsumaru, Sho Takase, Naoaki Okazaki	This paper explores improving the truthfulness in headline generation on two popular datasets.	related papers	related patents
124	SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization	Yang Gao, Wei Zhao, Steffen Eger	We propose SUPERT, which rates the quality of a summary by measuring its semantic similarity with a pseudo reference summary, i.e. selected salient sentences from the source documents, using contextualized embeddings and soft token alignment techniques.	related papers	related patents
125	Self-Attention Guided Copy Mechanism for Abstractive Summarization	Song Xu, Haoran Li, Peng Yuan, Youzheng Wu, Xiaodong He, Bowen Zhou	In this work, we propose a Transformer-based model to enhance the copy mechanism.	related papers	related patents
126	Beyond User Self-Reported Likert Scale Ratings: A Comparison Model for Automatic Dialog Evaluation	Weixin Liang, James Zou, Zhou Yu	To alleviate this problem, we formulate dialog evaluation as a comparison task.	related papers	related patents
127	Conversational Word Embedding for Retrieval-Based Dialog System	Wentao Ma, Yiming Cui, Ting Liu, Dong Wang, Shijin Wang, Guoping Hu	In this paper, we propose a conversational word embedding method named PR-Embedding, which utilizes the conversation pairs {\textless}post, reply{\textgreater} to learn word embedding.	related papers	related patents
128	Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network	Yutai Hou, Wanxiang Che, Yongkui Lai, Zhihan Zhou, Yijia Liu, Han Liu, Ting Liu	In this paper, we explore the slot tagging with only a few labeled support sentences (a.k.a. few-shot).	related papers	related patents
129	Learning Dialog Policies from Weak Demonstrations	Gabriel Gordon-Hall, Philip John Gorinski, Shay B. Cohen	We introduce Reinforced Fine-tune Learning, an extension to DQfD, enabling us to overcome the domain gap between the datasets and the environment.	related papers	related patents
130	MuTual: A Dataset for Multi-Turn Dialogue Reasoning	Leyang Cui, Yu Wu, Shujie Liu, Yue Zhang, Ming Zhou	To facilitate the conversation reasoning research, we introduce MuTual, a novel dataset for Multi-Turn dialogue Reasoning, consisting of 8,860 manually annotated dialogues based on Chinese student English listening comprehension exams.	related papers	related patents
131	You Impress Me: Dialogue Generation via Mutual Persona Perception	Qian Liu, Yihong Chen, Bei Chen, Jian-Guang LOU, Zixuan Chen, Bin Zhou, Dongmei Zhang	Motivated by this, we propose P{\^{}}2 Bot, a transmitter-receiver based framework with the aim of explicitly modeling understanding.	related papers	related patents
132	Bridging Anaphora Resolution as Question Answering	Yufang Hou	In this paper, we cast bridging anaphora resolution as question answering based on context.	related papers	related patents
133	Dialogue Coherence Assessment Without Explicit Dialogue Act Labels	Mohsen Mesgar, Sebastian Bücker, Iryna Gurevych	We address these issues by introducing a novel approach to dialogue coherence assessment.	related papers	related patents
134	Fast and Accurate Non-Projective Dependency Tree Linearization	Xiang Yu, Simon Tannert, Ngoc Thang Vu, Jonas Kuhn	We propose a graph-based method to tackle the dependency tree linearization task.	related papers	related patents
135	Semantic Graphs for Generating Deep Questions	Liangming Pan, Yuxi Xie, Yansong Feng, Tat-Seng Chua, Min-Yen Kan	This paper proposes the problem of Deep Question Generation (DQG), which aims to generate complex questions that require reasoning over multiple pieces of information about the input passage.	related papers	related patents
136	A Novel Cascade Binary Tagging Framework for Relational Triple Extraction	Zhepei Wei, Jianlin Su, Yue Wang, Yuan Tian, Yi Chang	In this work, we introduce a fresh perspective to revisit the relational triple extraction task and propose a novel cascade binary tagging framework (CasRel) derived from a principled problem formulation.	related papers	related patents
137	In Layman’s Terms: Semi-Open Relation Extraction from Scientific Texts	Ruben Kruiper, Julian Vincent, Jessica Chen-Burger, Marc Desmulliez, Ioannis Konstas	In this work we combine the output of both types of systems to achieve Semi-Open Relation Extraction, a new task that we explore in the Biology domain.	related papers	related patents
138	NAT: Noise-Aware Training for Robust Neural Sequence Labeling	Marcin Namysl, Sven Behnke, Joachim Köhler	To this end, we formulate the noisy sequence labeling problem, where the input may undergo an unknown noising process and propose two Noise-Aware Training (NAT) objectives that improve robustness of sequence labeling performed on perturbed input: Our data augmentation method trains a neural model using a mixture of clean and noisy samples, whereas our stability training algorithm encourages the model to create a noise-invariant latent representation.	related papers	related patents
139	Named Entity Recognition without Labelled Data: A Weak Supervision Approach	Pierre Lison, Jeremy Barnes, Aliaksandr Hubin, Samia Touileb	This paper presents a simple but powerful approach to learn NER models in the absence of labelled data through weak supervision.	related papers	related patents
140	Probing Linguistic Features of Sentence-Level Representations in Relation Extraction	Christoph Alt, Aleksandra Gabryszak, Leonhard Hennig	We introduce 14 probing tasks targeting linguistic properties relevant to RE, and we use them to study representations learned by more than 40 different encoder architecture and linguistic feature combinations trained on two datasets, TACRED and SemEval 2010 Task 8.	related papers	related patents
141	Reasoning with Latent Structure Refinement for Document-Level Relation Extraction	Guoshun Nan, Zhijiang Guo, Ivan Sekulic, Wei Lu	Unlike previous methods that may not be able to capture rich non-local interactions for inference, we propose a novel model that empowers the relational reasoning across sentences by automatically inducing the latent document-level graph.	related papers	related patents
142	TACRED Revisited: A Thorough Evaluation of the TACRED Relation Extraction Task	Christoph Alt, Aleksandra Gabryszak, Leonhard Hennig	In this paper, we investigate the questions: Have we reached a performance ceiling or is there still room for improvement?	related papers	related patents
143	Bilingual Dictionary Based Neural Machine Translation without Using Parallel Sentences	Xiangyu Duan, Baijun Ji, Hao Jia, Min Tan, Min Zhang, Boxing Chen, Weihua Luo, Yue Zhang	In this paper, we propose a new task of machine translation (MT), which is based on no parallel sentences but can refer to a ground-truth bilingual dictionary.	related papers	related patents
144	Boosting Neural Machine Translation with Similar Translations	Jitao XU, Josep Crego, Jean Senellart	This paper explores data augmentation methods for training Neural Machine Translation to make use of similar translations, in a comparable way a human translator employs fuzzy matches.	related papers	related patents
145	Character-Level Translation with Self-attention	Yingqiang Gao, Nikola I. Nikolov, Yuhuang Hu, Richard H.R. Hahnloser	We explore the suitability of self-attention models for character-level neural machine translation.	related papers	related patents
146	End-to-End Neural Word Alignment Outperforms GIZA++	Thomas Zenkel, Joern Wuebker, John DeNero	We present the first end-to-end neural word alignment method that consistently outperforms GIZA++ on three data sets.	related papers	related patents
147	Enhancing Machine Translation with Dependency-Aware Self-Attention	Emanuele Bugliarello, Naoaki Okazaki	In this work, we investigate different approaches to incorporate syntactic knowledge in the Transformer model and also propose a novel, parameter-free, dependency-aware self-attention mechanism that improves its translation quality, especially for long sentences and in low-resource scenarios.	related papers	related patents
148	Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation	Biao Zhang, Philip Williams, Ivan Titov, Rico Sennrich	We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics, and overcome this bottleneck via language-specific components and deepening NMT architectures.	related papers	related patents
149	It’s Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information	Emanuele Bugliarello, Sabrina J. Mielke, Antonios Anastasopoulos, Ryan Cotterell, Naoaki Okazaki	In this paper, we propose cross-mutual information (XMI): an asymmetric information-theoretic metric of machine translation difficulty that exploits the probabilistic nature of most neural machine translation models.	related papers	related patents
150	Language-aware Interlingua for Multilingual Neural Machine Translation	Changfeng Zhu, Heng Yu, Shanbo Cheng, Weihua Luo	In this paper, we incorporate a language-aware interlingua into the Encoder-Decoder architecture.	related papers	related patents
151	On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation	Wei Zhao, Goran Glavaš, Maxime Peyrard, Yang Gao, Robert West, Steffen Eger	In this paper, we concern ourselves with reference-free machine translation (MT) evaluation where we directly compare source texts to (sometimes low-quality) system translations, which represents a natural adversarial setup for multilingual encoders.	related papers	related patents
152	Parallel Sentence Mining by Constrained Decoding	Pinzhen Chen, Nikolay Bogoychev, Kenneth Heafield, Faheem Kirefu	We present a novel method to extract parallel sentences from two monolingual corpora, using neural machine translation.	related papers	related patents
153	Self-Attention with Cross-Lingual Position Representation	Liang Ding, Longyue Wang, Dacheng Tao	In this paper, we augment SANs with \textit{cross-lingual position representations} to model the bilingually aware latent structure for the input sentence.	related papers	related patents
154	“You Sound Just Like Your Father” Commercial Machine Translation Systems Include Stylistic Biases	Dirk Hovy, Federico Bianchi, Tommaso Fornaciari	We show that as a consequence, the output of three commercial machine translation systems (Bing, DeepL, Google) make demographically diverse samples from five languages “sound” older and more male than the original.	related papers	related patents
155	MMPE: A Multi-Modal Interface for Post-Editing Machine Translation	Nico Herbig, Tim Düwel, Santanu Pal, Kalliopi Meladaki, Mahsa Monshizadeh, Antonio Krüger, Josef van Genabith	Since this paradigm shift offers potential for modalities other than mouse and keyboard, we present MMPE, the first prototype to combine traditional input modes with pen, touch, and speech modalities for PE of MT.	related papers	related patents
156	A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages	Pedro Javier Ortiz Suárez, Laurent Romary, Benoît Sagot	We use the multilingual OSCAR corpus, extracted from Common Crawl via language classification, filtering and cleaning, to train monolingual contextualized word embeddings (ELMo) for five mid-resource languages.	related papers	related patents
157	Will-They-Won’t-They: A Very Large Dataset for Stance Detection on Twitter	Costanza Conforti, Jakob Berndt, Mohammad Taher Pilehvar, Chryssi Giannitsarou, Flavio Toxvaerd, Nigel Collier	We present a new challenging stance detection dataset, called Will-They-Won’t-They (WT–WT), which contains 51,284 tweets in English, making it by far the largest available dataset of the type.	related papers	related patents
158	A Systematic Assessment of Syntactic Generalization in Neural Language Models	Jennifer Hu, Jon Gauthier, Peng Qian, Ethan Wilcox, Roger Levy	We present a systematic evaluation of the syntactic knowledge of neural language models, testing 20 combinations of model types and data sizes on a set of 34 English-language syntactic test suites.	related papers	related patents
159	Inflecting When There’s No Majority: Limitations of Encoder-Decoder Neural Networks as Cognitive Models for German Plurals	Kate McCurdy, Sharon Goldwater, Adam Lopez	We conclude that modern neural models may still struggle with minority-class generalization.	related papers	related patents
160	Overestimation of Syntactic Representation in Neural Language Models	Jordan Kodner, Nitish Gupta	We illustrate a fundamental problem with this approach by reproducing positive results from a recent paper with two non-syntactic baseline language models: an n-gram model and an LSTM model trained on scrambled inputs.	related papers	related patents
161	Suspense in Short Stories is Predicted By Uncertainty Reduction over Neural Story Representation	David Wilmot, Frank Keller	We propose a hierarchical language model that encodes stories and computes surprise and uncertainty reduction.	related papers	related patents
162	You Don’t Have Time to Read This: An Exploration of Document Reading Time Prediction	Orion Weller, Jordan Hildebrandt, Ilya Reznik, Christopher Challis, E. Shannon Tass, Quinn Snell, Kevin Seppi	We seek to extend these works by examining whether or not document level predictions are effective, given additional information such as subject matter, font characteristics, and readability metrics.	related papers	related patents
163	A Generative Model for Joint Natural Language Understanding and Generation	Bo-Hsiang Tseng, Jianpeng Cheng, Yimai Fang, David Vandyke	In this work, we propose a generative model which couples NLU and NLG through a shared latent variable.	related papers	related patents
164	Automatic Detection of Generated Text is Easiest when Humans are Fooled	Daphne Ippolito, Daniel Duckworth, Chris Callison-Burch, Douglas Eck	Here, we perform careful benchmarking and analysis of three popular sampling-based decoding strategies-top-{\_}k{\_}, nucleus sampling, and untruncated random sampling-and show that improvements in decoding methods have primarily optimized for fooling humans.	related papers	related patents
165	Multi-Domain Neural Machine Translation with Word-Level Adaptive Layer-wise Domain Mixing	Haoming Jiang, Chen Liang, Chong Wang, Tuo Zhao	To overcome this limitation, we propose a novel multi-domain NMT model using individual modules for each domain, on which we apply word-level, adaptive and layer-wise domain mixing.	related papers	related patents
166	Conversational Graph Grounded Policy Learning for Open-Domain Conversation Generation	Jun Xu, Haifeng Wang, Zheng-Yu Niu, Hua Wu, Wanxiang Che, Ting Liu	To address the challenge of policy learning in open-domain multi-turn conversation, we propose to represent prior information about dialog transitions as a graph and learn a graph grounded dialog policy, aimed at fostering a more coherent and controllable dialog.	related papers	related patents
167	GPT-too: A Language-Model-First Approach for AMR-to-Text Generation	Manuel Mager, Ramón Fernandez Astudillo, Tahira Naseem, Md Arafat Sultan, Young-Suk Lee, Radu Florian, Salim Roukos	In this paper, we propose an alternative approach that combines a strong pre-trained language model with cycle consistency-based re-scoring.	related papers	related patents
168	Learning to Update Natural Language Comments Based on Code Changes	Sheena Panthaplackel, Pengyu Nie, Milos Gligoric, Junyi Jessy Li, Raymond Mooney	We propose an approach that learns to correlate changes across two distinct language representations, to generate a sequence of edits that are applied to the existing comment to reflect the source code modifications.	related papers	related patents
169	Politeness Transfer: A Tag and Generate Approach	Aman Madaan, Amrith Setlur, Tanmay Parekh, Barnabas Poczos, Graham Neubig, Yiming Yang, Ruslan Salakhutdinov, Alan W Black, Shrimai Prabhumoye	This paper introduces a new task of politeness transfer which involves converting non-polite sentences to polite sentences while preserving the meaning.	related papers	related patents
170	BPE-Dropout: Simple and Effective Subword Regularization	Ivan Provilkov, Dmitrii Emelianenko, Elena Voita	We introduce BPE-dropout – simple and effective subword regularization method based on and compatible with conventional BPE.	related papers	related patents
171	Improving Non-autoregressive Neural Machine Translation with Monolingual Data	Jiawei Zhou, Phillip Keung	Under this framework, we leverage large monolingual corpora to improve the NAR model’s performance, with the goal of transferring the AR model’s generalization ability while preventing overfitting.	related papers	related patents
172	Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization	Sajad Sotudeh Gharebagh, Nazli Goharian, Ross Filice	In this paper, we approach the content selection problem for clinical abstractive summarization by augmenting salient ontological terms into the summarizer.	related papers	related patents
173	On Faithfulness and Factuality in Abstractive Summarization	Joshua Maynez, Shashi Narayan, Bernd Bohnet, Ryan McDonald	In this paper we have analyzed limitations of these models for abstractive document summarization and found that these models are highly prone to hallucinate content that is unfaithful to the input document.	related papers	related patents
174	Screenplay Summarization Using Latent Narrative Structure	Pinelopi Papalampidi, Frank Keller, Lea Frermann, Mirella Lapata	In this paper, we propose to explicitly incorporate the underlying structure of narratives into general unsupervised and supervised extractive summarization models.	related papers	related patents
175	Unsupervised Opinion Summarization with Noising and Denoising	Reinald Kim Amplayo, Mirella Lapata	In this paper we enable the use of supervised learning for the setting where there are only documents available (e.g., product or business reviews) without ground truth summaries.	related papers	related patents
176	A Tale of Two Perplexities: Sensitivity of Neural Language Models to Lexical Retrieval Deficits in Dementia of the Alzheimer’s Type	Trevor Cohen, Serguei Pakhomov	In this paper, we interrogate neural LMs trained on participants with and without dementia by using synthetic narratives previously developed to simulate progressive semantic dementia by manipulating lexical frequency.	related papers	related patents
177	Probing Linguistic Systematicity	Emily Goodwin, Koustuv Sinha, Timothy J. O’Donnell	We examine the notion of systematicity from a linguistic perspective, defining a set of probing tasks and a set of metrics to measure systematic behaviour. We also identify ways in which network architectures can generalize non-systematically, and discuss why such forms of generalization may be unsatisfying.	related papers	related patents
178	Recollection versus Imagination: Exploring Human Memory and Cognition via Neural Language Models	Maarten Sap, Eric horvitz, Yejin Choi, Noah A. Smith, James Pennebaker	We introduce a measure of narrative flow and use this to examine the narratives for imagined and recalled events.	related papers	related patents
179	Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment	Forrest Davis, Marten van Schijndel	Our work uses ambiguous relative clause attachment to extend such evaluations to cases of multiple simultaneous valid interpretations, where stark grammaticality differences are absent.	related papers	related patents
180	Speakers enhance contextually confusable words	Eric Meinhardt, Eric Bakovic, Leon Bergen	We develop a measure of contextual confusability during word recognition based on psychoacoustic data. Applying this measure to naturalistic speech corpora, we find evidence suggesting that speakers alter their productions to make contextually more confusable words easier to understand.	related papers	related patents
181	What determines the order of adjectives in English? Comparing efficiency-based theories using dependency treebanks	Richard Futrell, William Dyer, Greg Scontras	The four theories we test are subjectivity (Scontras et al., 2017), information locality (Futrell, 2019), integration cost (Dyer, 2017), and information gain, which we introduce.	related papers	related patents
182	“None of the Above”: Measure Uncertainty in Dialog Response Retrieval	Yulan Feng, Shikib Mehri, Maxine Eskenazi, Tiancheng Zhao	This paper discusses the importance of uncovering uncertainty in end-to-end dialog tasks and presents our experimental results on uncertainty classification on the processed Ubuntu Dialog Corpus.	related papers	related patents
183	Can You Put it All Together: Evaluating Conversational Agents’ Ability to Blend Skills	Eric Michael Smith, Mary Williamson, Kurt Shuster, Jason Weston, Y-Lan Boureau	In this work, we investigate several ways to combine models trained towards isolated capabilities, ranging from simple model aggregation schemes that require minimal additional training, to various forms of multi-task training that encompass several skills at all training stages.	related papers	related patents
184	Grounded Conversation Generation as Guided Traverses in Commonsense Knowledge Graphs	Houyu Zhang, Zhenghao Liu, Chenyan Xiong, Zhiyuan Liu	This paper presents a new conversation generation model, ConceptFlow, which leverages commonsense knowledge graphs to explicitly model conversation flows.	related papers	related patents
185	Negative Training for Neural Dialogue Response Generation	Tianxing He, James Glass	In this work, we propose a framework named “Negative Training” to minimize such behaviors.	related papers	related patents
186	Recursive Template-based Frame Generation for Task Oriented Dialog	Rashmi Gangadharaiah, Balakrishnan Narayanaswamy	We propose a recursive, hierarchical frame-based representation and show how to learn it from data.	related papers	related patents
187	Speak to your Parser: Interactive Text-to-SQL with Natural Language Feedback	Ahmed Elgohary, saghar Hosseini, Ahmed Hassan Awadallah	In this paper, we investigate a more interactive scenario where humans can further interact with the system by providing free-form natural language feedback to correct the system when it generates an inaccurate interpretation of an initial utterance.	related papers	related patents
188	Calibrating Structured Output Predictors for Natural Language Processing	Abhyuday Jagannatha, hong yu	In this study, we propose a general calibration scheme for output entities of interest in neural network based structured prediction models.	related papers	related patents
189	Active Imitation Learning with Noisy Guidance	Kianté Brantley, Hal Daumé III, Amr Sharaf	To combat this query complexity, we consider an active learning setting in which the learning algorithm has additional access to a much cheaper noisy heuristic that provides noisy guidance.	related papers	related patents
190	ExpBERT: Representation Engineering with Natural Language Explanations	Shikhar Murty, Pang Wei Koh, Percy Liang	In this paper, we allow model developers to specify these types of inductive biases as natural language explanations.	related papers	related patents
191	GAN-BERT: Generative Adversarial Learning for Robust Text Classification with a Bunch of Labeled Examples	Danilo Croce, Giuseppe Castellucci, Roberto Basili	In this paper, we propose GAN-BERT that ex- tends the fine-tuning of BERT-like architectures with unlabeled data in a generative adversarial setting.	related papers	related patents
192	Generalizing Natural Language Analysis through Span-relation Representations	Zhengbao Jiang, Wei Xu, Jun Araki, Graham Neubig	In this paper, we provide the simple insight that a great variety of tasks can be represented in a single unified format consisting of labeling spans and relations between spans, thus a single task-independent model can be used across different tasks.	related papers	related patents
193	Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling	Ouyu Lan, Xiao Huang, Bill Yuchen Lin, He Jiang, Liyuan Liu, Xiang Ren	In this paper, we propose a novel framework Consensus Network (ConNet) that can be trained on annotations from multiple sources (e.g., crowd annotation, cross-domain data).	related papers	related patents
194	MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification	Jiaao Chen, Zichao Yang, Diyi Yang	This paper presents MixText, a semi-supervised learning method for text classification, which uses our newly designed data augmentation method called TMix.	related papers	related patents
195	MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices	Zhiqing Sun, Hongkun Yu, Xiaodan Song, Renjie Liu, Yiming Yang, Denny Zhou	In this paper, we propose MobileBERT for compressing and accelerating the popular BERT model.	related papers	related patents
196	On Importance Sampling-Based Evaluation of Latent Language Models	Robert L Logan IV, Matt Gardner, Sameer Singh	In this paper, we carry out this analysis for three models: RNNG, EntityNLM, and KGLM.	related papers	related patents
197	SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization	Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, Tuo Zhao	To address such an issue in a principled manner, we propose a new learning framework for robust and efficient fine-tuning for pre-trained models to attain better generalization performance.	related papers	related patents
198	Stolen Probability: A Structural Weakness of Neural Language Models	David Demeter, Gregory Kimmel, Doug Downey	We present numerical, theoretical and empirical analyses which show that words on the interior of the convex hull in the embedding space have their probability bounded by the probabilities of the words on the hull.	related papers	related patents
199	Taxonomy Construction of Unseen Domains via Graph-based Cross-Domain Knowledge Transfer	Chao Shang, Sarthak Dash, Md. Faisal Mahbub Chowdhury, Nandana Mihindukulasooriya, Alfio Gliozzo	In this paper, we propose Graph2Taxo, a GNN-based cross-domain transfer framework for the taxonomy construction task.	related papers	related patents
200	To Pretrain or Not to Pretrain: Examining the Benefits of Pretrainng on Resource Rich Tasks	Sinong Wang, Madian Khabsa, Hao Ma	This paper examines the benefits of pretrained models as a function of the number of training samples used in the downstream task.	related papers	related patents
201	Why Overfitting Isn’t Always Bad: Retrofitting Cross-Lingual Word Embeddings to Dictionaries	Mozhi Zhang, Yoshinari Fujinuma, Michael J. Paul, Jordan Boyd-Graber	We address this limitation by retrofitting CLWE to the training dictionary, which pulls training translation pairs closer in the embedding space and overfits the training dictionary.	related papers	related patents
202	XtremeDistil: Multi-stage Distillation for Massive Multilingual Models	Subhabrata Mukherjee, Ahmed Hassan Awadallah	In this work we study knowledge distillation with a focus on multilingual Named Entity Recognition (NER).	related papers	related patents
203	A Girl Has A Name: Detecting Authorship Obfuscation	Asad Mahmood, Zubair Shafiq, Padmini Srinivasan	In this paper, we evaluate the stealthiness of state-of-the-art authorship obfuscation methods under an adversarial threat model.	related papers	related patents
204	DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference	Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, Jimmy Lin	We propose a simple but effective method, DeeBERT, to accelerate BERT inference.	related papers	related patents
205	Efficient Strategies for Hierarchical Text Classification: External Knowledge and Auxiliary Tasks	Kervy Rivas Rojas, Gina Bustamante, Arturo Oncevay, Marco Antonio Sobrevilla Cabezudo	In hierarchical text classification, we perform a sequence of inference steps to predict the category of a document from top to bottom of a given class taxonomy.	related papers	related patents
206	Investigating the effect of auxiliary objectives for the automated grading of learner English speech transcriptions	Hannah Craighead, Andrew Caines, Paula Buttery, Helen Yannakoudakis	Motivated by recent advances in multi-task learning, we develop neural networks trained in a multi-task fashion that learn to predict the proficiency level of non-native English speakers by taking advantage of inductive transfer between the main task (grading) and auxiliary prediction tasks: morpho-syntactic labeling, language modeling, and native language identification (L1).	related papers	related patents
207	SPECTER: Document-level Representation Learning using Citation-informed Transformers	Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel Weld	We propose SPECTER, a new method to generate document-level embedding of scientific papers based on pretraining a Transformer language model on a powerful signal of document-level relatedness: the citation graph.	related papers	related patents
208	Semantic Scaffolds for Pseudocode-to-Code Generation	Ruiqi Zhong, Mitchell Stern, Dan Klein	We propose a method for program generation based on semantic scaffolds, lightweight structures representing the high-level semantic and syntactic composition of a program.	related papers	related patents
209	Can We Predict New Facts with Open Knowledge Graph Embeddings? A Benchmark for Open Link Prediction	Samuel Broscheit, Kiril Gashteovski, Yanjie Wang, Rainer Gemulla	In this paper, we investigate whether it is possible to infer new facts directly from the open knowledge graph without any canonicalization or any supervision from curated knowledge.	related papers	related patents
210	INFOTABS: Inference on Tables as Semi-structured Data	Vivek Gupta, Maitrey Mehta, Pegah Nokhiz, Vivek Srikumar	In this paper, we observe that semi-structured tabulated text is ubiquitous; understanding them requires not only comprehending the meaning of text fragments, but also implicit relationships between them.	related papers	related patents
211	Interactive Machine Comprehension with Information Seeking Agents	Xingdi Yuan, Jie Fu, Marc-Alexandre Côté, Yi Tay, Chris Pal, Adam Trischler	In this paper, we propose a simple method that reframes existing MRC datasets as interactive, partially observable environments.	related papers	related patents
212	Syntactic Data Augmentation Increases Robustness to Inference Heuristics	Junghyun Min, R. Thomas McCoy, Dipanjan Das, Emily Pitler, Tal Linzen	We explore several methods to augment standard training sets with syntactically informative examples, generated by applying syntactic transformations to sentences from the MNLI corpus.	related papers	related patents
213	Improved Speech Representations with Multi-Target Autoregressive Predictive Coding	Yu-An Chung, James Glass	In this paper we extend this hypothesis and aim to enrich the information encoded in the hidden states by training the model to make more accurate future predictions.	related papers	related patents
214	Integrating Multimodal Information in Large Pretrained Transformers	Wasifur Rahman, Md Kamrul Hasan, Sangwu Lee, AmirAli Bagher Zadeh, Chengfeng Mao, Louis-Philippe Morency, Ehsan Hoque	In this paper, we proposed an attachment to BERT and XLNet called Multimodal Adaptation Gate (MAG).	related papers	related patents
215	MultiQT: Multimodal learning for real-time question tracking in speech	Jakob D. Havtorn, Jan Latko, Joakim Edin, Lars Maaløe, Lasse Borgholt, Lorenzo Belgrano, Nicolai Jacobsen, Regitze Sdun, Željko Agić	We propose a novel multimodal approach to real-time sequence labeling in speech.	related papers	related patents
216	Multimodal and Multiresolution Speech Recognition with Transformers	Georgios Paraskevopoulos, Srinivas Parthasarathy, Aparna Khare, Shiva Sundaram	This paper presents an audio visual automatic speech recognition (AV-ASR) system using a Transformer-based architecture.	related papers	related patents
217	Phone Features Improve Speech Translation	Elizabeth Salesky, Alan W Black	We compare cascaded and end-to-end models across high, medium, and low-resource conditions, and show that cascades remain stronger baselines.	related papers	related patents
218	Grounding Conversations with Improvised Dialogues	Hyundong Cho, Jonathan May	We collect a corpus of more than 26,000 yes-and turns, transcribing them from improv dialogues and extracting them from larger, but more sparsely populated movie script dialogue corpora, via a bootstrapped classifier.	related papers	related patents
219	Image-Chat: Engaging Grounded Conversations	Kurt Shuster, Samuel Humeau, Antoine Bordes, Jason Weston	In this work we study large-scale architectures and datasets for this goal.	related papers	related patents
220	Learning an Unreferenced Metric for Online Dialogue Evaluation	Koustuv Sinha, Prasanna Parthasarathi, Jasmine Wang, Ryan Lowe, William L. Hamilton, Joelle Pineau	Here, we propose an unreferenced automated evaluation metric that uses large pre-trained language models to extract latent representations of utterances, and leverages the temporal transitions that exist between them.	related papers	related patents
221	Neural Generation of Dialogue Response Timings	Matthew Roddy, Naomi Harte	We propose neural models that simulate the distributions of these response offsets, taking into account the response turn as well as the preceding turn.	related papers	related patents
222	The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents	Kurt Shuster, Da JU, Stephen Roller, Emily Dinan, Y-Lan Boureau, Jason Weston	We introduce dodecaDialogue: a set of 12 tasks that measures if a conversational agent can communicate engagingly with personality and empathy, ask questions, answer questions by utilizing knowledge resources, discuss topics and situations, and perceive and converse about images.	related papers	related patents
223	Automatic Poetry Generation from Prosaic Text	Tim Van de Cruys	In this paper, we will explore how these approaches can be adapted and combined to model the linguistic and literary aspects needed for poetry generation.	related papers	related patents
224	Bridging the Structural Gap Between Encoding and Decoding for Data-To-Text Generation	Chao Zhao, Marilyn Walker, Snigdha Chaturvedi	To narrow this gap, we propose DualEnc, a dual encoding model that can not only incorporate the graph structure, but can also cater to the linear structure of the output text.	related papers	related patents
225	Enabling Language Models to Fill in the Blanks	Chris Donahue, Mina Lee, Percy Liang	We present a simple approach for \textit{text infilling}, the task of predicting missing spans of text at any position in a document.	related papers	related patents
226	INSET: Sentence Infilling with INter-SEntential Transformer	Yichen Huang, Yizhe Zhang, Oussama Elachqar, Yu Cheng	In this paper, we propose a framework to decouple the challenge and address these three aspects respectively, leveraging the power of existing large-scale pre-trained models such as BERT and GPT-2.	related papers	related patents
227	Improving Adversarial Text Generation by Modeling the Distant Future	Ruiyi Zhang, Changyou Chen, Zhe Gan, Wenlin Wang, Dinghan Shen, Guoyin Wang, Zheng Wen, Lawrence Carin	We consider a text planning scheme and present a model-based imitation-learning approach to alleviate the aforementioned issues.	related papers	related patents
228	Simple and Effective Retrieve-Edit-Rerank Text Generation	Nabil Hossain, Marjan Ghazvininejad, Luke Zettlemoyer	We propose to extend this framework with a simple and effective post-generation ranking approach.	related papers	related patents
229	BabyWalk: Going Farther in Vision-and-Language Navigation by Taking Baby Steps	Wang Zhu, Hexiang Hu, Jiacheng Chen, Zhiwei Deng, Vihan Jain, Eugene Ie, Fei Sha	In this paper, we study how an agent can navigate long paths when learning from a corpus that consists of shorter ones.	related papers	related patents
230	Cross-media Structured Common Space for Multimedia Event Extraction	Manling Li, Alireza Zareian, Qi Zeng, Spencer Whitehead, Di Lu, Heng Ji, Shih-Fu Chang	We propose a novel method, Weakly Aligned Structured Embedding (WASE), that encodes structured representations of semantic information from textual and visual data into a common embedding space.	related papers	related patents
231	Learning to Segment Actions from Observation and Narration	Daniel Fried, Jean-Baptiste Alayrac, Phil Blunsom, Chris Dyer, Stephen Clark, Aida Nematzadeh	We apply a generative segmental model of task structure, guided by narration, to action segmentation in video.	related papers	related patents
232	Learning to execute instructions in a Minecraft dialogue	Prashant Jayannavar, Anjali Narayan-Chen, Julia Hockenmaier	We define the subtask of predicting correct action sequences (block placements and removals) in a given game context, and show that capturing B’s past actions as well as B’s perspective leads to a significant improvement in performance on this challenging language understanding problem.	related papers	related patents
233	MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning	Jie Lei, Liwei Wang, Yelong Shen, Dong Yu, Tamara Berg, Mohit Bansal	Towards this goal, we propose a new approach called Memory-Augmented Recurrent Transformer (MART), which uses a memory module to augment the transformer architecture.	related papers	related patents
234	What is Learned in Visually Grounded Neural Syntax Acquisition	Noriyuki Kojima, Hadar Averbuch-Elor, Alexander Rush, Yoav Artzi	We also find that a simple lexical signal of noun concreteness plays the main role in the model�s predictions as opposed to more complex syntactic reasoning.	related papers	related patents
235	A Batch Normalized Inference Network Keeps the KL Vanishing Away	Qile Zhu, Wei Bi, Xiaojiang Liu, Xiyao Ma, Xiaolin Li, Dapeng Wu	We propose to let the KL follow a distribution across the whole dataset, and analyze that it is sufficient to prevent posterior collapse by keeping the expectation of the KL’s distribution positive.	related papers	related patents
236	Contextual Embeddings: When Are They Worth It?	Simran Arora, Avner May, Jian Zhang, Christopher Ré	We study the settings for which deep contextual embeddings (e.g., BERT) give large improvements in performance relative to classic pretrained embeddings (e.g., GloVe), and an even simpler baseline-random word embeddings-focusing on the impact of the training set size and the linguistic properties of the task.	related papers	related patents
237	Interactive Classification by Asking Informative Questions	Lili Yu, Howard Chen, Sida I. Wang, Tao Lei, Yoav Artzi	We study the potential for interaction in natural language classification.	related papers	related patents
238	Knowledge Graph Embedding Compression	Mrinmaya Sachan	Thus, we propose an approach that compresses the KG embedding layer by representing each entity in the KG as a vector of discrete codes and then composes the embeddings from these codes.	related papers	related patents
239	Low Resource Sequence Tagging using Sentence Reconstruction	Tal Perl, Sriram Chaudhury, Raja Giryes	Specifically, our method demonstrates how by adding a decoding layer for sentence reconstruction, we can improve the performance of various baselines.	related papers	related patents
240	Masked Language Model Scoring	Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff	We release our library for language model scoring at https://github.com/awslabs/mlm-scoring.	related papers	related patents
241	Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding	Yun Tang, Jing Huang, Guangtao Wang, Xiaodong He, Bowen Zhou	In this work, we propose a novel distance-based approach for knowledge graph link prediction.	related papers	related patents
242	Posterior Calibrated Training on Sentence Classification Tasks	Taehee Jung, Dongyeop Kang, Hua Cheng, Lucas Mentch, Thomas Schaaf	Here we propose an end-to-end training procedure called posterior calibrated (PosCal) training that directly optimizes the objective while minimizing the difference between the predicted and empirical posterior probabilities.	related papers	related patents
243	Posterior Control of Blackbox Generation	Xiang Lisa Li, Alexander Rush	In this work, we consider augmenting neural generation models with discrete control states learned through a structured latent-variable approach.	related papers	related patents
244	Pretrained Transformers Improve Out-of-Distribution Robustness	Dan Hendrycks, Xiaoyuan Liu, Eric Wallace, Adam Dziedzic, Rishabh Krishnan, Dawn Song	We examine which factors affect robustness, finding that larger models are not necessarily more robust, distillation can be harmful, and more diverse pretraining data can enhance robustness.	related papers	related patents
245	Robust Encodings: A Framework for Combating Adversarial Typos	Erik Jones, Robin Jia, Aditi Raghunathan, Percy Liang	In this work, we introduce robust encodings (RobEn): a simple framework that confers guaranteed robustness, without making compromises on model architecture.	related papers	related patents
246	Showing Your Work Doesn’t Always Work	Raphael Tang, Jaejun Lee, Ji Xin, Xinyu Liu, Yaoliang Yu, Jimmy Lin	One exemplar publication, titled �Show Your Work: Improved Reporting of Experimental Results� (Dodge et al., 2019), advocates for reporting the expected validation effectiveness of the best-tuned model, with respect to the computational budget. In the present work, we critically examine this paper. As far as statistical generalizability is concerned, we find unspoken pitfalls and caveats with this approach.	related papers	related patents
247	Span Selection Pre-training for Question Answering	Michael Glass, Alfio Gliozzo, Rishav Chakravarti, Anthony Ferritto, Lin Pan, G P Shrivatsa Bhargav, Dinesh Garg, Avi Sil	In this paper we introduce a new pre-training task inspired by reading comprehension to better align the pre-training from memorization to understanding.	related papers	related patents
248	Topological Sort for Sentence Ordering	Shrimai Prabhumoye, Ruslan Salakhutdinov, Alan W Black	In this paper, we propose a new framing of this task as a constraint solving problem and introduce a new technique to solve it.	related papers	related patents
249	Weight Poisoning Attacks on Pretrained Models	Keita Kurita, Paul Michel, Graham Neubig	In this paper, we show that it is possible to construct “weight poisoning” attacks where pre-trained weights are injected with vulnerabilities that expose “backdoors” after fine-tuning, enabling the attacker to manipulate the model prediction simply by injecting an arbitrary keyword.	related papers	related patents
250	schuBERT: Optimizing Elements of BERT	Ashish Khetan, Zohar Karnin	In this work we revisit the architecture choices of BERT in efforts to obtain a lighter model.	related papers	related patents
251	ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation	Lifu Tu, Richard Yuanzhe Pang, Sam Wiseman, Kevin Gimpel	We propose to train a non-autoregressive machine translation model to minimize the energy defined by a pretrained autoregressive model.	related papers	related patents
252	Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation	Aditya Siddhant, Ankur Bapna, Yuan Cao, Orhan Firat, Mia Chen, Sneha Kudugunta, Naveen Arivazhagan, Yonghui Wu	In this work, we join these two lines of research and demonstrate the efficacy of monolingual data with self-supervision in multilingual NMT.	related papers	related patents
253	On The Evaluation of Machine Translation SystemsTrained With Back-Translation	Sergey Edunov, Myle Ott, Marc’Aurelio Ranzato, Michael Auli	In this work, we show that this conjecture is not empirically supported and that back-translation improves translation quality of both naturally occurring text as well as translationese according to professional human translators.	related papers	related patents
254	Simultaneous Translation Policies: From Fixed to Adaptive	Baigong Zheng, Kaibo Liu, Renjie Zheng, Mingbo Ma, Hairong Liu, Liang Huang	We design an algorithm to achieve adaptive policies via a simple heuristic composition of a set of fixed policies.	related papers	related patents
255	Breaking Through the 80% Glass Ceiling: Raising the State of the Art in Word Sense Disambiguation by Incorporating Knowledge Graph Information	Michele Bevilacqua, Roberto Navigli	We present Enhanced WSD Integrating Synset Embeddings and Relations (EWISER), a neural supervised architecture that is able to tap into this wealth of knowledge by embedding information from the LKB graph within the neural architecture, and to exploit pretrained synset embeddings, enabling the network to predict synsets that are not in the training set.	related papers	related patents
256	Glyph2Vec: Learning Chinese Out-of-Vocabulary Word Embedding from Glyphs	Hong-You Chen, SZ-HAN YU, Shou-de Lin	We present a multi-modal model, \textit{Glyph2Vec}, to tackle Chinese out-of-vocabulary word embedding problem.	related papers	related patents
257	Multidirectional Associative Optimization of Function-Specific Word Representations	Daniela Gerz, Ivan Vulić, Marek Rei, Roi Reichart, Anna Korhonen	We present a neural framework for learning associations between interrelated groups of words such as the ones found in Subject-Verb-Object (SVO) structures.	related papers	related patents
258	Predicting Degrees of Technicality in Automatic Terminology Extraction	Anna Hätty, Dominik Schlechtweg, Michael Dorna, Sabine Schulte im Walde	We suggest two novel models to exploit general- vs. domain-specific comparisons: a simple neural network model with pre-computed comparative-embedding information as input, and a multi-channel model computing the comparison internally.	related papers	related patents
259	Verbal Multiword Expressions for Identification of Metaphor	Omid Rohanian, Marek Rei, Shiva Taslimipoor, Le An Ha	This work is the first attempt at analysing the interplay of metaphor and MWEs processing through the design of a neural architecture whereby classification of metaphors is enhanced by informing the model of the presence of MWEs.	related papers	related patents
260	Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer	Jieyu Zhao, Subhabrata Mukherjee, saghar Hosseini, Kai-Wei Chang, Ahmed Hassan Awadallah	In this paper, we study gender bias in multilingual embeddings and how it affects transfer learning for NLP applications.	related papers	related patents
261	Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis?	kobi leins, Jey Han Lau, Timothy Baldwin	We examine this question with respect to a paper on automatic legal sentencing from EMNLP 2019 which was a source of some debate, in asking whether the paper should have been allowed to be published, who should have been charged with making such a decision, and on what basis.	related papers	related patents
262	Is Your Classifier Actually Biased? Measuring Fairness under Uncertainty with Bernstein Bounds	Kawin Ethayarajh	In this work, we propose using Bernstein bounds to represent this uncertainty about the bias estimate as a confidence interval. Instead of annotating all the examples, can we annotate a subset of them and use that sample to estimate the bias?	related papers	related patents
263	It’s Morphin’ Time! Combating Linguistic Discrimination with Inflectional Perturbations	Samson Tan, Shafiq Joty, Min-Yen Kan, Richard Socher	We perturb the inflectional morphology of words to craft plausible and semantically similar adversarial examples that expose these biases in popular NLP models, e.g., BERT and Transformer, and show that adversarially fine-tuning them for a single epoch significantly improves robustness without sacrificing performance on clean data.	related papers	related patents
264	Mitigating Gender Bias Amplification in Distribution by Posterior Regularization	Shengyu Jia, Tao Meng, Jieyu Zhao, Kai-Wei Chang	In this paper, we investigate the gender bias amplification issue from the distribution perspective and demonstrate that the bias is amplified in the view of predicted probability distribution over labels.	related papers	related patents
265	Towards Understanding Gender Bias in Relation Extraction	Andrew Gaut, Tony Sun, Shirlyn Tang, Yuxin Huang, Jing Qian, Mai ElSherief, Jieyu Zhao, Diba Mirza, Elizabeth Belding, Kai-Wei Chang, William Yang Wang	In this paper, we create WikiGenderBias, a distantly supervised dataset composed of over 45,000 sentences including a 10% human annotated test set for the purpose of analyzing gender bias in relation extraction systems.	related papers	related patents
266	A Probabilistic Generative Model for Typographical Analysis of Early Modern Printing	Kartik Goyal, Chris Dyer, Christopher Warren, Maxwell G’Sell, Taylor Berg-Kirkpatrick	We propose a deep and interpretable probabilistic generative model to analyze glyph shapes in printed Early Modern documents.	related papers	related patents
267	Attentive Pooling with Learnable Norms for Text Representation	Chuhan Wu, Fangzhao Wu, Tao Qi, Xiaohui Cui, Yongfeng Huang	In this paper, we propose an Attentive Pooling with Learnable Norms (APLN) approach for text representation.	related papers	related patents
268	Estimating the influence of auxiliary tasks for multi-task learning of sequence tagging tasks	Fynn Schröder, Chris Biemann	We propose new methods to automatically assess the similarity of sequence tagging datasets to identify beneficial auxiliary data for MTL or TL setups.	related papers	related patents
269	How Does Selective Mechanism Improve Self-Attention Networks?	Xinwei Geng, Longyue Wang, Xing Wang, Bing Qin, Ting Liu, Zhaopeng Tu	In this paper, we bridge the gap by assessing the strengths of selective SANs (SSANs), which are implemented with a flexible and universal Gumbel-Softmax.	related papers	related patents
270	Improving Transformer Models by Reordering their Sublayers	Ofir Press, Noah A. Smith, Omer Levy	We propose a new transformer pattern that adheres to this property, the sandwich transformer, and show that it improves perplexity on multiple word-level and character-level language modeling benchmarks, at no cost in parameters, memory, or training time.	related papers	related patents
271	Single Model Ensemble using Pseudo-Tags and Distinct Vectors	Ryosuke Kuwabara, Jun Suzuki, Hideki Nakayama	In this study, we propose a novel method that replicates the effects of a model ensemble with a single model.	related papers	related patents
272	Zero-shot Text Classification via Reinforced Self-training	Zhiquan Ye, Yuxia Geng, Jiaoyan Chen, Jingmin Chen, Xiaoxiao Xu, SuHang Zheng, Feng Wang, Jun Zhang, Huajun Chen	To tackle this problem, in this paper we propose a self-training based method to efficiently leverage unlabeled data.	related papers	related patents
273	A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine Translation	Yongjing Yin, Fandong Meng, Jinsong Su, Chulun Zhou, Zhengyuan Yang, Jie Zhou, Jiebo Luo	To deal with this issue, in this paper, we propose a novel graph-based multi-modal fusion encoder for NMT.	related papers	related patents
274	A Relaxed Matching Procedure for Unsupervised BLI	Xu Zhao, Zihao Wang, Yong Zhang, Hao Wu	Thus We propose a relaxed matching procedure to find a more precise matching between two languages.	related papers	related patents
275	Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation	Xuanli He, Gholamreza Haffari, Mohammad Norouzi	This paper introduces Dynamic Programming Encoding (DPE), a new segmentation algorithm for tokenizing sentences into subword units.	related papers	related patents
276	Geometry-aware domain adaptation for unsupervised alignment of word embeddings	Pratik Jawanpuria, Mayank Meghwanshi, Bamdev Mishra	We propose a novel manifold based geometric approach for learning unsupervised alignment of word embeddings between the source and the target languages.	related papers	related patents
277	Learning to Recover from Multi-Modality Errors for Non-Autoregressive Neural Machine Translation	Qiu Ran, Yankai Lin, Peng Li, Jie Zhou	To alleviate this problem, we propose a novel semi-autoregressive model RecoverSAT in this work, which generates a translation as a sequence of segments.	related papers	related patents
278	On the Inference Calibration of Neural Machine Translation	Shuo Wang, Zhaopeng Tu, Shuming Shi, Yang Liu	By carefully designing experiments on three language pairs, our work provides in-depth analyses of the correlation between calibration and translation performance as well as linguistic properties of miscalibration and reports a number of interesting findings that might help humans better analyze, understand and improve NMT models.	related papers	related patents
279	Camouflaged Chinese Spam Content Detection with Semi-supervised Generative Active Learning	Zhuoren Jiang, Zhe Gao, Yu Duan, Yangyang Kang, Changlong Sun, Qiong Zhang, Xiaozhong Liu	We propose a Semi-supervIsed GeNerative Active Learning (SIGNAL) model to address the imbalance, efficiency, and text camouflage problems of Chinese text spam detection task.	related papers	related patents
280	Distinguish Confusing Law Articles for Legal Judgment Prediction	Nuo Xu, Pinghui Wang, Long Chen, Li Pan, Xiaoyan Wang, Junzhou Zhao	In this paper, we present an end-to-end model, LADAN, to solve the task of LJP.	related papers	related patents
281	Hiring Now: A Skill-Aware Multi-Attention Model for Job Posting Generation	Liting Liu, Jie Liu, Wenzheng Zhang, Ziming Chi, Wenxuan Shi, Yalou Huang	To this end, we propose a novel task of Job Posting Generation (JPG) which is cast as a conditional text generation problem to generate job requirements according to the job descriptions.	related papers	related patents
282	HyperCore: Hyperbolic and Co-graph Representation for Automatic ICD Coding	Pengfei Cao, Yubo Chen, Kang Liu, Jun Zhao, Shengping Liu, Weifeng Chong	In this paper, we propose a Hyperbolic and Co-graph Representation method (HyperCore) to address the above problem.	related papers	related patents
283	Hyperbolic Capsule Networks for Multi-Label Classification	Boli Chen, Xin Huang, Lin Xiao, Liping Jing	Thus, we propose Hyperbolic Capsule Networks (HyperCaps) for Multi-Label Classification (MLC), which have two merits.	related papers	related patents
284	Improving Segmentation for Technical Support Problems	Kushal Chauhan, Abhirut Gupta	In this paper, we address the problem of segmentation for technical support questions.	related papers	related patents
285	MOOCCube: A Large-scale Data Repository for NLP Applications in MOOCs	Jifan Yu, Gan Luo, Tong Xiao, Qingyang Zhong, Yuquan Wang, wenzheng feng, Junyi Luo, Chenyu Wang, Lei Hou, Juanzi Li, Zhiyuan Liu, Jie Tang	Therefore, we present MOOCCube, a large-scale data repository of over 700 MOOC courses, 100k concepts, 8 million student behaviors with an external resource.	related papers	related patents
286	Towards Interpretable Clinical Diagnosis with Bayesian Network Ensembles Stacked on Entity-Aware CNNs	Jun Chen, Xiaoya Dai, Quan Yuan, Chao Lu, Haifeng Huang	In this paper, we attempt to propose a solution by introducing a novel framework that stacks Bayesian Network Ensembles on top of Entity-Aware Convolutional Neural Networks (CNN) towards building an accurate yet interpretable diagnosis system.	related papers	related patents
287	Analyzing the Persuasive Effect of Style in News Editorial Argumentation	Roxanne El Baff, Henning Wachsmuth, Khalid Al Khatib, Benno Stein	In contrast, this paper studies how important the style of news editorials is to achieve persuasion.	related papers	related patents
288	ECPE-2D: Emotion-Cause Pair Extraction based on Joint Two-Dimensional Representation, Interaction and Prediction	Zixiang Ding, Rui Xia, Jianfei Yu	To address these shortcomings, in this paper we propose a new end-to-end approach, called ECPE-Two-Dimensional (ECPE-2D), to represent the emotion-cause pairs by a 2D representation scheme.	related papers	related patents
289	Effective Inter-Clause Modeling for End-to-End Emotion-Cause Pair Extraction	Penghui Wei, Jiahao Zhao, Wenji Mao	In this paper, we tackle emotion-cause pair extraction from a ranking perspective, i.e., ranking clause pair candidates in a document, and propose a one-step neural approach which emphasizes inter-clause modeling to perform end-to-end extraction.	related papers	related patents
290	Embarrassingly Simple Unsupervised Aspect Extraction	Stéphan Tulkens, Andreas van Cranenburgh	We present a simple but effective method for aspect identification in sentiment analysis.	related papers	related patents
291	Enhancing Cross-target Stance Detection with Transferable Semantic-Emotion Knowledge	Bowen Zhang, Min Yang, Xutao Li, Yunming Ye, Xiaofei Xu, Kuai Dai	In this paper, we proposed a Semantic-Emotion Knowledge Transferring (SEKT) model for cross-target stance detection, which uses the external knowledge (semantic and emotion lexicons) as a bridge to enable knowledge transfer across different targets.	related papers	related patents
292	KinGDOM: Knowledge-Guided DOMain Adaptation for Sentiment Analysis	Deepanway Ghosal, Devamanyu Hazarika, Abhinaba Roy, Navonil Majumder, Rada Mihalcea, Soujanya Poria	In this paper, we take a novel perspective on this task by exploring the role of external commonsense knowledge.	related papers	related patents
293	Modelling Context and Syntactical Features for Aspect-based Sentiment Analysis	Minh Hieu Phan, Philip O. Ogunbona	This paper explores the grammatical aspect of the sentence and employs the self-attention mechanism for syntactical learning.	related papers	related patents
294	Parallel Data Augmentation for Formality Style Transfer	Yi Zhang, Tao Ge, Xu SUN	In this paper, we study how to augment parallel data and propose novel and simple data augmentation methods for this task to obtain useful sentence pairs with easily accessible models and systems.	related papers	related patents
295	Relational Graph Attention Network for Aspect-based Sentiment Analysis	Kai Wang, Weizhou Shen, Yunyi Yang, Xiaojun Quan, Rui Wang	In this paper, we address this problem by means of effective encoding of syntax information.	related papers	related patents
296	SpanMlt: A Span-based Multi-Task Learning Framework for Pair-wise Aspect and Opinion Terms Extraction	He Zhao, Longtao Huang, Rong Zhang, Quan Lu, hui xue	To this end, this paper proposes an end-to-end method to solve the task of Pair-wise Aspect and Opinion Terms Extraction (PAOTE).	related papers	related patents
297	Syntax-Aware Opinion Role Labeling with Dependency Graph Convolutional Networks	Bo Zhang, Yue Zhang, Rui Wang, Zhenghua Li, Min Zhang	In this work, we try to enhance neural ORL models with syntactic knowledge by comparing and integrating different representations.	related papers	related patents
298	Towards Better Non-Tree Argument Mining: Proposition-Level Biaffine Parsing with Task-Specific Parameterization	Gaku Morio, Hiroaki Ozaki, Terufumi Morishita, Yuta Koreeda, Kohsuke Yanai	In this paper, we focus on non-tree argument mining with a neural model.	related papers	related patents
299	A Span-based Linearization for Constituent Trees	Yang Wei, Yuanbin Wu, Man Lan	We propose a novel linearization of a constituent tree, together with a new locally normalized model.	related papers	related patents
300	An Empirical Comparison of Unsupervised Constituency Parsing Methods	Jun Li, Yifan Cao, Jiong Cai, Yong Jiang, Kewei Tu	In this paper, we first examine experimental settings used in previous work and propose to standardize the settings for better comparability between methods.	related papers	related patents
301	Efficient Constituency Parsing by Pointing	Thanh-Tung Nguyen, Xuan-Phi Nguyen, Shafiq Joty, Xiaoli Li	We propose a novel constituency parsing model that casts the parsing problem into a series of pointing tasks.	related papers	related patents
302	Efficient Second-Order TreeCRF for Neural Dependency Parsing	Yu Zhang, Zhenghua Li, Min Zhang	To address this issue, we propose an effective way to batchify the inside and Viterbi algorithms for direct large matrix operation on GPUs, and to avoid the complex outside algorithm via efficient back-propagation.	related papers	related patents
303	Representations of Syntax [MASK] Useful: Effects of Constituency and Dependency Structure in Recursive LSTMs	Michael Lepori, Tal Linzen, R. Thomas McCoy	We evaluate which of these two representational schemes more effectively introduces biases for syntactic structure that increase performance on the subject-verb agreement prediction task.	related papers	related patents
304	Structure-Level Knowledge Distillation For Multilingual Sequence Labeling	Xinyu Wang, Yong Jiang, Nguyen Bach, Tao Wang, Fei Huang, Kewei Tu	In this paper, we propose to reduce the gap between monolingual models and the unified multilingual model by distilling the structural knowledge of several monolingual models (teachers) to the unified multilingual model (student).	related papers	related patents
305	Dynamic Online Conversation Recommendation	Xingshan Zeng, Jing Li, Lu Wang, Zhiming Mao, Kam-Fai Wong	Concretely, we propose a neural architecture to exploit changes of user interactions and interests over time, to predict which discussions they are likely to enter.	related papers	related patents
306	Improving Multimodal Named Entity Recognition via Entity Span Detection with Unified Multimodal Transformer	Jianfei Yu, Jing Jiang, Li Yang, Rui Xia	In this paper, we study Multimodal Named Entity Recognition (MNER) for social media posts.	related papers	related patents
307	Stock Embeddings Acquired from News Articles and Price History, and an Application to Portfolio Optimization	Xin Du, Kumiko Tanaka-Ishii	In contrast, this paper presents a method to encode the influence of news articles through a vector representation of stocks called a \textit{stock embedding}.	related papers	related patents
308	What Was Written vs. Who Read It: News Media Profiling Using Text Analysis and Social Media Context	Ramy Baly, Georgi Karadzhov, Jisun An, Haewoon Kwak, Yoan Dinkov, Ahmed Ali, James Glass, Preslav Nakov	Here, we study the impact of both, namely (i) what was written (i.e., what was published by the target medium, and how it describes itself in Twitter) vs. (ii) who reads it (i.e., analyzing the target medium’s audience on social media).	related papers	related patents
309	An Analysis of the Utility of Explicit Negative Examples to Improve the Syntactic Abilities of Neural Language Models	Hiroshi Noji, Hiroya Takamura	We explore the utilities of explicit negative examples in training neural language models.	related papers	related patents
310	On the Robustness of Language Encoders against Grammatical Errors	Fan Yin, Quanyu Long, Tao Meng, Kai-Wei Chang	We conduct a thorough study to diagnose the behaviors of pre-trained language encoders (ELMo, BERT, and RoBERTa) when confronted with natural grammatical errors.	related papers	related patents
311	Roles and Utilization of Attention Heads in Transformer-based Neural Language Models	Jae-young Jo, Sung-Hyon Myaeng	Meaningful insights are shown through the lens of heat map visualization and utilized to propose a relatively simple sentence representation method that takes advantage of most influential attention heads, resulting in additional performance improvements on the downstream tasks.	related papers	related patents
312	Understanding Attention for Text Classification	Xiaobing Sun, Wei Lu	In this work, we present a study on understanding the internal mechanism of attention by looking into the gradient update process, checking its behavior when approaching a local minimum during training.	related papers	related patents
313	A Relational Memory-based Embedding Model for Triple Classification and Search Personalization	Dai Quoc Nguyen, Tu Nguyen, Dinh Phung	To this end, we introduce a novel embedding model, named R-MeN, that explores a relational memory network to encode potential dependencies in relationship triples.	related papers	related patents
314	Do you have the right scissors? Tailoring Pre-trained Language Models via Monte-Carlo Methods	Ning Miao, Yuxuan Song, Hao Zhou, Lei Li	In this paper, we propose MC-Tailor, a novel method to alleviate the above issue in text generation tasks by truncating and transferring the probability mass from over-estimated regions to under-estimated ones.	related papers	related patents
315	Enhancing Pre-trained Chinese Character Representation with Word-aligned Attention	Yanzeng Li, Bowen Yu, Xue Mengge, Tingwen Liu	Hence, we propose a novel word-aligned attention to exploit explicit word information, which is complementary to various character-based Chinese pre-trained language models.	related papers	related patents
316	On the Encoder-Decoder Incompatibility in Variational Text Modeling and Beyond	Chen Wu, Prince Zizhuang Wang, William Yang Wang	To this end, we propose Coupled-VAE, which couples a VAE model with a deterministic autoencoder with the same structure and improves the encoder and decoder parameterizations via encoder weight sharing and decoder signal matching.	related papers	related patents
317	SAFER: A Structure-free Approach for Certified Robustness to Adversarial Word Substitutions	Mao Ye, Chengyue Gong, Qiang Liu	In this work, we propose a certified robust method based on a new randomized smoothing technique, which constructs a stochastic ensemble by applying random word substitutions on the input sentences, and leverage the statistical properties of the ensemble to provably certify the robustness.	related papers	related patents
318	A Graph-based Coarse-to-fine Method for Unsupervised Bilingual Lexicon Induction	Shuo Ren, Shujie Liu, Ming Zhou, Shuai Ma	To deal with those issues, in this paper, we propose a novel graph-based paradigm to induce bilingual lexicons in a coarse-to-fine way.	related papers	related patents
319	A Reinforced Generation of Adversarial Examples for Neural Machine Translation	wei zou, Shujian Huang, Jun Xie, Xinyu Dai, Jiajun CHEN	Instead of collecting and analyzing bad cases using limited handcrafted error features, here we investigate this issue by generating adversarial examples via a new paradigm based on reinforcement learning.	related papers	related patents
320	A Retrieve-and-Rewrite Initialization Method for Unsupervised Machine Translation	Shuo Ren, Yu Wu, Shujie Liu, Ming Zhou, Shuai Ma	In this paper, we propose a novel retrieval and rewriting based method to better initialize unsupervised translation models.	related papers	related patents
321	A Simple and Effective Unified Encoder for Document-Level Machine Translation	Shuming Ma, Dongdong Zhang, Ming Zhou	In this work, we propose a simple and effective unified encoder that can outperform the baseline models of dual-encoder models in terms of BLEU and METEOR scores.	related papers	related patents
322	Does Multi-Encoder Help? A Case Study on Context-Aware Neural Machine Translation	Bei Li, Hui Liu, Ziyang Wang, Yufan Jiang, Tong Xiao, Jingbo Zhu, Tongran Liu, changliang li	In this paper, we investigate multi-encoder approaches in document-level neural machine translation (NMT).	related papers	related patents
323	Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change	Hongfei Xu, Josef van Genabith, Deyi Xiong, Qiuhui Liu	To improve the efficiency of our approach for large models, we propose a sampling approach to select gradients of parameters sensitive to the batch size.	related papers	related patents
324	Knowledge Distillation for Multilingual Unsupervised Neural Machine Translation	Haipeng Sun, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao	In this paper, we empirically introduce a simple method to translate between thirteen languages using a single encoder and a single decoder, making use of multilingual data to improve UNMT for all language pairs.	related papers	related patents
325	Lexically Constrained Neural Machine Translation with Levenshtein Transformer	Raymond Hendy Susanto, Shamil Chollampatt, Liling Tan	This paper proposes a simple and effective algorithm for incorporating lexical constraints in neural machine translation.	related papers	related patents
326	On Exposure Bias, Hallucination and Domain Shift in Neural Machine Translation	Chaojun Wang, Rico Sennrich	In this paper, we link exposure bias to another well-known problem in NMT, namely the tendency to generate hallucinations under domain shift.	related papers	related patents
327	Automatic Machine Translation Evaluation using Source Language Inputs and Cross-lingual Language Model	Kosuke Takahashi, Katsuhito Sudoh, Satoshi Nakamura	We propose an automatic evaluation method of machine translation that uses source language sentences regarded as additional pseudo references.	related papers	related patents
328	ChartDialogs: Plotting from Natural Language Instructions	Yutong Shao, Ndapa Nakashole	This paper presents the problem of conversational plotting agents that carry out plotting actions from natural language instructions.	related papers	related patents
329	GLUECoS: An Evaluation Benchmark for Code-Switched NLP	Simran Khanuja, Sandipan Dandapat, Anirudh Srinivasan, Sunayana Sitaram, Monojit Choudhury	We present an evaluation benchmark, GLUECoS, for code-switched languages, that spans several NLP tasks in English-Hindi and English-Spanish.	related papers	related patents
330	MATINF: A Jointly Labeled Large-Scale Dataset for Classification, Question Answering and Summarization	Canwen Xu, Jiaxin Pei, Hongtao Wu, Yiyu Liu, Chenliang Li	We propose MATINF, the first jointly labeled large-scale dataset for classification, question answering and summarization.	related papers	related patents
331	MIND: A Large-scale Dataset for News Recommendation	Fangzhao Wu, Ying Qiao, Jiun-Hung Chen, Chuhan Wu, Tao Qi, Jianxun Lian, Danyang Liu, Xing Xie, Jianfeng Gao, Winnie Wu, Ming Zhou	In this paper, we present a large-scale dataset named MIND for news recommendation.	related papers	related patents
332	That is a Known Lie: Detecting Previously Fact-Checked Claims	Shaden Shaar, Nikolay Babulkov, Giovanni Da San Martino, Preslav Nakov	Interestingly, despite the importance of the task, it has been largely ignored by the research community so far. Here, we aim to bridge this gap.	related papers	related patents
333	Towards Holistic and Automatic Evaluation of Open-Domain Dialogue Generation	Bo Pang, Erik Nijkamp, Wenjuan Han, Linqi Zhou, Yixian Liu, Kewei Tu	In this paper, we propose holistic evaluation metrics that capture different aspects of open-domain dialogues.	related papers	related patents
334	BiRRE: Learning Bidirectional Residual Relation Embeddings for Supervised Hypernymy Detection	Chengyu Wang, XIAOFENG HE	In this work, we revisit supervised distributional models for hypernymy detection.	related papers	related patents
335	Biomedical Entity Representations with Synonym Marginalization	Mujeen Sung, Hwisang Jeon, Jinhyuk Lee, Jaewoo Kang	In this paper, we focus on learning representations of biomedical entities solely based on the synonyms of entities.	related papers	related patents
336	Hypernymy Detection for Low-Resource Languages via Meta Learning	Changlong Yu, Jialong Han, Haisong Zhang, Wilfred Ng	This paper addresses the problem of low-resource hypernymy detection by combining high-resource languages.	related papers	related patents
337	Investigating Word-Class Distributions in Word Vector Spaces	Ryohei Sasano, Anna Korhonen	This paper presents an investigation on the distribution of word vectors belonging to a certain word class in a pre-trained word vector space.	related papers	related patents
338	Aspect Sentiment Classification with Document-level Sentiment Preference Modeling	Xiao Chen, Changlong Sun, Jingjing Wang, Shoushan Li, Luo Si, Min Zhang, Guodong Zhou	In this paper, we explore two kinds of sentiment preference information inside a document, i.e., contextual sentiment consistency w.r.t. the same aspect (namely intra-aspect sentiment consistency) and contextual sentiment tendency w.r.t. all the related aspects (namely inter-aspect sentiment tendency).	related papers	related patents
339	Don’t Eclipse Your Arts Due to Small Discrepancies: Boundary Repositioning with a Pointer Network for Aspect Extraction	Zhenkai Wei, Yu Hong, Bowei Zou, Meng Cheng, Jianmin YAO	In this paper, we propose to utilize a pointer network for repositioning the boundaries.	related papers	related patents
340	Relation-Aware Collaborative Learning for Unified Aspect-Based Sentiment Analysis	Zhuang Chen, Tieyun Qian	In order to fully exploit these relations, we propose a Relation-Aware Collaborative Learning (RACL) framework which allows the subtasks to work coordinately via the multi-task learning and relation propagation mechanisms in a stacked multi-layer network.	related papers	related patents
341	SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics	Da Yin, Tao Meng, Kai-Wei Chang	We propose SentiBERT, a variant of BERT that effectively captures compositional sentiment semantics.	related papers	related patents
342	Transition-based Directed Graph Construction for Emotion-Cause Pair Extraction	Chuang Fan, Chaofa Yuan, Jiachen Du, Lin Gui, Min Yang, Ruifeng Xu	Towards this issue, we propose a transition-based model to transform the task into a procedure of parsing-like directed graph construction.	related papers	related patents
343	CH-SIMS: A Chinese Multimodal Sentiment Analysis Dataset with Fine-grained Annotation of Modality	Wenmeng Yu, Hua Xu, Fanyang Meng, Yilin Zhu, Yixiao Ma, Jiele Wu, Jiyun Zou, Kaicheng Yang	In this paper, we introduce a Chinese single- and multi-modal sentiment analysis dataset, CH-SIMS, which contains 2,281 refined video segments in the wild with both multimodal and independent unimodal annotations.	related papers	related patents
344	Curriculum Pre-training for End-to-End Speech Translation	Chengyi Wang, Yu Wu, Shujie Liu, Ming Zhou, Zhenglu Yang	Inspired by this, we propose a curriculum pre-training method that includes an elementary course for transcription learning and two advanced courses for understanding the utterance and mapping words in two languages.	related papers	related patents
345	How Accents Confound: Probing for Accent Information in End-to-End Speech Recognition Systems	Archiki Prasad, Preethi Jyothi	In this work, we present a detailed analysis of how accent information is reflected in the internal representation of speech in an end-to-end automatic speech recognition (ASR) system.	related papers	related patents
346	Improving Disfluency Detection by Self-Training a Self-Attentive Model	Paria Jamshid Lou, Mark Johnson	However, we show that self-training � a semi-supervised technique for incorporating unlabeled data � sets a new state-of-the-art for the self-attentive parser on disfluency detection, demonstrating that self-training provides benefits orthogonal to the pre-trained contextualized word representations.	related papers	related patents
347	Learning Spoken Language Representations with Neural Lattice Language Modeling	Chao-Wei Huang, Yun-Nung Chen	We propose a framework that trains neural lattice language models to provide contextualized representations for spoken language understanding tasks.	related papers	related patents
348	Meta-Transfer Learning for Code-Switched Speech Recognition	Genta Indra Winata, Samuel Cahyawijaya, Zhaojiang Lin, Zihan Liu, Peng Xu, Pascale Fung	We therefore propose a new learning method, meta-transfer learning, to transfer learn on a code-switched speech recognition system in a low-resource setting by judiciously extracting information from high-resource monolingual datasets.	related papers	related patents
349	Reasoning with Multimodal Sarcastic Tweets via Modeling Cross-Modality Contrast and Semantic Association	Nan Xu, Zhixiong Zeng, Wenji Mao	To reason with multimodal sarcastic tweets, in this paper, we propose a novel method for modeling cross-modality contrast in the associated context.	related papers	related patents
350	SimulSpeech: End-to-End Simultaneous Speech to Text Translation	Yi Ren, Jinglin Liu, Xu Tan, Chen Zhang, Tao QIN, Zhou Zhao, Tie-Yan Liu	In this work, we develop SimulSpeech, an end-to-end simultaneous speech to text translation system which translates speech in source language to text in target language concurrently.	related papers	related patents
351	Towards end-2-end learning for predicting behavior codes from spoken utterances in psychotherapy conversations	Karan Singla, Zhuohao Chen, David Atkins, Shrikanth Narayanan	We propose a novel framework for predicting utterance level labels directly from speech features, thus removing the dependency on first generating transcripts, and transcription free behavioral coding.	related papers	related patents
352	Neural Temporal Opinion Modelling for Opinion Prediction on Twitter	Lixing Zhu, Yulan He, Deyu Zhou	In this paper, we model users’ tweet posting behaviour as a temporal point process to jointly predict the posting time and the stance label of the next tweet given a user’s historical tweet sequence and tweets posted by their neighbours.	related papers	related patents
353	It Takes Two to Lie: One to Lie, and One to Listen	Denis Peskov, Benny Cheng, Ahmed Elgohary, Joe Barrow, Cristian Danescu-Niculescu-Mizil, Jordan Boyd-Graber	We study the language and dynamics of deception in the negotiation-based game Diplomacy, where seven players compete for world domination by forging and breaking alliances with each other.	related papers	related patents
354	Learning Implicit Text Generation via Feature Matching	Inkit Padhi, Pierre Dognin, Ke Bai, Cícero Nogueira dos Santos, Vijil Chenthamarakshan, Youssef Mroueh, Payel Das	In this paper, we present new GFMN formulations that are effective for sequential data.	related papers	related patents
355	Two Birds, One Stone: A Simple, Unified Model for Text Generation from Structured and Unstructured Data	Hamidreza Shahidi, Ming Li, Jimmy Lin	In this work, we show that this is also the case for text generation from structured and unstructured data.	related papers	related patents
356	Bayesian Hierarchical Words Representation Learning	Oren Barkan, Idan Rejwan, Avi Caciularu, Noam Koenigstein	This paper presents the Bayesian Hierarchical Words Representation (BHWR) learning algorithm.	related papers	related patents
357	Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning	Alexandre Tamborrino, Nicola Pellicanò, Baptiste Pannier, Pascal Voitot, Louise Naudin	In this paper, we introduce a new scoring method that casts a plausibility ranking task in a full-text format and leverages the masked language modeling head tuned during the pre-training phase.	related papers	related patents
358	SEEK: Segmented Embedding of Knowledge Graphs	Wentao Xu, Shun Zheng, Liang He, Bin Shao, Jian Yin, Tie-Yan Liu	To mitigate this problem, we propose a lightweight modeling framework that can achieve highly competitive relational expressiveness without increasing the model complexity.	related papers	related patents
359	Selecting Backtranslated Data from Multiple Sources for Improved Neural Machine Translation	Xabier Soto, Dimitar Shterionov, Alberto Poncelas, Andy Way	In this work we analyse the impact that data translated with rule-based, phrase-based statistical and neural MT systems has on new MT systems.	related papers	related patents
360	Successfully Applying the Stabilized Lottery Ticket Hypothesis to the Transformer Architecture	Christopher Brix, Parnia Bahar, Hermann Ney	On the transformer architecture and the WMT 2014 English-to-German and English-to-French tasks, we show that stabilized lottery ticket pruning performs similar to magnitude pruning for sparsity levels of up to 85%, and propose a new combination of pruning techniques that outperforms all other techniques for even higher levels of sparsity.	related papers	related patents
361	A Self-Training Method for Machine Reading Comprehension with Soft Evidence Extraction	Yilin Niu, Fangkai Jiao, Mantong Zhou, Ting Yao, jingfang xu, Minlie Huang	To address this problem, we present a Self-Training method (STM), which supervises the evidence extractor with auto-generated evidence labels in an iterative process.	related papers	related patents
362	Graph-to-Tree Learning for Solving Math Word Problems	Jipeng Zhang, Lei Wang, Roy Ka-Wei Lee, Yi Bin, Yan Wang, Jie Shao, Ee-Peng Lim	In this paper, we propose Graph2Tree, a novel deep learning architecture that combines the merits of the graph-based encoder and tree-based decoder to generate better solution expressions.	related papers	related patents
363	An Effectiveness Metric for Ordinal Classification: Formal Properties and Experimental Results	Enrique Amigo, Julio Gonzalo, Stefano Mizzaro, Jorge Carrillo-de-Albornoz	In this paper we propose a new metric for Ordinal Classification, Closeness Evaluation Measure, that is rooted on Measurement Theory and Information Theory.	related papers	related patents
364	Adaptive Compression of Word Embeddings	Yeachan Kim, Kang-Min Kim, SangKeun Lee	In this paper, we propose a novel method to adaptively compress word embeddings.	related papers	related patents
365	Analysing Lexical Semantic Change with Contextualised Word Representations	Mario Giulianelli, Marco Del Tredici, Raquel Fernández	This paper presents the first unsupervised approach to lexical semantic change that makes use of contextualised word representations.	related papers	related patents
366	Autoencoding Keyword Correlation Graph for Document Clustering	Billy Chiu, Sunil Kumar Sahu, Derek Thomas, Neha Sengupta, Mohammady Mahdy	To address this, we present a novel graph-based representation for document clustering that builds a \textit{graph autoencoder} (GAE) on a Keyword Correlation Graph.	related papers	related patents
367	Autoencoding Pixies: Amortised Variational Inference with Graph Convolutions for Functional Distributional Semantics	Guy Emerson	In this paper, I introduce the Pixie Autoencoder, which augments the generative model of Functional Distributional Semantics with a graph-convolutional neural network to perform amortised variational inference.	related papers	related patents
368	BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance	Timo Schick, Hinrich Schütze	In this work, we transfer this idea to pretrained language models: We introduce BERTRAM, a powerful architecture based on BERT that is capable of inferring high-quality embeddings for rare words that are suitable as input representations for deep language models.	related papers	related patents
369	CluBERT: A Cluster-Based Approach for Learning Sense Distributions in Multiple Languages	Tommaso Pasini, Federico Scozzafava, Bianca Scarlini	To address this issue, in this paper we present CluBERT, an automatic and multilingual approach for inducing the distributions of word senses from a corpus of raw sentences.	related papers	related patents
370	Adversarial and Domain-Aware BERT for Cross-Domain Sentiment Analysis	Chunning Du, Haifeng Sun, Jingyu Wang, Qi Qi, Jianxin Liao	In this paper, we investigate how to efficiently apply the pre-training language model BERT on the unsupervised domain adaptation.	related papers	related patents
371	From Arguments to Key Points: Towards Automatic Argument Summarization	Roy Bar-Haim, Lilach Eden, Roni Friedman, Yoav Kantor, Dan Lahav, Noam Slonim	We propose to represent such summaries as a small set of talking points, termed \textit{key points}, each scored according to its salience. We study the task of argument-to-key point mapping, and introduce a novel large-scale dataset for this task.	related papers	related patents
372	GoEmotions: A Dataset of Fine-Grained Emotions	Dorottya Demszky, Dana Movshovitz-Attias, Jeongwoo Ko, Alan Cowen, Gaurav Nemade, Sujith Ravi	We introduce GoEmotions, the largest manually annotated dataset of 58k English Reddit comments, labeled for 27 emotion categories or Neutral.	related papers	related patents
373	He said “who’s gonna take care of your children when you are at ACL?”: Reported Sexist Acts are Not Sexist	Patricia Chiril, Véronique MORICEAU, Farah Benamara, Alda Mari, Gloria Origgi, Marlène Coulomb-Gully	We propose: (1) a new characterization of sexist content inspired by speech acts theory and discourse analysis studies, (2) the first French dataset annotated for sexism detection, and (3) a set of deep learning experiments trained on top of a combination of several tweet’s vectorial representations (word embeddings, linguistic features, and various generalization strategies).	related papers	related patents
374	SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis	Hao Tian, Can Gao, Xinyan Xiao, Hao Liu, Bolei He, Hua Wu, Haifeng Wang, feng wu	In this paper, we introduce Sentiment Knowledge Enhanced Pre-training (SKEP) in order to learn a unified sentiment representation for multiple sentiment analysis tasks.	related papers	related patents
375	Do Neural Language Models Show Preferences for Syntactic Formalisms?	Artur Kulmizev, Vinit Ravishankar, Mostafa Abdou, Joakim Nivre	In this study, we aim to investigate the extent to which the semblance of syntactic structure captured by language models adheres to a surface-syntactic or deep syntactic style of analysis, and whether the patterns are consistent across different languages.	related papers	related patents
376	Enriched In-Order Linearization for Faster Sequence-to-Sequence Constituent Parsing	Daniel Fernández-González, Carlos Gómez-Rodríguez	In this paper, we show that these results can be improved by using an in-order linearization instead.	related papers	related patents
377	Exact yet Efficient Graph Parsing, Bi-directional Locality and the Constructivist Hypothesis	Yajie Ye, Weiwei Sun	We demonstrate, for the first time, that exact graph parsing can be efficient for large graphs and with large Hyperedge Replacement Grammars (HRGs).	related papers	related patents
378	Max-Margin Incremental CCG Parsing	Miloš Stanojević, Mark Steedman	Instead, we tackle all of these three biases at the same time using an improved version of beam search optimisation that minimises all beam search violations instead of minimising only the biggest violation.	related papers	related patents
379	Neural Reranking for Dependency Parsing: An Evaluation	Bich-Ngoc Do, Ines Rehbein	In the paper, we re-assess the potential of successful neural reranking models from the literature on English and on two morphologically rich(er) languages, German and Czech.	related papers	related patents
380	Demographics Should Not Be the Reason of Toxicity: Mitigating Discrimination in Text Classifications with Instance Weighting	Guanhua Zhang, Bing Bai, Junqi Zhang, Kun Bai, Conghui Zhu, Tiejun Zhao	In this paper, we formalize the unintended biases in text classification datasets as a kind of selection bias from the non-discrimination distribution to the discrimination distribution.	related papers	related patents
381	Analyzing analytical methods: The case of phonology in neural models of spoken language	Grzegorz Chrupała, Bertrand Higy, Afra Alishahi	As a step in this direction we study the case of representations of phonology in neural network models of spoken language.	related papers	related patents
382	Make Up Your Mind! Adversarial Generation of Inconsistent Natural Language Explanations	Oana-Maria Camburu, Brendan Shillingford, Pasquale Minervini, Thomas Lukasiewicz, Phil Blunsom	In this work, we show that such models are nonetheless prone to generating mutually inconsistent explanations, such as “Because there is a dog in the image.”	related papers	related patents
383	Perturbed Masking: Parameter-free Probing for Analyzing and Interpreting BERT	Zhiyong Wu, Yun Chen, Ben Kao, Qun Liu	Complementary to those works, we propose a parameter-free probing technique for analyzing pre-trained language models (e.g., BERT).	related papers	related patents
384	Probing for Referential Information in Language Models	Ionut-Teodor Sorodoc, Kristina Gulordava, Gemma Boleda	We analyze two state of the art models with LSTM and Transformer architectures, via probe tasks and analysis on a coreference annotated corpus.	related papers	related patents
385	Quantifying Attention Flow in Transformers	Samira Abnar, Willem Zuidema	In this paper, we consider the problem of quantifying this flow of information through self-attention.	related papers	related patents
386	Towards Faithfully Interpretable NLP Systems: How Should We Define and Evaluate Faithfulness?	Alon Jacovi, Yoav Goldberg	We call for more clearly differentiating between different desired criteria an interpretation should satisfy, and focus on the faithfulness criteria.	related papers	related patents
387	Towards Transparent and Explainable Attention Models	Akash Kumar Mohankumar, Preksha Nema, Sharan Narasimhan, Mitesh M. Khapra, Balaji Vasan Srinivasan, Balaraman Ravindran	To make attention mechanisms more faithful and plausible, we propose a modified LSTM cell with a diversity-driven training objective that ensures that the hidden representations learned at different time steps are diverse.	related papers	related patents
388	Tchebycheff Procedure for Multi-task Text Classification	Yuren Mao, Shuang Yun, Weiwei Liu, Bo Du	To address this issue, this paper presents a novel Tchebycheff procedure to optimize the multi-task classification problems without convex assumption.	related papers	related patents
389	Modeling Word Formation in English–German Neural Machine Translation	Marion Weller-Di Marco, Alexander Fraser	This paper studies strategies to model word formation in NMT using rich linguistic information, namely a word segmentation approach that goes beyond splitting into substrings by considering fusional morphology.	related papers	related patents
390	Empowering Active Learning to Jointly Optimize System and User Demands	Ji-Ung Lee, Christian M. Meyer, Iryna Gurevych	In this paper, we propose a new active learning approach that jointly optimizes the seemingly counteracting objectives of the active learning system (training efficiently) and the user (receiving useful instances).	related papers	related patents
391	Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction	Masahiro Kaneko, Masato Mita, Shun Kiyono, Jun Suzuki, Kentaro Inui	This paper investigates how to effectively incorporate a pre-trained masked language model (MLM), such as BERT, into an encoder-decoder (EncDec) model for grammatical error correction (GEC).	related papers	related patents
392	Graph Neural News Recommendation with Unsupervised Preference Disentanglement	Linmei Hu, Siyong Xu, Chen Li, Cheng Yang, Chuan Shi, Nan Duan, Xing Xie, Ming Zhou	In this paper, we model the user-news interactions as a bipartite graph and propose a novel Graph Neural News Recommendation model with Unsupervised Preference Disentanglement, named GNUD.	related papers	related patents
393	Identifying Principals and Accessories in a Complex Case based on the Comprehension of Fact Description	Yakun Hu, Zhunchen Luo, Wenhan Chao	In this paper, we study the problem of identifying the principals and accessories from the fact description with multiple defendants in a criminal case.	related papers	related patents
394	Joint Modelling of Emotion and Abusive Language Detection	Santhosh Rajamanickam, Pushkar Mishra, Helen Yannakoudakis, Ekaterina Shutova	In this paper, we present the first joint model of emotion and abusive language detection, experimenting in a multi-task learning framework that allows one task to inform the other.	related papers	related patents
395	Programming in Natural Language with fuSE: Synthesizing Methods from Spoken Utterances Using Deep Natural Language Understanding	Sebastian Weigelt, Vanessa Steurer, Tobias Hey, Walter F. Tichy	We examine how to teach intelligent systems new functions, expressed in natural language.	related papers	related patents
396	Toxicity Detection: Does Context Really Matter?	John Pavlopoulos, Jeffrey Sorensen, Lucas Dixon, Nithum Thain, Ion Androutsopoulos	We investigate this assumption by focusing on two questions: (a) does context affect the human judgement, and (b) does conditioning on context improve performance of toxicity detection systems?	related papers	related patents
397	AMR Parsing with Latent Structural Information	Qiji Zhou, Yue Zhang, Donghong Ji, Hao Tang	We investigate parsing AMR with explicit dependency structures and interpretable latent structures.	related papers	related patents
398	TaPas: Weakly Supervised Table Parsing via Pre-training	Jonathan Herzig, Pawel Krzysztof Nowak, Thomas Müller, Francesco Piccinno, Julian Eisenschlos	In this paper, we present TaPas, an approach to question answering over tables without generating logical forms.	related papers	related patents
399	Target Inference in Argument Conclusion Generation	Milad Alshomary, Shahbaz Syed, Martin Potthast, Henning Wachsmuth	We develop two complementary target inference approaches: one ranks premise targets and selects the top-ranked target as the conclusion target, the other finds a new conclusion target in a learned embedding space using a triplet neural network.	related papers	related patents
400	Multimodal Transformer for Multimodal Machine Translation	Shaowei Yao, Xiaojun Wan	In this paper, we introduce the multimodal self-attention in Transformer to solve the issues above in MMT.	related papers	related patents
401	Sentiment and Emotion help Sarcasm? A Multi-task Learning Framework for Multi-Modal Sarcasm, Sentiment and Emotion Analysis	Dushyant Singh Chauhan, Dhanush S R, Asif Ekbal, Pushpak Bhattacharyya	In this paper, we hypothesize that sarcasm is closely related to sentiment and emotion, and thereby propose a multi-task deep learning framework to solve all these three problems simultaneously in a multi-modal conversational scenario.	related papers	related patents
402	Towards Emotion-aided Multi-modal Dialogue Act Classification	Tulika Saha, Aditya Patra, Sriparna Saha, Pushpak Bhattacharyya	In this work, we address the role of \textit{both} multi-modality and emotion recognition (ER) in DAC.	related papers	related patents
403	Analyzing Political Parody in Social Media	Antonios Maronikolakis, Danae Sánchez Villegas, Daniel Preotiuc-Pietro, Nikolaos Aletras	In this paper, we present the first computational study of parody.	related papers	related patents
404	Masking Actor Information Leads to Fairer Political Claims Detection	Erenay Dayanik, Sebastian Padó	We propose two simple debiasing methods which mask proper names and pronouns during training of the model, thus removing personal information bias.	related papers	related patents
405	When do Word Embeddings Accurately Reflect Surveys on our Beliefs About People?	Kenneth Joseph, Jonathan Morgan	Here, we investigate the extent to which publicly-available word embeddings accurately reflect beliefs about certain kinds of people as measured via traditional survey methods.	related papers	related patents
406	“Who said it, and Why?” Provenance for Natural Language Claims	Yi Zhang, Zachary Ives, Dan Roth	This paper suggests that the key to a longer-term, holistic, and systematic approach to navigating this information pollution is capturing the provenance of claims.	related papers	related patents
407	Compositionality and Generalization In Emergent Languages	Rahma Chaabouni, Eugene Kharitonov, Diane Bouchacourt, Emmanuel Dupoux, Marco Baroni	In this paper, we study whether the language emerging in deep multi-agent simulations possesses a similar ability to refer to novel primitive combinations, and whether it accomplishes this feat by strategies akin to human-language compositionality.	related papers	related patents
408	ERASER: A Benchmark to Evaluate Rationalized NLP Models	Jay DeYoung, Sarthak Jain, Nazneen Fatema Rajani, Eric Lehman, Caiming Xiong, Richard Socher, Byron C. Wallace	We propose the \textbf{E}valuating \textbf{R}ationales \textbf{A}nd \textbf{S}imple \textbf{E}nglish \textbf{R}easoning (\textbf{ERASER} a benchmark to advance research on interpretable models in NLP.	related papers	related patents
409	Learning to Faithfully Rationalize by Construction	Sarthak Jain, Sarah Wiegreffe, Yuval Pinter, Byron C. Wallace	We propose a simpler variant of this approach that provides faithful explanations by construction. In our scheme, named FRESH, arbitrary feature importance scores (e.g., gradients from a trained model) are used to induce binary labels over token inputs, which an extractor can be trained to predict. An independent classifier module is then trained exclusively on snippets provided by the extractor; these snippets thus constitute faithful explanations, even if the classifier is arbitrarily complex.	related papers	related patents
410	Clinical Reading Comprehension: A Thorough Analysis of the emrQA Dataset	Xiang Yue, Bernal Jimenez Gutierrez, Huan Sun	In this paper, we provide an in-depth analysis of this dataset and the clinical reading comprehension (CliniRC) task.	related papers	related patents
411	DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering	Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian	We introduce DeFormer, a decomposed transformer, which substitutes the full self-attention with question-wide and passage-wide self-attentions in the lower layers.	related papers	related patents
412	Improving Multi-hop Question Answering over Knowledge Graphs using Knowledge Base Embeddings	Apoorv Saxena, Aditay Tripathi, Partha Talukdar	We fill this gap in this paper and propose EmbedKGQA. EmbedKGQA is particularly effective in performing multi-hop KGQA over sparse KGs.	related papers	related patents
413	Template-Based Question Generation from Retrieved Sentences for Improved Unsupervised Question Answering	Alexander Fabbri, Patrick Ng, Zhiguo Wang, Ramesh Nallapati, Bing Xiang	We propose an unsupervised approach to training QA models with generated pseudo-training data.	related papers	related patents
414	Unsupervised Alignment-based Iterative Evidence Retrieval for Multi-hop Question Answering	Vikas Yadav, Steven Bethard, Mihai Surdeanu	We introduce a simple, fast, and unsupervised iterative evidence retrieval method, which relies on three ideas: (a) an unsupervised alignment approach to soft-align questions and answers with justification sentences using only GloVe embeddings, (b) an iterative process that reformulates queries focusing on terms that are not covered by existing justifications, which (c) stops when the terms in the given question and candidate answers are covered by the retrieved justifications.	related papers	related patents
415	A Corpus for Large-Scale Phonetic Typology	Elizabeth Salesky, Eleanor Chodroff, Tiago Pimentel, Matthew Wiesner, Ryan Cotterell, Alan W Black, Jason Eisner	We present VoxClamantis v1.0, the first large-scale corpus for phonetic typology, with aligned segments and estimated phoneme-level labels in 690 readings spanning 635 languages, along with acoustic-phonetic measures of vowels and sibilants.	related papers	related patents
416	Dscorer: A Fast Evaluation Metric for Discourse Representation Structure Parsing	Jiangming Liu, Shay B. Cohen, Mirella Lapata	We introduce Dscorer, an efficient new metric which converts box-style DRSs to graphs and then measures the overlap of n-grams.	related papers	related patents
417	ParaCrawl: Web-Scale Acquisition of Parallel Corpora	Marta Bañón, Pinzhen Chen, Barry Haddow, Kenneth Heafield, Hieu Hoang, Miquel Esplà-Gomis, Mikel L. Forcada, Amir Kamran, Faheem Kirefu, Philipp Koehn, Sergio Ortiz Rojas, Leopoldo Pla Sempere, Gema Ramírez-Sánchez, Elsa Sarrías, Marek Strelec, Brian Thompson, William Waites, Dion Wiggins, Jaume Zaragoza	We report on methods to create the largest publicly available parallel corpora by crawling the web, using open source software.	related papers	related patents
418	Toward Gender-Inclusive Coreference Resolution	Yang Trista Cao, Hal Daumé III	Through these studies, conducted on English text, we confirm that without acknowledging and building systems that recognize the complexity of gender, we build systems that lead to many potential harms.	related papers	related patents
419	Human Attention Maps for Text Classification: Do Humans and Neural Networks Focus on the Same Words?	Cansu Sen, Thomas Hartvigsen, Biao Yin, Xiangnan Kong, Elke Rundensteiner	In this work, we conduct the first quantitative assessment of human versus computational attention mechanisms for the text classification task.	related papers	related patents
420	Information-Theoretic Probing for Linguistic Structure	Tiago Pimentel, Josef Valvoda, Rowan Hall Maudslay, Ran Zmigrod, Adina Williams, Ryan Cotterell	We propose an information-theoretic operationalization of probing as estimating mutual information that contradicts this received wisdom: one should always select the highest performing probe one can, even if it is more complex, since it will result in a tighter estimate, and thus reveal more of the linguistic information inherent in the representation.	related papers	related patents
421	On the Cross-lingual Transferability of Monolingual Representations	Mikel Artetxe, Sebastian Ruder, Dani Yogatama	More concretely, we first train a transformer-based masked language model on one language, and transfer it to a new language by learning a new embedding matrix with the same masked language modeling objective, freezing parameters of all other layers.	related papers	related patents
422	Similarity Analysis of Contextual Word Representation Models	John Wu, Yonatan Belinkov, Hassan Sajjad, Nadir Durrani, Fahim Dalvi, James Glass	This paper investigates contextual word representation models from the lens of similarity analysis.	related papers	related patents
423	SenseBERT: Driving Some Sense into BERT	Yoav Levine, Barak Lenz, Or Dagan, Ori Ram, Dan Padnos, Or Sharir, Shai Shalev-Shwartz, Amnon Shashua, Yoav Shoham	This paper proposes a method to employ weak-supervision directly at the word sense level.	related papers	related patents
424	ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations	Fernando Alva-Manchego, Louis Martin, Antoine Bordes, Carolina Scarton, Benoît Sagot, Lucia Specia	To alleviate this limitation, this paper introduces ASSET, a new dataset for assessing sentence simplification in English.	related papers	related patents
425	Fatality Killed the Cat or: BabelPic, a Multimodal Dataset for Non-Concrete Concepts	Agostina Calabrese, Michele Bevilacqua, Roberto Navigli	We fill this gap by presenting BabelPic, a hand-labeled dataset built by cleaning the image-synset association found within the BabelNet Lexical Knowledge Base (LKB).	related papers	related patents
426	Modeling Label Semantics for Predicting Emotional Reactions	Radhika Gaonkar, Heeyoung Kwon, Mohaddeseh Bastan, Niranjan Balasubramanian, Nathanael Chambers	In this work, we explicitly model label classes via label embeddings, and add mechanisms that track label-label correlations both during training and inference.	related papers	related patents
427	CraftAssist Instruction Parsing: Semantic Parsing for a Voxel-World Assistant	Kavya Srinet, Yacine Jernite, Jonathan Gray, arthur szlam	We propose a semantic parsing dataset focused on instruction-driven communication with an agent in the game Minecraft.	related papers	related patents
428	Don’t Say That! Making Inconsistent Dialogue Unlikely with Unlikelihood Training	Margaret Li, Stephen Roller, Ilia Kulikov, Sean Welleck, Y-Lan Boureau, Kyunghyun Cho, Jason Weston	We show that appropriate loss functions which regularize generated outputs to match human distributions are effective for the first three issues. For the last important general issue, we show applying unlikelihood to collected data of what a model should not do is effective for improving logical consistency, potentially paving the way to generative models with greater reasoning ability.	related papers	related patents
429	How does BERT’s attention change when you fine-tune? An analysis methodology and a case study in negation scope	Yiyun Zhao, Steven Bethard	We propose a procedure and analysis methods that take a hypothesis of how a transformer-based model might encode a linguistic phenomenon, and test the validity of that hypothesis based on a comparison between knowledge-related downstream tasks with downstream control tasks, and measurement of cross-dataset consistency.	related papers	related patents
430	Influence Paths for Characterizing Subject-Verb Number Agreement in LSTM Language Models	Kaiji Lu, Piotr Mardziel, Klas Leino, Matt Fredrikson, Anupam Datta	We introduce influence paths, a causal account of structural properties as carried by paths across gates and neurons of a recurrent neural network.	related papers	related patents
431	Interpreting Pretrained Contextualized Representations via Reductions to Static Embeddings	Rishi Bommasani, Kelly Davis, Claire Cardie	Consequently, we introduce simple and fully general methods for converting from contextualized representations to static lookup-table embeddings which we apply to 5 popular pretrained models and 9 sets of pretrained weights.	related papers	related patents
432	Learning to Deceive with Attention-Based Explanations	Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton	We call the latter use of attention mechanisms into question by demonstrating a simple method for training models to produce deceptive attention masks.	related papers	related patents
433	On the Spontaneous Emergence of Discrete and Compositional Signals	Nur Geffen Lan, Emmanuel Chemla, Shane Steinert-Threlkeld	We propose a general framework to study language emergence through signaling games with neural agents.	related papers	related patents
434	Spying on Your Neighbors: Fine-grained Probing of Contextual Embeddings for Information about Surrounding Words	Josef Klafka, Allyson Ettinger	To address this question, we introduce a suite of probing tasks that enable fine-grained testing of contextual embeddings for encoding of information about surrounding words.	related papers	related patents
435	Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA	Hyounghun Kim, Zineng Tang, Mohit Bansal	In this paper, we propose a video question answering model which effectively integrates multi-modal input sources and finds the temporally relevant information to answer questions.	related papers	related patents
436	Shaping Visual Representations with Language for Few-Shot Classification	Jesse Mu, Percy Liang, Noah Goodman	Instead, we propose language-shaped learning (LSL), an end-to-end model that regularizes visual representations to predict language.	related papers	related patents
437	Discrete Latent Variable Representations for Low-Resource Text Classification	Shuning Jin, Sam Wiseman, Karl Stratos, Karen Livescu	We consider several approaches to learning discrete latent variable models for text in the case where exact marginalization over these variables is intractable.	related papers	related patents
438	Learning Constraints for Structured Prediction Using Rectifier Networks	Xingyuan Pan, Maitrey Mehta, Vivek Srikumar	We frame the problem as that of training a two-layer rectifier network to identify valid structures or substructures, and show a construction for converting a trained network into a system of linear constraints over the inference variables.	related papers	related patents
439	Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models	Dan Iter, Kelvin Guu, Larry Lansing, Dan Jurafsky	We propose Conpono, an inter-sentence objective for pretraining language models that models discourse coherence and the distance between sentences.	related papers	related patents
440	A Recipe for Creating Multimodal Aligned Datasets for Sequential Tasks	Angela Lin, Sudha Rao, Asli Celikyilmaz, Elnaz Nouri, Chris Brockett, Debadeepta Dey, Bill Dolan	To address these challenges, we use an unsupervised alignment algorithm that learns pairwise alignments between instructions of different recipes for the same dish.	related papers	related patents
441	Adversarial NLI: A New Benchmark for Natural Language Understanding	Yixin Nie, Adina Williams, Emily Dinan, Mohit Bansal, Jason Weston, Douwe Kiela	We introduce a new large-scale NLI benchmark dataset, collected via an iterative, adversarial human-and-model-in-the-loop procedure.	related papers	related patents
442	Beyond Accuracy: Behavioral Testing of NLP Models with CheckList	Marco Tulio Ribeiro, Tongshuang Wu, Carlos Guestrin, Sameer Singh	Inspired by principles of behavioral testing in software engineering, we introduce CheckList, a task-agnostic methodology for testing NLP models.	related papers	related patents
443	Code and Named Entity Recognition in StackOverflow	Jeniya Tabassum, Mounica Maddela, Wei Xu, Alan Ritter	In this paper, we introduce a new named entity recognition (NER) corpus for the computer programming domain, consisting of 15,372 sentences annotated with 20 fine-grained entity types.	related papers	related patents
444	Dialogue-Based Relation Extraction	Dian Yu, Kai Sun, Claire Cardie, Dong Yu	We present the first human-annotated dialogue-based relation extraction (RE) dataset DialogRE, aiming to support the prediction of relation(s) between two arguments that appear in a dialogue.	related papers	related patents
445	Facet-Aware Evaluation for Extractive Summarization	Yuning Mao, Liyuan Liu, Qi Zhu, Xiang Ren, Jiawei Han	In this paper, we present a facet-aware evaluation setup for better assessment of the information coverage in extracted summaries.	related papers	related patents
446	More Diverse Dialogue Datasets via Diversity-Informed Data Collection	Katherine Stasaski, Grace Hui Yang, Marti A. Hearst	We introduce a new strategy to address this problem, called Diversity-Informed Data Collection.	related papers	related patents
447	S2ORC: The Semantic Scholar Open Research Corpus	Kyle Lo, Lucy Lu Wang, Mark Neumann, Rodney Kinney, Daniel Weld	We introduce S2ORC, a large corpus of 81.1M English-language academic papers spanning many academic disciplines.	related papers	related patents
448	Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation Metrics	Nitika Mathur, Timothy Baldwin, Trevor Cohn	We show that current methods for judging metrics are highly sensitive to the translations used for assessment, particularly the presence of outliers, which often leads to falsely confident conclusions about a metric’s efficacy.	related papers	related patents
449	A Transformer-based Approach for Source Code Summarization	Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang	To learn code representation for summarization, we explore the Transformer model that uses a self-attention mechanism and has shown to be effective in capturing long-range dependencies.	related papers	related patents
450	Asking and Answering Questions to Evaluate the Factual Consistency of Summaries	Alex Wang, Kyunghyun Cho, Mike Lewis	We propose QAGS (pronounced “kags”), an automatic evaluation protocol that is designed to identify factual inconsistencies in a generated summary.	related papers	related patents
451	Discourse-Aware Neural Extractive Text Summarization	Jiacheng Xu, Zhe Gan, Yu Cheng, Jingjing Liu	To address these issues, we present a discourse-aware neural summarization model – DiscoBert.	related papers	related patents
452	Discrete Optimization for Unsupervised Sentence Summarization with Word-Level Extraction	Raphael Schumann, Lili Mou, Yao Lu, Olga Vechtomova, Katja Markert	A good summary is characterized by language fluency and high information overlap with the source sentence. We model these two aspects in an unsupervised objective function, consisting of language modeling and semantic similarity metrics.	related papers	related patents
453	Exploring Content Selection in Summarization of Novel Chapters	Faisal Ladhak, Bryan Li, Yaser Al-Onaizan, Kathleen McKeown	We present a new summarization task, generating summaries of novel chapters using summary/chapter pairs from online study guides.	related papers	related patents
454	FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization	Esin Durmus, He He, Mona Diab	We tackle the problem of evaluating faithfulness of a generated summary given its source document.	related papers	related patents
455	Fact-based Content Weighting for Evaluating Abstractive Summarisation	Xinnuo Xu, Ondřej Dušek, Jingyi Li, Verena Rieser, Ioannis Konstas	We introduce a new evaluation metric which is based on fact-level content weighting, i.e. relating the facts of the document to the facts of the summary.	related papers	related patents
456	Hooks in the Headline: Learning to Generate Headlines with Controlled Styles	Di Jin, Zhijing Jin, Joey Tianyi Zhou, Lisa Orii, Peter Szolovits	We propose a new task, Stylistic Headline Generation (SHG), to enrich the headlines with three style options (humor, romance and clickbait), thus attracting more readers.	related papers	related patents
457	Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven Cloze Reward	Luyang Huang, Lingfei Wu, Lu Wang	In this paper, we present ASGARD, a novel framework for Abstractive Summarization with Graph-Augmentation and semantic-driven RewarD.	related papers	related patents
458	Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports	Yuhao Zhang, Derek Merck, Emily Tsai, Christopher D. Manning, Curtis Langlotz	In this work, we develop a general framework where we evaluate the factual correctness of a generated summary by fact-checking it automatically against its reference using an information extraction module.	related papers	related patents
459	Storytelling with Dialogue: A Critical Role Dungeons and Dragons Dataset	Revanth Rameshkumar, Peter Bailey	This paper describes the Critical Role Dungeons and Dragons Dataset (CRD3) and related analyses.	related papers	related patents
460	The Summary Loop: Learning to Write Abstractive Summaries Without Examples	Philippe Laban, Andrew Hsi, John Canny, Marti A. Hearst	This work presents a new approach to unsupervised abstractive summarization based on maximizing a combination of coverage and fluency for a given length constraint.	related papers	related patents
461	Unsupervised Opinion Summarization as Copycat-Review Generation	Arthur Bražinskas, Mirella Lapata, Ivan Titov	We define a generative model for a review collection which capitalizes on the intuition that when generating a new review given a set of other reviews of a product, we should be able to control the “amount of novelty” going into the new review or, equivalently, vary the extent to which it deviates from the input.	related papers	related patents
462	(Re)construing Meaning in NLP	Sean Trott, Tiago Timponi Torrent, Nancy Chang, Nathan Schneider	In this paper, we engage with an idea largely absent from discussions of meaning in natural language understanding-namely, that the way something is expressed reflects different ways of conceptualizing or construing the information being conveyed.	related papers	related patents
463	Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data	Emily M. Bender, Alexander Koller	In this position paper, we argue that a system trained only on form has a priori no way to learn meaning.	related papers	related patents
464	Examining Citations of Natural Language Processing Literature	Saif M. Mohammad	We extracted information from the ACL Anthology (AA) and Google Scholar (GS) to examine trends in citations of NLP papers.	related papers	related patents
465	How Can We Accelerate Progress Towards Human-like Linguistic Generalization?	Tal Linzen	This position paper describes and critiques the Pretraining-Agnostic Identically Distributed (PAID) evaluation paradigm, which has become a central tool for measuring progress in natural language understanding.	related papers	related patents
466	How Does NLP Benefit Legal System: A Summary of Legal Artificial Intelligence	Haoxi Zhong, Chaojun Xiao, Cunchao Tu, Tianyang Zhang, Zhiyuan Liu, Maosong Sun	In this paper, we introduce the history, the current state, and the future directions of research in LegalAI.	related papers	related patents
467	Intermediate-Task Transfer Learning with Pretrained Language Models: When and Why Does It Work?	Yada Pruksachatkun, Jason Phang, Haokun Liu, Phu Mon Htut, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, Samuel R. Bowman	To investigate this, we perform a large-scale study on the pretrained RoBERTa model with 110 intermediate-target task combinations.	related papers	related patents
468	Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview	Deven Santosh Shah, H. Andrew Schwartz, Dirk Hovy	In this paper, we propose a unifying predictive bias framework for NLP.	related papers	related patents
469	What Does BERT with Vision Look At?	Liunian Harold Li, Mark Yatskar, Da Yin, Cho-Jui Hsieh, Kai-Wei Chang	In this work, we demonstrate that certain attention heads of a visually grounded language model actively ground elements of language to image regions.	related papers	related patents
470	Balancing Objectives in Counseling Conversations: Advancing Forwards or Looking Backwards	Justine Zhang, Cristian Danescu-Niculescu-Mizil	In this work, we develop an unsupervised methodology to quantify how counselors manage this balance.	related papers	related patents
471	Detecting Perceived Emotions in Hurricane Disasters	Shrey Desai, Cornelia Caragea, Junyi Jessy Li	In this paper, we introduce HurricaneEmo, an emotion dataset of 15,000 English tweets spanning three hurricanes: Harvey, Irma, and Maria.	related papers	related patents
472	Hierarchical Modeling for User Personality Prediction: The Role of Message-Level Attention	Veronica Lynn, Niranjan Balasubramanian, H. Andrew Schwartz	In this paper, we present a novel model that uses message-level attention to learn the relative weight of users’ social media posts for assessing their five factor personality traits.	related papers	related patents
473	Measuring Forecasting Skill from Text	Shi Zong, Alan Ritter, Eduard Hovy	In this paper we explore connections between the language people use to describe their predictions and their forecasting skill.	related papers	related patents
474	Text and Causal Inference: A Review of Using Text to Remove Confounding from Causal Estimates	Katherine Keith, David Jensen, Brendan O’Connor	Despite increased attention on adjusting for confounding using text, there are still many open problems, which we highlight in this paper.	related papers	related patents
475	Text-Based Ideal Points	Keyon Vafa, Suresh Naidu, David Blei	In this paper, we introduce the text-based ideal point model (TBIP), an unsupervised probabilistic topic model that analyzes texts to quantify the political positions of its authors.	related papers	related patents
476	Understanding the Language of Political Agreement and Disagreement in Legislative Texts	Maryam Davoodi, Eric Waltenburg, Dan Goldwasser	In this paper, we take the first step towards a better understanding of these processes and the underlying dynamics that shape them, using data-driven methods.	related papers	related patents
477	Would you Rather? A New Benchmark for Learning Machine Alignment with Cultural Values and Social Preferences	Yi Tay, Donovan Ong, Jie Fu, Alvin Chan, Nancy Chen, Anh Tuan Luu, Chris Pal	Concretely, we present a new task and corpus for learning alignments between machine and human preferences.	related papers	related patents
478	Discourse as a Function of Event: Profiling Discourse Structure in News Articles around the Main Event	Prafulla Kumar Choubey, Aaron Lee, Ruihong Huang, Lu Wang	To enable computational modeling of news structures, we apply an existing theory of functional discourse structure for news articles that revolves around the main event and create a human-annotated corpus of 802 documents spanning over four domains and three media sources.	related papers	related patents
479	Harnessing the linguistic signal to predict scalar inferences	Sebastian Schuster, Yuxing Chen, Judith Degen	In this work, we explore to what extent neural network sentence encoders can learn to predict the strength of scalar inferences.	related papers	related patents
480	Implicit Discourse Relation Classification: We Need to Talk about Evaluation	Najoung Kim, Song Feng, Chulaka Gunasekara, Luis Lastras	In this work, we highlight these inconsistencies and propose an improved evaluation protocol.	related papers	related patents
481	PeTra: A Sparsely Supervised Memory Model for People Tracking	Shubham Toshniwal, Allyson Ettinger, Kevin Gimpel, Karen Livescu	We propose PeTra, a memory-augmented neural network designed to track entities in its memory slots.	related papers	related patents
482	ZPR2: Joint Zero Pronoun Recovery and Resolution using Multi-Task Learning and BERT	Linfeng Song, Kun Xu, Yue Zhang, Jianshu Chen, Dong Yu	We propose to better explore their interaction by solving both tasks together, while the previous work treats them separately.	related papers	related patents
483	Contextualizing Hate Speech Classifiers with Post-hoc Explanation	Brendan Kennedy, Xisen Jin, Aida Mostafazadeh Davani, Morteza Dehghani, Xiang Ren	We extract post-hoc explanations from fine-tuned BERT classifiers to detect bias towards identity terms. Then, we propose a novel regularization technique based on these explanations that encourages models to learn from the context of group identifiers in addition to the identifiers themselves.	related papers	related patents
484	Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation	Tianlu Wang, Xi Victoria Lin, Nazneen Fatema Rajani, Bryan McCann, Vicente Ordonez, Caiming Xiong	We propose a simple but effective technique, Double Hard Debias, which purifies the word embeddings against such corpus regularities prior to inferring and removing the gender subspace.	related papers	related patents
485	Language (Technology) is Power: A Critical Survey of “Bias” in NLP	Su Lin Blodgett, Solon Barocas, Hal Daumé III, Hanna Wallach	Based on these findings, we describe the beginnings of a path forward by proposing three recommendations that should guide work analyzing “bias” in NLP systems.	related papers	related patents
486	Social Bias Frames: Reasoning about Social and Power Implications of Language	Maarten Sap, Saadia Gabriel, Lianhui Qin, Dan Jurafsky, Noah A. Smith, Yejin Choi	We introduce Social Bias Frames, a new conceptual formalism that aims to model the pragmatic frames in which people project social biases and stereotypes onto others.	related papers	related patents
487	Social Biases in NLP Models as Barriers for Persons with Disabilities	Ben Hutchinson, Vinodkumar Prabhakaran, Emily Denton, Kellie Webster, Yu Zhong, Stephen Denuyl	In this paper, we present evidence of such undesirable biases towards mentions of disability in two different English language models: toxicity prediction and sentiment analysis.	related papers	related patents
488	Towards Debiasing Sentence Representations	Paul Pu Liang, Irene Mengze Li, Emily Zheng, Yao Chong Lim, Ruslan Salakhutdinov, Louis-Philippe Morency	In this paper, we investigate the presence of social biases in sentence-level representations and propose a new method, Sent-Debias, to reduce these biases.	related papers	related patents
489	A Re-evaluation of Knowledge Graph Completion Methods	Zhiqing Sun, Shikhar Vashishth, Soumya Sanyal, Partha Talukdar, Yiming Yang	In this paper, we find that this can be attributed to the inappropriate evaluation protocol used by them and propose a simple evaluation protocol to address this problem.	related papers	related patents
490	Cross-Linguistic Syntactic Evaluation of Word Prediction Models	Aaron Mueller, Garrett Nicolai, Panayiota Petrou-Zeniou, Natalia Talmina, Tal Linzen	To investigate how these models’ ability to learn syntax varies by language, we introduce CLAMS (Cross-Linguistic Assessment of Models on Syntax), a syntactic evaluation suite for monolingual and multilingual models.	related papers	related patents
491	Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior?	Peter Hase, Mohit Bansal	Through two kinds of simulation tests involving text and tabular data, we evaluate five explanations methods: (1) LIME, (2) Anchor, (3) Decision Boundary, (4) a Prototype model, and (5) a Composite approach that combines explanations from each method.	related papers	related patents
492	Explaining Black Box Predictions and Unveiling Data Artifacts through Influence Functions	Xiaochuang Han, Byron C. Wallace, Yulia Tsvetkov	In this work, we investigate the use of influence functions for NLP, providing an alternative approach to interpreting neural text classifiers.	related papers	related patents
493	Finding Universal Grammatical Relations in Multilingual BERT	Ethan A. Chi, John Hewitt, Christopher D. Manning	Motivated by these results, we present an unsupervised analysis method that provides evidence mBERT learns representations of syntactic dependency labels, in the form of clusters which largely agree with the Universal Dependencies taxonomy.	related papers	related patents
494	Generating Hierarchical Explanations on Text Classification via Feature Interaction Detection	Hanjie Chen, Guangtao Zheng, Yangfeng Ji	In this work, we build hierarchical explanations by detecting feature interactions.	related papers	related patents
495	Obtaining Faithful Interpretations from Compositional Neural Networks	Sanjay Subramanian, Ben Bogin, Nitish Gupta, Tomer Wolfson, Sameer Singh, Jonathan Berant, Matt Gardner	In this work, we propose and conduct a systematic evaluation of the intermediate outputs of NMNs on NLVR2 and DROP, two datasets which require composing multiple reasoning steps.	related papers	related patents
496	Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport	Kyle Swanson, Lili Yu, Tao Lei	In this work, we extend this selective rationalization approach to text matching, where the goal is to jointly select and align text pieces, such as tokens or sentences, as a justification for the downstream prediction.	related papers	related patents
497	Benefits of Intermediate Annotations in Reading Comprehension	Dheeru Dua, Sameer Singh, Matt Gardner	In this work, we study the benefits of collecting intermediate reasoning supervision along with the answer during data collection.	related papers	related patents
498	Crossing Variational Autoencoders for Answer Retrieval	Wenhao Yu, Lingfei Wu, Qingkai Zeng, Shu Tao, Yu Deng, Meng Jiang	In this work, we propose to cross variational auto-encoders by generating questions with aligned answers and generating answers with aligned questions.	related papers	related patents
499	Logic-Guided Data Augmentation and Regularization for Consistent Question Answering	Akari Asai, Hannaneh Hajishirzi	This paper addresses the problem of improving the accuracy and consistency of responses to comparison questions by integrating logic rules and neural models.	related papers	related patents
500	On the Importance of Diversity in Question Generation for QA	Md Arafat Sultan, Shubham Chandel, Ramón Fernandez Astudillo, Vittorio Castelli	In this paper we ask: Is textual diversity in QG beneficial for downstream QA?	related papers	related patents
501	Probabilistic Assumptions Matter: Improved Models for Distantly-Supervised Document-Level Question Answering	Hao Cheng, Ming-Wei Chang, Kenton Lee, Kristina Toutanova	We address the problem of extractive question answering using document-level distant super-vision, pairing questions and relevant documents with answer strings.	related papers	related patents
502	SCDE: Sentence Cloze Dataset with High Quality Distractors From Examinations	Xiang Kong, Varun Gangal, Eduard Hovy	We introduce SCDE, a dataset to evaluate the performance of computational models through sentence prediction.	related papers	related patents
503	Selective Question Answering under Domain Shift	Amita Kamath, Robin Jia, Percy Liang	In this work, we propose the setting of selective question answering under domain shift, in which a QA model is tested on a mixture of in-domain and out-of-domain data, and must answer (i.e., not abstain on) as many questions as possible while maintaining high accuracy.	related papers	related patents
504	The Cascade Transformer: an Application for Efficient Answer Sentence Selection	Luca Soldaini, Alessandro Moschitti	In this paper, we introduce the Cascade Transformer, a simple yet effective technique to adapt transformer-based models into a cascade of rankers.	related papers	related patents
505	Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-based Question Answering	Changmao Li, Jinho D. Choi	We introduce a novel approach to transformers that learns hierarchical representations in multiparty dialogue.	related papers	related patents
506	Not All Claims are Created Equal: Choosing the Right Statistical Approach to Assess Hypotheses	Erfan Sadeqi Azer, Daniel Khashabi, Ashish Sabharwal, Dan Roth	We address this gap by contrasting various hypothesis assessment techniques, especially those not commonly used in the field (such as evaluations based on Bayesian inference).	related papers	related patents
507	STARC: Structured Annotations for Reading Comprehension	Yevgeni Berzak, Jonathan Malmaud, Roger Levy	We present STARC (Structured Annotations for Reading Comprehension), a new annotation framework for assessing reading comprehension with multiple choice questions.	related papers	related patents
508	WinoWhy: A Deep Diagnosis of Essential Commonsense Knowledge for Answering Winograd Schema Challenge	Hongming Zhang, Xinran Zhao, Yangqiu Song	In this paper, we present the first comprehensive categorization of essential commonsense knowledge for answering the Winograd Schema Challenge (WSC).	related papers	related patents
509	Agreement Prediction of Arguments in Cyber Argumentation for Detecting Stance Polarity and Intensity	Joseph Sirrianni, Xiaoqing Liu, Douglas Adams	We introduce a new research problem, stance polarity and intensity prediction in response relationships between posts.	related papers	related patents
510	Cross-Lingual Unsupervised Sentiment Classification with Multi-View Transfer Learning	Hongliang Fei, Ping Li	We propose an unsupervised cross-lingual sentiment classification model named multi-view encoder-classifier (MVEC) that leverages an unsupervised machine translation (UMT) system and a language discriminator.	related papers	related patents
511	Efficient Pairwise Annotation of Argument Quality	Lukas Gienapp, Benno Stein, Matthias Hagen, Martin Potthast	We present an efficient annotation framework for argument quality, a feature difficult to be measured reliably as per previous work.	related papers	related patents
512	Entity-Aware Dependency-Based Deep Graph Attention Network for Comparative Preference Classification	Nianzu Ma, Sahisnu Mazumder, Hao Wang, Bing Liu	This paper proposes a novel Entity-aware Dependency-based Deep Graph Attention Network (ED-GAT) that employs a multi-hop graph attention over a dependency graph sentence representation to leverage both the semantic information from word embeddings and the syntactic information from the dependency graph to solve the problem.	related papers	related patents
513	OpinionDigest: A Simple Framework for Opinion Summarization	Yoshihiko Suhara, Xiaolan Wang, Stefanos Angelidis, Wang-Chiew Tan	We present OpinionDigest, an abstractive opinion summarization framework, which does not rely on gold-standard summaries for training.	related papers	related patents
514	A Comprehensive Analysis of Preprocessing for Word Representation Learning in Affective Tasks	Nastaran Babanejad, Ameeta Agrawal, Aijun An, Manos Papagelis	To address this limitation, we conduct a comprehensive analysis of the role of preprocessing techniques in affective analysis based on word vector models.	related papers	related patents
515	Diverse and Informative Dialogue Generation with Context-Specific Commonsense Knowledge Awareness	Sixing Wu, Ying Li, Dawei Zhang, Yang Zhou, Zhonghai Wu	To this end, this paper proposes a novel commonsense knowledge-aware dialogue generation model, ConKADI. We collect and build a large-scale Chinese dataset aligned with the commonsense knowledge for dialogue generation.	related papers	related patents
516	Generate, Delete and Rewrite: A Three-Stage Framework for Improving Persona Consistency of Dialogue Generation	Haoyu Song, Yan Wang, Wei-Nan Zhang, Xiaojiang Liu, Ting Liu	In this work, we introduce a three-stage framework that employs a generate-delete-rewrite mechanism to delete inconsistent words from a generated response prototype and further rewrite it to a personality-consistent one.	related papers	related patents
517	Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks	YIPING SONG, Zequn Liu, Wei Bi, Rui Yan, Ming Zhang	In this paper, we propose an algorithm that can customize a unique dialogue model for each task in the few-shot setting.	related papers	related patents
518	Video-Grounded Dialogues with Pretrained Generation Language Models	Hung Le, Steven C.H. Hoi	In this paper, we leverage the power of pre-trained language models for improving video-grounded dialogue, which is very challenging and involves complex features of different dynamics: (1) Video features which can extend across both spatial and temporal dimensions; and (2) Dialogue features which involve semantic dependencies over multiple dialogue turns.	related papers	related patents
519	A Unified MRC Framework for Named Entity Recognition	Xiaoya Li, Jingrong Feng, Yuxian Meng, Qinghong Han, Fei Wu, Jiwei Li	In this paper, we propose a unified framework that is capable of handling both flat and nested NER tasks.	related papers	related patents
520	An Effective Transition-based Model for Discontinuous NER	Xiang Dai, Sarvnaz Karimi, Ben Hachey, Cecile Paris	We propose a simple, effective transition-based model with generic neural encoding for discontinuous NER.	related papers	related patents
521	IMoJIE: Iterative Memory-Based Joint Open Information Extraction	Keshav Kolluru, Samarth Aggarwal, Vipul Rathore, Mausam, Soumen Chakrabarti	We present IMoJIE, an extension to CopyAttention, which produces the next extraction conditioned on all previously extracted tuples.	related papers	related patents
522	Improving Event Detection via Open-domain Trigger Knowledge	Meihan Tong, Bin Xu, Shuai Wang, Yixin Cao, Lei Hou, Juanzi Li, Jun Xie	To address the issue, we propose a novel Enrichment Knowledge Distillation (EKD) model to leverage external open-domain trigger knowledge to reduce the in-built biases to frequent trigger words in annotations.	related papers	related patents
523	Improving Low-Resource Named Entity Recognition using Joint Sentence and Token Labeling	Canasai Kruengkrai, Thien Hai Nguyen, Sharifah Mahani Aljunied, Lidong Bing	We present a joint model that supports multi-class classification and introduce a simple variant of self-attention that allows the model to learn scaling factors.	related papers	related patents
524	Multi-Cell Compositional LSTM for NER Domain Adaptation	Chen Jia, Yue Zhang	We investigate a multi-cell compositional LSTM structure for multi-task learning, modeling each entity type using a separate cell state.	related papers	related patents
525	Pyramid: A Layered Model for Nested Named Entity Recognition	Jue WANG, Lidan Shou, Ke Chen, Gang Chen	This paper presents Pyramid, a novel layered model for Nested Named Entity Recognition (nested NER).	related papers	related patents
526	ReInceptionE: Relation-Aware Inception Network with Joint Local-Global Structural Information for Knowledge Graph Embedding	Zhiwen Xie, Guangyou Zhou, Jin Liu, Jimmy Xiangji Huang	In this paper, we take the benefits of ConvE and KBGAT together and propose a Relation-aware Inception network with joint local-global structural information for knowledge graph Embedding (ReInceptionE).	related papers	related patents
527	Relabel the Noise: Joint Extraction of Entities and Relations via Cooperative Multiagents	Daoyuan Chen, Yaliang Li, Kai Lei, Ying Shen	We propose a joint extraction approach to address this problem by re-labeling noisy instances with a group of cooperative multiagents.	related papers	related patents
528	Simplify the Usage of Lexicon in Chinese NER	Ruotian Ma, Minlong Peng, Qi Zhang, Zhongyu Wei, Xuanjing Huang	In this work, we propose a simple but effective method for incorporating the word lexicon into the character representations.	related papers	related patents
529	AdvAug: Robust Adversarial Augmentation for Neural Machine Translation	Yong Cheng, Lu Jiang, Wolfgang Macherey, Jacob Eisenstein	In this paper, we propose a new adversarial augmentation method for Neural Machine Translation (NMT).	related papers	related patents
530	Contextual Neural Machine Translation Improves Translation of Cataphoric Pronouns	KayYen Wong, Sameen Maruf, Gholamreza Haffari	In this work, we investigate the effect of future sentences as context by comparing the performance of a contextual NMT model trained with the future context to the one trained with the past context.	related papers	related patents
531	Improving Neural Machine Translation with Soft Template Prediction	Jian Yang, Shuming Ma, Dongdong Zhang, Zhoujun Li, Ming Zhou	Inspired by the success of template-based and syntax-based approaches in other fields, we propose to use extracted templates from tree structures as soft target templates to guide the translation procedure.	related papers	related patents
532	Tagged Back-translation Revisited: Why Does It Really Work?	Benjamin Marie, Raphael Rubino, Atsushi Fujita	In this paper, we show that neural machine translation (NMT) systems trained on large back-translated data overfit some of the characteristics of machine-translated texts.	related papers	related patents
533	Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation	Shun-Po Chuang, Tzu-Wei Sung, Alexander H. Liu, Hung-yi Lee	Because whether the output of the recognition decoder has the correct semantics is more critical than its accuracy, we propose to improve the multitask ST model by utilizing word embedding as the intermediate.	related papers	related patents
534	Neural-DINF: A Neural Network based Framework for Measuring Document Influence	Jie Tan, Changlin Yang, Ying Li, Siliang Tang, Chen Huang, Yueting Zhuang	In this paper, we use both frequency changes and word semantic shifts to measure document influence by developing a neural network framework.	related papers	related patents
535	Paraphrase Generation by Learning How to Edit from Samples	Amirhossein Kazemnejad, Mohammadreza Salehi, Mahdieh Soleymani Baghshah	To address these problems, we propose a novel retrieval-based method for paraphrase generation.	related papers	related patents
536	Emerging Cross-lingual Structure in Pretrained Language Models	Alexis Conneau, Shijie Wu, Haoran Li, Luke Zettlemoyer, Veselin Stoyanov	We study the problem of multilingual masked language modeling, i.e. the training of a single model on concatenated text from multiple languages, and present a detailed study of several factors that influence why these models are so effective for cross-lingual transfer.	related papers	related patents
537	FastBERT: a Self-distilling BERT with Adaptive Inference Time	Weijie Liu, Peng Zhou, Zhiruo Wang, Zhe Zhao, Haotang Deng, QI JU	To improve their efficiency with an assured model performance, we propose a novel speed-tunable FastBERT with adaptive inference time.	related papers	related patents
538	Incorporating External Knowledge through Pre-training for Natural Language to Code Generation	Frank F. Xu, Zhengbao Jiang, Pengcheng Yin, Bogdan Vasilescu, Graham Neubig	Motivated by the intuition that developers usually retrieve resources on the web when writing code, we explore the effectiveness of incorporating two varieties of external knowledge into NL-to-code generation: automatically mined NL-code pairs from the online programming QA forum StackOverflow and programming language API documentation.	related papers	related patents
539	LogicalFactChecker: Leveraging Logical Operations for Fact Checking with Graph Module Network	Wanjun Zhong, Duyu Tang, Zhangyin Feng, Nan Duan, Ming Zhou, Ming Gong, Linjun Shou, Daxin Jiang, Jiahai Wang, Jian Yin	In this work, we propose LogicalFactChecker, a neural network approach capable of leveraging logical operations for fact checking.	related papers	related patents
540	Word-level Textual Adversarial Attacking as Combinatorial Optimization	Yuan Zang, Fanchao Qi, Chenghao Yang, Zhiyuan Liu, Meng Zhang, Qun Liu, Maosong Sun	In this paper, we propose a novel attack model, which incorporates the sememe-based word substitution method and particle swarm optimization-based search algorithm to solve the two problems separately.	related papers	related patents
541	Benchmarking Multimodal Regex Synthesis with Complex Structures	Xi Ye, Qiaochu Chen, Isil Dillig, Greg Durrett	We introduce StructuredRegex, a new regex synthesis dataset differing from prior ones in three aspects.	related papers	related patents
542	Curriculum Learning for Natural Language Understanding	Benfeng Xu, Licheng Zhang, Zhendong Mao, Quan Wang, Hongtao Xie, Yongdong Zhang	However, examples in NLU tasks can vary greatly in difficulty, and similar to human learning procedure, language models can benefit from an easy-to-difficult curriculum. Based on this idea, we propose our Curriculum Learning approach.	related papers	related patents
543	Do Neural Models Learn Systematicity of Monotonicity Inference in Natural Language?	Hitomi Yanaka, Koji Mineshima, Daisuke Bekki, Kentaro Inui	In this paper, we introduce a method for evaluating whether neural models can learn systematicity of monotonicity inference in natural language, namely, the regularity for performing arbitrary inferences with generalization on composition.	related papers	related patents
544	Evidence-Aware Inferential Text Generation with Vector Quantised Variational AutoEncoder	Daya Guo, Duyu Tang, Nan Duan, Jian Yin, Daxin Jiang, Ming Zhou	To address this, we propose an approach that automatically finds evidence for an event from a large text corpus, and leverages the evidence to guide the generation of inferential texts.	related papers	related patents
545	How to Ask Good Questions? Try to Leverage Paraphrases	Xin Jia, Wenjie Zhou, Xu SUN, Yunfang Wu	Specifically, we present a two-hand hybrid model leveraging a self-built paraphrase resource, which is automatically conducted by a simple back-translation method.	related papers	related patents
546	NeuInfer: Knowledge Inference on N-ary Facts	Saiping Guan, Xiaolong Jin, Jiafeng Guo, Yuanzhuo Wang, Xueqi Cheng	We represent each n-ary fact as a primary triple coupled with a set of its auxiliary descriptive attribute-value pair(s).	related papers	related patents
547	Neural Graph Matching Networks for Chinese Short Text Matching	Lu Chen, Yanbin Zhao, Boer Lyu, Lesheng Jin, Zhi Chen, Su Zhu, Kai Yu	To address this problem, we propose neural graph matching networks, a novel sentence matching framework capable of dealing with multi-granular input information.	related papers	related patents
548	Neural Mixed Counting Models for Dispersed Topic Discovery	Jiemin Wu, Yanghui Rao, Zusheng Zhang, Haoran Xie, Qing Li, Fu Lee Wang, Ziye Chen	In this paper, we propose two efficient neural mixed counting models, i.e., the Negative Binomial-Neural Topic Model (NB-NTM) and the Gamma Negative Binomial-Neural Topic Model (GNB-NTM) for dispersed topic discovery.	related papers	related patents
549	Reasoning Over Semantic-Level Graph for Fact Checking	Wanjun Zhong, Jingjing Xu, Duyu Tang, Zenan Xu, Nan Duan, Ming Zhou, Jiahai Wang, Jian Yin	In this work, we present a method suitable for reasoning about the semantic-level structure of evidence.	related papers	related patents
550	Automatic Generation of Citation Texts in Scholarly Papers: A Pilot Study	Xinyu Xing, Xiaosheng Fan, Xiaojun Wan	In this paper, we study the challenging problem of automatic generation of citation texts in scholarly papers.	related papers	related patents
551	Composing Elementary Discourse Units in Abstractive Summarization	Zhenwen Li, Wenhao Wu, Sujian Li	In this paper, we argue that elementary discourse unit (EDU) is a more appropriate textual unit of content selection than the sentence unit in abstractive summarization.	related papers	related patents
552	Extractive Summarization as Text Matching	Ming Zhong, Pengfei Liu, Yiran Chen, Danqing Wang, Xipeng Qiu, Xuanjing Huang	This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems.	related papers	related patents
553	Heterogeneous Graph Neural Networks for Extractive Document Summarization	Danqing Wang, Pengfei Liu, Yining Zheng, Xipeng Qiu, Xuanjing Huang	In this paper, we present a heterogeneous graph-based neural network for extractive summarization (HETERSUMGRAPH), which contains semantic nodes of different granularity levels apart from sentences.	related papers	related patents
554	Jointly Learning to Align and Summarize for Neural Cross-Lingual Summarization	Yue Cao, Hui Liu, Xiaojun Wan	In this paper, we propose to ease the cross-lingual summarization training by jointly learning to align and summarize.	related papers	related patents
555	Leveraging Graph to Improve Abstractive Multi-Document Summarization	Wei Li, Xinyan Xiao, Jiachen Liu, Hua Wu, Haifeng Wang, Junping Du	In this paper, we develop a neural abstractive multi-document summarization (MDS) model which can leverage well-known graph representations of documents such as similarity graph and discourse graph, to more effectively process multiple input documents and produce abstractive summaries.	related papers	related patents
556	Multi-Granularity Interaction Network for Extractive and Abstractive Multi-Document Summarization	Hanqi Jin, Tianming Wang, Xiaojun Wan	In this paper, we propose a multi-granularity interaction network for extractive and abstractive multi-document summarization, which jointly learn semantic representations for words, sentences, and documents.	related papers	related patents
557	Tetra-Tagging: Word-Synchronous Parsing with Linear-Time Inference	Nikita Kitaev, Dan Klein	We present a constituency parsing algorithm that, like a supertagger, works by assigning labels to each word in a sentence.	related papers	related patents
558	Are we Estimating or Guesstimating Translation Quality?	Shuo Sun, Francisco Guzmán, Lucia Specia	Our findings suggest that although QE models might capture fluency of translated sentences and complexity of source sentences, they cannot model adequacy of translations effectively.	related papers	related patents
559	Language (Re)modelling: Towards Embodied Language Understanding	Ronen Tamari, Chen Shani, Tom Hope, Miriam R L Petruck, Omri Abend, Dafna Shahaf	This work proposes an approach to representation and learning based on the tenets of embodied cognitive linguistics (ECL).	related papers	related patents
560	The State and Fate of Linguistic Diversity and Inclusion in the NLP World	Pratik Joshi, Sebastin Santy, Amar Budhiraja, Kalika Bali, Monojit Choudhury	In this paper we look at the relation between the types of languages, resources, and their representation in NLP conferences to understand the trajectory that different languages have followed over time.	related papers	related patents
561	The Unstoppable Rise of Computational Linguistics in Deep Learning	James Henderson	In this paper, we trace the history of neural networks applied to natural language understanding tasks, and identify key contributions which the nature of language has made to the development of neural network architectures.	related papers	related patents
562	To Boldly Query What No One Has Annotated Before? The Frontiers of Corpus Querying	Markus Gärtner, Kerstin Jung	This paper offers a broad overview of the history of corpora and corpus query tools.	related papers	related patents
563	A Contextual Hierarchical Attention Network with Adaptive Objective for Dialogue State Tracking	Yong Shan, Zekang Li, Jinchao Zhang, Fandong Meng, Yang Feng, Cheng Niu, Jie Zhou	In this paper, we propose to enhance the DST through employing a contextual hierarchical attention network to not only discern relevant information at both word level and turn level but also learn contextual representations.	related papers	related patents
564	Data Manipulation: Towards Effective Instance Learning for Neural Dialogue Generation via Learning to Augment and Reweight	Hengyi Cai, Hongshen Chen, Yonghao Song, Cheng Zhang, Xiaofang Zhao, Dawei Yin	In this paper, we propose a data manipulation framework to proactively reshape the data distribution towards reliable samples by augmenting and highlighting effective learning samples as well as reducing the effect of inefficient samples simultaneously.	related papers	related patents
565	Dynamic Fusion Network for Multi-Domain End-to-end Task-Oriented Dialog	Libo Qin, Xiao Xu, Wanxiang Che, Yue Zhang, Ting Liu	To this end, we investigate methods that can make explicit use of domain knowledge and introduce a shared-private network to learn shared and specific knowledge.	related papers	related patents
566	Learning Efficient Dialogue Policy from Demonstrations through Shaping	Huimin Wang, Baolin Peng, Kam-Fai Wong	In this paper, we present S{\^{}}2Agent that efficiently learns dialogue policy from demonstrations through policy shaping and reward shaping.	related papers	related patents
567	SAS: Dialogue State Tracking via Slot Attention and Slot Information Sharing	Jiaying Hu, Yan Yang, Chencai Chen, liang he, Zhou Yu	We propose a Dialogue State Tracker with Slot Attention and Slot Information Sharing (SAS) to reduce redundant information’s interference and improve long dialogue context tracking.	related papers	related patents
568	Speaker Sensitive Response Evaluation Model	JinYeong Bak, Alice Oh	In this paper, we propose an automatic evaluation model based on that idea and learn the model parameters from an unlabeled conversation corpus.	related papers	related patents
569	A Top-down Neural Architecture towards Text-level Parsing of Discourse Rhetorical Structure	Longyin Zhang, Yuqing Xing, Fang Kong, Peifeng Li, Guodong Zhou	In this paper, we justify from both computational and perceptive points-of-view that the top-down architecture is more suitable for text-level DRS parsing.	related papers	related patents
570	Amalgamation of protein sequence, structure and textual information for improving protein-protein interaction identification	pratik Dutta, Sriparna Saha	In this paper, we argue that incorporating multimodal cues can improve the automatic identification of PPI.	related papers	related patents
571	Bipartite Flat-Graph Network for Nested Named Entity Recognition	Ying Luo, Hai Zhao	In this paper, we propose a novel bipartite flat-graph network (BiFlaG) for nested named entity recognition (NER), which contains two subgraph modules: a flat NER module for outermost entities and a graph module for all the entities located in inner layers.	related papers	related patents
572	Connecting Embeddings for Knowledge Graph Entity Typing	Yu Zhao, anxiang zhang, Ruobing Xie, Kang Liu, Xiaojie WANG	In this paper, we propose a novel approach for KG entity typing which is trained by jointly utilizing local typing knowledge from existing entity type assertions and global triple knowledge in KGs.	related papers	related patents
573	Continual Relation Learning via Episodic Memory Activation and Reconsolidation	Xu Han, Yi Dai, Tianyu Gao, Yankai Lin, Zhiyuan Liu, Peng Li, Maosong Sun, Jie Zhou	Inspired by the mechanism in human long-term memory formation, we introduce episodic memory activation and reconsolidation (EMAR) to continual relation learning.	related papers	related patents
574	Handling Rare Entities for Neural Sequence Labeling	Yangming Li, Han Li, Kaisheng Yao, Xiaolong Li	Most of test set entities appear only few times and are even unseen in training corpus, yielding large number of out-of-vocabulary (OOV) and low-frequency (LF) entities during evaluation. In this work, we propose approaches to address this problem.	related papers	related patents
575	Instance-Based Learning of Span Representations: A Case Study through Named Entity Recognition	Hiroki Ouchi, Jun Suzuki, Sosuke Kobayashi, Sho Yokoi, Tatsuki Kuribayashi, Ryuto Konno, Kentaro Inui	In this study, we develop models possessing interpretable inference process for structured prediction.	related papers	related patents
576	MIE: A Medical Information Extractor towards Medical Dialogues	Yuanzhe Zhang, Zhongtao Jiang, Tao Zhang, Shiwan Liu, Jiarun Cao, Kang Liu, Shengping Liu, Jun Zhao	We then propose a Medical Information Extractor (MIE) towards medical dialogues. MIE is able to extract mentioned symptoms, surgeries, tests, other information and their corresponding status.	related papers	related patents
577	Named Entity Recognition as Dependency Parsing	Juntao Yu, Bernd Bohnet, Massimo Poesio	In this paper, we use ideas from graph-based dependency parsing to provide our model a global view on the input via a biaffine model (Dozat and Manning, 2017).	related papers	related patents
578	Neighborhood Matching Network for Entity Alignment	Yuting Wu, Xiao Liu, Yansong Feng, Zheng Wang, Dongyan Zhao	This paper presents Neighborhood Matching Network (NMN), a novel entity alignment framework for tackling the structural heterogeneity challenge.	related papers	related patents
579	Relation Extraction with Explanation	Hamed Shahbazi, Xiaoli Fern, Reza Ghaeini, Prasad Tadepalli	In this work we annotate a test set with ground-truth sentence-level explanations to evaluate the quality of explanations afforded by the relation extraction models.	related papers	related patents
580	Representation Learning for Information Extraction from Form-like Documents	Bodhisattwa Prasad Majumder, Navneet Potti, Sandeep Tata, James Bradley Wendt, Qi Zhao, Marc Najork	We propose a novel approach using representation learning for tackling the problem of extracting structured information from form-like document images.	related papers	related patents
581	Single-/Multi-Source Cross-Lingual NER via Teacher-Student Learning on Unlabeled Data in Target Language	Qianhui Wu, Zijia Lin, Börje Karlsson, Jian-Guang LOU, Biqing Huang	In this paper, we propose a teacher-student learning method to address such limitations, where NER models in the source languages are used as teachers to train a student model on unlabeled data in the target language.	related papers	related patents
582	Synchronous Double-channel Recurrent Network for Aspect-Opinion Pair Extraction	Shaowei Chen, Jie Liu, Yu Wang, Wenzheng Zhang, Ziming Chi	In this paper, we explore Aspect-Opinion Pair Extraction (AOPE) task, which aims at extracting aspects and opinion expressions in pairs. To verify the performance of SDRN, we manually build three datasets based on SemEval 2014 and 2015 benchmarks.	related papers	related patents
583	Cross-modal Coherence Modeling for Caption Generation	Malihe Alikhani, Piyush Sharma, Shengjie Li, Radu Soricut, Matthew Stone	We introduce a new task for learning inferences in imagery and text, coherence relation prediction, and show that these coherence annotations can be exploited to learn relation classifiers as an intermediary step, and also train coherence-aware, controllable image captioning models.	related papers	related patents
584	Knowledge Supports Visual Language Grounding: A Case Study on Colour Terms	Simeon Schüz, Sina Zarrieß	We go beyond previous studies on colour terms using isolated colour swatches and study visual grounding of colour terms in realistic objects.	related papers	related patents
585	Span-based Localizing Network for Natural Language Video Localization	Hao Zhang, Aixin Sun, Wei Jing, Joey Tianyi Zhou	In this work, we address NLVL task with a span-based QA approach by treating the input video as text passage.	related papers	related patents
586	Words Aren’t Enough, Their Order Matters: On the Robustness of Grounding Visual Referring Expressions	Arjun Akula, Spandana Gella, Yaser Al-Onaizan, Song-Chun Zhu, Siva Reddy	Using these datasets, we empirically show that existing methods fail to exploit linguistic structure and are 12% to 23% lower in performance than the established progress for this task.	related papers	related patents
587	A Mixture of h – 1 Heads is Better than h Heads	Hao Peng, Roy Schwartz, Dianqi Li, Noah A. Smith	In this work, we instead “reallocate” them-the model learns to activate different heads on different inputs.	related papers	related patents
588	Dependency Graph Enhanced Dual-transformer Structure for Aspect-based Sentiment Classification	Hao Tang, Donghong Ji, Chenliang Li, Qiji Zhou	To this end, we propose a dependency graph enhanced dual-transformer network (named DGEDT) by jointly considering the flat representations learnt from Transformer and graph-based representations learnt from the corresponding dependency graph in an iterative interaction manner.	related papers	related patents
589	Differentiable Window for Dynamic Local Attention	Thanh-Tung Nguyen, Xuan-Phi Nguyen, Shafiq Joty, Xiaoli Li	We propose Differentiable Window, a new neural module and general purpose component for dynamic window selection.	related papers	related patents
590	Evaluating and Enhancing the Robustness of Neural Network-based Dependency Parsing Models with Adversarial Examples	Xiaoqing Zheng, Jiehang Zeng, Yi Zhou, Cho-Jui Hsieh, Minhao Cheng, Xuanjing Huang	In this study, we show that adversarial examples also exist in dependency parsing: we propose two approaches to study where and how parsers make mistakes by searching over perturbations to existing texts at sentence and phrase levels, and design algorithms to construct such examples in both of the black-box and white-box settings.	related papers	related patents
591	Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach	Wenyu Du, Zhouhan Lin, Yikang Shen, Timothy J. O’Donnell, Yoshua Bengio, Yue Zhang	In this paper, we make use of a multi-task objective, i.e., the models simultaneously predict words as well as ground truth parse trees in a form called “syntactic distances”, where information between these two separate objectives shares the same intermediate representation.	related papers	related patents
592	Learning Architectures from an Extended Search Space for Language Modeling	Yinqiao Li, Chi Hu, Yuhao Zhang, Nuo Xu, Yufan Jiang, Tong Xiao, Jingbo Zhu, Tongran Liu, changliang li	Neural architecture search (NAS) has advanced significantly in recent years but most NAS systems restrict search to learning architectures of a recurrent or convolutional cell. In this paper, we extend the search space of NAS.	related papers	related patents
593	The Right Tool for the Job: Matching Model and Instance Complexities	Roy Schwartz, Gabriel Stanovsky, Swabha Swayamdipta, Jesse Dodge, Noah A. Smith	To better respect a given inference budget, we propose a modification to contextual representation fine-tuning which, during inference, allows for an early (and fast) “exit” from neural network calculations for simple instances, and late (and accurate) exit for hard instances.	related papers	related patents
594	Bootstrapping Techniques for Polysynthetic Morphological Analysis	William Lane, Steven Bird	To address this challenge, we offer linguistically-informed approaches for bootstrapping a neural morphological analyzer, and demonstrate its application to Kunwinjku, a polysynthetic Australian language.	related papers	related patents
595	Coupling Distant Annotation and Adversarial Training for Cross-Domain Chinese Word Segmentation	Ning Ding, Dingkun Long, Guangwei Xu, Muhua Zhu, Pengjun Xie, Xiaobin Wang, Haitao Zheng	In order to simultaneously alleviate the issues, this paper intuitively couples distant annotation and adversarial training for cross-domain CWS.	related papers	related patents
596	Modeling Morphological Typology for Unsupervised Learning of Language Morphology	Hongzhi Xu, Jordan Kodner, Mitchell Marcus, Charles Yang	This paper describes a language-independent model for fully unsupervised morphological analysis that exploits a universal framework leveraging morphological typology.	related papers	related patents
597	Predicting Declension Class from Form and Meaning	Adina Williams, Tiago Pimentel, Hagen Blix, Arya D. McCarthy, Eleanor Chodroff, Ryan Cotterell	More specifically, we operationalize this by measuring how much information, in bits, we can glean about declension class from knowing the form and/or meaning of nouns.	related papers	related patents
598	Unsupervised Morphological Paradigm Completion	Huiming Jin, Liwei Cai, Yihui Peng, Chen Xia, Arya McCarthy, Katharina Kann	We propose the task of unsupervised morphological paradigm completion.	related papers	related patents
599	Document Modeling with Graph Attention Networks for Multi-grained Machine Reading Comprehension	Bo Zheng, Haoyang Wen, Yaobo Liang, Nan Duan, Wanxiang Che, Daxin Jiang, Ming Zhou, Ting Liu	To address this issue, we present a novel multi-grained machine reading comprehension framework that focuses on modeling documents at their hierarchical nature, which are different levels of granularity: documents, paragraphs, sentences, and tokens.	related papers	related patents
600	Harvesting and Refining Question-Answer Pairs for Unsupervised QA	Zhongli Li, Wenhui Wang, Li Dong, Furu Wei, Ke Xu	In this work, we introduce two approaches to improve unsupervised QA.	related papers	related patents
601	Low-Resource Generation of Multi-hop Reasoning Questions	Jianxing Yu, Wei Liu, Shuang Qiu, Qinliang Su, Kai Wang, Xiaojun Quan, Jian Yin	Since the labeled data is limited and insufficient for training, we propose to learn the model with the help of a large scale of unlabeled data that is much easier to obtain.	related papers	related patents
602	R4C: A Benchmark for Evaluating RC Systems to Get the Right Answer for the Right Reason	Naoya Inoue, Pontus Stenetorp, Kentaro Inui	We present a reliable, crowdsourced framework for scalably annotating RC datasets with derivations.	related papers	related patents
603	Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension	Hongyu Gong, Yelong Shen, Dian Yu, Jianshu Chen, Dong Yu	In this paper, we study machine reading comprehension (MRC) on long texts: where a model takes as inputs a lengthy document and a query, extracts a text span from the document as an answer.	related papers	related patents
604	RikiNet: Reading Wikipedia Pages for Natural Question Answering	Dayiheng Liu, Yeyun Gong, Jie Fu, Yu Yan, Jiusheng Chen, Daxin Jiang, Jiancheng Lv, Nan Duan	In this paper, we introduce a new model, called RikiNet, which reads Wikipedia pages for natural question answering.	related papers	related patents
605	Parsing into Variable-in-situ Logico-Semantic Graphs	Yufei Chen, Weiwei Sun	We propose variable-in-situ logico-semantic graphs to bridge the gap between semantic graph and logical form parsing.	related papers	related patents
606	Semantic Parsing for English as a Second Language	Yuanyuan Zhao, Weiwei Sun, junjie cao, Xiaojun Wan	Motivated by the theoretical emphasis on the learning challenges that occur at the syntax-semantics interface during second language acquisition, we formulate the task based on the divergence between literal and intended meanings.	related papers	related patents
607	Semi-Supervised Semantic Dependency Parsing Using CRF Autoencoders	Zixia Jia, Youmi Ma, Jiong Cai, Kewei Tu	We propose an approach to semi-supervised learning of semantic dependency parsers based on the CRF autoencoder framework.	related papers	related patents
608	Unsupervised Dual Paraphrasing for Two-stage Semantic Parsing	Ruisheng Cao, Su Zhu, Chenyu Yang, Chen Liu, Rao Ma, Yanbin Zhao, Lu Chen, Kai Yu	Aiming to reduce nontrivial human labor, we propose a two-stage semantic parsing framework, where the first stage utilizes an unsupervised paraphrase model to convert an unlabeled natural language utterance into the canonical utterance.	related papers	related patents
609	DRTS Parsing with Structure-Aware Encoding and Decoding	Qiankun Fu, Yue Zhang, Jiangming Liu, Meishan Zhang	In this work, we propose a structural-aware model at both the encoder and decoder phase to integrate the structural information, where graph attention network (GAT) is exploited for effectively modeling.	related papers	related patents
610	A Two-Stage Masked LM Method for Term Set Expansion	Guy Kushilevitz, Shaul Markovitch, Yoav Goldberg	We harness the power of neural masked language models (MLM) and propose a novel TSE algorithm, which combines the pattern-based and distributional approaches.	related papers	related patents
611	FLAT: Chinese NER Using Flat-Lattice Transformer	Xiaonan Li, Hang Yan, Xipeng Qiu, Xuanjing Huang	In this paper, we propose FLAT: Flat-LAttice Transformer for Chinese NER, which converts the lattice structure into a flat structure consisting of spans.	related papers	related patents
612	Improving Entity Linking through Semantic Reinforced Entity Embeddings	Feng Hou, Ruili Wang, Jun He, Yi Zhou	We propose a simple yet effective method, FGS2EE, to inject fine-grained semantic information into entity embeddings to reduce the distinctiveness and facilitate the learning of contextual commonality.	related papers	related patents
613	Document Translation vs. Query Translation for Cross-Lingual Information Retrieval in the Medical Domain	Shadi Saleh, Pavel Pecina	We present a thorough comparison of two principal approaches to Cross-Lingual Information Retrieval: document translation (DT) and query translation (QT).	related papers	related patents
614	Learning Robust Models for e-Commerce Product Search	Thanh Nguyen, Nikhil Rao, Karthik Subbian	In this paper, we develop a deep, end-to-end model that learns to effectively classify mismatches and to generate hard mismatched examples to improve the classifier.	related papers	related patents
615	Generalized Entropy Regularization or: There’s Nothing Special about Label Smoothing	Clara Meister, Elizabeth Salesky, Ryan Cotterell	We introduce a parametric family of entropy regularizers, which includes label smoothing as a special case, and use it to gain a better understanding of the relationship between the entropy of a model and its performance on language generation tasks.	related papers	related patents
616	Highway Transformer: Self-Gating Enhanced Self-Attentive Networks	Yekun Chai, Shuo Jin, Xinwen Hou	Through a pseudo information highway, we introduce a gated component self-dependency units (SDU) that incorporates LSTM-styled gating units to replenish internal semantic importance within the multi-dimensional latent space of individual representations.	related papers	related patents
617	Low-Dimensional Hyperbolic Knowledge Graph Embeddings	Ines Chami, Adva Wolf, Da-Cheng Juan, Frederic Sala, Sujith Ravi, Christopher Ré	In this work, we introduce a class of hyperbolic KG embedding models that simultaneously capture hierarchical and logical patterns.	related papers	related patents
618	Classification-Based Self-Learning for Weakly Supervised Bilingual Lexicon Induction	Mladen Karan, Ivan Vulić, Anna Korhonen, Goran Glavaš	In this work, we present ClassyMap, a classification-based approach to self-learning, yielding a more robust and a more effective induction of projection-based CLWEs.	related papers	related patents
619	Gender in Danger? Evaluating Speech Translation Technology on the MuST-SHE Corpus	Luisa Bentivogli, Beatrice Savoldi, Matteo Negri, Mattia A. Di Gangi, Roldano Cattoni, Marco Turchi	We present the first thorough investigation of gender bias in speech translation, contributing with: i) the release of a benchmark useful for future studies, and ii) the comparison of different technologies (cascade and end-to-end) on two language directions (English-Italian/French).	related papers	related patents
620	Uncertainty-Aware Curriculum Learning for Neural Machine Translation	Yikai Zhou, Baosong Yang, Derek F. Wong, Yu Wan, Lidia S. Chao	We propose uncertainty-aware curriculum learning, which is motivated by the intuition that: 1) the higher the uncertainty in a translation pair, the more complex and rarer the information it contains; and 2) the end of the decline in model uncertainty indicates the completeness of current training stage.	related papers	related patents
621	Closing the Gap: Joint De-Identification and Concept Extraction in the Clinical Domain	Lukas Lange, Heike Adel, Jannik Strötgen	In this paper, we close this gap by reporting concept extraction performance on automatically anonymized data and investigating joint models for de-identification and concept extraction.	related papers	related patents
622	CorefQA: Coreference Resolution as Query-based Span Prediction	Wei Wu, Fei Wang, Arianna Yuan, Fei Wu, Jiwei Li	In this paper, we present CorefQA, an accurate and extensible approach for the coreference resolution task.	related papers	related patents
623	Estimating predictive uncertainty for rumour verification models	Elena Kochkina, Maria Liakata	We propose two methods for uncertainty-based instance rejection, supervised and unsupervised.	related papers	related patents
624	From Zero to Hero: Human-In-The-Loop Entity Linking in Low Resource Domains	Jan-Christoph Klie, Richard Eckart de Castilho, Iryna Gurevych	We therefore present a novel domain-agnostic Human-In-The-Loop annotation approach: we use recommenders that suggest potential concepts and adaptive candidate ranking, thereby speeding up the overall annotation process and making it less tedious for users.	related papers	related patents
625	Language to Network: Conditional Parameter Adaptation with Natural Language Descriptions	Tian Jin, Zhun Liu, Shengjia Yan, Alexandre Eichenberger, Louis-Philippe Morency	In this paper, we propose \textbf{N3} (\textbf{N}eural \textbf{N}etworks from \textbf{N}atural Language) – a new paradigm of synthesizing task-specific neural networks from language descriptions and a generic pre-trained model.	related papers	related patents
626	Controlled Crowdsourcing for High-Quality QA-SRL Annotation	Paul Roit, Ayal Klein, Daniela Stepanov, Jonathan Mamou, Julian Michael, Gabriel Stanovsky, Luke Zettlemoyer, Ido Dagan	In this paper, we present an improved crowdsourcing protocol for complex semantic annotation, involving worker selection and training, and a data consolidation phase.	related papers	related patents
627	Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus	Hao Fei, Meishan Zhang, Donghong Ji	In this paper, we propose a novel alternative based on corpus translation, constructing high-quality training datasets for the target languages from the source gold-standard SRL annotations.	related papers	related patents
628	Sentence Meta-Embeddings for Unsupervised Semantic Textual Similarity	Nina Poerner, Ulli Waltinger, Hinrich Schütze	We address the task of unsupervised Semantic Textual Similarity (STS) by ensembling diverse pre-trained sentence encoders into sentence meta-embeddings.	related papers	related patents
629	Transition-based Semantic Dependency Parsing with Pointer Networks	Daniel Fernández-González, Carlos Gómez-Rodríguez	In order to further test the capabilities of these powerful neural networks on a harder NLP problem, we propose a transition system that, thanks to Pointer Networks, can straightforwardly produce labelled directed acyclic graphs and perform semantic dependency parsing.	related papers	related patents
630	tBERT: Topic Models and BERT Joining Forces for Semantic Similarity Detection	Nicole Peinelt, Dong Nguyen, Maria Liakata	We propose a novel topic-informed BERT-based architecture for pairwise semantic similarity detection and show that our model improves performance over strong neural baselines across a variety of English language datasets.	related papers	related patents
631	Conditional Augmentation for Aspect Term Extraction via Masked Sequence-to-Sequence Generation	Kun Li, Chengbo Chen, Xiaojun Quan, Qing Ling, Yan Song	In this paper, we formulate the data augmentation as a conditional generation task: generating a new sentence while preserving the original opinion targets and labels.	related papers	related patents
632	Exploiting Personal Characteristics of Debaters for Predicting Persuasiveness	Khalid Al Khatib, Michael Völske, Shahbaz Syed, Nikolay Kolyada, Benno Stein	In this paper, we model debaters’ prior beliefs, interests, and personality traits based on their previous activity, without dependence on explicit user profiles or questionnaires.	related papers	related patents
633	Out of the Echo Chamber: Detecting Countering Debate Speeches	Matan Orbach, Yonatan Bilu, Assaf Toledo, Dan Lahav, Michal Jacovi, Ranit Aharonov, Noam Slonim	Given such a speech, we aim to identify, from among a set of speeches on the same topic and with an opposing stance, the ones that directly counter it.	related papers	related patents
634	Diversifying Dialogue Generation with Non-Conversational Text	Hui Su, Xiaoyu Shen, Sanqiang Zhao, Zhou Xiao, Pengwei Hu, randy zhong, Cheng Niu, Jie Zhou	In this paper, we propose a new perspective to diversify dialogue generation by leveraging \textit{non-conversational} text.	related papers	related patents
635	KdConv: A Chinese Multi-domain Dialogue Dataset Towards Multi-turn Knowledge-driven Conversation	Hao Zhou, Chujie Zheng, Kaili Huang, Minlie Huang, Xiaoyan Zhu	In this paper, we propose a Chinese multi-domain knowledge-driven conversation dataset, KdConv, which grounds the topics in multi-turn conversations to knowledge graphs.	related papers	related patents
636	Meta-Reinforced Multi-Domain State Generator for Dialogue Systems	Yi Huang, Junlan Feng, Min Hu, Xiaoting Wu, Xiaoyu Du, Shuo Ma	In this paper, we propose a Meta-Reinforced Multi-Domain State Generator (MERET).	related papers	related patents
637	Modeling Long Context for Task-Oriented Dialogue State Generation	Jun Quan, Deyi Xiong	Based on the recently proposed transferable dialogue state generator (TRADE) that predicts dialogue states from utterance-concatenated dialogue context, we propose a multi-task learning model with a simple yet effective utterance tagging technique and a bidirectional language model as an auxiliary task for task-oriented dialogue state generation.	related papers	related patents
638	Multi-Domain Dialogue Acts and Response Co-Generation	Kai Wang, Junfeng Tian, Rui Wang, Xiaojun Quan, Jianxing Yu	To address these issues, we propose a neural co-generation model that generates dialogue acts and responses concurrently.	related papers	related patents
639	Exploring Contextual Word-level Style Relevance for Unsupervised Style Transfer	Chulun Zhou, Liangyu Chen, Jiachen Liu, Xinyan Xiao, Jinsong Su, Sheng Guo, Hua Wu	In this paper, we propose a novel attentional sequence-to-sequence (Seq2seq) model that dynamically exploits the relevance of each output word to the target style for unsupervised style transfer.	related papers	related patents
640	Heterogeneous Graph Transformer for Graph-to-Sequence Learning	Shaowei Yao, Tianming Wang, Xiaojun Wan	In this paper, we propose the Heterogeneous Graph Transformer to independently model the different relations in the individual subgraphs of the original graph, including direct relations, indirect relations and multiple possible relations between nodes.	related papers	related patents
641	Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence	Xiaoyu Shen, Ernie Chang, Hui Su, Cheng Niu, Dietrich Klakow	To address this concern, we propose to explicitly segment target text into fragment units and align them with their data correspondences.	related papers	related patents
642	Aligned Dual Channel Graph Convolutional Network for Visual Question Answering	Qingbao Huang, Jielong Wei, Yi Cai, Changmeng Zheng, Junying Chen, Ho-fung Leung, Qing Li	To simultaneously capture the relations between objects in an image and the syntactic dependency relations between words in a question, we propose a novel dual channel graph convolutional network (DC-GCN) for better combining visual and textual advantages.	related papers	related patents
643	Multimodal Neural Graph Memory Networks for Visual Question Answering	Mahmoud Khademi	We introduce a new neural network architecture, Multimodal Neural Graph Memory Networks (MN-GMN), for visual question answering.	related papers	related patents
644	Refer360$^circ$: A Referring Expression Recognition Dataset in 360$^circ$ Images	Volkan Cirik, Taylor Berg-Kirkpatrick, Louis-Philippe Morency	We propose a novel large-scale referring expression recognition dataset, Refer360{\mbox{$^\circ$}}, consisting of 17,137 instruction sequences and ground-truth actions for completing these instructions in 360{\mbox{$^\circ$}} scenes.	related papers	related patents
645	CamemBERT: a Tasty French Language Model	Louis Martin, Benjamin Muller, Pedro Javier Ortiz Suárez, Yoann Dupont, Laurent Romary, Éric de la Clergerie, Djamé Seddah, Benoît Sagot	In this paper, we investigate the feasibility of training monolingual Transformer-based language models for other languages, taking French as an example and evaluating our language models on part-of-speech tagging, dependency parsing, named entity recognition and natural language inference tasks.	related papers	related patents
646	Effective Estimation of Deep Generative Language Models	Tom Pelsmaeker, Wilker Aziz	We concentrate on one such model, the variational auto-encoder, which we argue is an important building block in hierarchical probabilistic models of language.	related papers	related patents
647	Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection	Shauli Ravfogel, Yanai Elazar, Hila Gonen, Michael Twiton, Yoav Goldberg	We present Iterative Null-space Projection (INLP), a novel method for removing information from neural representations.	related papers	related patents
648	2kenize: Tying Subword Sequences for Chinese Script Conversion	Pranav A, Isabelle Augenstein	Here, we propose a model that can disambiguate between mappings and convert between the two scripts.	related papers	related patents
649	Predicting the Growth of Morphological Families from Social and Linguistic Factors	Valentin Hofmann, Janet Pierrehumbert, Hinrich Schütze	We present the first study that examines the evolution of morphological families, i.e., sets of morphologically related words such as “trump”, “antitrumpism”, and “detrumpify”, in social media.	related papers	related patents
650	Semi-supervised Contextual Historical Text Normalization	Peter Makarov, Simon Clematide	By utilizing a simple generative normalization model and obtaining powerful contextualization from the target-side language model, we train accurate models with unlabeled historical data.	related papers	related patents
651	ClarQ: A large-scale and diverse dataset for Clarification Question Generation	Vaibhav Kumar, Alan W Black	In order to overcome these limitations, we devise a novel bootstrapping framework (based on self-supervision) that assists in the creation of a diverse, large-scale dataset of clarification questions based on post-comment tuples extracted from stackexchange.	related papers	related patents
652	DoQA – Accessing Domain-Specific FAQs via Conversational QA	Jon Ander Campos, Arantxa Otegi, Aitor Soroa, Jan Deriu, Mark Cieliebak, Eneko Agirre	The goal of this work is to build conversational Question Answering (QA) interfaces for the large body of domain-specific information available in FAQ sites.	related papers	related patents
653	MLQA: Evaluating Cross-lingual Extractive Question Answering	Patrick Lewis, Barlas Oguz, Ruty Rinott, Sebastian Riedel, Holger Schwenk	We present MLQA, a multi-way aligned extractive QA evaluation benchmark intended to spur research in this area.	related papers	related patents
654	Multi-source Meta Transfer for Low Resource Multiple-Choice Question Answering	Ming Yan, Hao Zhang, Di Jin, Joey Tianyi Zhou	To address this challenge, we propose a multi-source meta transfer (MMT) for low-resource MCQA.	related papers	related patents
655	Fine-grained Fact Verification with Kernel Graph Attention Network	Zhenghao Liu, Chenyan Xiong, Maosong Sun, Zhiyuan Liu	This paper presents Kernel Graph Attention Network (KGAT), which conducts more fine-grained fact verification with kernel-based attentions.	related papers	related patents
656	Generating Fact Checking Explanations	Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein	This paper provides the first study of how these explanations can be generated automatically based on available claim context, and how this task can be modelled jointly with veracity prediction.	related papers	related patents
657	Premise Selection in Natural Language Mathematical Texts	Deborah Ferreira, André Freitas	We propose an approach to solve this task as a link prediction problem, using Deep Convolutional Graph Neural Networks.	related papers	related patents
658	A Call for More Rigor in Unsupervised Cross-lingual Learning	Mikel Artetxe, Sebastian Ruder, Dani Yogatama, Gorka Labaka, Eneko Agirre	We review motivations, definition, approaches, and methodology for unsupervised cross-lingual learning and call for a more rigorous position in each of them.	related papers	related patents
659	A Tale of a Probe and a Parser	Rowan Hall Maudslay, Josef Valvoda, Tiago Pimentel, Adina Williams, Ryan Cotterell	To explore whether syntactic probes would do better to make use of existing techniques, we compare the structural probe to a more traditional parser with an identical lightweight parameterisation.	related papers	related patents
660	From SPMRL to NMRL: What Did We Learn (and Unlearn) in a Decade of Parsing Morphologically-Rich Languages (MRLs)?	Reut Tsarfaty, Dan Bareket, Stav Klein, Amit Seker	Here we reflect on parsing MRLs in that decade, highlight the solutions and lessons learned for the architectural, modeling and lexical challenges in the pre-neural era, and argue that similar challenges re-emerge in neural architectures for MRLs.	related papers	related patents
661	Speech Translation and the End-to-End Promise: Taking Stock of Where We Are	Matthias Sperber, Matthias Paulik	This paper provides a unifying categorization and nomenclature that covers both traditional and recent approaches and that may help researchers by highlighting both trade-offs and open research questions.	related papers	related patents
662	What Question Answering can Learn from Trivia Nerds	Jordan Boyd-Graber, Benjamin Börschinger	We argue that creating a QA dataset-and the ubiquitous leaderboard that goes with it-closely resembles running a trivia tournament: you write questions, have agents (either humans or machines) answer the questions, and declare a winner.	related papers	related patents
663	What are the Goals of Distributional Semantics?	Guy Emerson	In this paper, I take a broad linguistic perspective, looking at how well current models can deal with various semantic challenges.	related papers	related patents
664	Improving Image Captioning with Better Use of Caption	Zhan Shi, Xu Zhou, Xipeng Qiu, Xiaodan Zhu	In this paper, we present a novel image captioning architecture to better explore semantics available in captions and leverage that to enhance both image representation and caption generation.	related papers	related patents
665	Shape of Synth to Come: Why We Should Use Synthetic Data for English Surface Realization	Henry Elder, Robert Burke, Alexander O’Connor, Jennifer Foster	We analyse the effects of synthetic data, and we argue that its use should be encouraged rather than prohibited so that future research efforts continue to explore systems that can take advantage of such data.	related papers	related patents
666	Toward Better Storylines with Sentence-Level Language Models	Daphne Ippolito, David Grangier, Douglas Eck, Chris Callison-Burch	We propose a sentence-level language model which selects the next sentence in a story from a finite set of fluent alternatives.	related papers	related patents
667	A Two-Step Approach for Implicit Event Argument Detection	Zhisong Zhang, Xiang Kong, Zhengzhong Liu, Xuezhe Ma, Eduard Hovy	In this work, we explore the implicit event argument detection task, which studies event arguments beyond sentence boundaries.	related papers	related patents
668	Machine Reading of Historical Events	Or Honovich, Lucas Torroba Hennigen, Omri Abend, Shay B. Cohen	Within this broad framework, we address the task of machine reading the time of historical events, compile datasets for the task, and develop a model for tackling it.	related papers	related patents
669	Revisiting Unsupervised Relation Extraction	Thy Thy Tran, Phong Le, Sophia Ananiadou	However, we demonstrate that by using only named entities to induce relation types, we can outperform existing methods on two popular datasets.	related papers	related patents
670	SciREX: A Challenge Dataset for Document-Level Information Extraction	Sarthak Jain, Madeleine van Zuylen, Hannaneh Hajishirzi, Iz Beltagy	In this paper, we introduce SciREX, a document level IE dataset that encompasses multiple IE tasks, including salient entity identification and document level N-ary relation identification from scientific articles.	related papers	related patents
671	Contrastive Self-Supervised Learning for Commonsense Reasoning	Tassilo Klein, Moin Nabi	We propose a self-supervised method to solve Pronoun Disambiguation and Winograd Schema Challenge problems.	related papers	related patents
672	Do Transformers Need Deep Long-Range Memory?	Jack Rae, Ali Razavi	We perform a set of interventions to show that comparable performance can be obtained with 6X fewer long range memories and better performance can be obtained by limiting the range of attention in lower layers of the network.	related papers	related patents
673	Improving Disentangled Text Representation Learning with Information-Theoretic Guidance	Pengyu Cheng, Martin Renqiang Min, Dinghan Shen, Christopher Malon, Yizhe Zhang, Yitong Li, Lawrence Carin	Inspired by information theory, we propose a novel method that effectively manifests disentangled representations of text, without any supervision on semantics.	related papers	related patents
674	Understanding Advertisements with BERT	Kanika Kalra, Bhargav Kurma, Silpa Vadakkeeveetil Sreelatha, Manasi Patwardhan, Shirish Karande	We consider a task based on CVPR 2018 challenge dataset on advertisement (Ad) understanding. The task involves detecting the viewer�s interpretation of an Ad image captured as text.	related papers	related patents
675	Non-Linear Instance-Based Cross-Lingual Mapping for Non-Isomorphic Embedding Spaces	Goran Glavaš, Ivan Vulić	We present InstaMap, an instance-based method for learning projection-based cross-lingual word embeddings.	related papers	related patents
676	Good-Enough Compositional Data Augmentation	Jacob Andreas	We propose a simple data augmentation protocol aimed at providing a compositional inductive bias in conditional and unconditional sequence models.	related papers	related patents
677	RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers	Bailin Wang, Richard Shin, Xiaodong Liu, Oleksandr Polozov, Matthew Richardson	We present a unified framework, based on the relation-aware self-attention mechanism, to address schema encoding, schema linking, and feature representation within a text-to-SQL encoder.	related papers	related patents
678	Temporal Common Sense Acquisition with Minimal Supervision	Ben Zhou, Qiang Ning, Daniel Khashabi, Dan Roth	This work proposes a novel sequence modeling approach that exploits explicit and implicit mentions of temporal common sense, extracted from a large corpus, to build TacoLM, a temporal common sense language model.	related papers	related patents
679	The Sensitivity of Language Models and Humans to Winograd Schema Perturbations	Mostafa Abdou, Vinit Ravishankar, Maria Barrett, Yonatan Belinkov, Desmond Elliott, Anders Søgaard	Our results highlight interesting differences between humans and language models: language models are more sensitive to number or gender alternations and synonym replacements than humans, and humans are more stable and consistent in their predictions, maintain a much higher absolute performance, and perform better on non-associative instances than associative ones.	related papers	related patents
680	Temporally-Informed Analysis of Named Entity Recognition	Shruti Rijhwani, Daniel Preotiuc-Pietro	We analyze and propose methods that make better use of temporally-diverse training data, with a focus on the task of named entity recognition. To support these experiments, we introduce a novel data set of English tweets annotated with named entities.	related papers	related patents
681	Towards Open Domain Event Trigger Identification using Adversarial Domain Adaptation	Aakanksha Naik, Carolyn Rose	We tackle the task of building supervised event trigger identification models which can generalize better across domains.	related papers	related patents
682	CompGuessWhat?!: A Multi-task Evaluation Framework for Grounded Language Learning	Alessandro Suglia, Ioannis Konstas, Andrea Vanzo, Emanuele Bastianelli, Desmond Elliott, Stella Frank, Oliver Lemon	To remedy this, we present GroLLA, an evaluation framework for Grounded Language Learning with Attributes based on three sub-tasks: 1) Goal-oriented evaluation; 2) Object attribute prediction evaluation; and 3) Zero-shot evaluation.	related papers	related patents
683	Cross-Modality Relevance for Reasoning on Language and Vision	Chen Zheng, Quan Guo, Parisa Kordjamshidi	This work deals with the challenge of learning and reasoning over language and vision data for the related downstream tasks such as visual question answering (VQA) and natural language for visual reasoning (NLVR).	related papers	related patents
684	Learning Web-based Procedures by Reasoning over Explanations and Demonstrations in Context	Shashank Srivastava, Oleksandr Polozov, Nebojsa Jojic, Christopher Meek	We explore learning web-based tasks from a human teacher through natural language explanations and a single demonstration.	related papers	related patents
685	Multi-agent Communication meets Natural Language: Synergies between Functional and Structural Language Learning	Angeliki Lazaridou, Anna Potapenko, Olivier Tieleman	We present a method for combining multi-agent communication and traditional data-driven approaches to natural language learning, with an end goal of teaching agents to communicate with humans in natural language.	related papers	related patents
686	HAT: Hardware-Aware Transformers for Efficient Natural Language Processing	Hanrui Wang, Zhanghao Wu, Zhijian Liu, Han Cai, Ligeng Zhu, Chuang Gan, Song Han	To enable low-latency inference on resource-constrained hardware platforms, we propose to design Hardware-Aware Transformers (HAT) with neural architecture search.	related papers	related patents
687	Hard-Coded Gaussian Attention for Neural Machine Translation	Weiqiu You, Simeng Sun, Mohit Iyyer	We push further in this direction by developing a �hard-coded� attention variant without any learned parameters.	related papers	related patents
688	In Neural Machine Translation, What Does Transfer Learning Transfer?	Alham Fikri Aji, Nikolay Bogoychev, Kenneth Heafield, Rico Sennrich	We perform several ablation studies that limit information transfer, then measure the quality impact across three language pairs to gain a black-box understanding of transfer learning.	related papers	related patents
689	Learning a Multi-Domain Curriculum for Neural Machine Translation	Wei Wang, Ye Tian, Jiquan Ngiam, Yinfei Yang, Isaac Caswell, Zarana Parekh	This is achieved by carefully introducing instance-level domain-relevance features and automatically constructing a training curriculum to gradually concentrate on multi-domain relevant and noise-reduced data batches.	related papers	related patents
690	Reducing Gender Bias in Neural Machine Translation as a Domain Adaptation Problem	Danielle Saunders, Bill Byrne	At inference time we propose a lattice-rescoring scheme which outperforms all systems evaluated in Stanovsky et al, 2019 on WinoMT with no degradation of general test set BLEU.	related papers	related patents
691	Translationese as a Language in “Multilingual” NMT	Parker Riley, Isaac Caswell, Markus Freitag, David Grangier	Motivated by this, we model translationese and original (i.e. natural) text as separate languages in a multilingual model, and pose the question: can we perform zero-shot translation between original source text and original target text?	related papers	related patents
692	Unsupervised Domain Clusters in Pretrained Language Models	Roee Aharoni, Yoav Goldberg	We harness this property and propose domain data selection methods based on such models, which require only a small set of in-domain monolingual data.	related papers	related patents
693	Using Context in Neural Machine Translation Training Objectives	Danielle Saunders, Felix Stahlberg, Bill Byrne	We present Neural Machine Translation (NMT) training using document-level metrics with batch-level documents.	related papers	related patents
694	Variational Neural Machine Translation with Normalizing Flows	Hendra Setiawan, Matthias Sperber, Udhyakumar Nallasamy, Matthias Paulik	In this paper, we propose to apply the VNMT framework to the state-of-the-art Transformer and introduce a more flexible approximate posterior based on normalizing flows.	related papers	related patents
695	The Paradigm Discovery Problem	Alexander Erdmann, Micha Elsner, Shijie Wu, Ryan Cotterell, Nizar Habash	This work treats the paradigm discovery problem (PDP), the task of learning an inflectional morphological system from unannotated sentences.	related papers	related patents
696	Supervised Grapheme-to-Phoneme Conversion of Orthographic Schwas in Hindi and Punjabi	Aryaman Arora, Luke Gessler, Nathan Schneider	We present the first statistical schwa deletion classifier for Hindi, which relies solely on the orthography as the input and outperforms previous approaches.	related papers	related patents
697	Automated Evaluation of Writing — 50 Years and Counting	Beata Beigman Klebanov, Nitin Madnani	In this theme paper, we focus on Automated Writing Evaluation (AWE), using Ellis Page’s seminal 1966 paper to frame the presentation.	related papers	related patents
698	Negated and Misprimed Probes for Pretrained Language Models: Birds Can Talk, But Cannot Fly	Nora Kassner, Hinrich Schütze	Building on Petroni et al. 2019, we propose two new probing tasks analyzing factual knowledge stored in Pretrained Language Models (PLMs).	related papers	related patents
699	On Forgetting to Cite Older Papers: An Analysis of the ACL Anthology	Marcel Bollmann, Desmond Elliott	In this paper, we address this question through bibliographic analysis.	related papers	related patents
700	Returning the N to NLP: Towards Contextually Personalized Classification Models	Lucie Flek	This paper surveys the landscape of personalization in natural language processing and related fields, and offers a path forward to mitigate the decades of deviation of the NLP tools from sociolingustic findings, allowing to flexibly process the �natural� language of each user rather than enforcing a uniform NLP treatment.	related papers	related patents
701	To Test Machine Comprehension, Start by Defining Comprehension	Jesse Dunietz, Greg Burnham, Akash Bharadwaj, Owen Rambow, Jennifer Chu-Carroll, Dave Ferrucci	First, we argue that existing approaches do not adequately define comprehension; they are too unsystematic about what content is tested. Second, we present a detailed definition of comprehension�a �Template of Understanding��for a widely useful class of texts, namely short narratives.	related papers	related patents
702	Gender Gap in Natural Language Processing Research: Disparities in Authorship and Citations	Saif M. Mohammad	In this work, we examine female first author percentages and the citations to their papers in Natural Language Processing.	related papers	related patents
703	BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension	Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, Luke Zettlemoyer	We present BART, a denoising autoencoder for pretraining sequence-to-sequence models.	related papers	related patents
704	BLEURT: Learning Robust Metrics for Text Generation	Thibault Sellam, Dipanjan Das, Ankur Parikh	We propose BLEURT, a learned evaluation metric for English based on BERT.	related papers	related patents
705	Distilling Knowledge Learned in BERT for Text Generation	Yen-Chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu	In this paper, we present a novel approach, Conditional Masked Language Modeling (C-MLM), to enable the finetuning of BERT on target generation tasks.	related papers	related patents
706	ESPRIT: Explaining Solutions to Physical Reasoning Tasks	Nazneen Fatema Rajani, Rui Zhang, Yi Chern Tan, Stephan Zheng, Jeremy Weiss, Aadit Vyas, Abhijit Gupta, Caiming Xiong, Richard Socher, Dragomir Radev	We propose ESPRIT, a framework for commonsense reasoning about qualitative physics in natural language that generates interpretable descriptions of physical events.	related papers	related patents
707	Iterative Edit-Based Unsupervised Sentence Simplification	Dhruv Kumar, Lili Mou, Lukasz Golab, Olga Vechtomova	We present a novel iterative, edit-based approach to unsupervised sentence simplification.	related papers	related patents
708	Logical Natural Language Generation from Open-Domain Tables	Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang	In this paper, we suggest a new NLG task where a model is tasked with generating natural language statements that can be \textit{logically entailed} by the facts in an open-domain semi-structured table.	related papers	related patents
709	Neural CRF Model for Sentence Alignment in Text Simplification	Chao Jiang, Mounica Maddela, Wuwei Lan, Yang Zhong, Wei Xu	We propose a novel neural CRF alignment model which not only leverages the sequential nature of sentences in parallel documents but also utilizes a neural sentence pair model to capture semantic similarity.	related papers	related patents
710	One Size Does Not Fit All: Generating and Evaluating Variable Number of Keyphrases	Xingdi Yuan, Tong Wang, Rui Meng, Khushboo Thaker, Peter Brusilovsky, Daqing He, Adam Trischler	In this study, we address this problem from both modeling and evaluation perspectives.	related papers	related patents
711	R^3: Reverse, Retrieve, and Rank for Sarcasm Generation with Commonsense Knowledge	Tuhin Chakrabarty, Debanjan Ghosh, Smaranda Muresan, Nanyun Peng	We propose an unsupervised approach for sarcasm generation based on a non-sarcastic input sentence.	related papers	related patents
712	Structural Information Preserving for Graph-to-Text Generation	Linfeng Song, Ante Wang, Jinsong Su, Yue Zhang, Kun Xu, Yubin Ge, Dong Yu	We propose to tackle this problem by leveraging richer training signals that can guide our model for preserving input information.	related papers	related patents
713	A Joint Neural Model for Information Extraction with Global Features	Ying Lin, Heng Ji, Fei Huang, Lingfei Wu	In order to capture such cross-subtask and cross-instance inter-dependencies, we propose a joint neural framework, OneIE, that aims to extract the globally optimal IE result as a graph from an input sentence.	related papers	related patents
714	Document-Level Event Role Filler Extraction using Multi-Granularity Contextualized Encoding	Xinya Du, Claire Cardie	To dynamically aggregate information captured by neural representations learned at different levels of granularity (e.g., the sentence- and paragraph-level), we propose a novel multi-granularity reader.	related papers	related patents
715	Exploiting the Syntax-Model Consistency for Neural Relation Extraction	Amir Pouran Ben Veyseh, Franck Dernoncourt, Dejing Dou, Thien Huu Nguyen	In order to overcome these issues, we propose a novel deep learning model for RE that uses the dependency trees to extract the syntax-based importance scores for the words, serving as a tree representation to introduce syntactic information into the models with greater generalization.	related papers	related patents
716	From English to Code-Switching: Transfer Learning with Strong Morphological Clues	Gustavo Aguilar, Thamar Solorio	In this paper, we aim at adapting monolingual models to code-switched text in various tasks.	related papers	related patents
717	Learning Interpretable Relationships between Entities, Relations and Concepts via Bayesian Structure Learning on Open Domain Facts	Jingyuan Zhang, Mingming Sun, Yue Feng, Ping Li	In this paper, we propose the task of learning interpretable relationships from open-domain facts to enrich and refine concept graphs.	related papers	related patents
718	Multi-Sentence Argument Linking	Seth Ebner, Patrick Xia, Ryan Culkin, Kyle Rawlins, Benjamin Van Durme	We present a novel document-level model for finding argument spans that fill an event’s roles, connecting related ideas in sentence-level semantic role labeling and coreference resolution.	related papers	related patents
719	Rationalizing Medical Relation Prediction from Corpus-level Statistics	Zhen Wang, Jennifer Lee, Simon Lin, Huan Sun	Aiming to shed some light on how to rationalize medical relation prediction, we present a new interpretable framework inspired by existing theories on how human memory works, e.g., theories of recall and recognition.	related papers	related patents
720	Sources of Transfer in Multilingual Named Entity Recognition	David Mueller, Nicholas Andrews, Mark Dredze	To explain this phenomena, we explore the sources of multilingual transfer in polyglot NER models and examine the weight structure of polyglot models compared to their monolingual counterparts.	related papers	related patents
721	ZeroShotCeres: Zero-Shot Relation Extraction from Semi-Structured Webpages	Colin Lockard, Prashant Shiralkar, Xin Luna Dong, Hannaneh Hajishirzi	In this work, we propose a solution for “zero-shot” open-domain relation extraction from webpages with a previously unseen template, including from websites with little overlap with existing sources of knowledge for distant supervision and websites in entirely new subject verticals.	related papers	related patents
722	Soft Gazetteers for Low-Resource Named Entity Recognition	Shruti Rijhwani, Shuyan Zhou, Graham Neubig, Jaime Carbonell	To address this problem, we propose a method of “soft gazetteers” that incorporates ubiquitously available information from English knowledge bases, such as Wikipedia, into neural named entity recognition models through cross-lingual entity linking.	related papers	related patents
723	A Prioritization Model for Suicidality Risk Assessment	Han-Chin Shing, Philip Resnik, Douglas Oard	Building on measures developed for resource-bounded document retrieval, we introduce a well founded evaluation paradigm, and demonstrate using an expert-annotated test collection that meaningful improvements over plausible cascade model baselines can be achieved using an approach that jointly ranks individuals and their social media posts.	related papers	related patents
724	CluHTM – Semantic Hierarchical Topic Modeling based on CluWords	Felipe Viegas, Washington Cunha, Christian Gomes, Antônio Pereira, Leonardo Rocha, Marcos Goncalves	In this paper, we advance the state-of-the-art on HTM by means of the design and evaluation of CluHTM, a novel non-probabilistic hierarchical matrix factorization aimed at solving the specific issues of HTM.	related papers	related patents
725	Empower Entity Set Expansion via Language Model Probing	Yunyi Zhang, Jiaming Shen, Jingbo Shang, Jiawei Han	In this study, we propose a novel iterative set expansion framework that leverages automatically generated class names to address the semantic drift issue.	related papers	related patents
726	Feature Projection for Improved Text Classification	Qi Qin, Wenpeng Hu, Bing Liu	In this paper, we propose a novel angle to further improve this representation learning, i.e., feature projection.	related papers	related patents
727	A negative case analysis of visual grounding methods for VQA	Robik Shrestha, Kushal Kafle, Christopher Kanan	However, we show that the performance improvements are not a result of improved visual grounding, but a regularization effect which prevents over-fitting to linguistic priors.	related papers	related patents
728	History for Visual Dialog: Do we really need it?	Shubham Agarwal, Trung Bui, Joon-Young Lee, Ioannis Konstas, Verena Rieser	In this paper, we show that co-attention models which explicitly encode dialoh history outperform models that don’t, achieving state-of-the-art performance (72 % NDCG on val set).	related papers	related patents
729	Mapping Natural Language Instructions to Mobile UI Action Sequences	Yang Li, Jiacong He, Xin Zhou, Yuan Zhang, Jason Baldridge	We present a new problem: grounding natural language instructions to mobile user interface actions, and create three new datasets for it.	related papers	related patents
730	TVQA+: Spatio-Temporal Grounding for Video Question Answering	Jie Lei, Licheng Yu, Tamara Berg, Mohit Bansal	We present the task of Spatio-Temporal Video Question Answering, which requires intelligent systems to simultaneously retrieve relevant moments and detect referenced visual concepts (people and objects) to answer natural language questions about videos.	related papers	related patents
731	Unsupervised Multimodal Neural Machine Translation with Pseudo Visual Pivoting	Po-Yao Huang, Junjie Hu, Xiaojun Chang, Alexander Hauptmann	In this paper, we investigate how to utilize visual content for disambiguation and promoting latent space alignment in unsupervised MMT.	related papers	related patents
732	A Multitask Learning Approach for Diacritic Restoration	Sawsan Alqahtani, Ajay Mishra, Mona Diab	Thus, to compensate for this loss, we investigate the use of multi-task learning to jointly optimize diacritic restoration with related NLP problems namely word segmentation, part-of-speech tagging, and syntactic diacritization.	related papers	related patents
733	Frugal Paradigm Completion	Alexander Erdmann, Tom Kenter, Markus Becker, Christian Schallhart	We propose a frugal paradigm completion approach that predicts all related forms in a morphological paradigm from as few manually provided forms as possible.	related papers	related patents
734	Improving Chinese Word Segmentation with Wordhood Memory Networks	Yuanhe Tian, Yan Song, Fei Xia, Tong Zhang, Yonggang Wang	In this paper, we therefore propose a neural framework, WMSeg, which uses memory networks to incorporate wordhood information with several popular encoder-decoder combinations for CWS.	related papers	related patents
735	Joint Chinese Word Segmentation and Part-of-speech Tagging via Two-way Attentions of Auto-analyzed Knowledge	Yuanhe Tian, Yan Song, Xiang Ao, Fei Xia, Xiaojun Quan, Tong Zhang, Yonggang Wang	In this paper, we propose a neural model named TwASP for joint CWS and POS tagging following the character-based sequence labeling paradigm, where a two-way attention mechanism is used to incorporate both context feature and their corresponding syntactic knowledge for each input character.	related papers	related patents
736	Joint Diacritization, Lemmatization, Normalization, and Fine-Grained Morphological Tagging	Nasser Zalmout, Nizar Habash	Our approach models the different features jointly, whether lexicalized (on the character-level), or non-lexicalized (on the word-level).	related papers	related patents
737	Phonetic and Visual Priors for Decipherment of Informal Romanization	Maria Ryskina, Matthew R. Gormley, Taylor Berg-Kirkpatrick	We propose a noisy-channel WFST cascade model for deciphering the original non-Latin script from observed romanized text in an unsupervised fashion.	related papers	related patents
738	Active Learning for Coreference Resolution using Discrete Annotation	Belinda Z. Li, Gabriel Stanovsky, Luke Zettlemoyer	We improve upon pairwise annotation for active learning in coreference resolution, by asking annotators to identify mention antecedents if a presented mention pair is deemed not coreferent.	related papers	related patents
739	Beyond Possession Existence: Duration and Co-Possession	Dhivya Chinnappa, Srikala Murugan, Eduardo Blanco	This paper introduces two tasks: determining (a) the duration of possession relations and (b) co-possessions, i.e., whether multiple possessors possess a possessee at the same time.	related papers	related patents
740	Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks	Suchin Gururangan, Ana Marasović, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, Noah A. Smith	We present a study across four domains (biomedical and computer science publications, news, and reviews) and eight classification tasks, showing that a second phase of pretraining in-domain (domain-adaptive pretraining) leads to performance gains, under both high- and low-resource settings.	related papers	related patents
741	Estimating Mutual Information Between Dense Word Embeddings	Vitalii Zhelezniak, Aleksandar Savkov, Nils Hammerla	In this work we go through a vast literature on estimating MI in such cases and single out the most promising methods, yielding a simple and elegant similarity measure for word embeddings.	related papers	related patents
742	Exploring Unexplored Generalization Challenges for Cross-Database Semantic Parsing	Alane Suhr, Ming-Wei Chang, Peter Shaw, Kenton Lee	We propose a challenging evaluation setup for cross-database semantic parsing, focusing on variation across database schemas and in-domain language use.	related papers	related patents
743	Predicting the Focus of Negation: Model and Error Analysis	Md Mosharaf Hossain, Kathleen Hamilton, Alexis Palmer, Eduardo Blanco	In this paper, we experiment with neural networks to predict the focus of negation.	related papers	related patents
744	Structured Tuning for Semantic Role Labeling	Tao Li, Parth Anand Jawale, Martha Palmer, Vivek Srikumar	In this paper, we present a structured tuning framework to improve models using softened constraints only at training time.	related papers	related patents
745	TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data	Pengcheng Yin, Graham Neubig, Wen-tau Yih, Sebastian Riedel	In this paper we present TaBERT, a pretrained LM that jointly learns representations for NL sentences and (semi-)structured tables.	related papers	related patents
746	Universal Decompositional Semantic Parsing	Elias Stengel-Eskin, Aaron Steven White, Sheng Zhang, Benjamin Van Durme	We introduce a transductive model for parsing into Universal Decompositional Semantics (UDS) representations, which jointly learns to map natural language utterances into UDS graph structures and annotate the graph with decompositional semantic attribute scores.	related papers	related patents
747	Unsupervised Cross-lingual Representation Learning at Scale	Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, Veselin Stoyanov	This paper shows that pretraining multilingual language models at scale leads to significant performance gains for a wide range of cross-lingual transfer tasks.	related papers	related patents
748	A Generate-and-Rank Framework with Semantic Type Regularization for Biomedical Concept Normalization	Dongfang Xu, Zeyu Zhang, Steven Bethard	In this paper, we propose an architecture consisting of a candidate generator and a list-wise ranker based on BERT.	related papers	related patents
749	Hierarchical Entity Typing via Multi-level Learning to Rank	Tongfei Chen, Yunmo Chen, Benjamin Van Durme	We propose a novel method for hierarchical entity classification that embraces ontological structure at both training and during prediction.	related papers	related patents
750	Multi-Domain Named Entity Recognition with Genre-Aware and Agnostic Inference	Jing Wang, Mayank Kulkarni, Daniel Preotiuc-Pietro	We introduce a new architecture tailored to this task by using shared and private domain parameters and multi-task learning.	related papers	related patents
751	TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories	Giannis Karamanolakis, Jun Ma, Xin Luna Dong	This paper proposes TXtract, a taxonomy-aware knowledge extraction model that applies to thousands of product categories organized in a hierarchical taxonomy.	related papers	related patents
752	TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition	Bill Yuchen Lin, Dong-Ho Lee, Ming Shen, Ryan Moreno, Xiao Huang, Prashant Shiralkar, Xiang Ren	In this paper, we introduce “entity triggers,” an effective proxy of human explanations for facilitating label-efficient learning of NER models.	related papers	related patents
753	Addressing Posterior Collapse with Mutual Information for Improved Variational Neural Machine Translation	Arya D. McCarthy, Xian Li, Jiatao Gu, Ning Dong	This paper proposes a simple and effective approach to address the problem of posterior collapse in conditional variational autoencoders (CVAEs).	related papers	related patents
754	Balancing Training for Multilingual Neural Machine Translation	Xinyi Wang, Yulia Tsvetkov, Graham Neubig	In this paper, we propose a method that instead automatically learns how to weight training data through a data scorer that is optimized to maximize performance on all test languages.	related papers	related patents
755	Evaluating Robustness to Input Perturbations for Neural Machine Translation	Xing Niu, Prashant Mathur, Georgiana Dinu, Yaser Al-Onaizan	This paper proposes additional metrics which measure the relative degradation and changes in translation when small perturbations are added to the input.	related papers	related patents
756	Parallel Corpus Filtering via Pre-trained Language Models	Boliang Zhang, Ajay Nagesh, Kevin Knight	In this paper, we propose a novel approach to filter out noisy sentence pairs from web-crawled corpora via pre-trained language models.	related papers	related patents
757	Regularized Context Gates on Transformer for Machine Translation	Xintong Li, Lemao Liu, Rui Wang, Guoping Huang, Max Meng	This paper first provides a method to identify source and target contexts and then introduce a gate mechanism to control the source and target contributions in Transformer. In addition, to further reduce the bias problem in the gate mechanism, this paper proposes a regularization method to guide the learning of the gates with supervision automatically generated using pointwise mutual information.	related papers	related patents
758	A Multi-Perspective Architecture for Semantic Code Search	Rajarshi Haldar, Lingfei Wu, JinJun Xiong, Julia Hockenmaier	In this paper, we propose a novel multi-perspective cross-lingual neural framework for code-text matching, inspired in part by a previous model for monolingual text-to-text matching, to capture both global and local similarities.	related papers	related patents
759	Automated Topical Component Extraction Using Neural Network Attention Scores from Source-based Essay Scoring	Haoran Zhang, Diane Litman	This paper presents a method for linking AWE and neural AES, by extracting Topical Components (TCs) representing evidence from a source text using the intermediate output of attention layers.	related papers	related patents
760	Clinical Concept Linking with Contextualized Neural Representations	Elliot Schumacher, Andriy Mulyar, Mark Dredze	We propose an approach to concept linking that leverages recent work in contextualized neural models, such as ELMo (Peters et al. 2018), which create a token representation that integrates the surrounding context of the mention and concept name.	related papers	related patents
761	DeSePtion: Dual Sequence Prediction and Adversarial Examples for Improved Fact-Checking	Christopher Hidey, Tuhin Chakrabarty, Tariq Alhindi, Siddharth Varia, Kriste Krstovski, Mona Diab, Smaranda Muresan	We show that current systems for FEVER are vulnerable to three categories of realistic challenges for fact-checking � multiple propositions, temporal reasoning, and ambiguity and lexical variation � and introduce a resource with these types of claims. Then we present a system designed to be resilient to these �attacks� using multiple pointer networks for document selection and jointly modeling a sequence of evidence sentences and veracity relation predictions.	related papers	related patents
762	Let Me Choose: From Verbal Context to Font Selection	Amirreza Shirani, Franck Dernoncourt, Jose Echevarria, Paul Asente, Nedim Lipka, Thamar Solorio	In this paper, we aim to learn associations between visual attributes of fonts and the verbal context of the texts they are typically applied to. We introduce a new dataset, containing examples of different topics in social media posts and ads, labeled through crowd-sourcing.	related papers	related patents
763	Multi-Label and Multilingual News Framing Analysis	Afra Feyza Akyürek, Lei Guo, Randa Elanwar, Prakash Ishwar, Margrit Betke, Derry Tanti Wijaya	In this work, we explore multilingual transfer learning to detect multiple frames from just the news headline in a genuinely low-resource context where there are few/no frame annotations in the target language.	related papers	related patents
764	Predicting Performance for Natural Language Processing Tasks	Mengzhou Xia, Antonios Anastasopoulos, Ruochen Xu, Yiming Yang, Graham Neubig	In this work, we attempt to explore the possibility of gaining plausible judgments of how well an NLP model can perform under an experimental setting, \textit{without actually training or testing the model}.	related papers	related patents
765	ScriptWriter: Narrative-Guided Script Generation	Yutao Zhu, Ruihua Song, Zhicheng Dou, Jian-Yun NIE, Jin Zhou	In this paper, we address a key problem involved in these applications – guiding a dialogue by a narrative.	related papers	related patents
766	Should All Cross-Lingual Embeddings Speak English?	Antonios Anastasopoulos, Graham Neubig	First, we show that the choice of hub language can significantly impact downstream lexicon induction zero-shot POS tagging performance. Second, we both expand a standard English-centered evaluation dictionary collection to include all language pairs using triangulation, and create new dictionaries for under-represented languages.	related papers	related patents
767	Smart To-Do: Automatic Generation of To-Do Items from Emails	Sudipto Mukherjee, Subhabrata Mukherjee, Marcello Hasegawa, Ahmed Hassan Awadallah, Ryen White	In this work, we explore a new application, Smart-To-Do, that helps users with task management over emails.	related papers	related patents
768	Are Natural Language Inference Models IMPPRESsive? Learning IMPlicature and PRESupposition	Paloma Jeretic, Alex Warstadt, Suvrat Bhooshan, Adina Williams	We use IMPPRES to evaluate whether BERT, InferSent, and BOW NLI models trained on MultiNLI (Williams et al., 2018) learn to make pragmatic inferences.	related papers	related patents
769	End-to-End Bias Mitigation by Modelling Biases in Corpora	Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson	We propose two learning strategies to train neural models, which are more robust to such biases and transfer better to out-of-domain datasets.	related papers	related patents
770	Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance	Prasetya Ajie Utama, Nafise Sadat Moosavi, Iryna Gurevych	In this paper, we address this trade-off by introducing a novel debiasing method, called confidence regularization, which discourage models from exploiting biases while enabling them to receive enough incentive to learn from all the training examples.	related papers	related patents
771	NILE : Natural Language Inference with Faithful Natural Language Explanations	Sawan Kumar, Partha Talukdar	We propose Natural-language Inference over Label-specific Explanations (NILE), a novel NLI method which utilizes auto-generated label-specific NL explanations to produce labels along with its faithful explanation.	related papers	related patents
772	QuASE: Question-Answer Driven Sentence Encoding	Hangfeng He, Qiang Ning, Dan Roth	This paper studies a natural question: Can we get supervision from QA data for other tasks (typically, non-QA ones)?	related papers	related patents
773	Towards Robustifying NLI Models Against Lexical Dataset Biases	Xiang Zhou, Mohit Bansal	Using contradiction-word bias and word-overlapping bias as our two bias examples, this paper explores both data-level and model-level debiasing methods to robustify models against lexical dataset biases.	related papers	related patents
774	Uncertain Natural Language Inference	Tongfei Chen, Zhengping Jiang, Adam Poliak, Keisuke Sakaguchi, Benjamin Van Durme	We introduce Uncertain Natural Language Inference (UNLI), a refinement of Natural Language Inference (NLI) that shifts away from categorical labels, targeting instead the direct prediction of subjective probability assessments.	related papers	related patents
775	Extracting Headless MWEs from Dependency Parse Trees: Parsing, Tagging, and Joint Modeling Approaches	Tianze Shi, Lillian Lee	We empirically compare these two common strategies�parsing and tagging�for predicting flat MWEs. Additionally, we propose an efficient joint decoding algorithm that combines scores from both strategies.	related papers	related patents
776	Revisiting Higher-Order Dependency Parsers	Erick Fonseca, André F. T. Martins	We tested this hypothesis and found that neural parsers may benefit from higher-order features, even when employing a powerful pre-trained encoder, such as BERT.	related papers	related patents
777	SeqVAT: Virtual Adversarial Training for Semi-Supervised Sequence Labeling	Luoxin Chen, Weitong Ruan, Xinyue Liu, Jianhua Lu	In this paper, we propose SeqVAT, a method which naturally applies VAT to sequence labeling models with CRF.	related papers	related patents
778	Treebank Embedding Vectors for Out-of-domain Dependency Parsing	Joachim Wagner, James Barry, Jennifer Foster	We build on this idea by 1) introducing a method to predict a treebank vector for sentences that do not come from a treebank used in training, and 2) exploring what happens when we move away from predefined treebank embedding vectors during test time and instead devise tailored interpolations.	related papers	related patents