Most Influential ArXiv (Computation and Language) Papers (2024-10)
The field of Computation and Language in arXiv covers natural language processing. Roughly it includes material in ACM Subject Class I.2.7. Note that work on artificial languages (programming languages, logics, formal systems) that does not explicitly address natural-language issues broadly construed (natural-language processing, computational linguistics, speech, text retrieval, etc.) is not appropriate for this area. Paper Digest Team analyzes all papers published in this field in the past years, and presents up to 30 most influential papers for each year. This ranking list is automatically constructed based upon citations from both research papers and granted patents, and will be frequently updated to reflect the most recent changes. To find the latest version of this list or the most influential papers from other conferences/journals, please visit Best Paper Digest page. Note: the most influential papers may or may not include the papers that won the best paper awards. (Version: 2024-10).
This list is created by the Paper Digest Team. Experience the cutting-edge capabilities of Paper Digest, an innovative AI-powered research platform that empowers you to write, review, get answers and more.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Most Influential ArXiv (Computation and Language) Papers (2024-10)
Year | Rank | Paper | Author(s) |
---|---|---|---|
2024 | 1 | Gemini 1.5: Unlocking Multimodal Understanding Across Millions of Tokens of Context IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. |
GEMINI TEAM et. al. |
2024 | 2 | Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. |
MARAH ABDIN et. al. |
2024 | 3 | Yi: Open Foundation Models By 01.AI IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the Yi model family, a series of language and multimodal models that demonstrate strong multi-dimensional capabilities. |
01. AI et. al. |
2024 | 4 | TinyLlama: An Open-Source Small Language Model IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present TinyLlama, a compact 1.1B language model pretrained on around 1 trillion tokens for approximately 3 epochs. |
Peiyuan Zhang; Guangtao Zeng; Tianduo Wang; Wei Lu; |
2024 | 5 | Gemma: Open Models Based on Gemini Research and Technology IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models. |
GEMMA TEAM et. al. |
2024 | 6 | Self-Rewarding Language Models IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Current approaches commonly train reward models from human preferences, which may then be bottlenecked by human performance level, and secondly these separate frozen reward models cannot then learn to improve during LLM training. In this work, we study Self-Rewarding Language Models, where the language model itself is used via LLM-as-a-Judge prompting to provide its own rewards during training. |
WEIZHE YUAN et. al. |
2024 | 7 | OLMo: Accelerating The Science of Language Models IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we have built OLMo, a competitive, truly Open Language Model, to enable the scientific study of language models. |
DIRK GROENEVELD et. al. |
2024 | 8 | Qwen2 Technical Report IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. |
AN YANG et. al. |
2024 | 9 | DeepSeek LLM: Scaling Open-Source Language Models with Longtermism IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We delve into the study of scaling laws and present our distinctive findings that facilitate scaling of large scale models in two commonly used open-source configurations, 7B and 67B. |
XIAO BI et. al. |
2024 | 10 | DoRA: Weight-Decomposed Low-Rank Adaptation IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Aiming to resemble the learning capacity of FT from the findings, we propose Weight-Decomposed Low-Rank Adaptation (DoRA). |
SHIH-YANG LIU et. al. |
2024 | 11 | How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety By Humanizing LLMs IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As large language models (LLMs) become increasingly common and competent, non-expert users can also impose risks during daily interactions. This paper introduces a new perspective to jailbreak LLMs as human-like communicators, to explore this overlooked intersection between everyday language interaction and AI safety. |
YI ZENG et. al. |
2024 | 12 | DeepSeekMath: Pushing The Limits of Mathematical Reasoning in Open Language Models IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce DeepSeekMath 7B, which continues pre-training DeepSeek-Coder-Base-v1.5 7B with 120B math-related tokens sourced from Common Crawl, together with natural language and code data. |
ZHIHONG SHAO et. al. |
2024 | 13 | Joint Lemmatization and Morphological Tagging with LEMMING IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present LEMMING, a modular log-linear model that jointly models lemmatization and tagging and supports the integration of arbitrary global features. |
Thomas Muller; Ryan Cotterell; Alexander Fraser; Hinrich Schütze; |
2024 | 14 | Large Language Models: A Survey IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we review some of the most prominent LLMs, including three popular LLM families (GPT, LLaMA, PaLM), and discuss their characteristics, contributions and limitations. |
SHERVIN MINAEE et. al. |
2024 | 15 | MiniCPM: Unveiling The Potential of Small Language Models with Scalable Training Strategies IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For data scaling, we introduce a Warmup-Stable-Decay (WSD) learning rate scheduler (LRS), conducive to continuous training and domain adaptation. |
SHENGDING HU et. al. |
2024 | 16 | Dolma: An Open Corpus of Three Trillion Tokens for Language Model Pretraining Research IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present analyses and experimental results on intermediate states of Dolma to share what we have learned about important data curation practices. |
LUCA SOLDAINI et. al. |
2024 | 17 | TrustLLM: Trustworthiness in Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces TrustLLM, a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. |
YUE HUANG et. al. |
2024 | 18 | SimPO: Simple Preference Optimization with A Reference-Free Reward IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose SimPO, a simpler yet more effective approach. |
Yu Meng; Mengzhou Xia; Danqi Chen; |
2024 | 19 | DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. |
AIXIN LIU et. al. |
2024 | 20 | BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a new embedding model, called M3-Embedding, which is distinguished for its versatility in Multi-Linguality, Multi-Functionality, and Multi-Granularity. |
JIANLV CHEN et. al. |
2024 | 21 | A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a comprehensive survey of over 32 techniques developed to mitigate hallucination in LLMs. |
S. M TOWHIDUL ISLAM TONMOY et. al. |
2024 | 22 | BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the availability of various open-source LLMs tailored for health contexts, adapting general-purpose LLMs to the medical domain presents significant challenges. In this paper, we introduce BioMistral, an open-source LLM tailored for the biomedical domain, utilizing Mistral as its foundation model and further pre-trained on PubMed Central. |
YANIS LABRAK et. al. |
2024 | 23 | Hallucination Is Inevitable: An Innate Limitation of Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we formalize the problem and show that it is impossible to eliminate hallucination in LLMs. |
Ziwei Xu; Sanjay Jain; Mohan Kankanhalli; |
2024 | 24 | The Era of 1-bit LLMs: All Large Language Models Are in 1.58 Bits IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single parameter (or weight) of the LLM is ternary {-1, 0, 1}. |
SHUMING MA et. al. |
2024 | 25 | Jamba: A Hybrid Transformer-Mamba Language Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Jamba, a new base large language model based on a novel hybrid Transformer-Mamba mixture-of-experts (MoE) architecture. |
OPHER LIEBER et. al. |
2024 | 26 | Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we conduct the first systematic analysis of work using OpenAI’s GPT-3.5 and GPT-4, the most prominently used LLMs today, in the context of data contamination. |
Simone Balloccu; Patrícia Schmidtová; Mateusz Lango; Ondřej Dušek; |
2024 | 27 | ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce ChatGLM, an evolving family of large language models that we have been developing over time. |
TEAM GLM et. al. |
2024 | 28 | Contrastive Preference Optimization: Pushing The Boundaries of LLM Performance in Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, even the top-performing 13B LLM-based translation models, like ALMA, does not match the performance of state-of-the-art conventional encoder-decoder translation models or larger-scale LLMs such as GPT-4. In this study, we bridge this performance gap. |
HAORAN XU et. al. |
2024 | 29 | Creating Emoji Lexica from Unsupervised Sentiment Analysis of Their Descriptions IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: About twenty billion are typed in Twitter nowadays, and new emojis keep appearing in each new Unicode version, making them increasingly relevant to sentiment analysis tasks. This has motivated us to propose a novel approach to predict the sentiments expressed by emojis in online textual messages, such as tweets, that does not require human effort to manually annotate data and saves valuable time for other analysis tasks. |
Milagros Fernández-Gavilanes; Jonathan Juncal-Martínez; Silvia García-Méndez; Enrique Costa-Montenegro; Francisco Javier González-Castaño; |
2024 | 30 | DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In response, we propose the DeepSeekMoE architecture towards ultimate expert specialization. |
DAMAI DAI et. al. |
2023 | 1 | LLaMA: Open and Efficient Foundation Language Models IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. |
HUGO TOUVRON et. al. |
2023 | 2 | Llama 2: Open Foundation and Fine-Tuned Chat Models IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. |
HUGO TOUVRON et. al. |
2023 | 3 | GPT-4 Technical Report IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. |
JOSH ACHIAM et. al. |
2023 | 4 | Sparks of Artificial General Intelligence: Early Experiments with GPT-4 IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we report on our investigation of an early version of GPT-4, when it was still in active development by OpenAI. |
SÉBASTIEN BUBECK et. al. |
2023 | 5 | Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address this, we explore using strong LLMs as judges to evaluate these models on more open-ended questions. We examine the usage and limitations of LLM-as-a-judge, including position, verbosity, and self-enhancement biases, as well as limited reasoning ability, and propose solutions to mitigate some of them. |
LIANMIN ZHENG et. al. |
2023 | 6 | A Survey of Large Language Models IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this survey, we review the recent advances of LLMs by introducing the background, key findings, and mainstream techniques. |
WAYNE XIN ZHAO et. al. |
2023 | 7 | Code Llama: Open Foundation Models for Code IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We provide multiple flavors to cover a wide range of applications: foundation models (Code Llama), Python specializations (Code Llama – Python), and instruction-following models (Code Llama – Instruct) with 7B, 13B, 34B and 70B parameters each. |
BAPTISTE ROZIÈRE et. al. |
2023 | 8 | Toolformer: Language Models Can Teach Themselves to Use Tools IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that LMs can teach themselves to use external tools via simple APIs and achieve the best of both worlds. |
TIMO SCHICK et. al. |
2023 | 9 | A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a framework for quantitatively evaluating interactive LLMs such as ChatGPT using publicly available data sets. |
YEJIN BANG et. al. |
2023 | 10 | Mistral 7B IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Mistral 7B v0.1, a 7-billion-parameter language model engineered for superior performance and efficiency. |
ALBERT Q. JIANG et. al. |
2023 | 11 | Tree of Thoughts: Deliberate Problem Solving with Large Language Models IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This means they can fall short in tasks that require exploration, strategic lookahead, or where initial decisions play a pivotal role. To surmount these challenges, we introduce a new framework for language model inference, Tree of Thoughts (ToT), which generalizes over the popular Chain of Thought approach to prompting language models, and enables exploration over coherent units of text (thoughts) that serve as intermediate steps toward problem solving. |
SHUNYU YAO et. al. |
2023 | 12 | PaLM 2 Technical Report IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce PaLM 2, a new state-of-the-art language model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM. |
ROHAN ANIL et. al. |
2023 | 13 | Self-Refine: Iterative Refinement with Self-Feedback IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by how humans refine their written text, we introduce Self-Refine, an approach for improving initial outputs from LLMs through iterative feedback and refinement. |
AMAN MADAAN et. al. |
2023 | 14 | Lost in The Middle: How Language Models Use Long Contexts IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We analyze the performance of language models on two tasks that require identifying relevant information in their input contexts: multi-document question answering and key-value retrieval. |
NELSON F. LIU et. al. |
2023 | 15 | Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: How do these patterns change as models scale? To answer these questions, we introduce \textit{Pythia}, a suite of 16 LLMs all trained on public data seen in the exact same order and ranging in size from 70M to 12B parameters. |
STELLA BIDERMAN et. al. |
2023 | 16 | Qwen Technical Report IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce Qwen, the first installment of our large language model series. |
JINZE BAI et. al. |
2023 | 17 | Universal and Transferable Adversarial Attacks on Aligned Language Models IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a simple and effective attack method that causes aligned language models to generate objectionable behaviors. |
ANDY ZOU et. al. |
2023 | 18 | A Survey on Evaluation of Large Language Models IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Over the past years, significant efforts have been made to examine LLMs from various perspectives. This paper presents a comprehensive review of these evaluation methods for LLMs, focusing on three key dimensions: what to evaluate, where to evaluate, and how to evaluate. |
YUPENG CHANG et. al. |
2023 | 19 | WizardLM: Empowering Large Language Models to Follow Complex Instructions IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show an avenue for creating large amounts of instruction data with varying levels of complexity using LLM instead of humans. |
CAN XU et. al. |
2023 | 20 | MPLUG-Owl: Modularization Empowers Large Language Models with Multimodality IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce mPLUG-Owl, a novel training paradigm that equips LLMs with multi-modal abilities through modularized learning of foundation LLM, a visual knowledge module, and a visual abstractor module. |
QINGHAO YE et. al. |
2023 | 21 | G-Eval: NLG Evaluation Using GPT-4 with Better Human Alignment IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present G-Eval, a framework of using large language models with chain-of-thoughts (CoT) and a form-filling paradigm, to assess the quality of NLG outputs. |
YANG LIU et. al. |
2023 | 22 | HuggingGPT: Solving AI Tasks with ChatGPT and Its Friends in Hugging Face IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Considering large language models (LLMs) have exhibited exceptional abilities in language understanding, generation, interaction, and reasoning, we advocate that LLMs could act as a controller to manage existing AI models to solve complicated AI tasks, with language serving as a generic interface to empower this. Based on this philosophy, we present HuggingGPT, an LLM-powered agent that leverages LLMs (e.g., ChatGPT) to connect various AI models in machine learning communities (e.g., Hugging Face) to solve AI tasks. |
YONGLIANG SHEN et. al. |
2023 | 23 | Gemini: A Family of Highly Capable Multimodal Models IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. |
GEMINI TEAM et. al. |
2023 | 24 | Retrieval-Augmented Generation for Large Language Models: A Survey IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: RAG synergistically merges LLMs’ intrinsic knowledge with the vast, dynamic repositories of external databases. This comprehensive review paper offers a detailed examination of the progression of RAG paradigms, encompassing the Naive RAG, the Advanced RAG, and the Modular RAG. |
YUNFAN GAO et. al. |
2023 | 25 | The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Large language models are commonly trained on a mixture of filtered web data and curated high-quality corpora, such as social media conversations, books, or technical papers. |
GUILHERME PENEDO et. al. |
2023 | 26 | Capabilities of GPT-4 on Medical Challenge Problems IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a comprehensive evaluation of GPT-4, a state-of-the-art LLM, on medical competency examinations and benchmark datasets. |
Harsha Nori; Nicholas King; Scott Mayer McKinney; Dean Carignan; Eric Horvitz; |
2023 | 27 | Spanish Pre-trained BERT Model and Evaluation Data IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Nevertheless, finding resources to train or evaluate Spanish language models is not an easy task. In this paper we help bridge this gap by presenting a BERT-based language model pre-trained exclusively on Spanish data. |
JOSÉ CAÑETE et. al. |
2023 | 28 | Is ChatGPT A General-Purpose Natural Language Processing Task Solver? IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we empirically analyze the zero-shot learning ability of ChatGPT by evaluating it on 20 popular NLP datasets covering 7 representative task categories. |
CHENGWEI QIN et. al. |
2023 | 29 | Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Video-LLaMA a multi-modal framework that empowers Large Language Models (LLMs) with the capability of understanding both visual and auditory content in the video. |
Hang Zhang; Xin Li; Lidong Bing; |
2023 | 30 | Baichuan 2: Open Large-scale Language Models IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, most powerful LLMs are closed-source or limited in their capability for languages other than English. In this technical report, we present Baichuan 2, a series of large-scale multilingual language models containing 7 billion and 13 billion parameters, trained from scratch, on 2.6 trillion tokens. |
AIYUAN YANG et. al. |
2022 | 1 | Training Language Models to Follow Instructions with Human Feedback IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning with human feedback. |
LONG OUYANG et. al. |
2022 | 2 | PaLM: Scaling Language Modeling with Pathways IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM. |
AAKANKSHA CHOWDHERY et. al. |
2022 | 3 | OPT: Open Pre-trained Transformer Language Models IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Open Pre-trained Transformers (OPT), a suite of decoder-only pre-trained transformers ranging from 125M to 175B parameters, which we aim to fully and responsibly share with interested researchers. |
SUSAN ZHANG et. al. |
2022 | 4 | Large Language Models Are Zero-Shot Reasoners IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Pretrained large language models (LLMs) are widely used in many sub-fields of natural language processing (NLP) and generally known as excellent few-shot learners with task-specific exemplars. |
Takeshi Kojima; Shixiang Shane Gu; Machel Reid; Yutaka Matsuo; Yusuke Iwasawa; |
2022 | 5 | Self-Consistency Improves Chain of Thought Reasoning in Language Models IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a new decoding strategy, self-consistency, to replace the naive greedy decoding used in chain-of-thought prompting. |
XUEZHI WANG et. al. |
2022 | 6 | BLOOM: A 176B-Parameter Open-Access Multilingual Language Model IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. |
BIGSCIENCE WORKSHOP et. al. |
2022 | 7 | Emergent Abilities of Large Language Models IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider an ability to be emergent if it is not present in smaller models but is present in larger models. |
JASON WEI et. al. |
2022 | 8 | Training A Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Alongside our main results, we perform peripheral analyses on calibration, competing objectives, and the use of OOD detection, compare our models with human writers, and provide samples from our models using prompts appearing in recent related work. |
YUNTAO BAI et. al. |
2022 | 9 | Self-Instruct: Aligning Language Models with Self-Generated Instructions IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Self-Instruct, a framework for improving the instruction-following capabilities of pretrained language models by bootstrapping off their own generations. |
YIZHONG WANG et. al. |
2022 | 10 | Survey of Hallucination in Natural Language Generation IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this survey, we thus provide a broad overview of the research progress and challenges in the hallucination problem in NLG. |
ZIWEI JI et. al. |
2022 | 11 | ReAct: Synergizing Reasoning and Acting in Language Models IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore the use of LLMs to generate both reasoning traces and task-specific actions in an interleaved manner, allowing for greater synergy between the two: reasoning traces help the model induce, track, and update action plans as well as handle exceptions, while actions allow it to interface with external sources, such as knowledge bases or environments, to gather additional information. |
SHUNYU YAO et. al. |
2022 | 12 | Training Compute-Optimal Large Language Models IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We investigate the optimal model size and number of tokens for training a transformer language model under a given compute budget. |
JORDAN HOFFMANN et. al. |
2022 | 13 | Large Language Models Encode Clinical Knowledge IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a framework for human evaluation of model answers along multiple axes including factuality, precision, possible harm, and bias. |
KARAN SINGHAL et. al. |
2022 | 14 | LaMDA: Language Models for Dialog Applications IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present LaMDA: Language Models for Dialog Applications. |
ROMAL THOPPILAN et. al. |
2022 | 15 | Beyond The Imitation Game: Quantifying and Extrapolating The Capabilities of Language Models IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG-bench). |
AAROHI SRIVASTAVA et. al. |
2022 | 16 | Rethinking The Role of Demonstrations: What Makes In-Context Learning Work? IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show that ground truth demonstrations are in fact not required — randomly replacing labels in the demonstrations barely hurts performance on a range of classification and multi-choce tasks, consistently over 12 different models including GPT-3. |
SEWON MIN et. al. |
2022 | 17 | BERTopic: Neural Topic Modeling with A Class-based TF-IDF Procedure IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present BERTopic, a topic model that extends this process by extracting coherent topic representation through the development of a class-based variation of TF-IDF. |
Maarten Grootendorst; |
2022 | 18 | GLM-130B: An Open Bilingual Pre-trained Model IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce GLM-130B, a bilingual (English and Chinese) pre-trained language model with 130 billion parameters. |
AOHAN ZENG et. al. |
2022 | 19 | No Language Left Behind: Scaling Human-Centered Machine Translation IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose multiple architectural and training improvements to counteract overfitting while training on thousands of tasks. |
NLLB TEAM et. al. |
2022 | 20 | Locating and Editing Factual Associations in GPT IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We analyze the storage and recall of factual associations in autoregressive transformer language models, finding evidence that these associations correspond to localized, directly-editable computations. |
Kevin Meng; David Bau; Alex Andonian; Yonatan Belinkov; |
2022 | 21 | Holistic Evaluation of Language Models IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Holistic Evaluation of Language Models (HELM) to improve the transparency of language models. |
PERCY LIANG et. al. |
2022 | 22 | GPT-NeoX-20B: An Open-Source Autoregressive Language Model IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce GPT-NeoX-20B, a 20 billion parameter autoregressive language model trained on the Pile, whose weights will be made freely and openly available to the public through a permissive license. |
SID BLACK et. al. |
2022 | 23 | Learn to Explain: Multimodal Reasoning Via Thought Chains for Science Question Answering IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we present Science Question Answering (ScienceQA), a new benchmark that consists of ~21k multimodal multiple choice questions with a diverse set of science topics and annotations of their answers with corresponding lectures and explanations. |
PAN LU et. al. |
2022 | 24 | Super-NaturalInstructions: Generalization Via Declarative Instructions on 1600+ NLP Tasks IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: How well can NLP models generalize to a variety of unseen tasks when provided with task instructions? |
YIZHONG WANG et. al. |
2022 | 25 | Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As the result of a joint effort between Microsoft and NVIDIA, we present details on the training of the largest monolithic transformer based language model, Megatron-Turing NLG 530B (MT-NLG), with 530 billion parameters. |
SHADEN SMITH et. al. |
2022 | 26 | Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on a suite of 23 challenging BIG-Bench tasks which we call BIG-Bench Hard (BBH). |
MIRAC SUZGUN et. al. |
2022 | 27 | Galactica: A Large Language Model for Science IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we introduce Galactica: a large language model that can store, combine and reason about scientific knowledge. |
ROSS TAYLOR et. al. |
2022 | 28 | Diffusion-LM Improves Controllable Text Generation IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While recent works have demonstrated successes on controlling simple sentence attributes (e.g., sentiment), there has been little progress on complex, fine-grained controls (e.g., syntactic structure). To address this challenge, we develop a new non-autoregressive language model based on continuous diffusions that we call Diffusion-LM. |
Xiang Lisa Li; John Thickstun; Ishaan Gulrajani; Percy Liang; Tatsunori B. Hashimoto; |
2022 | 29 | BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose BioGPT, a domain-specific generative Transformer language model pre-trained on large scale biomedical literature. |
RENQIAN LUO et. al. |
2022 | 30 | Solving Quantitative Reasoning Problems with Language Models IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Language models have achieved remarkable performance on a wide range of tasks that require natural language understanding. |
AITOR LEWKOWYCZ et. al. |
2021 | 1 | LoRA: Low-Rank Adaptation of Large Language Models IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks. |
EDWARD J. HU et. al. |
2021 | 2 | Prefix-Tuning: Optimizing Continuous Prompts For Generation IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose prefix-tuning, a lightweight alternative to fine-tuning for natural language generation tasks, which keeps language model parameters frozen, but optimizes a small continuous task-specific vector (called the prefix). |
Xiang Lisa Li; Percy Liang; |
2021 | 3 | The Power of Scale for Parameter-Efficient Prompt Tuning IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we explore prompt tuning, a simple yet effective mechanism for learning soft prompts to condition frozen language models to perform specific downstream tasks. |
Brian Lester; Rami Al-Rfou; Noah Constant; |
2021 | 4 | Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we introduce the basics of this promising paradigm, describe a unified set of mathematical notations that can cover a wide variety of existing work, and organize existing work along several dimensions, e.g.the choice of pre-trained models, prompts, and tuning strategies. |
PENGFEI LIU et. al. |
2021 | 5 | Finetuned Language Models Are Zero-Shot Learners IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper explores a simple method for improving the zero-shot learning abilities of language models. |
JASON WEI et. al. |
2021 | 6 | SimCSE: Simple Contrastive Learning of Sentence Embeddings IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents SimCSE, a simple contrastive learning framework that greatly advances state-of-the-art sentence embeddings. |
Tianyu Gao; Xingcheng Yao; Danqi Chen; |
2021 | 7 | HuBERT: Self-Supervised Speech Representation Learning By Masked Prediction of Hidden Units IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To deal with these three problems, we propose the Hidden-Unit BERT (HuBERT) approach for self-supervised speech representation learning, which utilizes an offline clustering step to provide aligned target labels for a BERT-like prediction loss. |
WEI-NING HSU et. al. |
2021 | 8 | WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To tackle the problem, we propose a new pre-trained model, WavLM, to solve full-stack downstream speech tasks. |
SANYUAN CHEN et. al. |
2021 | 9 | RoFormer: Enhanced Transformer with Rotary Position Embedding IF:8 Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Position encoding recently has shown effective in the transformer architecture. It enables valuable supervision for dependency modeling between elements at different positions of … |
JIANLIN SU et. al. |
2021 | 10 | CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present CodeT5, a unified pre-trained encoder-decoder Transformer model that better leverages the code semantics conveyed from the developer-assigned identifiers. |
Yue Wang; Weishi Wang; Shafiq Joty; Steven C. H. Hoi; |
2021 | 11 | TruthfulQA: Measuring How Models Mimic Human Falsehoods IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a benchmark to measure whether a language model is truthful in generating answers to questions. |
Stephanie Lin; Jacob Hilton; Owain Evans; |
2021 | 12 | Calibrate Before Use: Improving Few-Shot Performance of Language Models IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We demonstrate that this instability arises from the bias of language models towards predicting certain answers, e.g., those that are placed near the end of the prompt or are common in the pre-training data. |
Tony Z. Zhao; Eric Wallace; Shi Feng; Dan Klein; Sameer Singh; |
2021 | 13 | GLM: General Language Model Pretraining with Autoregressive Blank Infilling IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a General Language Model (GLM) based on autoregressive blank infilling to address this challenge. |
ZHENGXIAO DU et. al. |
2021 | 14 | Scaling Language Models: Methods, Analysis & Insights from Training Gopher IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present an analysis of Transformer-based language model performance across a wide range of model scales — from models with tens of millions of parameters up to a 280 billion parameter model called Gopher. |
JACK W. RAE et. al. |
2021 | 15 | GPT Understands, Too IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel method P-Tuning that employs trainable continuous prompt embeddings in concatenation with discrete prompts. |
XIAO LIU et. al. |
2021 | 16 | WebGPT: Browser-assisted Question-answering with Human Feedback IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We train and evaluate our models on ELI5, a dataset of questions asked by Reddit users. |
REIICHIRO NAKANO et. al. |
2021 | 17 | Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Instead, we use the generative nature of language models to construct an artificial development set and based on entropy statistics of the candidate permutations on this set, we identify performant prompts. |
Yao Lu; Max Bartolo; Alastair Moore; Sebastian Riedel; Pontus Stenetorp; |
2021 | 18 | DeBERTaV3: Improving DeBERTa Using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a new pre-trained language model, DeBERTaV3, which improves the original DeBERTa model by replacing mask language modeling (MLM) with replaced token detection (RTD), a more sample-efficient pre-training task. |
Pengcheng He; Jianfeng Gao; Weizhu Chen; |
2021 | 19 | Improving Language Models By Retrieving from Trillions of Tokens IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We enhance auto-regressive language models by conditioning on document chunks retrieved from a large corpus, based on local similarity with preceding tokens. |
SEBASTIAN BORGEAUD et. al. |
2021 | 20 | SUPERB: Speech Processing Universal PERformance Benchmark IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a simple framework to solve SUPERB tasks by learning task-specialized lightweight prediction heads on top of the frozen shared model. To bridge this gap, we introduce Speech processing Universal PERformance Benchmark (SUPERB). |
SHU-WEN YANG et. al. |
2021 | 21 | Ethical and Social Risks of Harm from Language Models IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to help structure the risk landscape associated with large-scale Language Models (LMs). |
LAURA WEIDINGER et. al. |
2021 | 22 | Towards A Unified View of Parameter-Efficient Transfer Learning IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we break down the design of state-of-the-art parameter-efficient transfer learning methods and present a unified framework that establishes connections between them. |
Junxian He; Chunting Zhou; Xuezhe Ma; Taylor Berg-Kirkpatrick; Graham Neubig; |
2021 | 23 | A Survey of Data Augmentation Approaches for NLP IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a comprehensive and unifying survey of data augmentation for NLP by summarizing the literature in a structured manner. |
STEVEN Y. FENG et. al. |
2021 | 24 | BARTScore: Evaluating Generated Text As Text Generation IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we conceptualize the evaluation of generated text as a text generation problem, modeled using pre-trained sequence-to-sequence models. |
Weizhe Yuan; Graham Neubig; Pengfei Liu; |
2021 | 25 | Recent Advances in Natural Language Processing Via Large Pre-Trained Language Models: A Survey IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a survey of recent work that uses these large language models to solve NLP tasks via pre-training then fine-tuning, prompting, or text generation approaches. |
BONAN MIN et. al. |
2021 | 26 | P-Tuning V2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a novel empirical finding that properly optimized prompt tuning can be universally effective across a wide range of model scales and NLU tasks. |
XIAO LIU et. al. |
2021 | 27 | Prompt Programming for Large Language Models: Beyond The Few-Shot Paradigm IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we discuss methods of prompt programming, emphasizing the usefulness of considering prompts through the lens of natural language. |
Laria Reynolds; Kyle McDonell; |
2021 | 28 | Unified Pre-training for Program Understanding and Generation IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces PLBART, a sequence-to-sequence model capable of performing a broad spectrum of program and language understanding and generation tasks. |
Wasi Uddin Ahmad; Saikat Chakraborty; Baishakhi Ray; Kai-Wei Chang; |
2021 | 29 | Cross-Task Generalization Via Natural Language Crowdsourcing Instructions IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To study this, we introduce NATURAL INSTRUCTIONS, a dataset of 61 distinct tasks, their human-authored instructions, and 193k task instances (input-output pairs). |
Swaroop Mishra; Daniel Khashabi; Chitta Baral; Hannaneh Hajishirzi; |
2021 | 30 | Are NLP Models Really Able to Solve Simple Math Word Problems? IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we restrict our attention to English MWPs taught in grades four and lower. Further, we introduce a challenge dataset, SVAMP, created by applying carefully chosen variations over examples sampled from existing datasets. |
Arkil Patel; Satwik Bhattamishra; Navin Goyal; |
2020 | 1 | Language Models Are Few-Shot Learners IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. |
TOM B. BROWN et. al. |
2020 | 2 | Wav2vec 2.0: A Framework For Self-Supervised Learning Of Speech Representations IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We show for the first time that learning powerful representations from speech audio alone followed by fine-tuning on transcribed speech can outperform the best semi-supervised methods while being conceptually simpler. |
Alexei Baevski; Henry Zhou; Abdelrahman Mohamed; Michael Auli; |
2020 | 3 | Longformer: The Long-Document Transformer IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address this limitation, we introduce the Longformer with an attention mechanism that scales linearly with sequence length, making it easy to process documents of thousands of tokens or longer. |
Iz Beltagy; Matthew E. Peters; Arman Cohan; |
2020 | 4 | Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce RAG models where the parametric memory is a pre-trained seq2seq model and the non-parametric memory is a dense vector index of Wikipedia, accessed with a pre-trained neural retriever. |
PATRICK LEWIS et. al. |
2020 | 5 | Dense Passage Retrieval For Open-Domain Question Answering IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we show that retrieval can be practically implemented using dense representations alone, where embeddings are learned from a small number of questions and passages by a simple dual-encoder framework. |
VLADIMIR KARPUKHIN et. al. |
2020 | 6 | ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As an alternative, we propose a more sample-efficient pre-training task called replaced token detection. |
Kevin Clark; Minh-Thang Luong; Quoc V. Le; Christopher D. Manning; |
2020 | 7 | DeBERTa: Decoding-enhanced BERT with Disentangled Attention IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we propose a new model architecture DeBERTa (Decoding-enhanced BERT with disentangled attention) that improves the BERT and RoBERTa models using two novel techniques. |
Pengcheng He; Xiaodong Liu; Jianfeng Gao; Weizhu Chen; |
2020 | 8 | MT5: A Massively Multilingual Pre-trained Text-to-text Transformer IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce mT5, a multilingual variant of T5 that was pre-trained on a new Common Crawl-based dataset covering 101 languages. |
LINTING XUE et. al. |
2020 | 9 | CodeBERT: A Pre-Trained Model For Programming And Natural Languages IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present CodeBERT, a bimodal pre-trained model for programming language (PL) and nat-ural language (NL). Furthermore, to investigate what type of knowledge is learned in CodeBERT, we construct a dataset for NL-PL probing, and evaluate in a zero-shot setting where parameters of pre-trained models are fixed. |
ZHANGYIN FENG et. al. |
2020 | 10 | Making Pre-trained Language Models Better Few-shot Learners IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present LM-BFF–better few-shot fine-tuning of language models–a suite of simple and complementary techniques for fine-tuning language models on a small number of annotated examples. |
Tianyu Gao; Adam Fisch; Danqi Chen; |
2020 | 11 | A Survey on Knowledge Graphs: Representation, Acquisition and Applications IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this survey, we provide a comprehensive review of knowledge graph covering overall research topics about 1) knowledge graph representation learning, 2) knowledge acquisition and completion, 3) temporal knowledge graph, and 4) knowledge-aware applications, and summarize recent breakthroughs and perspective directions to facilitate future research. |
Shaoxiong Ji; Shirui Pan; Erik Cambria; Pekka Marttinen; Philip S. Yu; |
2020 | 12 | Multilingual Denoising Pre-training For Neural Machine Translation IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present mBART — a sequence-to-sequence denoising auto-encoder pre-trained on large-scale monolingual corpora in many languages using the BART objective. |
YINHAN LIU et. al. |
2020 | 13 | The Pile: An 800GB Dataset Of Diverse Text For Language Modeling IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: With this in mind, we present \textit{the Pile}: an 825 GiB English text corpus targeted at training large-scale language models. |
LEO GAO et. al. |
2020 | 14 | Stanza: A Python Natural Language Processing Toolkit For Many Human Languages IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Stanza, an open-source Python natural language processing toolkit supporting 66 human languages. |
Peng Qi; Yuhao Zhang; Yuhui Zhang; Jason Bolton; Christopher D. Manning; |
2020 | 15 | Learning to Summarize from Human Feedback IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we show that it is possible to significantly improve summary quality by training a model to optimize for human preferences. |
NISAN STIENNON et. al. |
2020 | 16 | Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we challenge this assumption by showing that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains over continual pretraining of general-domain language models. |
YU GU et. al. |
2020 | 17 | A Primer In BERTology: What We Know About How BERT Works IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Transformer-based models have pushed state of the art in many areas of NLP, but our understanding of what is behind their success is still limited. |
Anna Rogers; Olga Kovaleva; Anna Rumshisky; |
2020 | 18 | Pre-trained Models for Natural Language Processing: A Survey IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this survey, we provide a comprehensive review of PTMs for NLP. |
XIPENG QIU et. al. |
2020 | 19 | BLEURT: Learning Robust Metrics For Text Generation IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose BLEURT, a learned evaluation metric based on BERT that can model human judgments with a few thousand possibly biased training examples. |
Thibault Sellam; Dipanjan Das; Ankur P. Parikh; |
2020 | 20 | MiniLM: Deep Self-Attention Distillation For Task-Agnostic Compression Of Pre-Trained Transformers IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present a simple and effective approach to compress large Transformer (Vaswani et al., 2017) based pre-trained models, termed as deep self-attention distillation. |
WENHUI WANG et. al. |
2020 | 21 | Beyond Accuracy: Behavioral Testing Of NLP Models With CheckList IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by principles of behavioral testing in software engineering, we introduce CheckList, a task-agnostic methodology for testing NLP models. |
Marco Tulio Ribeiro; Tongshuang Wu; Carlos Guestrin; Sameer Singh; |
2020 | 22 | Recipes For Building An Open-domain Chatbot IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We build variants of these recipes with 90M, 2.7B and 9.4B parameter models, and make our models and code publicly available. |
STEPHEN ROLLER et. al. |
2020 | 23 | RealToxicityPrompts: Evaluating Neural Toxic Degeneration In Language Models IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We investigate the extent to which pretrained LMs can be prompted to generate toxic language, and the effectiveness of controllable text generation algorithms at preventing such toxic degeneration. |
Samuel Gehman; Suchin Gururangan; Maarten Sap; Yejin Choi; Noah A. Smith; |
2020 | 24 | Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate how much these models can benefit from retrieving text passages, potentially containing evidence. |
Gautier Izacard; Edouard Grave; |
2020 | 25 | XTREME: A Massively Multilingual Multi-task Benchmark For Evaluating Cross-lingual Generalization IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we introduce the Cross-lingual TRansfer Evaluation of Multilingual Encoders XTREME benchmark, a multi-task benchmark for evaluating the cross-lingual generalization capabilities of multilingual representations across 40 languages and 9 tasks. We release the benchmark to encourage research on cross-lingual learning methods that transfer linguistic knowledge across a diverse and representative set of languages and tasks. |
JUNJIE HU et. al. |
2020 | 26 | Towards A Human-like Open-Domain Chatbot IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Meena, a multi-turn open-domain chatbot trained end-to-end on data mined and filtered from public domain social media conversations. |
DANIEL ADIWARDANA et. al. |
2020 | 27 | MPNet: Masked And Permuted Pre-training For Language Understanding IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose MPNet, a novel pre-training method that inherits the advantages of BERT and XLNet and avoids their limitations. |
Kaitao Song; Xu Tan; Tao Qin; Jianfeng Lu; Tie-Yan Liu; |
2020 | 28 | Making Monolingual Sentence Embeddings Multilingual Using Knowledge Distillation IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present an easy and efficient method to extend existing sentence embedding models to new languages. |
Nils Reimers; Iryna Gurevych; |
2020 | 29 | GShard: Scaling Giant Models With Conditional Computation And Automatic Sharding IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We demonstrate that such a giant model can efficiently be trained on 2048 TPU v3 accelerators in 4 days to achieve far superior quality for translation from 100 languages to English compared to the prior art. |
DMITRY LEPIKHIN et. al. |
2020 | 30 | COMET: A Neural Framework For MT Evaluation IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present COMET, a neural framework for training multilingual machine translation evaluation models which obtains new state-of-the-art levels of correlation with human judgements. |
Ricardo Rei; Craig Stewart; Ana C Farinha; Alon Lavie; |
2019 | 1 | RoBERTa: A Robustly Optimized BERT Pretraining Approach IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a replication study of BERT pretraining (Devlin et al., 2019) that carefully measures the impact of many key hyperparameters and training data size. |
YINHAN LIU et. al. |
2019 | 2 | HuggingFace’s Transformers: State-of-the-art Natural Language Processing IF:9 Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Recent progress in natural language processing has been driven by advances in both model architecture and model pretraining. Transformer architectures have facilitated building … |
THOMAS WOLF et. al. |
2019 | 3 | Sentence-BERT: Sentence Embeddings Using Siamese BERT-Networks IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this publication, we present Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity. |
Nils Reimers; Iryna Gurevych; |
2019 | 4 | BART: Denoising Sequence-to-Sequence Pre-training For Natural Language Generation, Translation, And Comprehension IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present BART, a denoising autoencoder for pretraining sequence-to-sequence models. |
MIKE LEWIS et. al. |
2019 | 5 | XLNet: Generalized Autoregressive Pretraining For Language Understanding IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In light of these pros and cons, we propose XLNet, a generalized autoregressive pretraining method that (1) enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and (2) overcomes the limitations of BERT thanks to its autoregressive formulation. |
ZHILIN YANG et. al. |
2019 | 6 | DistilBERT, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a method to pre-train a smaller general-purpose language representation model, called DistilBERT, which can then be fine-tuned with good performances on a wide range of tasks like its larger counterparts. |
Victor Sanh; Lysandre Debut; Julien Chaumond; Thomas Wolf; |
2019 | 7 | ALBERT: A Lite BERT For Self-supervised Learning Of Language Representations IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address these problems, we present two parameter-reduction techniques to lower memory consumption and increase the training speed of BERT. |
ZHENZHONG LAN et. al. |
2019 | 8 | Unsupervised Cross-lingual Representation Learning At Scale IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We train a Transformer-based masked language model on one hundred languages, using more than two terabytes of filtered CommonCrawl data. |
ALEXIS CONNEAU et. al. |
2019 | 9 | BioBERT: A Pre-trained Biomedical Language Representation Model For Biomedical Text Mining IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this article, we investigate how the recently introduced pre-trained language model BERT can be adapted for biomedical corpora. |
JINHYUK LEE et. al. |
2019 | 10 | BERTScore: Evaluating Text Generation With BERT IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose BERTScore, an automatic evaluation metric for text generation. |
Tianyi Zhang; Varsha Kishore; Felix Wu; Kilian Q. Weinberger; Yoav Artzi; |
2019 | 11 | Fairseq: A Fast, Extensible Toolkit For Sequence Modeling IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: fairseq is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks. |
MYLE OTT et. al. |
2019 | 12 | The Curious Case Of Neural Text Degeneration IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we reveal surprising distributional differences between human text and machine text. |
Ari Holtzman; Jan Buys; Li Du; Maxwell Forbes; Yejin Choi; |
2019 | 13 | Cross-lingual Language Model Pretraining IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we extend this approach to multiple languages and show the effectiveness of cross-lingual pretraining. |
Guillaume Lample; Alexis Conneau; |
2019 | 14 | SciBERT: A Pretrained Language Model For Scientific Text IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We evaluate on a suite of tasks including sequence tagging, sentence classification and dependency parsing, with datasets from a variety of scientific domains. |
Iz Beltagy; Kyle Lo; Arman Cohan; |
2019 | 15 | Energy And Policy Considerations For Deep Learning In NLP IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we bring this issue to the attention of NLP researchers by quantifying the approximate financial and environmental costs of training a variety of recently successful neural network models for NLP. |
Emma Strubell; Ananya Ganesh; Andrew McCallum; |
2019 | 16 | Language Models As Knowledge Bases? IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present an in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models. |
FABIO PETRONI et. al. |
2019 | 17 | LXMERT: Learning Cross-Modality Encoder Representations From Transformers IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We thus propose the LXMERT (Learning Cross-Modality Encoder Representations from Transformers) framework to learn these vision-and-language connections. |
Hao Tan; Mohit Bansal; |
2019 | 18 | SuperGLUE: A Stickier Benchmark For General-Purpose Language Understanding Systems IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we present SuperGLUE, a new benchmark styled after GLUE with a new set of more difficult language understanding tasks, a software toolkit, and a public leaderboard. |
ALEX WANG et. al. |
2019 | 19 | SpanBERT: Improving Pre-training By Representing And Predicting Spans IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text. |
MANDAR JOSHI et. al. |
2019 | 20 | PEGASUS: Pre-training With Extracted Gap-sentences For Abstractive Summarization IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose pre-training large Transformer-based encoder-decoder models on massive text corpora with a new self-supervised objective. |
Jingqing Zhang; Yao Zhao; Mohammad Saleh; Peter J. Liu; |
2019 | 21 | EDA: Easy Data Augmentation Techniques For Boosting Performance On Text Classification Tasks IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present EDA: easy data augmentation techniques for boosting performance on text classification tasks. |
Jason Wei; Kai Zou; |
2019 | 22 | Publicly Available Clinical BERT Embeddings IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we address this need by exploring and releasing BERT models for clinical text: one for generic clinical text and another for discharge summaries specifically. |
EMILY ALSENTZER et. al. |
2019 | 23 | TinyBERT: Distilling BERT For Natural Language Understanding IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: By leveraging this new KD method, the plenty of knowledge encoded in a large teacher BERT can be effectively transferred to a small student Tiny-BERT. |
XIAOQI JIAO et. al. |
2019 | 24 | GQA: A New Dataset For Real-World Visual Reasoning And Compositional Question Answering IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce GQA, a new dataset for real-world visual reasoning and compositional question answering, seeking to address key shortcomings of previous VQA datasets. |
Drew A. Hudson; Christopher D. Manning; |
2019 | 25 | HellaSwag: Can A Machine Really Finish Your Sentence? IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show that commonsense inference still proves difficult for even state-of-the-art models, by presenting HellaSwag, a new challenge dataset. |
Rowan Zellers; Ari Holtzman; Yonatan Bisk; Ali Farhadi; Yejin Choi; |
2019 | 26 | Unified Language Model Pre-training For Natural Language Understanding And Generation IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a new Unified pre-trained Language Model (UniLM) that can be fine-tuned for both natural language understanding and generation tasks. |
LI DONG et. al. |
2019 | 27 | Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present our techniques for training very large transformer models and implement a simple, efficient intra-layer model parallel approach that enables training transformer models with billions of parameters. |
MOHAMMAD SHOEYBI et. al. |
2019 | 28 | DialoGPT: Large-Scale Generative Pre-training For Conversational Response Generation IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a large, tunable neural conversational response generation model, DialoGPT (dialogue generative pre-trained transformer). |
YIZHE ZHANG et. al. |
2019 | 29 | How To Fine-Tune BERT For Text Classification? IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we conduct exhaustive experiments to investigate different fine-tuning methods of BERT on text classification task and provide a general solution for BERT fine-tuning. |
Chi Sun; Xipeng Qiu; Yige Xu; Xuanjing Huang; |
2019 | 30 | BERT Rediscovers The Classical NLP Pipeline IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We focus on one such model, BERT, and aim to quantify where linguistic information is captured within the network. |
Ian Tenney; Dipanjan Das; Ellie Pavlick; |
2018 | 1 | BERT: Pre-training Of Deep Bidirectional Transformers For Language Understanding IF:10 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. |
Jacob Devlin; Ming-Wei Chang; Kenton Lee; Kristina Toutanova; |
2018 | 2 | Deep Contextualized Word Representations IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy). |
MATTHEW E. PETERS et. al. |
2018 | 3 | GLUE: A Multi-Task Benchmark And Analysis Platform For Natural Language Understanding IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In pursuit of this objective, we introduce the General Language Understanding Evaluation benchmark (GLUE), a tool for evaluating and analyzing the performance of models across a diverse range of existing NLU tasks. |
ALEX WANG et. al. |
2018 | 4 | Universal Language Model Fine-tuning For Text Classification IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose Universal Language Model Fine-tuning (ULMFiT), an effective transfer learning method that can be applied to any task in NLP, and introduce techniques that are key for fine-tuning a language model. |
Jeremy Howard; Sebastian Ruder; |
2018 | 5 | SentencePiece: A Simple And Language Independent Subword Tokenizer And Detokenizer For Neural Text Processing IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper describes SentencePiece, a language-independent subword tokenizer and detokenizer designed for Neural-based text processing, including Neural Machine Translation. |
Taku Kudo; John Richardson; |
2018 | 6 | A Call For Clarity In Reporting BLEU Scores IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Pointing to the success of the parsing community, I suggest machine translation researchers settle upon the BLEU scheme used by the annual Conference on Machine Translation (WMT), which does not allow for user-supplied reference processing, and provide a new tool, SacreBLEU, to facilitate this. |
Matt Post; |
2018 | 7 | Self-Attention With Relative Position Representations IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we present an alternative approach, extending the self-attention mechanism to efficiently consider representations of the relative positions, or distances between sequence elements. |
Peter Shaw; Jakob Uszkoreit; Ashish Vaswani; |
2018 | 8 | HotpotQA: A Dataset For Diverse, Explainable Multi-hop Question Answering IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce HotpotQA, a new dataset with 113k Wikipedia-based question-answer pairs with four key features: (1) the questions require finding and reasoning over multiple supporting documents to answer; (2) the questions are diverse and not constrained to any pre-existing knowledge bases or knowledge schemas; (3) we provide sentence-level supporting facts required for reasoning, allowing QA systems to reason with strong supervision and explain the predictions; (4) we offer a new type of factoid comparison questions to test QA systems’ ability to extract relevant facts and perform necessary comparison. |
ZHILIN YANG et. al. |
2018 | 9 | Universal Sentence Encoder IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present models for encoding sentences into embedding vectors that specifically target transfer learning to other NLP tasks. |
DANIEL CER et. al. |
2018 | 10 | Graph Convolutional Networks For Text Classification IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose to use graph convolutional networks for text classification. |
Liang Yao; Chengsheng Mao; Yuan Luo; |
2018 | 11 | Deep Learning For Sentiment Analysis : A Survey IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Deep learning has emerged as a powerful machine learning technique that learns multiple layers of representations or features of the data and produces state-of-the-art prediction … |
Lei Zhang; Shuai Wang; Bing Liu; |
2018 | 12 | Speech Commands: A Dataset For Limited-Vocabulary Speech Recognition IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Describes an audio dataset of spoken words designed to help train and evaluate keyword spotting systems. Discusses why this task is an interesting challenge, and why it requires a … |
Pete Warden; |
2018 | 13 | Hierarchical Neural Story Generation IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We explore story generation: creative systems that can build coherent and fluent passages of text about a topic. We collect a large dataset of 300K human-written stories paired with writing prompts from an online forum. |
Angela Fan; Mike Lewis; Yann Dauphin; |
2018 | 14 | ESPnet: End-to-End Speech Processing Toolkit IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a new open source platform for end-to-end speech processing named ESPnet. |
SHINJI WATANABE et. al. |
2018 | 15 | FEVER: A Large-scale Dataset For Fact Extraction And VERification IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we introduce a new publicly available dataset for verification against textual sources, FEVER: Fact Extraction and VERification. |
James Thorne; Andreas Vlachos; Christos Christodoulopoulos; Arpit Mittal; |
2018 | 16 | Federated Learning For Mobile Keyboard Prediction IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We train a recurrent neural network language model using a distributed, on-device learning framework called federated learning for the purpose of next-word prediction in a virtual keyboard for smartphones. |
ANDREW HARD et. al. |
2018 | 17 | Learning Word Vectors For 157 Languages IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we describe how we trained such high quality word representations for 157 languages. We also introduce three new word analogy datasets to evaluate these word vectors, for French, Hindi and Polish. |
Edouard Grave; Piotr Bojanowski; Prakhar Gupta; Armand Joulin; Tomas Mikolov; |
2018 | 18 | CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To investigate question answering with prior knowledge, we present CommonsenseQA: a challenging new dataset for commonsense question answering. |
Alon Talmor; Jonathan Herzig; Nicholas Lourie; Jonathan Berant; |
2018 | 19 | AllenNLP: A Deep Semantic Natural Language Processing Platform IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper describes AllenNLP, a platform for research on deep learning methods in natural language understanding. |
MATT GARDNER et. al. |
2018 | 20 | Neural Network Acceptability Judgments IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the Corpus of Linguistic Acceptability (CoLA), a set of 10,657 English sentences labeled as grammatical or ungrammatical from published linguistics literature. |
Alex Warstadt; Amanpreet Singh; Samuel R. Bowman; |
2018 | 21 | XNLI: Evaluating Cross-lingual Sentence Representations IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we construct an evaluation set for XLU by extending the development and test sets of the Multi-Genre Natural Language Inference Corpus (MultiNLI) to 15 languages, including low-resource languages such as Swahili and Urdu. |
ALEXIS CONNEAU et. al. |
2018 | 22 | CoQA: A Conversational Question Answering Challenge IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce CoQA, a novel dataset for building Conversational Question Answering systems. |
Siva Reddy; Danqi Chen; Christopher D. Manning; |
2018 | 23 | Annotation Artifacts In Natural Language Inference Data IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that, in a significant portion of such data, this protocol leaves clues that make it possible to identify the label by looking only at the hypothesis, without observing the premise. |
SUCHIN GURURANGAN et. al. |
2018 | 24 | QANet: Combining Local Convolution With Global Self-Attention For Reading Comprehension IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a new Q\&A architecture called QANet, which does not require recurrent networks: Its encoder consists exclusively of convolution and self-attention, where convolution models local interactions and self-attention models global interactions. |
ADAMS WEI YU et. al. |
2018 | 25 | Subword Regularization: Improving Neural Network Translation Models With Multiple Subword Candidates IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The question addressed in this paper is whether it is possible to harness the segmentation ambiguity as a noise to improve the robustness of NMT. |
Taku Kudo; |
2018 | 26 | Can A Suit Of Armor Conduct Electricity? A New Dataset For Open Book Question Answering IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a new kind of question answering dataset, OpenBookQA, modeled after open book exams for assessing human understanding of a subject. |
Todor Mihaylov; Peter Clark; Tushar Khot; Ashish Sabharwal; |
2018 | 27 | A Survey On Deep Learning For Named Entity Recognition IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we provide a comprehensive review on existing deep learning techniques for NER. |
Jing Li; Aixin Sun; Jianglei Han; Chenliang Li; |
2018 | 28 | Spider: A Large-Scale Human-Labeled Dataset For Complex And Cross-Domain Semantic Parsing And Text-to-SQL Task IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Spider, a large-scale, complex and cross-domain semantic parsing and text-to-SQL dataset annotated by 11 college students. |
TAO YU et. al. |
2018 | 29 | Massively Multilingual Sentence Embeddings For Zero-Shot Cross-Lingual Transfer And Beyond IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an architecture to learn joint multilingual sentence representations for 93 languages, belonging to more than 30 different families and written in 28 different scripts. We also introduce a new test set of aligned sentences in 112 languages, and show that our sentence embeddings obtain strong results in multilingual similarity search even for low-resource languages. |
Mikel Artetxe; Holger Schwenk; |
2018 | 30 | Generating Natural Language Adversarial Examples IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Given these challenges, we use a black-box population-based optimization algorithm to generate semantically and syntactically similar adversarial examples that fool well-trained sentiment analysis and textual entailment models with success rates of 97% and 70%, respectively. |
MOUSTAFA ALZANTOT et. al. |
2017 | 1 | Attention Is All You Need IF:10 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. |
ASHISH VASWANI et. al. |
2017 | 2 | A Broad-Coverage Challenge Corpus For Sentence Understanding Through Inference IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces the Multi-Genre Natural Language Inference (MultiNLI) corpus, a dataset designed for use in the development and evaluation of machine learning models for sentence understanding. |
Adina Williams; Nikita Nangia; Samuel R. Bowman; |
2017 | 3 | Get To The Point: Summarization With Pointer-Generator Networks IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we propose a novel architecture that augments the standard sequence-to-sequence attentional model in two orthogonal ways. |
Abigail See; Peter J. Liu; Christopher D. Manning; |
2017 | 4 | Convolutional Sequence To Sequence Learning IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an architecture based entirely on convolutional neural networks. |
Jonas Gehring; Michael Auli; David Grangier; Denis Yarats; Yann N. Dauphin; |
2017 | 5 | Recent Trends In Deep Learning Based Natural Language Processing IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we review significant deep learning related models and methods that have been employed for numerous NLP tasks and provide a walk-through of their evolution. |
Tom Young; Devamanyu Hazarika; Soujanya Poria; Erik Cambria; |
2017 | 6 | Natural TTS Synthesis By Conditioning WaveNet On Mel Spectrogram Predictions IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper describes Tacotron 2, a neural network architecture for speech synthesis directly from text. |
JONATHAN SHEN et. al. |
2017 | 7 | Automated Hate Speech Detection And The Problem Of Offensive Language IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We used a crowd-sourced hate speech lexicon to collect tweets containing hate speech keywords. |
Thomas Davidson; Dana Warmsley; Michael Macy; Ingmar Weber; |
2017 | 8 | Supervised Learning Of Universal Sentence Representations From Natural Language Inference Data IF:10 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show how universal sentence representations trained using the supervised data of the Stanford Natural Language Inference datasets can consistently outperform unsupervised methods like SkipThought vectors on a wide range of transfer tasks. |
Alexis Conneau; Douwe Kiela; Holger Schwenk; Loic Barrault; Antoine Bordes; |
2017 | 9 | A Structured Self-attentive Sentence Embedding IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention. |
ZHOUHAN LIN et. al. |
2017 | 10 | TriviaQA: A Large Scale Distantly Supervised Challenge Dataset For Reading Comprehension IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present TriviaQA, a challenging reading comprehension dataset containing over 650K question-answer-evidence triples. |
Mandar Joshi; Eunsol Choi; Daniel S. Weld; Luke Zettlemoyer; |
2017 | 11 | OpenNMT: Open-source Toolkit For Neural Machine Translation IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an open-source toolkit for neural machine translation (NMT) to support research into model architectures, feature representations, and source modalities, while maintaining competitive performance, modularity and reasonable training requirements. |
GUILLAUME KLEIN et. al. |
2017 | 12 | Reading Wikipedia To Answer Open-Domain Questions IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes to tackle open- domain question answering using Wikipedia as the unique knowledge source: the answer to any factoid question is a text span in a Wikipedia article. |
Danqi Chen; Adam Fisch; Jason Weston; Antoine Bordes; |
2017 | 13 | Tacotron: Towards End-to-End Speech Synthesis IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present Tacotron, an end-to-end generative text-to-speech model that synthesizes speech directly from characters. |
YUXUAN WANG et. al. |
2017 | 14 | A Simple Neural Network Module For Relational Reasoning IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we describe how to use Relation Networks (RNs) as a simple plug-and-play module to solve problems that fundamentally hinge on relational reasoning. |
ADAM SANTORO et. al. |
2017 | 15 | Word Translation Without Parallel Data IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we show that we can build a bilingual dictionary between two languages without using any parallel corpora, by aligning monolingual word embedding spaces in an unsupervised way. |
Alexis Conneau; Guillaume Lample; Marc’Aurelio Ranzato; Ludovic Denoyer; Hervé Jégou; |
2017 | 16 | Adversarial Examples For Evaluating Reading Comprehension Systems IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To reward systems with real language understanding abilities, we propose an adversarial evaluation scheme for the Stanford Question Answering Dataset (SQuAD). |
Robin Jia; Percy Liang; |
2017 | 17 | A Deep Reinforced Model For Abstractive Summarization IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a neural network model with a novel intra-attention that attends over the input and continuously generated output separately, and a new training method that combines standard supervised word prediction and reinforcement learning (RL). |
Romain Paulus; Caiming Xiong; Richard Socher; |
2017 | 18 | Supervised Speech Separation Based On Deep Learning: An Overview IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Much of the overview is on separation algorithms where we review monaural methods, including speech enhancement (speech-nonspeech separation), speaker separation (multi-talker separation), and speech dereverberation, as well as multi-microphone techniques. |
DeLiang Wang; Jitong Chen; |
2017 | 19 | Advances In Pre-Training Distributed Word Representations IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show how to train high-quality word vector representations by using a combination of known tricks that are however rarely used together. |
Tomas Mikolov; Edouard Grave; Piotr Bojanowski; Christian Puhrsch; Armand Joulin; |
2017 | 20 | DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We develop a high-quality multi-turn dialog dataset, DailyDialog, which is intriguing in several aspects. |
YANRAN LI et. al. |
2017 | 21 | RACE: Large-scale ReAding Comprehension Dataset From Examinations IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present RACE, a new dataset for benchmark evaluation of methods in the reading comprehension task. |
Guokun Lai; Qizhe Xie; Hanxiao Liu; Yiming Yang; Eduard Hovy; |
2017 | 22 | State-of-the-art Speech Recognition With Sequence-to-Sequence Models IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we explore a variety of structural and optimization improvements to our LAS model which significantly improve performance. |
CHUNG-CHENG CHIU et. al. |
2017 | 23 | Six Challenges For Neural Machine Translation IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore six challenges for neural machine translation: domain mismatch, amount of training data, rare words, long sentences, word alignment, and beam search. |
Philipp Koehn; Rebecca Knowles; |
2017 | 24 | Unsupervised Machine Translation Using Monolingual Corpora Only IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we take this research direction to the extreme and investigate whether it is possible to learn to translate even without any parallel data. |
Guillaume Lample; Alexis Conneau; Ludovic Denoyer; Marc’Aurelio Ranzato; |
2017 | 25 | Tensor Fusion Network For Multimodal Sentiment Analysis IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we pose the problem of multimodal sentiment analysis as modeling intra-modality and inter-modality dynamics. |
Amir Zadeh; Minghai Chen; Soujanya Poria; Erik Cambria; Louis-Philippe Morency; |
2017 | 26 | Seq2SQL: Generating Structured Queries From Natural Language Using Reinforcement Learning IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose Seq2SQL, a deep neural network for translating natural language questions to corresponding SQL queries. |
Victor Zhong; Caiming Xiong; Richard Socher; |
2017 | 27 | Deep Learning For Hate Speech Detection In Tweets IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We define this task as being able to classify a tweet as racist, sexist or neither. |
Pinkesh Badjatiya; Shashank Gupta; Manish Gupta; Vasudeva Varma; |
2017 | 28 | HotFlip: White-Box Adversarial Examples For Text Classification IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose an efficient method to generate white-box adversarial examples to trick a character-level neural classifier. |
Javid Ebrahimi; Anyi Rao; Daniel Lowd; Dejing Dou; |
2017 | 29 | Comparative Study Of CNN And RNN For Natural Language Processing IF:8 Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Deep neural networks (DNN) have revolutionized the field of natural language processing (NLP). Convolutional neural network (CNN) and recurrent neural network (RNN), the two main … |
Wenpeng Yin; Katharina Kann; Mo Yu; Hinrich Schütze; |
2017 | 30 | Adversarial Learning For Neural Dialogue Generation IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, drawing intuition from the Turing test, we propose using adversarial training for open-domain dialogue generation: the system is trained to produce sequences that are indistinguishable from human-generated dialogue utterances. |
JIWEI LI et. al. |
2016 | 1 | Enriching Word Vectors With Subword Information IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a new approach based on the skipgram model, where each word is represented as a bag of character $n$-grams. |
Piotr Bojanowski; Edouard Grave; Armand Joulin; Tomas Mikolov; |
2016 | 2 | SQuAD: 100,000+ Questions For Machine Comprehension Of Text IF:10 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the Stanford Question Answering Dataset (SQuAD), a new reading comprehension dataset consisting of 100,000+ questions posed by crowdworkers on a set of Wikipedia articles, where the answer to each question is a segment of text from the corresponding reading passage. |
Pranav Rajpurkar; Jian Zhang; Konstantin Lopyrev; Percy Liang; |
2016 | 3 | Google’s Neural Machine Translation System: Bridging The Gap Between Human And Machine Translation IF:10 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present GNMT, Google’s Neural Machine Translation system, which attempts to address many of these issues. |
YONGHUI WU et. al. |
2016 | 4 | Bag Of Tricks For Efficient Text Classification IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper explores a simple and efficient baseline for text classification. |
Armand Joulin; Edouard Grave; Piotr Bojanowski; Tomas Mikolov; |
2016 | 5 | Neural Architectures For Named Entity Recognition IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce two new neural architectures—one based on bidirectional LSTMs and conditional random fields, and the other that constructs and labels segments using a transition-based approach inspired by shift-reduce parsers. |
Guillaume Lample; Miguel Ballesteros; Sandeep Subramanian; Kazuya Kawakami; Chris Dyer; |
2016 | 6 | Man Is To Computer Programmer As Woman Is To Homemaker? Debiasing Word Embeddings IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This raises concerns because their widespread use, as we describe, often tends to amplify these biases. |
Tolga Bolukbasi; Kai-Wei Chang; James Zou; Venkatesh Saligrama; Adam Kalai; |
2016 | 7 | ConceptNet 5.5: An Open Multilingual Graph Of General Knowledge IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present here a new version of the linked open data resource ConceptNet that is particularly well suited to be used with modern NLP techniques such as word embeddings. |
Robyn Speer; Joshua Chin; Catherine Havasi; |
2016 | 8 | Abstractive Text Summarization Using Sequence-to-Sequence RNNs And Beyond IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we model abstractive text summarization using Attentional Encoder-Decoder Recurrent Neural Networks, and show that they achieve state-of-the-art performance on two different corpora. |
Ramesh Nallapati; Bowen Zhou; Cicero Nogueira dos santos; Caglar Gulcehre; Bing Xiang; |
2016 | 9 | MS MARCO: A Human Generated MAchine Reading COmprehension Dataset IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a large scale MAchine Reading COmprehension dataset, which we name MS MARCO. |
PAYAL BAJAJ et. al. |
2016 | 10 | Language Modeling With Gated Convolutional Networks IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we develop a finite context approach through stacked convolutions, which can be more efficient since they allow parallelization over sequential tokens. |
Yann N. Dauphin; Angela Fan; Michael Auli; David Grangier; |
2016 | 11 | Bidirectional Attention Flow For Machine Comprehension IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we introduce the Bi-Directional Attention Flow (BIDAF) network, a multi-stage hierarchical process that represents the context at different levels of granularity and uses bi-directional attention flow mechanism to obtain a query-aware context representation without early summarization. |
Minjoon Seo; Aniruddha Kembhavi; Ali Farhadi; Hannaneh Hajishirzi; |
2016 | 12 | A Decomposable Attention Model For Natural Language Inference IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a simple neural architecture for natural language inference. |
Ankur P. Parikh; Oscar Täckström; Dipanjan Das; Jakob Uszkoreit; |
2016 | 13 | Deep Reinforcement Learning For Dialogue Generation IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show how to integrate these goals, applying deep reinforcement learning to model future reward in chatbot dialogue. |
JIWEI LI et. al. |
2016 | 14 | A Network-based End-to-End Trainable Task-oriented Dialogue System IF:10 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we introduce a neural network-based text-in, text-out end-to-end trainable goal-oriented dialogue system along with a new way of collecting dialogue data based on a novel pipe-lined Wizard-of-Oz framework. |
TSUNG-HSIEN WEN et. al. |
2016 | 15 | Recurrent Neural Network For Text Classification With Multi-Task Learning IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we use the multi-task learning framework to jointly learn across multiple related tasks. |
Pengfei Liu; Xipeng Qiu; Xuanjing Huang; |
2016 | 16 | SummaRuNNer: A Recurrent Neural Network Based Sequence Model For Extractive Summarization Of Documents IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present SummaRuNNer, a Recurrent Neural Network (RNN) based sequence model for extractive summarization of documents and show that it achieves performance better than or comparable to state-of-the-art. |
Ramesh Nallapati; Feifei Zhai; Bowen Zhou; |
2016 | 17 | Deep Biaffine Attention For Neural Dependency Parsing IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper builds off recent work from Kiperwasser & Goldberg (2016) using neural attention in a simple graph-based dependency parser. |
Timothy Dozat; Christopher D. Manning; |
2016 | 18 | Exploring The Limits Of Language Modeling IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we explore recent advances in Recurrent Neural Networks for large scale Language Modeling, a task central to language understanding. |
Rafal Jozefowicz; Oriol Vinyals; Mike Schuster; Noam Shazeer; Yonghui Wu; |
2016 | 19 | End-to-End Relation Extraction Using LSTMs On Sequences And Tree Structures IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a novel end-to-end neural model to extract entities and relations between them. |
Makoto Miwa; Mohit Bansal; |
2016 | 20 | A Hierarchical Latent Variable Encoder-Decoder Model For Generating Dialogues IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In an effort to model this kind of generative process, we propose a neural network-based generative architecture, with latent stochastic variables that span a variable number of time steps. |
IULIAN VLAD SERBAN et. al. |
2016 | 21 | Enhanced LSTM For Natural Language Inference IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a new state-of-the-art result, achieving the accuracy of 88.6% on the Stanford Natural Language Inference Dataset. |
QIAN CHEN et. al. |
2016 | 22 | Long Short-Term Memory-Networks For Machine Reading IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we address the question of how to render sequence-level networks better at handling structured input. |
Jianpeng Cheng; Li Dong; Mirella Lapata; |
2016 | 23 | A Persona-Based Neural Conversation Model IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present persona-based models for handling the issue of speaker consistency in neural response generation. |
JIWEI LI et. al. |
2016 | 24 | Sequence-Level Knowledge Distillation IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we consider applying knowledge distillation approaches (Bucila et al., 2006; Hinton et al., 2015) that have proven successful for reducing the size of neural models in other domains to the problem of NMT. |
Yoon Kim; Alexander M. Rush; |
2016 | 25 | Very Deep Convolutional Networks For Text Classification IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a new architecture (VDCNN) for text processing which operates directly at the character level and uses only small convolutions and pooling operations. |
Alexis Conneau; Holger Schwenk; Loïc Barrault; Yann Lecun; |
2016 | 26 | Permutation Invariant Training Of Deep Models For Speaker-Independent Multi-talker Speech Separation IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel deep learning model, which supports permutation invariant training (PIT), for speaker independent multi-talker speech separation, commonly known as the cocktail-party problem. |
Dong Yu; Morten Kolbæk; Zheng-Hua Tan; Jesper Jensen; |
2016 | 27 | Key-Value Memory Networks For Directly Reading Documents IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we introduce a new method, Key-Value Memory Networks, that makes reading documents more viable by utilizing different encodings in the addressing and output stages of the memory read operation. |
ALEXANDER MILLER et. al. |
2016 | 28 | Joint CTC-Attention Based End-to-End Speech Recognition Using Multi-task Learning IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a novel method for end-to-end speech recognition to improve robustness and achieve fast convergence by using a joint CTC-attention model within the multi-task learning framework, thereby mitigating the alignment issue. |
Suyoun Kim; Takaaki Hori; Shinji Watanabe; |
2016 | 29 | Diachronic Word Embeddings Reveal Statistical Laws Of Semantic Change IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Using six historical corpora spanning four languages and two centuries, we propose two quantitative laws of semantic change: (i) the law of conformity—the rate of semantic change scales with an inverse power-law of word frequency; (ii) the law of innovation—independent of frequency, words that are more polysemous have higher rates of semantic change. |
William L. Hamilton; Jure Leskovec; Dan Jurafsky; |
2016 | 30 | Aspect Level Sentiment Classification With Deep Memory Network IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a deep memory network for aspect level sentiment classification. |
Duyu Tang; Bing Qin; Ting Liu; |
2015 | 1 | Effective Approaches To Attention-based Neural Machine Translation IF:10 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper examines two simple and effective classes of attentional mechanism: a global approach which always attends to all source words and a local one that only looks at a subset of source words at a time. |
Minh-Thang Luong; Hieu Pham; Christopher D. Manning; |
2015 | 2 | Neural Machine Translation Of Rare Words With Subword Units IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a simpler and more effective approach, making the NMT model capable of open-vocabulary translation by encoding rare and unknown words as sequences of subword units. |
Rico Sennrich; Barry Haddow; Alexandra Birch; |
2015 | 3 | VQA: Visual Question Answering IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose the task of free-form and open-ended Visual Question Answering (VQA). |
AISHWARYA AGRAWAL et. al. |
2015 | 4 | A Large Annotated Corpus For Learning Natural Language Inference IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address this, we introduce the Stanford Natural Language Inference corpus, a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning. |
Samuel R. Bowman; Gabor Angeli; Christopher Potts; Christopher D. Manning; |
2015 | 5 | Bidirectional LSTM-CRF Models For Sequence Tagging IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a variety of Long Short-Term Memory (LSTM) based models for sequence tagging. |
Zhiheng Huang; Wei Xu; Kai Yu; |
2015 | 6 | Teaching Machines To Read And Comprehend IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we define a new methodology that resolves this bottleneck and provides large scale supervised reading comprehension data. |
KARL MORITZ HERMANN et. al. |
2015 | 7 | Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks IF:10 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the Tree-LSTM, a generalization of LSTMs to tree-structured network topologies. |
Kai Sheng Tai; Richard Socher; Christopher D. Manning; |
2015 | 8 | Deep Speech 2: End-to-End Speech Recognition In English And Mandarin IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We show that an end-to-end deep learning approach can be used to recognize either English or Mandarin Chinese speech–two vastly different languages. |
DARIO AMODEI et. al. |
2015 | 9 | A Neural Attention Model For Abstractive Sentence Summarization IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a fully data-driven approach to abstractive sentence summarization. |
Alexander M. Rush; Sumit Chopra; Jason Weston; |
2015 | 10 | Improving Neural Machine Translation Models With Monolingual Data IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Target-side monolingual data plays an important role in boosting fluency for phrase-based statistical machine translation, and we investigate the use of monolingual data for NMT. |
Rico Sennrich; Barry Haddow; Alexandra Birch; |
2015 | 11 | Skip-Thought Vectors IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We describe an approach for unsupervised learning of a generic, distributed sentence encoder. |
RYAN KIROS et. al. |
2015 | 12 | Word Sense Disambiguation: A Survey IF:10 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we made a survey on Word Sense Disambiguation (WSD). |
Alok Ranjan Pal; Diganta Saha; |
2015 | 13 | A Diversity-Promoting Objective Function For Neural Conversation Models IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Instead we propose using Maximum Mutual Information (MMI) as the objective function in neural models. |
Jiwei Li; Michel Galley; Chris Brockett; Jianfeng Gao; Bill Dolan; |
2015 | 14 | Named Entity Recognition With Bidirectional LSTM-CNNs IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a novel neural network architecture that automatically detects word- and character-level features using a hybrid bidirectional LSTM and CNN architecture, eliminating the need for most feature engineering. |
Jason P. C. Chiu; Eric Nichols; |
2015 | 15 | A Neural Conversational Model IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a simple approach for this task which uses the recently proposed sequence to sequence framework. |
Oriol Vinyals; Quoc Le; |
2015 | 16 | Character-Aware Neural Language Models IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We describe a simple neural language model that relies only on character-level inputs. |
Yoon Kim; Yacine Jernite; David Sontag; Alexander M. Rush; |
2015 | 17 | Convolutional Neural Network Architectures For Matching Natural Language Sentences IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As a step toward this goal, we propose convolutional neural network models for matching two sentences, by adapting the convolutional strategy in vision and speech. |
Baotian Hu; Zhengdong Lu; Hang Li; Qingcai Chen; |
2015 | 18 | End-to-End Attention-based Large Vocabulary Speech Recognition IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose two methods to speed up this operation: limiting the scan to a subset of most promising frames and pooling over time the information contained in neighboring frames, thereby reducing source sequence length. |
Dzmitry Bahdanau; Jan Chorowski; Dmitriy Serdyuk; Philemon Brakel; Yoshua Bengio; |
2015 | 19 | Neural Responding Machine For Short-Text Conversation IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose Neural Responding Machine (NRM), a neural network-based response generator for Short-Text Conversation. |
Lifeng Shang; Zhengdong Lu; Hang Li; |
2015 | 20 | A Primer On Neural Network Models For Natural Language Processing IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The tutorial covers input encoding for natural language tasks, feed-forward networks, convolutional networks, recurrent networks and recursive networks, as well as the computation graph abstraction for automatic gradient computation. |
Yoav Goldberg; |
2015 | 21 | ABCNN: Attention-Based Convolutional Neural Network For Modeling Sentence Pairs IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work presents a general Attention Based Convolutional Neural Network (ABCNN) for modeling a pair of sentences. |
Wenpeng Yin; Hinrich Schütze; Bing Xiang; Bowen Zhou; |
2015 | 22 | Semantically Conditioned LSTM-based Natural Language Generation For Spoken Dialogue Systems IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a statistical language generator based on a semantically controlled Long Short-term Memory (LSTM) structure. |
TSUNG-HSIEN WEN et. al. |
2015 | 23 | Effective LSTMs For Target-Dependent Sentiment Classification IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we develop two target dependent long short-term memory (LSTM) models, where target information is automatically taken into account. |
Duyu Tang; Bing Qin; Xiaocheng Feng; Ting Liu; |
2015 | 24 | A C-LSTM Neural Network For Text Classification IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we combine the strengths of both architectures and propose a novel and unified model called C-LSTM for sentence representation and text classification. |
Chunting Zhou; Chonglin Sun; Zhiyuan Liu; Francis C. M. Lau; |
2015 | 25 | Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis And Application To Information Retrieval IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, the LSTM-RNN is trained in a weakly supervised manner on user click-through data logged by a commercial web search engine. |
HAMID PALANGI et. al. |
2015 | 26 | Transition-Based Dependency Parsing With Stack Long Short-Term Memory IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a technique for learning representations of parser states in transition-based dependency parsers. |
Chris Dyer; Miguel Ballesteros; Wang Ling; Austin Matthews; Noah A. Smith; |
2015 | 27 | Sentiment Of Emojis IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, we propose our Emoji Sentiment Ranking as a European language-independent resource for automated sentiment analysis. |
Petra Kralj Novak; Jasmina Smailović; Borut Sluban; Igor Mozetič; |
2015 | 28 | PTE: Predictive Text Embedding Through Large-scale Heterogeneous Text Networks IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we fill this gap by proposing a semi-supervised representation learning method for text data, which we call the \textit{predictive text embedding} (PTE). |
Jian Tang; Meng Qu; Qiaozhu Mei; |
2015 | 29 | Reasoning About Entailment With Neural Attention IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a neural model that reads two sentences to determine entailment using long short-term memory units. |
Tim Rocktäschel; Edward Grefenstette; Karl Moritz Hermann; Tomáš Kočiský; Phil Blunsom; |
2015 | 30 | A Sensitivity Analysis Of (and Practitioners’ Guide To) Convolutional Neural Networks For Sentence Classification IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We thus conduct a sensitivity analysis of one-layer CNNs to explore the effect of architecture components on model performance; our aim is to distinguish between important and comparatively inconsequential design decisions for sentence classification. |
Ye Zhang; Byron Wallace; |
2014 | 1 | Neural Machine Translation By Jointly Learning To Align And Translate IF:10 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and propose to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly. |
Dzmitry Bahdanau; Kyunghyun Cho; Yoshua Bengio; |
2014 | 2 | Sequence To Sequence Learning With Neural Networks IF:10 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure. |
Ilya Sutskever; Oriol Vinyals; Quoc V. Le; |
2014 | 3 | Convolutional Neural Networks For Sentence Classification IF:10 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We report on a series of experiments with convolutional neural networks (CNN) trained on top of pre-trained word vectors for sentence-level classification tasks. |
Yoon Kim; |
2014 | 4 | Learning Phrase Representations Using RNN Encoder-Decoder For Statistical Machine Translation IF:10 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel neural network model called RNN Encoder-Decoder that consists of two recurrent neural networks (RNN). |
KYUNGHYUN CHO et. al. |
2014 | 5 | Distributed Representations Of Sentences And Documents IF:10 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Paragraph Vector, an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents. |
Quoc V. Le; Tomas Mikolov; |
2014 | 6 | A Convolutional Neural Network For Modelling Sentences IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We describe a convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) that we adopt for the semantic modelling of sentences. |
Nal Kalchbrenner; Edward Grefenstette; Phil Blunsom; |
2014 | 7 | On The Properties Of Neural Machine Translation: Encoder-Decoder Approaches IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we focus on analyzing the properties of the neural machine translation using two models; RNN Encoder–Decoder and a newly proposed gated recursive convolutional neural network. |
Kyunghyun Cho; Bart van Merrienboer; Dzmitry Bahdanau; Yoshua Bengio; |
2014 | 8 | Embedding Entities And Relations For Learning And Inference In Knowledge Bases IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Under this framework, we compare a variety of embedding models on the link prediction task. |
Bishan Yang; Wen-tau Yih; Xiaodong He; Jianfeng Gao; Li Deng; |
2014 | 9 | Deep Speech: Scaling Up End-to-end Speech Recognition IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a state-of-the-art speech recognition system developed using end-to-end deep learning. |
AWNI HANNUN et. al. |
2014 | 10 | Word2vec Explained: Deriving Mikolov Et Al.’s Negative-sampling Word-embedding Method IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The learning models behind the software are described in two research papers. |
Yoav Goldberg; Omer Levy; |
2014 | 11 | SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present SimLex-999, a gold standard resource for evaluating distributional semantic models that improves on existing resources in several important ways. |
Felix Hill; Roi Reichart; Anna Korhonen; |
2014 | 12 | On Using Very Large Target Vocabulary For Neural Machine Translation IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a method that allows us to use a very large target vocabulary without increasing training complexity, based on importance sampling. |
Sébastien Jean; Kyunghyun Cho; Roland Memisevic; Yoshua Bengio; |
2014 | 13 | Grammar As A Foreign Language IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Syntactic constituency parsing is a fundamental problem in natural language processing and has been the subject of intensive research and engineering for decades. |
ORIOL VINYALS et. al. |
2014 | 14 | Effective Use Of Word Order For Text Categorization With Convolutional Neural Networks IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Instead of using low-dimensional word vectors as input as is often done, we directly apply CNN to high-dimensional text data, which leads to directly learning embedding of small text regions for use in classification. |
Rie Johnson; Tong Zhang; |
2014 | 15 | Word2vec Parameter Learning Explained IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In the appendix, a review on the basics of neuron networks and backpropagation is provided. |
Xin Rong; |
2014 | 16 | Addressing The Rare Word Problem In Neural Machine Translation IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose and implement an effective technique to address this problem. |
Minh-Thang Luong; Ilya Sutskever; Quoc V. Le; Oriol Vinyals; Wojciech Zaremba; |
2014 | 17 | Question Answering With Subgraph Embeddings IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a system which learns to answer questions on a broad range of topics from a knowledge base using few hand-crafted features. |
Antoine Bordes; Sumit Chopra; Jason Weston; |
2014 | 18 | Statistically Significant Detection Of Linguistic Change IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a new computational approach for tracking and detecting statistically significant linguistic shifts in the meaning and usage of words. |
Vivek Kulkarni; Rami Al-Rfou; Bryan Perozzi; Steven Skiena; |
2014 | 19 | Wikipedia-based Semantic Interpretation For Natural Language Processing IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we propose a novel method, called Explicit Semantic Analysis (ESA), for fine-grained semantic interpretation of unrestricted natural language texts. |
Evgeniy Gabrilovich; Shaul Markovitch; |
2014 | 20 | Comparing And Combining Sentiment Analysis Methods IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study aims at filling this gap by presenting comparisons of eight popular sentiment analysis methods in terms of coverage (i.e., the fraction of messages whose sentiment is identified) and agreement (i.e., the fraction of identified sentiments that are in tune with ground truth). |
Pollyanna Gonçalves; Matheus Araújo; Fabrício Benevenuto; Meeyoung Cha; |
2014 | 21 | Deep Learning For Answer Sentence Selection IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel approach to solving this task via means of distributed representations, and learn to match questions with answers by considering their semantic encoding. |
Lei Yu; Karl Moritz Hermann; Phil Blunsom; Stephen Pulman; |
2014 | 22 | Word Representations Via Gaussian Embedding IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper advocates for density-based distributed embeddings and presents a method for learning representations in the space of Gaussian distributions. |
Luke Vilnis; Andrew McCallum; |
2014 | 23 | Analysis Of Named Entity Recognition And Linking For Tweets IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we describe a new Twitter entity disambiguation dataset, and conduct an empirical analysis of named entity recognition and disambiguation, investigating how robust a number of state-of-the-art systems are on such noisy texts, what the main sources of error are, and which problems should be further investigated to improve the state of the art. |
LEON DERCZYNSKI et. al. |
2014 | 24 | Improving Zero-shot Learning By Mitigating The Hubness Problem IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: After illustrating the problem empirically, we propose a simple method to correct it by taking the proximity distribution of potential neighbours across many mapped vectors into account. |
Georgiana Dinu; Angeliki Lazaridou; Marco Baroni; |
2014 | 25 | An Autoencoder Approach To Learning Bilingual Word Representations IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we explore the use of autoencoder-based methods for cross-language learning of vectorial word representations that are aligned between two languages, while not relying on word-level alignments. |
SARATH CHANDAR A P et. al. |
2014 | 26 | Open Question Answering With Weakly Supervised Embedding Models IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we instead take the radical approach of learning to map questions to vectorial feature representations. |
Antoine Bordes; Jason Weston; Nicolas Usunier; |
2014 | 27 | Temporal Analysis Of Language Through Neural Language Models IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We provide a method for automatically detecting change in language across time through a chronologically trained neural language model. |
Yoon Kim; Yi-I Chiu; Kentaro Hanaki; Darshan Hegde; Slav Petrov; |
2014 | 28 | Multilingual Models For Compositional Distributed Semantics IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a novel technique for learning semantic representations, which extends the distributional hypothesis to multilingual data and joint-space embeddings. |
Karl Moritz Hermann; Phil Blunsom; |
2014 | 29 | Lexicon Infused Phrase Embeddings For Named Entity Resolution IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we present two contributions: a new form of learning word embeddings that can leverage information from relevant lexicons to improve the representations, and the first system to use neural word embeddings to achieve state-of-the-art results on named-entity recognition in both CoNLL and Ontonotes NER. |
Alexandre Passos; Vineet Kumar; Andrew McCallum; |
2014 | 30 | Constructing Long Short-Term Memory Based Deep Recurrent Neural Networks For Large Vocabulary Speech Recognition IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To achieve a further performance improvement, in this research, deep extensions on LSTM are investigated considering that deep hierarchical model has turned out to be more efficient than a shallow one. |
Xiangang Li; Xihong Wu; |
2013 | 1 | Distributed Representations Of Words And Phrases And Their Compositionality IF:10 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we present several extensions that improve both the quality of the vectors and the training speed. |
Tomas Mikolov; Ilya Sutskever; Kai Chen; Greg Corrado; Jeffrey Dean; |
2013 | 2 | Efficient Estimation Of Word Representations In Vector Space IF:10 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose two novel model architectures for computing continuous vector representations of words from very large data sets. |
Tomas Mikolov; Kai Chen; Greg Corrado; Jeffrey Dean; |
2013 | 3 | Exploiting Similarities Among Languages For Machine Translation IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper develops a method that can automate the process of generating and extending dictionaries and phrase tables. |
Tomas Mikolov; Quoc V. Le; Ilya Sutskever; |
2013 | 4 | One Billion Word Benchmark For Measuring Progress In Statistical Language Modeling IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a new benchmark corpus to be used for measuring progress in statistical language modeling. |
CIPRIAN CHELBA et. al. |
2013 | 5 | NRC-Canada: Building The State-of-the-Art In Sentiment Analysis Of Tweets IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we describe how we created two state-of-the-art SVM classifiers, one to detect the sentiment of messages such as tweets and SMS (message-level task) and one to detect the sentiment of a term within a submissions stood first in both tasks on tweets, obtaining an F-score of 69.02 in the message-level task and 88.93 in the term-level task. |
Saif M. Mohammad; Svetlana Kiritchenko; Xiaodan Zhu; |
2013 | 6 | Linear Models And Linear Mixed Effects Models In R With Linguistic Applications IF:7 Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: This text is a conceptual introduction to mixed effects modeling with linguistic applications, using the R programming environment. The reader is introduced to linear modeling and … |
Bodo Winter; |
2013 | 7 | Polyglot: Distributed Word Representations For Multilingual NLP IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we train word embeddings for more than 100 languages using their corresponding Wikipedias. |
Rami Al-Rfou; Bryan Perozzi; Steven Skiena; |
2013 | 8 | A Computational Approach To Politeness With Application To Social Factors IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a computational framework for identifying linguistic aspects of politeness. |
Cristian Danescu-Niculescu-Mizil; Moritz Sudhof; Dan Jurafsky; Jure Leskovec; Christopher Potts; |
2013 | 9 | Good Debt Or Bad Debt: Detecting Semantic Orientations In Economic Texts IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The objective of this article is to investigate how semantic orientations can be better detected in financial and economic news by accommodating the overall phrase-structure information and domain-specific use of language. |
Pekka Malo; Ankur Sinha; Pyry Takala; Pekka Korhonen; Jyrki Wallenius; |
2013 | 10 | Sentiment Analysis In The News IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given these definitions, we present work on mining opinions about entities in English language news, in which (a) we test the relative suitability of various sentiment dictionaries and (b) we attempt to separate positive or negative opinion from good or bad news. |
ALEXANDRA BALAHUR et. al. |
2013 | 11 | Fast And Accurate Sentiment Classification Using An Enhanced Naive Bayes Model IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We have explored different methods of improving the accuracy of a Naive Bayes classifier for sentiment analysis. |
Vivek Narayanan; Ishan Arora; Arjun Bhatia; |
2013 | 12 | Recurrent Convolutional Neural Networks For Discourse Compositionality IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce both a sentence model and a discourse model corresponding to the two levels of compositionality. |
Nal Kalchbrenner; Phil Blunsom; |
2013 | 13 | Connecting Language And Knowledge Bases With Embedding Models For Relation Extraction IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel approach for relation extraction from free text which is trained to jointly use information from the text and from existing knowledge. |
Jason Weston; Antoine Bordes; Oksana Yakhnenko; Nicolas Usunier; |
2013 | 14 | Cross-Recurrence Quantification Analysis Of Categorical And Continuous Time Series: An R Package IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes the R package crqa to perform cross-recurrence quantification analysis of two time series of either a categorical or continuous nature. |
Moreno I. Coco; Rick Dale; |
2013 | 15 | Domain And Function: A Dual-Space Model Of Semantic Relations And Compositions IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a dual-space model that unifies these two tasks. |
Peter D. Turney; |
2013 | 16 | From Once Upon A Time To Happily Ever After: Tracking Emotions In Novels And Fairy Tales IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show how sentiment analysis can be used in tandem with effective visualizations to quantify and track emotions in both individual books and across very large collections. |
Saif Mohammad; |
2013 | 17 | Tracking Sentiment In Mail: How Genders Differ On Emotional Axes IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show how sentiment analysis can be used in tandem with effective visualizations to quantify and track emotions in many types of mail. |
Saif M. Mohammad; |
2013 | 18 | Multilingual Distributed Representations Without Word Alignment IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We combine these two approaches by proposing a method for learning distributed representations in a multilingual setup. |
Karl Moritz Hermann; Phil Blunsom; |
2013 | 19 | DGT-TM: A Freely Available Translation Memory In 22 Languages IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this reference paper for DGT-TM, we introduce this new resource, provide statistics regarding its size, and explain how it was produced and how to use it. |
Ralf Steinberger; Andreas Eisele; Szymon Klocek; Spyridon Pilos; Patrick Schlüter; |
2013 | 20 | Description And Evaluation Of Semantic Similarity Measures Approaches IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The aim of this paper is to give an efficient evaluation of all these measures which help researcher and practitioners to select the measure that best fit for their requirements. |
Thabet Slimani; |
2013 | 21 | Probabilistic Frame Induction IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose the first probabilistic approach to frame induction, which incorporates frames, events, participants as latent topics and learns those frame and event transitions that best explain the text. |
Jackie Chi Kit Cheung; Hoifung Poon; Lucy Vanderwende; |
2013 | 22 | Computing Lexical Contrast IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an automatic method to identify contrasting word pairs that is based on the hypothesis that if a pair of words, $A$ and $B$, are contrasting, then there is a pair of opposites, $C$ and $D$, such that $A$ and $C$ are strongly related and $B$ and $D$ are strongly related. |
Saif M. Mohammad; Bonnie J. Dorr; Graeme Hirst; Peter D. Turney; |
2013 | 23 | Sentiment Analysis: How To Derive Prior Polarities From SentiWordNet IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compare the most often used techniques together with newly proposed ones and incorporate all of them in a learning framework to see whether blending them can further improve the estimation of prior polarity scores. |
Marco Guerini; Lorenzo Gatti; Marco Turchi; |
2013 | 24 | Opinion Mining And Analysis: A Survey IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We have surveyed and analyzed in this paper, various techniques that have been developed for the key tasks of opinion mining. |
Arti Buche; Dr. M. B. Chandak; Akshay Zadgaonkar; |
2013 | 25 | An Introduction To The Europe Media Monitor Family Of Applications IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the European Union with its 23 official languages, it is particularly important to cover media reports in many languages in order to capture the complementary news content published in the different countries. |
Ralf Steinberger; Bruno Pouliquen; Erik van der Goot; |
2013 | 26 | A Probabilistic Framework For Analysing The Compositionality Of Conceptual Combinations IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Utilizing formal frameworks developed for analyzing composite systems in quantum theory, we present two methods that allow the semantics of conceptual combinations to be classified as compositional or non-compositional. |
Peter D. Bruza; Kirsty Kitto; Brentyn J. Ramm; Laurianne Sitbon; |
2013 | 27 | JRC-Names: A Freely Available, Highly Multilingual Named Entity Resource IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes a new, freely available, highly multilingual named entity resource for person and organisation names that has been compiled over seven years of large-scale multilingual news analysis combined with Wikipedia mining, resulting in 205,000 per-son and organisation names plus about the same number of spelling variants written in over 20 different scripts and in many more languages. |
Ralf Steinberger; Bruno Pouliquen; Mijail Kabadjov; Erik van der Goot; |
2013 | 28 | Word Emdeddings Through Hellinger PCA IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Instead, we propose to drastically simplify the word embeddings computation through a Hellinger PCA of the word co-occurence matrix. |
Rémi Lebret; Ronan Collobert; |
2013 | 29 | LDC Arabic Treebanks And Associated Corpora: Data Divisions Manual IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This document details a set of rules that have been defined to enable consistent divisions for old and new Arabic treebanks (ATB) and related corpora. |
Mona Diab; Nizar Habash; Owen Rambow; Ryan Roth; |
2013 | 30 | The Placement Of The Head That Minimizes Online Memory: A Complex Systems Approach IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We discuss various aspects of the dynamics of word order of subject (S), verb (V) and object (O) from a complex systems perspective and suggest that word orders tend to evolve by swapping adjacent constituents from an initial or early SOV configuration that is attracted towards a central word order by online memory minimization. |
Ramon Ferrer-i-Cancho; |
2012 | 1 | Learning To Map Sentences To Logical Form: Structured Classification With Probabilistic Categorial Grammars IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We describe a learning algorithm that takes as input a training set of sentences labeled with expressions in the lambda calculus. |
Luke S. Zettlemoyer; Michael Collins; |
2012 | 2 | A Fast And Simple Algorithm For Training Neural Probabilistic Language Models IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a fast and simple algorithm for training NPLMs based on noise-contrastive estimation, a newly introduced procedure for estimating unnormalized continuous distributions. |
Andriy Mnih; Yee Whye Teh; |
2012 | 3 | Gender Identity And Lexical Variation In Social Media IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a study of the relationship between gender, linguistic style, and social networks, using a novel corpus of 14,000 Twitter users. |
David Bamman; Jacob Eisenstein; Tyler Schnoebelen; |
2012 | 4 | Roget’s Thesaurus And Semantic Similarity IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We have implemented a system that measures semantic similarity using a computerized 1987 Roget’s Thesaurus, and evaluated it by performing a few typical tests. |
Mario Jarmasz; Stan Szpakowicz; |
2012 | 5 | A Joint Model Of Language And Perception For Grounded Attribute Learning IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present an approach for joint learning of language and perception models for grounded attribute induction. |
Cynthia Matuszek; Nicholas FitzGerald; Luke Zettlemoyer; Liefeng Bo; Dieter Fox; |
2012 | 6 | Learning Attitudes And Attributes From Multi-Aspect Reviews IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we build models for rating systems in which such dimensions are explicit, in the sense that users leave separate ratings for each aspect of a product. |
Julian McAuley; Jure Leskovec; Dan Jurafsky; |
2012 | 7 | Diffusion Of Lexical Change In Social Media IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Computer-mediated communication is driving fundamental changes in the nature of written language. |
Jacob Eisenstein; Brendan O’Connor; Noah A. Smith; Eric P. Xing; |
2012 | 8 | Multilingual Topic Models For Unaligned Text IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop the multilingual topic model for unaligned text (MuTo), a probabilistic model of text that is designed to analyze corpora composed of documents in two languages. |
Jordan Boyd-Graber; David Blei; |
2012 | 9 | TempEval-3: Evaluating Events, Time Expressions, And Temporal Relations IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We describe the TempEval-3 task which is currently in preparation for the SemEval-2013 evaluation exercise. |
NAUSHAD UZZAMAN et. al. |
2012 | 10 | OCR Post-Processing Error Correction Algorithm Using Google Online Spelling Suggestion IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a post-processing context-based error correction algorithm for detecting and correcting OCR non-word and real-word errors. |
Youssef Bassil; Mohammad Alwani; |
2012 | 11 | You Had Me At Hello: How Phrasing Affects Memorability IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we develop an analysis framework and build a corpus of movie quotes, annotated with memorability information, in which we are able to control for both the speaker and the setting of the quotes. |
Cristian Danescu-Niculescu-Mizil; Justin Cheng; Jon Kleinberg; Lillian Lee; |
2012 | 12 | Exploring Text Virality In Social Networks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to shed some light on the concept of virality – especially in social networks – and to provide new insights on its structure. |
Marco Guerini; Carlo Strapparava; Gozde Ozbal; |
2012 | 13 | Distributional Measures Of Semantic Distance: A Survey IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Therefore, this paper presents a detailed study of distributional measures. |
Saif M. Mohammad; Graeme Hirst; |
2012 | 14 | Two Step CCA: A New Spectral Method For Estimating Vector Models Of Words IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a new spectral method based on CCA to learn an eigenword dictionary. |
Paramveer Dhillon; Jordan Rodu; Dean Foster; Lyle Ungar; |
2012 | 15 | Cross Language Text Classification Via Subspace Co-Regularized Multi-View Learning IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we develop a novel subspace co-regularized multi-view learning method for cross language text classification. |
Yuhong Guo; Min Xiao; |
2012 | 16 | A Practical Approach To Language Complexity: A Wikipedia Case Study IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we present statistical analysis of English texts from Wikipedia. |
Taha Yasseri; András Kornai; János Kertész; |
2012 | 17 | Large Scale Language Modeling In Automatic Speech Recognition IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models have been proven quite beneficial for a variety of automatic speech recognition tasks in Google. |
Ciprian Chelba; Dan Bikel; Maria Shugrina; Patrick Nguyen; Shankar Kumar; |
2012 | 18 | Average Word Length Dynamics As Indicator Of Cultural Changes In Society IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Dynamics of average length of words in Russian and English is analysed in the article. |
Vladimir V. Bochkarev; Anna V. Shevlyakova; Valery D. Solovyev; |
2012 | 19 | Roget’s Thesaurus As A Lexical Resource For Natural Language Processing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by WordNet’s success, we propose as an alternative a similar resource, based on the 1987 Penguin edition of Roget’s Thesaurus of English Words and Phrases. |
Mario Jarmasz; |
2012 | 20 | Distributional Measures As Proxies For Semantic Relatedness IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper is a detailed study of some of the major distributional measures; it lists their respective merits and limitations. |
Saif M Mohammad; Graeme Hirst; |
2012 | 21 | Segmentation Similarity And Agreement IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a new segmentation evaluation metric, called segmentation similarity (S), that quantifies the similarity between two segmentations as the proportion of boundaries that are not transformed when comparing them using edit distance, essentially using edit distance as a penalty function and scaling penalties by segmentation size. |
Chris Fournier; Diana Inkpen; |
2012 | 22 | OCR Context-Sensitive Error Correction Based On Google Web 1T 5-Gram Data Set IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a post-processing OCR context-sensitive error correction method for detecting and correcting non-word and real-word OCR errors. |
Youssef Bassil; Mohammad Alwani; |
2012 | 23 | A Comparative Study Of Root-based And Stem-based Approaches For Measuring The Similarity Between Arabic Words For Arabic Text Mining Applications IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compare and contrast the effect of two preprocessing techniques applied to Arabic corpus: Rootbased (Stemming), and Stem-based (Light Stemming) approaches for measuring the similarity between Arabic words with the well known abstractive model -Latent Semantic Analysis (LSA)- with a wide variety of distance functions and similarity measures, such as the Euclidean Distance, Cosine Similarity, Jaccard Coefficient, and the Pearson Correlation Coefficient. |
Hanane Froud; Abdelmonaim Lachkar; Said Alaoui Ouatik; |
2012 | 24 | Recognizing Bangla Grammar Using Predictive Parser IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We describe a Context Free Grammar (CFG) for Bangla language and hence we propose a Bangla parser based on the grammar. |
K. M. Azharul Hasan; Amit Mondal; Amit Saha; |
2012 | 25 | A Principled Approach To Grammars For Controlled Natural Languages And Predictive Editors IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A parsing approach for Codeco based on an extended chart parsing algorithm is presented. |
Tobias Kuhn; |
2012 | 26 | Distinct Word Length Frequencies: Distributions And Symbol Entropies IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The distribution of frequency counts of distinct words by length in a language’s vocabulary will be analyzed using two methods. |
Reginald D. Smith; |
2012 | 27 | Post-Editing Error Correction Algorithm For Speech Recognition Using Bing Spelling Suggestion IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a post-editing ASR error correction method and algorithm based on Bing’s online spelling suggestion. |
Youssef Bassil; Mohammad Alwani; |
2012 | 28 | Arabic Keyphrase Extraction Using Linguistic Knowledge And Machine Learning Techniques IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, a supervised learning technique for extracting keyphrases of Arabic documents is presented. |
Tarek El-shishtawy; Abdulwahab Al-sammak; |
2012 | 29 | Beyond Sentiment: The Manifold Of Human Emotions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we consider higher dimensional extensions of the sentiment concept, which represent a richer set of human emotions. |
Seungyeon Kim; Fuxin Li; Guy Lebanon; Irfan Essa; |
2012 | 30 | Not As Easy As It Seems: Automating The Construction Of Lexical Chains Using Roget’s Thesaurus IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Morris and Hirst present a method of linking significant words that are about the same topic. |
Mario Jarmasz; Stan Szpakowicz; |
2011 | 1 | LexRank: Graph-based Lexical Centrality As Salience In Text Summarization IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a stochastic graph-based method for computing relative importance of textual units for Natural Language Processing. |
Gunes Erkan; Dragomir R. Radev; |
2011 | 2 | Finding Deceptive Opinion Spam By Any Stretch Of The Imagination IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Integrating work from psychology and computational linguistics, we develop and compare three approaches to detecting deceptive opinion spam, and ultimately develop a classifier that is nearly 90% accurate on our gold-standard opinion spam dataset. |
Myle Ott; Yejin Choi; Claire Cardie; Jeffrey T. Hancock; |
2011 | 3 | A Universal Part-of-Speech Tagset IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To facilitate future research in unsupervised induction of syntactic structure and to standardize best-practices, we propose a tagset that consists of twelve universal part-of-speech categories. |
Slav Petrov; Dipanjan Das; Ryan McDonald; |
2011 | 4 | User-level Sentiment Analysis Incorporating Social Networks IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Employing Twitter as a source for our experimental data, and working within a semi-supervised framework, we propose models that are induced either from the Twitter follower/followee network or from the network in Twitter formed by users referring to each other using @ mentions. |
CHENHAO TAN et. al. |
2011 | 5 | Chameleons In Imagined Conversations: A New Approach To Understanding Coordination Of Linguistic Style In Dialogs IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Indeed, we find significant coordination across many families of function words in our large movie-script corpus. |
Cristian Danescu-Niculescu-Mizil; Lillian Lee; |
2011 | 6 | Experimental Support For A Categorical Compositional Distributional Model Of Meaning IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We implement the abstract categorical model of Coecke et al. (arXiv:1003.4394v1 [cs.CL]) using data from the BNC and evaluate it. |
Edward Grefenstette; Mehrnoosh Sadrzadeh; |
2011 | 7 | Mark My Words! Linguistic Style Accommodation In Social Media IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To investigate this, we develop a probabilistic framework that can model accommodation and measure its effects. |
Cristian Danescu-Niculescu-Mizil; Michael Gamon; Susan Dumais; |
2011 | 8 | Individual And Domain Adaptation In Sentence Planning For Dialogue IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we present and evaluate a trainable sentence planner for providing restaurant information in the MATCH dialogue system. |
F. Mairesse; R. Prasad; A. Stent; M. A. Walker; |
2011 | 9 | OMG U Got Flu? Analysis Of Shared Health Messages For Bio-surveillance IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to ground user messages in epidemic response we focused on tracking reports of self-protective behaviour such as avoiding public gatherings or increased sanitation as the basis for further risk analysis. |
Nigel Collier; Nguyen Truong Son; Ngoc Mai Nguyen; |
2011 | 10 | Creating A Live, Public Short Message Service Corpus: The NUS SMS Corpus IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We describe our efforts to collect a public SMS corpus to address this problem. |
Tao Chen; Min-Yen Kan; |
2011 | 11 | Positive Words Carry Less Information Than Negative Words IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that the frequency of word use is not only determined by the word length \cite{Zipf1935} and the average information content \cite{Piantadosi2011}, but also by its emotional content. |
David Garcia; Antonios Garas; Frank Schweitzer; |
2011 | 12 | Learning Content Selection Rules For Generating Object Descriptions In Dialogue IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we use the annotated COCONUT corpus of task-oriented design dialogues to develop feature sets based on Dale and Reiters (1995) incremental model, Brennan and Clarks (1996) conceptual pact model, and Jordans (2000b) intentional influences model, and use these feature sets in a machine learning experiment to automatically learn a model of content selection for object descriptions. |
P. W. Jordan; M. A. Walker; |
2011 | 13 | Acquiring Correct Knowledge For Natural Language Generation IF:4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Natural language generation (NLG) systems are computer software systems that produce texts in English and other human languages, often from non-linguistic input data. NLG systems, … |
E. Reiter; R. Robertson; S. G. Sripada; |
2011 | 14 | Combining Knowledge- And Corpus-based Word-Sense-Disambiguation Methods IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we concentrate on the resolution of the lexical ambiguity that arises when a given word has several different meanings. |
A. Montoyo; M. Palomar; G. Rigau; A. Suarez; |
2011 | 15 | Learning Sentence-internal Temporal Relations IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we propose a data intensive approach for inferring sentence-internal temporal relations. |
M. Lapata; A. Lascarides; |
2011 | 16 | Acquiring Word-Meaning Mappings For Natural Language Interfaces IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper focuses on a system, WOLFIE (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. |
C. Thompson; |
2011 | 17 | A Context-theoretic Framework For Compositionality In Distributional Semantics IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a framework for natural language semantics in which words, phrases and sentences are all represented as vectors, based on a theoretical analysis which assumes that meaning is determined by context. |
Daoud Clarke; |
2011 | 18 | Experimenting With Transitive Verbs In A DisCoCat IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we extend this study by examining transitive verbs, represented as matrices in a DisCoCat. |
Edward Grefenstette; Mehrnoosh Sadrzadeh; |
2011 | 19 | Statistical Sign Language Machine Translation: From English Written Text To American Sign Language Gloss IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Contributions of this work are the use of a new couple of language English/ASL and an improvement of statistical machine translation based on string matching thanks to Jaro-distance. |
Achraf Othman; Mohamed Jemni; |
2011 | 20 | A Comparison Of Different Machine Transliteration Models IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Four machine transliteration models — grapheme-based transliteration model, phoneme-based transliteration model, hybrid transliteration model, and correspondence-based transliteration model — have been proposed by several researchers. |
K. Choi; H. Isahara; J. Oh; |
2011 | 21 | Syndromic Classification Of Twitter Messages IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent studies have shown strong correlation between social networking data and national influenza rates. |
Nigel Collier; Son Doan; |
2011 | 22 | Recognizing Uncertainty In Speech IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We address the problem of inferring a speaker’s level of certainty based on prosodic information in the speech signal, which has application in speech-based dialogue systems. |
Heather Pon-Barry; Stuart M. Shieber; |
2011 | 23 | NEMO: Extraction And Normalization Of Organization Names From PubMed Affiliation Strings IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose NEMO, a system for extracting organization names in the affiliation and normalizing them to a canonical organization name. |
Siddhartha Jonnalagadda; Philip Topham; |
2011 | 24 | NP Animacy Identification For Anaphora Resolution IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, two methods for animacy identification are proposed and evaluated using intrinsic and extrinsic measures. |
R. J. Evans; C. Orasan; |
2011 | 25 | Malagasy Dialects And The Peopling Of Madagascar IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research we try to answer these problems together with other ones, such as the historical configuration of Malagasy dialects, by types of analysis related to lexicostatistics and glottochronology which draw upon the automated method recently proposed by the authors \cite{Serva:2008, Holman:2008, Petroni:2008, Bakker:2009}. |
M. Serva; F. Petroni; D. Volchenkov; S. Wichmann; |
2011 | 26 | BioSimplify: An Open Source Sentence Simplification Engine To Improve Recall In Automatic Biomedical Information Extraction IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: BioSimplify is an open source tool written in Java that introduces and facilitates the use of a novel model for sentence simplification tuned for automatic discourse analysis and … |
Siddhartha Jonnalagadda; Graciela Gonzalez; |
2011 | 27 | Fitting Ranked English And Spanish Letter Frequency Distribution In U.S. And Mexican Presidential Speeches IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to critically compare various functions, we apply the statistical model selections on ten functions, using the texts of U.S. and Mexican presidential speeches in the last 1-2 centuries. |
Wentian Li; Pedro Miramontes; |
2011 | 28 | Exploring Twitter Hashtags IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using a dataset of 29 million messages, I explore relations among these hashtags with respect to co-occurrences. |
Jan Pöschko; |
2011 | 29 | What’s Unusual In Online Disease Outbreak News? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we address the issue of systematically evaluating online health news to support automatic alerting using daily disease-country counts text mined from real world data using BioCaster. |
Nigel Collier; |
2010 | 1 | From Frequency To Meaning: Vector Space Models Of Semantics IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field. |
Peter D. Turney; Patrick Pantel; |
2010 | 2 | Speech Recognition By Machine, A Review IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The objective of this review paper is to summarize and compare some of the well known methods used in various stages of speech recognition system and identify research topic and applications which are at the forefront of this exciting and challenging field. |
M. A. Anusuya; S. K. Katti; |
2010 | 3 | Mathematical Foundations For A Compositional Distributional Model Of Meaning IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a mathematical framework for a unification of the distributional theory of meaning in terms of vector space models, and a compositional theory for grammatical types, for which we rely on the algebra of Pregroups, introduced by Lambek. |
Bob Coecke; Mehrnoosh Sadrzadeh; Stephen Clark; |
2010 | 4 | A PDTB-Styled End-to-End Discourse Parser IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a comprehensive evaluation from both component-wise and error-cascading perspectives. |
Ziheng Lin; Hwee Tou Ng; Min-Yen Kan; |
2010 | 5 | Syntactic Topic Models IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We derive a fast posterior inference algorithm based on variational methods. |
Jordan Boyd-Graber; David M. Blei; |
2010 | 6 | For The Sake Of Simplicity: Unsupervised Extraction Of Lexical Simplifications From Wikipedia IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider two main approaches: (1) deriving simplification probabilities via an edit model that accounts for a mixture of different operations, and (2) using metadata to focus on edits that are more likely to be simplification operations. |
Mark Yatskar; Bo Pang; Cristian Danescu-Niculescu-Mizil; Lillian Lee; |
2010 | 7 | The Semantic Mapping Of Words And Co-words In Contexts IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This communication provides an introduction, an example, pointers to relevant software, and summarizes the choices that can be made by the analyst. |
Loet Leydesdorff; Kasper Welbers; |
2010 | 8 | Displacement Calculus IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we introduce the displacement calculus, a generalization of Lambek calculus, which preserves its good proof-theoretic properties while embracing discontinuiity and subsuming it. |
Glyn Morrill; Oriol Valentín; |
2010 | 9 | Concrete Sentence Spaces For Compositional Distributional Models Of Meaning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we provide a concrete method for implementing this linear meaning map, by constructing a corpus-based vector space for the type of sentence. |
Edward Grefenstette; Mehrnoosh Sadrzadeh; Stephen Clark; Bob Coecke; Stephen Pulman; |
2010 | 10 | Towards Effective Sentence Simplification For Automatic Processing Of Biomedical Text IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a text simplification process, bioSimplify, that seeks to reduce the complexity of sentences in biomedical abstracts in order to improve the performance of syntactic parsers on the processed sentences. |
Siddhartha Jonnalagadda; Luis Tari; Jorg Hakenberg; Chitta Baral; Graciela Gonzalez; |
2010 | 11 | Niche As A Determinant Of Word Fate In Online Groups IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop a method to quantify two important aspects of the size of the word niche: the range of individuals using the word and the range of topics it is used to discuss. |
Eduardo G. Altmann; Janet B. Pierrehumbert; Adilson E. Motter; |
2010 | 12 | The Probabilistic Analysis Of Language Acquisition: Theoretical, Computational, And Experimental Analysis IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we present a new experiment which tests these learnability predictions. |
Anne S. Hsu; Nick Chater; Paul M. B. Vitanyi; |
2010 | 13 | Learning Recursive Segments For Discourse Parsing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a simple approach to discourse segmentation that is able to produce nested EDUs. |
Stergos Afantenos; Pascal Denis; Philippe Muller; Laurence Danlos; |
2010 | 14 | A Generic Tool To Generate A Lexicon For NLP From Lexicon-Grammar Tables IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce LGExtract, a generic tool for generating a syntactic lexicon for NLP from the Lexicon-Grammar tables. |
Matthieu Constant; Elsa Tolone; |
2010 | 15 | Sentence Simplification Aids Protein-Protein Interaction Extraction IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we report on the impact that automatic simplification of sentences has on the performance of a state-of-art PPI extraction system, showing a substantial improvement in recall (8%) when the sentence simplification method is applied, without significant impact to precision. |
Siddhartha Jonnalagadda; Graciela Gonzalez; |
2010 | 16 | Opinion Polarity Identification Through Adjectives IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a technique for identifying polarity of reviews by identifying the polarity of the adjectives that appear in them. |
Samaneh Moghaddam; Fred Popowich; |
2010 | 17 | Lexical Co-occurrence, Statistical Significance, And Word Association IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a theoretical framework for discovering statistically significant lexical co-occurrences from a given corpus. |
Dipak Chaudhari; Om P. Damani; Srivatsan Laxman; |
2010 | 18 | Testing SDRT’s Right Frontier IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we provide strong empirical support for SDRT’s version of RFC. |
Stergos Afantenos; Nicholas Asher; |
2010 | 19 | La Représentation Formelle Des Concepts Spatiaux Dans La Langue IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this chapter, we assume that systematically studying spatial markers semantics in language provides a means to reveal fundamental properties and concepts characterizing conceptual representations of space. |
Michel Aurnague; Laure Vieu; Andrée Borillo; |