Paper Digest: SIGIR 2016 Highlights
SIGIR (Annual International ACM SIGIR Conference on Research and Development in Information Retrieval) is one of the top information retrieval conferences in the world.
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
If you do not want to miss any interesting academic paper, you are welcome to sign up our free daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
team@paperdigest.org
TABLE 1: SIGIR 2016 Papers
Title | Authors | Highlight | |
---|---|---|---|
1 | Understanding Human Language: Can NLP and Deep Learning Help? | Christopher Manning | My talk will emphasize the two topics of how NLP can contribute to understanding textual relationships and how deep learning approaches substantially aid in this goal. |
2 | Big Data in Climate: Opportunities and Challenges for Machine Learning | Vipin Kumar | This talk will present an overview of research being done in a large interdisciplinary project on the development of novel data mining and machine learning approaches for analyzing massive amount of climate and ecosystem data now available from satellite and ground-based sensors, and physics-based climate model simulations. |
3 | Statistical Significance, Power, and Sample Sizes: A Systematic Review of SIGIR and TOIS, 2006-2015 | Tetsuya Sakai | The original objective of the study was to identify IR effectiveness experiments that are seriously underpowered (i.e., the sample size is far too small so that the probability of missing a real difference is extremely high) or overpowered (i.e., the sample size is so large that a difference will be considered statistically significant even if the actual effect size is extremely small). |
4 | Bayesian Performance Comparison of Text Classifiers | Dell Zhang, Jun Wang, Emine Yilmaz, Xiaoling Wang, Yuxin Zhou | In this paper, we propose a novel Bayesian approach to the performance comparison of text classifiers, and argue its advantages over the traditional frequentist approach based on t-test etc. |
5 | A General Linear Mixed Models Approach to Study System Component Effects | Nicola Ferro, Gianmaria Silvello | In this paper, we face the problem of studying system variance in order to better understand how much system components contribute to overall performances. |
6 | Searching by Talking: Analysis of Voice Queries on Mobile Web Search | Ido Guy | In this paper, we examine the logs of a commercial search engine’s mobile interface, and compare the spoken queries to the typed-in queries. |
7 | Predicting User Satisfaction with Intelligent Assistants | Julia Kiseleva, Kyle Williams, Ahmed Hassan Awadallah, Aidan C. Crook, Imed Zitouni, Tasos Anastasakos | In this paper, we propose an automatic method to predict user satisfaction with intelligent assistants that exploits all the interaction signals, including voice commands and physical touch gestures on the device. |
8 | Learning to Respond with Deep Neural Networks for Retrieval-Based Human-Computer Conversation System | Rui Yan, Yiping Song, Hua Wu | In this paper, we propose a retrieval-based conversation system with the deep learning-to-respond schema through a deep neural network framework driven by web data. |
9 | Document Retrieval Using Entity-Based Language Models | Hadas Raviv, Oren Kurland, David Carmel | We address the ad hoc document retrieval task by devising novel types of entity-based language models. |
10 | Engineering Quality and Reliability in Technology-Assisted Review | Gordon V. Cormack, Maura R. Grossman | The objective of technology-assisted review ("TAR") is to find as much relevant information as possible with reasonable effort. |
11 | A Sequential Decision Formulation of the Interface Card Model for Interactive IR | Yinan Zhang, Chengxiang Zhai | We propose a novel formulation of the Interface Card model based on sequential decision theory, leading to a general framework for formal modeling of user states and stopping actions. |
12 | Generalized BROOF-L2R: A General Framework for Learning to Rank Based on Boosting and Random Forests | Clebson C.A. de Sá, Marcos A. Gonçalves, Daniel X. Sousa, Thiago Salles | In this paper, we propose a general framework that smoothly combines ensembles of additive trees, specifically Random Forests, with Boosting in a original way for the task of L2R. |
13 | An Optimization Framework for Remapping and Reweighting Noisy Relevance Labels | Yury Ustinovskiy, Valentina Fedorova, Gleb Gusev, Pavel Serdyukov | The major goal of this paper is to unify existing approaches to consensus modeling and noise reduction within a learning to rank framework. |
14 | Learning to Rank with Selection Bias in Personal Search | Xuanhui Wang, Michael Bendersky, Donald Metzler, Marc Najork | In this paper, we study the problem of how to leverage sparse click data in personal search and introduce a novel selection bias problem and address it in the learning-to-rank framework. |
15 | On Effective Personalized Music Retrieval by Exploring Online User Behaviors | Zhiyong Cheng, Shen Jialie, Steven C.H. Hoi | In this paper, we study the problem of personalized text based music retrieval which takes users’ music preferences on songs into account via the analysis of online listening behaviours and social tags. |
16 | Semantification of Identifiers in Mathematics for Better Math Information Retrieval | Moritz Schubotz, Alexey Grigorev, Marcus Leich, Howard S. Cohl, Norman Meuschke, Bela Gipp, Abdou S. Youssef, Volker Markl | As scientific communities tend to establish standard (identifier) notations, we use the document domain to infer the actual meaning of an identifier. |
17 | Multi-Stage Math Formula Search: Using Appearance-Based Similarity Metrics at Scale | Richard Zanibbi, Kenny Davila, Andrew Kane, Frank Wm. Tompa | Using a Symbol Layout Tree representation for formula appearance, we propose the Maximum Subtree Similarity (MSS) for ranking formulae based upon the subexpression whose symbols and layout best match a query formula. |
18 | Explainable User Clustering in Short Text Streams | Yukun Zhao, Shangsong Liang, Zhaochun Ren, Jun Ma, Emine Yilmaz, Maarten de Rijke | To address this problem, we propose a dynamic user clustering topic model (or UCT for short). |
19 | Topic Modeling for Short Texts with Auxiliary Word Embeddings | Chenliang Li, Haoran Wang, Zhiqian Zhang, Aixin Sun, Zongyang Ma | To this end, we propose a simple, fast, and effective topic model for short texts, named GPU-DMM. |
20 | Interleaved Evaluation for Retrospective Summarization and Prospective Notification on Document Streams | Xin Qian, Jimmy Lin, Adam Roegiest | We propose and validate a novel interleaved evaluation methodology for two complementary information seeking tasks on document streams: retrospective summarization and prospective notification. |
21 | Learning Query and Document Relevance from a Web-scale Click Graph | Shan Jiang, Yuening Hu, Changsung Kang, Tim Daly, Dawei Yin, Yi Chang, Chengxiang Zhai | This paper proposes a vector propagation algorithm on the click graph to learn vector representations for both queries and documents in the same semantic space. |
22 | Click-based Hot Fixes for Underperforming Torso Queries | Masrour Zoghi, Tomáš Tunys, Lihong Li, Damien Jose, Junyan Chen, Chun Ming Chin, Maarten de Rijke | In this paper, we address the challenge of dealing with torso queries on which the production ranker is performing poorly. |
23 | A Context-aware Time Model for Web Search | Alexey Borisov, Ilya Markov, Maarten de Rijke, Pavel Serdyukov | To account for this context bias effect, we propose a context-aware time model (CATM). |
24 | Novelty based Ranking of Human Answers for Community Questions | Adi Omari, David Carmel, Oleg Rokhlenko, Idan Szpektor | We propose a novel answer ranking algorithm that borrows ideas from aspect ranking and multi-document summarization, but adapts them to our scenario. |
25 | That’s Not My Question: Learning to Weight Unmatched Terms in CQA Vertical Search | Boaz Petersil, Avihai Mejer, Idan Szpektor, Koby Crammer | In this work we propose a novel term weighting model that directly assesses the weights of unmatched terms, and show its benefits. |
26 | When a Knowledge Base Is Not Enough: Question Answering over Knowledge Bases with External Text Data | Denis Savenkov, Eugene Agichtein | We introduce a new system, Text2KB, that enriches question answering over a knowledge base by using external text data. |
27 | Transfer Learning for Cross-Lingual Sentiment Classification with Weakly Shared Deep Neural Networks | Guangyou Zhou, Zhao Zeng, Jimmy Xiangji Huang, Tingting He | Transfer Learning for Cross-Lingual Sentiment Classification with Weakly Shared Deep Neural Networks |
28 | Query to Knowledge: Unsupervised Entity Extraction from Shopping Queries using Adaptor Grammars | Ke Zhai, Zornitsa Kozareva, Yuening Hu, Qi Li, Weiwei Guo | In this paper, we focus on the problem of automatically identifying brand and product entities from a large collection of web queries in online shopping domain. We present three different sets of grammar rules used to infer query structures and extract brand and product entities. |
29 | Learning for Efficient Supervised Query Expansion via Two-stage Feature Selection | Zhiwei Zhang, Qifan Wang, Luo Si, Jianfeng Gao | In this paper, we point out that the cost of SQE mainly comes from term feature extraction, and propose a Two-stage Feature Selection framework (TFS) to address this problem. |
30 | Leveraging Context-Free Grammar for Efficient Inverted Index Compression | Zhaohua Zhang, Jiancong Tong, Haibing Huang, Jin Liang, Tianlong Li, Rebecca J. Stones, Gang Wang, Xiaoguang Liu | In this paper, we propose a new grammar-based inverted index compression scheme, which can improve the performance of both index compression and query processing. |
31 | Fast and Compact Hamming Distance Index | Simon Gog, Rossano Venturini | In this paper we propose new solutions for the approximate dictionary queries problem. |
32 | Fast First-Phase Candidate Generation for Cascading Rankers | Qi Wang, Constantinos Dimopoulos, Torsten Suel | Our contribution is to propose an alternative framework that builds specialized single-term and pairwise index structures, and then during query time selectively accesses these structures based on a cost budget and a set of early termination techniques. |
33 | Learning to Rank Features for Recommendation over Multiple Categories | Xu Chen, Zheng Qin, Yongfeng Zhang, Tao Xu | Learning to Rank Features for Recommendation over Multiple Categories |
34 | How Much Novelty is Relevant?: It Depends on Your Curiosity | Pengfei Zhao, Dik Lun Lee | In this paper, we propose a curiosity-based recommendation system (CBRS) framework which generates recommendations with a personalized amount of DU’s to fit the user’s curiosity level. |
35 | Discrete Collaborative Filtering | Hanwang Zhang, Fumin Shen, Wei Liu, Xiangnan He, Huanbo Luan, Tat-Seng Chua | In this paper, we propose a principled CF hashing framework called Discrete Collaborative Filtering (DCF), which directly tackles the challenging discrete optimization that should have been treated adequately in hashing. |
36 | Understanding Information Need: An fMRI Study | Yashar Moshfeghi, Peter Triantafillou, Frank E. Pollick | In this paper, we investigate the connection between an information need and brain activity. |
37 | User Behavior in Asynchronous Slow Search | Ryan Burton, Kevyn Collins-Thompson | In this work, we present the first study to analyze users’ willingness to wait and their search success, when given a Web search system that embodies characteristics of slow search, where speed can be traded for an improvement in quality. |
38 | Going back in Time: An Investigation of Social Media Re-finding | Florian Meier, David Elsweiler | We present results from a 5 month-long naturalistic, log-based study of user interaction with Twitter, which suggest re-finding to be a regular activity and that Tweets can offer utility for longer than one might think. |
39 | R-Susceptibility: An IR-Centric Approach to Assessing Privacy Risks for Users in Online Communities | Joanna Asia Biega, Krishna P. Gummadi, Ida Mele, Dragan Milchevski, Christos Tryfonopoulos, Gerhard Weikum | This paper presents a ranking-based approach to the assessment of privacy risks emerging from textual contents in online communities, focusing on sensitive topics, such as being depressed. |
40 | Scalable Semantic Matching of Queries to Ads in Sponsored Search Advertising | Mihajlo Grbovic, Nemanja Djuric, Vladan Radosavljevic, Fabrizio Silvestri, Ricardo Baeza-Yates, Andrew Feng, Erik Ordentlich, Lee Yang, Gavin Owens | We present a novel advance match approach based on the idea of semantic embeddings of queries and ads. |
41 | Retrieving Non-Redundant Questions to Summarize a Product Review | Mengwen Liu, Yi Fang, Dae Hoon Park, Xiaohua Hu, Zhengtao Yu | In our study, we aim to help customers who want to quickly capture the main idea of a lengthy product review before they read the details. |
42 | Modeling Document Novelty with Neural Tensor Network for Search Result Diversification | Long Xia, Jun Xu, Yanyan Lan, Jiafeng Guo, Xueqi Cheng | In this paper, we propose to model the novelty of a document with a neural tensor network. |
43 | ScentBar: A Query Suggestion Interface Visualizing the Amount of Missed Relevant Information for Intrinsically Diverse Search | Kazutoshi Umemoto, Takehiro Yamamoto, Katsumi Tanaka | With the goal of helping searchers discover unexplored aspects and find the appropriate timing for search stopping in intrinsically diverse tasks, we propose ScentBar, a query suggestion interface visualizing the amount of important information that a user potentially misses collecting from the search results of individual queries. |
44 | Evaluating Search Result Diversity using Intent Hierarchies | Xiaojie Wang, Zhicheng Dou, Tetsuya Sakai, Ji-Rong Wen | In this paper, we introduce intent hierarchies to model the relationships among intents. |
45 | Robust and Collective Entity Disambiguation through Semantic Embeddings | Stefan Zwicklbauer, Christin Seifert, Michael Granitzer | We propose a new collective, graph-based disambiguation algorithm utilizing semantic entity and document embeddings for robust entity disambiguation. |
46 | Parameterized Fielded Term Dependence Models for Ad-hoc Entity Retrieval from Knowledge Graph | Fedor Nikolaev, Alexander Kotov, Nikita Zhiltsov | In this paper, we demonstrate that existing retrieval models for ad-hoc structured and unstructured document retrieval fall short of addressing this problem, due to their rigid assumptions. |
47 | Hierarchical Random Walk Inference in Knowledge Graphs | Qiao Liu, Liuyi Jiang, Minghao Han, Yao Liu, Zhiguang Qin | The central problem in the study of relational inference is to infer unknown relations between entities from the facts given in the knowledge bases. |
48 | When Watson Went to Work: Leveraging Cognitive Computing in the Real World | Aya Soffer, David Konopnicki, Haggai Roitman | When Watson Went to Work: Leveraging Cognitive Computing in the Real World |
49 | Ask Your TV: Real-Time Question Answering with Recurrent Neural Networks | Ferhan Ture, Oliver Jojic | We describe a real-time factoid question answering (QA) system, using our internal KG for training (i.e., generating labeled example question-answer pairs) and for retrieval at test time. |
50 | Amazon Search: The Joy of Ranking Products | Daria Sorokina, Erick Cantu-Paz | In this talk we are going to cover a number of relevance algorithms used in Amazon Search today. |
51 | Learning to Rank Personalized Search Results in Professional Networks | Viet Ha-Thuc, Shakti Sinha | This paper presents our approach to achieving this by mining various data sources available in LinkedIn to infer searchers’ intents (such as hiring, job seeking, etc.), as well as extending the concept of homophily to capture the searcher-result similarities on many aspects. |
52 | When does Relevance Mean Usefulness and User Satisfaction in Web Search? | Jiaxin Mao, Yiqun Liu, Ke Zhou, Jian-Yun Nie, Jingtao Song, Min Zhang, Shaoping Ma, Jiashen Sun, Hengliang Luo | In this study, we confirm the difference by a laboratory study in which we collect relevance annotations by external assessors, usefulness and user satisfaction information by users, for a set of search tasks. |
53 | How Many Workers to Ask?: Adaptive Exploration for Collecting High Quality Labels | Ittai Abraham, Omar Alonso, Vasilis Kandylas, Rajesh Patel, Steven Shelford, Aleksandrs Slivkins | In this paper we investigate how to devise better stopping rules given workers’ performance quality scores. |
54 | Risk-Sensitive Evaluation and Learning to Rank using Multiple Baselines | B. Taner Dinçer, Craig Macdonald, Iadh Ounis | Based upon the Chi-squared statistic, we propose a new measure ZRisk that exhibits more promise since it takes into account multiple baselines when measuring risk, and a derivative measure called GeoRisk, which enhances ZRisk by also taking into account the overall magnitude of effectiveness. |
55 | Event Digest: A Holistic View on Past Events | Arunav Mishra, Klaus Berberich | We propose a problem of automatic event digest generation to aid effective and efficient retrospection. |
56 | Terms over LOAD: Leveraging Named Entities for Cross-Document Extraction and Summarization of Events | Andreas Spitz, Michael Gertz | In this paper, we introduce the LOAD model for cross-document event extraction in large-scale document collections. |
57 | GeoBurst: Real-Time Local Event Detection in Geo-Tagged Tweet Streams | Chao Zhang, Guangyu Zhou, Quan Yuan, Honglei Zhuang, Yu Zheng, Lance Kaplan, Shaowen Wang, Jiawei Han | We propose GeoBurst, a method that enables effective and real-time local event detection from geo-tagged tweet streams. |
58 | Building a Self-Learning Search Engine: From Research to Business | Manos Tsagkias, Wouter Weerkamp | In this presentation we tell how to go from research to business and the challenges it brings along? |
59 | Sedano: A News Stream Processor for Business | Ugo Scaiella, Giacomo Berardi, Giuliano Mega, Roberto Santoro | We present Sedano, a system for processing and indexing a continuous stream of business-related news. |
60 | Ranking Financial Tweets | Diego Ceccarelli, Francesco Nidito, Miles Osborne | Here we consider whether popularity factors within Twitter can be used as a signal for popularity within the domain of financial experts. |
61 | Contextual Bandits in a Collaborative Environment | Qingyun Wu, Huazheng Wang, Quanquan Gu, Hongning Wang | In this paper, we develop a collaborative contextual bandit algorithm, in which the adjacency graph among users is leveraged to share context and payoffs among neighboring users while online updating. |
62 | Collaborative Filtering Bandits | Shuai Li, Alexandros Karatzoglou, Claudio Gentile | In this work, we investigate an adaptive clustering technique for content recommendation based on exploration-exploitation strategies in contextual multi-armed bandit settings. |
63 | Fast Matrix Factorization for Online Recommendation with Implicit Feedback | Xiangnan He, Hanwang Zhang, Min-Yen Kan, Tat-Seng Chua | We highlight two critical issues of existing works. |
64 | Leveraging User Interaction Signals for Web Image Search | Neil O’Hare, Paloma de Juan, Rossano Schifanella, Yunlong He, Dawei Yin, Yi Chang | In this paper we propose a number of implicit relevance feedback features based on these additional interactions: hover-through rate, ‘converted-hover’ rate, referral page click through, and a number of dwell time features. |
65 | Self-Paced Cross-Modal Subspace Matching | Jian Liang, Zhihang Li, Dong Cao, Ran He, Jingdong Wang | This paper proposes a Self-Paced Cross-Modal Subspace Matching (SCSM) method for unsupervised multimodal data. |
66 | Composite Correlation Quantization for Efficient Multimodal Retrieval | Mingsheng Long, Yue Cao, Jianmin Wang, Philip S. Yu | In this paper, we approach seamless multimodal hashing by proposing a novel Composite Correlation Quantization (CCQ) model. |
67 | Principles for the Design of Online A/B Metrics | Widad Machmouchi, Georg Buscher | In this paper, we describe principles for designing metrics in the context of A/B experiments. |
68 | Visual Recommendation Use Case for an Online Marketplace Platform: allegro.pl | Anna Wróblewska, Łukasz Rączkowski | In this paper we describe a small content-based visual recommendation project built as part of the Allegro online marketplace platform. |
69 | AOL’s Named Entity Resolver: Solving Disambiguation via Document Strongly Connected Components and Ad-Hoc Edges Construction | Roni Wiener, Yonatan Ben-Simhon, Anna Chen | In this work we present AOL’s Named Entity Resolver which was designed to handle real life scenarios including empty entries. |
70 | The Data Stack in Information Retrieval | Omar Alonso | I propose to look at information retrieval applications from the perspective of the data stack infrastructure that is needed in research prototypes and production systems. |
71 | Predicting User Engagement with Direct Displays Using Mouse Cursor Information | Ioannis Arapakis, Luis A. Leiva | In this paper, we conduct a crowdsourcing study and examine how users engage with a prominent web search engine component such as the knowledge module (KM) display. |
72 | Search Result Prefetching Using Cursor Movement | Fernando Diaz, Qi Guo, Ryen W. White | We present methods that leverage searchers’ cursor movements on search result pages in real time to dynamically estimate the result that searchers will request next. |
73 | Predicting Search User Examination with Visual Saliency | Yiqun Liu, Zeyang Liu, Ke Zhou, Meng Wang, Huanbo Luan, Chao Wang, Min Zhang, Shaoping Ma | To predict user examination on SERPs containing heterogenous components without user interaction information, we propose a new prediction model based on visual saliency map and page content features. |
74 | A Comparison of Cache Blocking Methods for Fast Execution of Ensemble-based Score Computation | Xin Jin, Tao Yang, Xun Tang | This paper provides an analytic comparison of cache blocking methods on their data access performance with an approximation and proposes a fast guided sampling scheme to select a traversal method and blocking parameters for effective use of memory hierarchy. |
75 | Improved Caching Techniques for Large-Scale Image Hosting Services | Xiao Bai, B. Barla Cambazoglu, Archie Russell | In this paper, we formalize the static caching problem in image serving systems which provide on-the-fly image resizing functionality in their edge caches or regional caches. |
76 | A Complete & Comprehensive Movie Review Dataset (CCMR) | Xuezhi Cao, Weiyue Huang, Yong Yu | Therefore, in this paper we assemble and publish such dataset (CCMR) for the community. |
77 | A Cross-Platform Collection of Social Network Profiles | Maria Han Veiga, Carsten Eickhoff | To enable the structured study of such adversarial effects, this paper presents a dedicated dataset of cross-platform social network personas (i.e., the same person has accounts on multiple platforms). |
78 | A Test Collection for Matching Patients to Clinical Trials | Bevan Koopman, Guido Zuccon | The collection described in this paper provides: i) a large corpus of clinical trials; ii) 60 patient case reports used as topics; iii) multiple query representations for a single topic (long, short and ad-hoc); iv) a user provided estimate of how many trials they expect each patient topic would be eligible for; and v) relevance assessments by medical professionals. |
79 | ArabicWeb16: A New Crawl for Today’s Arabic Web | Reem Suwaileh, Mucahid Kutlu, Nihal Fathima, Tamer Elsayed, Matthew Lease | To remedy this, we present ArabicWeb16, a new public Web crawl of roughly 150M Arabic Web pages with significant coverage of dialectal Arabic as well as Modern Standard Arabic. |
80 | Building Test Collections for Evaluating Temporal IR | Hideo Joho, Adam Jatowt, Roi Blanco, Haitao Yu, Shuhei Yamamoto | This paper describes our efforts for building test collections for the purpose of fostering temporal IR research. |
81 | DAJEE: A Dataset of Joint Educational Entities for Information Retrieval in Technology Enhanced Learning | Vladimir Estivill-Castro, Carla Limongelli, Matteo Lombardi, Alessandro Marani | This paper presents DAJEE, a dataset built from the crawling of MOOCs hosted on the Coursera platform. |
82 | Evaluating Retrieval over Sessions: The TREC Session Track 2011-2014 | Ben Carterette, Paul Clough, Mark Hall, Evangelos Kanoulas, Mark Sanderson | A key challenge in the study of this interaction is the creation of suitable evaluation resources to assess the effectiveness of IR systems over sessions. |
83 | EveTAR: A New Test Collection for Event Detection in Arabic Tweets | Hind Almerekhi, Maram Hasanain, Tamer Elsayed | In this paper, we present EveTAR, the first publicly-available test collection for event detection in Arabic tweets. |
84 | GNMID14: A Collection of 110 Million Global Music Identification Matches | Cameron Summers, Greg Tronel, Jason Cramer, Aneesh Vartakavi, Phillip Popp | In this paper, we characterize the dataset and demonstrate its utility for Information Retrieval (IR) research. |
85 | Longitudinal Navigation Log Data on a Large Web Domain | Suzan Verberne, Bram Arends, Wessel Kraaij, Arjen de Vries | We have collected the access logs for our university’s web domain over a time span of 4.5 years. We now release the pre-processed data of a 3-month period for research into user navigation behavior. |
86 | New Collection Announcement: Focused Retrieval Over the Web | Ivan Habernal, Maria Sukhareva, Fiana Raiber, Anna Shtok, Oren Kurland, Hadar Ronen, Judit Bar-Ilan, Iryna Gurevych | All sentences in the relevant documents were judged for relevance. |
87 | NTCIR Lifelog: The First Test Collection for Lifelog Research | Cathal Gurrin, Hideo Joho, Frank Hopfgartner, Liting Zhou, Rami Albatal | In this paper, the requirements for the test collection are motivated, the process of creating the test collection is described, along with an overview of the test collection. |
88 | SOGOU-2012-CRAWL: A Crawl of Search Results in the Sogou 2012 Chinese Query Log | Stewart Whiting, Joemon M. Jose, Omar Alonso | Based on this, we propose a simple approach for modelling the past/present/future temporal intent of queries based on the date the query was submitted by the user, and the dates appearing in the clicked search results. |
89 | The BOLT IR Test Collections of Multilingual Passage Retrieval from Discussion Forums | Ian Soboroff, Kira Griffitt, Stephanie Strassel | This paper describes a new test collection for passage retrieval from multilingual, informal text. |
90 | The Factoid Queries Collection | Ido Guy, Dan Pelleg | We present a collection of over 15,000 queries, issued to commercial web search engines, whose answer is a single fact. |
91 | The LExR Collection for Expertise Retrieval in Academia | Vitor Mangaravite, Rodrygo L.T. Santos, Isac S. Ribeiro, Marcos André Gonçalves, Alberto H.F. Laender | Expertise retrieval has been the subject of intense research over the past decade, particularly with the public availability of benchmark test collections for expertise retrieval in enterprises. |
92 | UQV100: A Test Collection with Query Variability | Peter Bailey, Alistair Moffat, Falk Scholer, Paul Thomas | We describe the UQV100 test collection, designed to incorporate variability from users. |
93 | A Dynamic Recurrent Model for Next Basket Recommendation | Feng Yu, Qiang Liu, Shu Wu, Liang Wang, Tieniu Tan | In this work, we propose a novel model, Dynamic REcurrent bAsket Model (DREAM), based on Recurrent Neural Network (RNN). |
94 | A Simple Enhancement for Ad-hoc Information Retrieval via Topic Modelling | Fanghong Jian, Jimmy Xiangji Huang, Jiashu Zhao, Tingting He, Po Hu | In this paper, we consider term-based information and semantic information as two features of query terms and propose a simple enhancement for ad-hoc IR via topic modeling. |
95 | An Empirical Study of Learning to Rank for Entity Search | Jing Chen, Chenyan Xiong, Jamie Callan | This work investigates the effectiveness of learning to rank methods for entity search. |
96 | An Exploration of Evaluation Metrics for Mobile Push Notifications | Luchen Tan, Adam Roegiest, Jimmy Lin, Charles L.A. Clarke | In this paper, we explore various evaluation metrics for this task, focusing specifically on measuring relevance. |
97 | An Improved Multileaving Algorithm for Online Ranker Evaluation | Brian Brost, Ingemar J. Cox, Yevgeny Seldin, Christina Lioma | We propose a new multileaving method for handling this problem and demonstrate that it substantially outperforms existing methods, in some cases reducing errors by as much as 50%. |
98 | An Unsupervised Approach to Anomaly Detection in Music Datasets | Yen-Cheng Lu, Chih-Wei Wu, Chang-Tien Lu, Alexander Lerch | This paper presents an unsupervised method for systematically identifying anomalies in music datasets. |
99 | Anonymizing Query Logs by Differential Privacy | Sicong Zhang, Hui Yang, Lisa Singh | We introduce a framework to anonymize query logs by differential privacy, the latest development in privacy research. |
100 | Audio Features Affected by Music Expressiveness: Experimental Setup and Preliminary Results on Tuba Players | Alberto Introini, Giorgio Presti, Giuseppe Boccignone | Within a Music Information Retrieval perspective, the goal of the study presented here is to investigate the impact on sound features of the musician’s affective intention, namely when trying to intentionally convey emotional contents via expressiveness. |
101 | Automatic Identification and Contextual Reformulation of Implicit System-Related Queries | Adam Fourney, Susan T. Dumais | In this paper we analyze a 3 month log of web search queries posed via the Cortana virtual assistant. |
102 | Axiomatic Analysis for Improving the Log-Logistic Feedback Model | Ali Montazeralghaem, Hamed Zamani, Azadeh Shakery | Several PRF methods have so far been proposed for many retrieval models. |
103 | Balancing Relevance Criteria through Multi-Objective Optimization | Joost van Doorn, Daan Odijk, Diederik M. Roijers, Maarten de Rijke | We propose to mitigate this by viewing multiple relevance criteria as objectives and learning a set of rankers that provide different trade-offs w.r.t. these objectives. |
104 | Build Emotion Lexicon from the Mood of Crowd via Topic-Assisted Joint Non-negative Matrix Factorization | Kaisong Song, Wei Gao, Ling Chen, Shi Feng, Daling Wang, Chengqi Zhang | In the research of building emotion lexicons, we witness the exploitation of crowd-sourced affective annotation given by readers of online news articles. |
105 | Burst Detection in Social Media Streams for Tracking Interest Profiles in Real Time | Cody Buntain, Jimmy Lin | This work presents RTTBurst, an end-to-end system for ingesting descriptions of user interest profiles and discovering new and relevant tweets based on those interest profiles using a simple model for identifying bursts in token usage. |
106 | Cluster-based Joint Matrix Factorization Hashing for Cross-Modal Retrieval | Dimitrios Rafailidis, Fabio Crestani | In this study, we propose a cross-modal hashing method by following a cluster-based joint matrix factorization strategy. |
107 | Collaborative Ranking with Social Relationships for Top-N Recommendations | Dimitrios Rafailidis, Fabio Crestani | Here, to account for the fact that the selections of social friends can leverage the recommendation accuracy, we propose SCR, a Social CR model. |
108 | Community-based Cyberreading for Information Understanding | Zhuoren Jiang, Xiaozhong Liu, Liangcai Gao, Zhi Tang | For this proposed problem, we investigate novel methods to assist scholars (readers) to better understand scientific publications by enabling physical and virtual collaboration. |
109 | Computational Creativity Based Video Recommendation | Wei Lu, Fu-lai Chung | Tensor models offer effective approaches for complex multi-relational data learning and missing element completion. |
110 | Controversy Detection in Wikipedia Using Collective Classification | Shiri Dori-Hacohen, David Jensen, James Allan | We hypothesize that intensities of controversy among related pages are not independent; thus, we propose a stacked model which exploits the dependencies among related pages. |
111 | Discovering Author Interest Evolution in Topic Modeling | Min Yang, Jincheng Mei, Fei Xu, Wenting Tu, Ziyu Lu | In this paper, we propose an interest drift model (IDM), which monitors the evolution of author interests in time-stamped documents. |
112 | Distributional Random Oversampling for Imbalanced Text Classification | Alejandro Moreo, Andrea Esuli, Fabrizio Sebastiani | We present a new oversampling method specifically designed for classifying data (such as text) for which the distributional hypothesis holds, according to which the meaning of a feature is somehow determined by its distribution in large corpora of data. |
113 | Doc2Sent2Vec: A Novel Two-Phase Approach for Learning Document Representation | Ganesh J, Manish Gupta, Vasudeva Varma | In the first phase, the model learns a vector for each sentence in the document using a standard word-level language model. |
114 | Dynamically Integrating Item Exposure with Rating Prediction in Collaborative Filtering | Ting-Yi Shih, Ting-Chang Hou, Jian-De Jiang, Yen-Chieh Lien, Chia-Rui Lin, Pu-Jen Cheng | The paper proposes a novel approach to appropriately promote those items with few ratings in collaborative filtering. |
115 | Effective Trend Detection within a Dynamic Search Context | Anat Hashavit, Roy Levin, Ido Guy, Gilad Kutiel | We present RT-Trend, an online trend detection algorithm that promptly finds relevant in-context trends as users issue search queries over a dataset of documents. |
116 | Enhancing First Story Detection using Word Embeddings | Sean Moran, Richard McCreadie, Craig Macdonald, Iadh Ounis | In this paper we show how word embeddings can be used to increase the effectiveness of a state-of-the art Locality Sensitive Hashing (LSH) based first story detection (FSD) system over a standard tweet corpus. |
117 | Examining the Coherence of the Top Ranked Tweet Topics | Anjie Fang, Craig Macdonald, Iadh Ounis, Philip Habel | In this paper, we conduct large-scale experiments using three topic modelling approaches over two Twitter datasets, and apply a state-of-the-art coherence metric to study the coherence of the top ranked topics and how K affects such coherence. |
118 | Explicit In Situ User Feedback for Web Search Results | Jin Young Kim, Jaime Teevan, Nick Craswell | In this paper, we aim to address these limitations by collecting explicit feedback on web search results from users in situ as they search. |
119 | Exploiting CPU SIMD Extensions to Speed-up Document Scoring with Tree Ensembles | Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto, Rossano Venturini | We propose V-QuickScorer (vQS), which exploits SIMD extensions to vectorize the document scoring, i.e., to perform the ensemble traversal by evaluating multiple documents simultaneously. |
120 | Exploiting Semantic Coherence Features for Information Retrieval | Xinhui Tu, Jimmy Xiangji Huang, Jing Luo, Tingting He | In this paper, we propose a heuristic approach, in which the degree of semantic coherence of the query terms with a document is adopted to improve the information retrieval performance. |
121 | Extracting Information Seeking Intentions for Web Search Sessions | Matthew Mitsui, Chirag Shah, Nicholas J. Belkin | We present a method for extracting the self-reported intentions of users engaged in an information seeking episode. |
122 | First Story Detection using Multiple Nearest Neighbors | Jeroen B.P. Vuurens, Arjen P. de Vries | We propose a novel FSD approach that is more effective, by adapting a recently proposed method for news summarization based on 3-nearest neighbor clustering. |
123 | Health Monitoring on Social Media over Time | Sumit Sidana, Shashwat Mishra, Sihem Amer-Yahia, Marianne Clausel, Massih-Reza Amini | In this work, we are interested in monitoring people’s health over time. |
124 | How Informative is a Term?: Dispersion as a measure of Term Specificity | Rodney McDonell, Justin Zobel, Bodo Billerbeck | We argue in this paper that the distribution of within-document frequencies across a collection is also pertinent to informativeness, a measure that has not been considered in prior work: the most informative words tend to be those whose frequency of occurrence has high variance. |
125 | Identifying Careless Workers in Crowdsourcing Platforms: A Game Theory Approach | Yashar Moshfeghi, Alvaro F. Huertas-Rosero, Joemon M. Jose | In this paper we introduce a game scenario for crowdsourcing (CS) using incentives as a bait for careless (gambler) workers, who respond to them in a characteristic way. |
126 | Impact of Review-Set Selection on Human Assessment for Text Classification | Adam Roegiest, Gordon V. Cormack | Impact of Review-Set Selection on Human Assessment for Text Classification |
127 | Improving Automated Controversy Detection on the Web | Myungha Jang, James Allan | In this paper, we discover two major weakness in the prior work and propose modifications. |
128 | Improving Language Estimation with the Paragraph Vector Model for Ad-hoc Retrieval | Qingyao Ai, Liu Yang, Jiafeng Guo, W. Bruce Croft | In this paper, we study how to effectively use the PV model to improve ad-hoc retrieval. |
129 | Improving Retrieval Quality Using Pseudo Relevance Feedback in Content-Based Image Retrieval | Dinesha Chathurani Nanayakkara Wasam Uluwitige, Timothy Chappell, Shlomo Geva, Vinod Chandran | Improving Retrieval Quality Using Pseudo Relevance Feedback in Content-Based Image Retrieval |
130 | Ingrams: A Neuropsychological Explanation For Why People Search | Peter Bailey, Nick Craswell | Our goal in this paper is to provide a simple yet general explanation for these acts that has its basis in neuropsychology and observed user behavior. |
131 | Investment Recommendation using Investor Opinions in Social Media | Wenting Tu, David W. Cheung, Nikos Mamoulis, Min Yang, Ziyu Lu | In this paper, we improve investment recommendation by modeling and using the quality of each investment opinion. |
132 | "Is Sven Seven?": A Search Intent Module for Children | Nevena Dragovic, Ion Madrazo Azpiazu, Maria Soledad Pera | To enhance web search environments in response to children’s behaviors and expectations, in this paper we discuss an initial effort to verify well-known issues, and identify yet to be explored ones, that affect children in formulating (natural language or keyword) queries. |
133 | Is This Your Final Answer?: Evaluating the Effect of Answers on Good Abandonment in Mobile Search | Kyle Williams, Julia Kiseleva, Aidan C. Crook, Imed Zitouni, Ahmed Hassan Awadallah, Madian Khabsa | We study these two aspects by analyzing the logs of a commercial search engine and through a user study. |
134 | Jointly Modeling Review Content and Aspect Ratings for Review Rating Prediction | Zhipeng Jin, Qiudan Li, Daniel D. Zeng, YongCheng Zhan, Ruoran Liu, Lei Wang, Hongyuan Ma | In this paper, we propose a novel review rating prediction method, which improves the prediction accuracy by capturing deep semantics of review content and alleviating data missing problem of aspect ratings. |
135 | Learning to Project and Binarise for Hashing Based Approximate Nearest Neighbour Search | Sean Moran | In this paper we focus on improving the effectiveness of hashing-based approximate nearest neighbour search. |
136 | Linking Organizational Social Network Profiles | Jerome Cheng, Kazunari Sugiyama, Min-Yen Kan | We classify profiles as to whether they belong to an organization or its affiliates. |
137 | Load-Balancing in Distributed Selective Search | Yubin Kim, Jamie Callan, J. Shane Culpepper, Alistair Moffat | Simulation and analysis have shown that selective search can reduce the cost of large-scale distributed information retrieval. |
138 | Multi-Rate Deep Learning for Temporal Recommendation | Yang Song, Ali Mamdouh Elkahky, Xiaodong He | In this work, we propose a novel deep neural network based architecture that models the combination of long-term static and short-term temporal user preferences to improve the recommendation performance. |
139 | Network-Aware Recommendations of Novel Tweets | Noor Aldeen Alawad, Aris Anagnostopoulos, Stefano Leonardi, Ida Mele, Fabrizio Silvestri | In this paper, we present a novel tweet-recommendation approach, which exploits network, content, and retweet analyses for making recommendations of tweets. |
140 | Not All Links Are Created Equal: An Adaptive Embedding Approach for Social Personalized Ranking | Qing Zhang, Houfeng Wang | To address this challenge, we propose an adaptive embedding approach to solve the both jointly for better recommendation in real world setting. |
141 | On a Topic Model for Sentences | Georgios Balikas, Massih-Reza Amini, Marianne Clausel | In this paper, we propose sentenceLDA, an extension of LDA whose goal is to overcome this limitation by incorporating the structure of the text in the generative and inference processes. |
142 | On Information-Theoretic Document-Person Associations for Expert Search in Academia | Vitor Mangaravite, Rodrygo L.T. Santos | In this paper, we address expert search in academia, where the authorship of a document can be determined with reasonable certainty. |
143 | On the Applicability of Delicious for Temporal Search on Web Archives | Helge Holzmann, Wolfgang Nejdl, Avishek Anand | In this paper we investigate the applicability of external longitudinal resources to identify important and popular websites in the past and analyze the social bookmarking service Delicious for this purpose. |
144 | On the Effectiveness of Contextualisation Techniques in Spoken Query Spoken Content Retrieval | David N. Racca, Gareth J.F. Jones | In this paper, we evaluate different contextualisation techniques, including a recently proposed technique based on positional language models (PLM) on the task of retrieving relevant spoken passages in response to a spoken query. |
145 | Ordinal Text Quantification | Giovanni Da San Martino, Wei Gao, Fabrizio Sebastiani | In this paper we tackle, for the first time, the problem of ordinal text quantification, defined as the task of performing text quantification when a total order is defined on the set of classes; estimating the prevalence of "five stars" reviews in a set of reviews of a given product, and monitoring this prevalence across time, is an example application. |
146 | Pearson Rank: A Head-Weighted Gap-Sensitive Score-Based Correlation Coefficient | Ning Gao, Mossaab Bagdouri, Douglas W. Oard | This paper introduces such a measure, referred to as Pearson Rank. |
147 | Polarized User and Topic Tracking in Twitter | Mauro Coletto, Claudio Lucchese, Salvatore Orlando, Raffaele Perego | In this work, we focus on polarisation classes, i.e., those topics that require the user to side exclusively with one position. |
148 | Post-Learning Optimization of Tree Ensembles for Efficient Ranking | Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Fabrizio Silvestri, Salvatore Trani | In this paper we propose a new framework, named CLEaVER, for optimizing machine-learned ranking models based on ensembles of regression trees. |
149 | Quit While Ahead: Evaluating Truncated Rankings | Fei Liu, Alistair Moffat, Timothy Baldwin, Xiuzhen Zhang | In this work we explore a generalized approach for representing truncated result sets, and propose modifications to a number of popular evaluation metrics. |
150 | Quote Recommendation in Dialogue using Deep Neural Network | Hanbit Lee, Yeonchan Ahn, Haejun Lee, Seungdo Ha, Sang-goo Lee | In this paper, we introduce a task of recommending quotes which are suitable for given dialogue context and we present a deep learning recommender system which combines recurrent neural network and convolutional neural network in order to learn semantic representation of each utterance and construct a sequence model for the dialog thread. We collected a large set of twitter dialogues with quote occurrences in order to evaluate proposed recommender system. |
151 | Ranking Documents Through Stochastic Sampling on Bayesian Network-based Models: A Pilot Study | Xing Tan, Jimmy Xiangji Huang, Aijun An | Using approximate inference techniques, we investigate in this paper the applicability of Bayesian Networks to the problem of ranking a large set of documents. |
152 | Ranking Health Web Pages with Relevance and Understandability | Joao Palotti, Lorraine Goeuriot, Guido Zuccon, Allan Hanbury | We propose a method that integrates relevance and understandability to rank health web documents. |
153 | Rethinking the Cost of Information Search Behavior | Yinglong Zhang, Jacek Gwizdka | In this paper, we present a cognitive-economic approach to examining the cost in information search. |
154 | Retrievability of Code Mixed Microblogs | Debasis Ganguly, Ayan Bandyopadhyay, Mandar Mitra, Gareth J.F. Jones | In this paper, we investigate the indexing and retrieval strategies for a mixed collection of documents, comprising of code-mixed and the monolingual documents. |
155 | Retweeting Behavior Prediction Based on One-Class Collaborative Filtering in Social Networks | Bo Jiang, Jiguang Liang, Ying Sha, Rui Li, Wei Liu, Hongyuan Ma, Lihong Wang | Since we can only observe on which messages user retweet. |
156 | Sampling Strategies and Active Learning for Volume Estimation | Haotian Zhang, Jimmy Lin, Gordon V. Cormack, Mark D. Smucker | We propose a simple yet effective technique for determining this "switchover" point, which intuitively can be understood as the "knee" in an effort vs. recall gain curve, as well as alternative sampling strategies beyond the knee. |
157 | Search-based Evaluation from Truth Transcripts for Voice Search Applications | François Mairesse, Paul Raccuglia, Shiv Vitaladevuni | This paper therefore proposes an evaluation method that compares the search results of the speech recognition hypotheses with the search results produced by a human transcript. |
158 | Seeking Serendipity: A Living Lab Approach to Understanding Creative Retrieval in Broadcast Media Production | Sabrina Sauer, Maarten de Rijke | This paper presents a method to map user needs and integrate serendipitous search behaviors in search algorithm development: the living lab approach. |
159 | Selectively Personalizing Query Auto-Completion | Fei Cai, Maarten de Rijke | Based on a lenient personalized QAC strategy that encodes the ranking signal as a trade-off between query popularity and search context, we propose a model for selectively personalizing query auto-completion (SP-QAC) to study this trade-off. |
160 | SG++: Word Representation with Sentiment and Negation for Twitter Sentiment Classification | Qinmin Hu, Yijun Pei, Qin Chen, Liang He | Here we propose an advance Skip-gram model to incorporate both word sentiment and negation information. |
161 | SGT Framework: Social, Geographical and Temporal Relevance for Recreational Queries in Web Search | Stewart Whiting, Omar Alonso | In this work we characterize such queries as recreational queries, and propose a relevance framework for ranking points of interest (POIs) to present in the web search recreational vertical using signals from query logs and LBSNs. |
162 | SimCC-AT: A Method to Compute Similarity of Scientific Papers with Automatic Parameter Tuning | Masoud Reyhani Hamedani, Sang-Wook Kim | In this paper, we propose SimCC-AT (similarity based on content and citations with automatic parameter tuning) to compute the similarity of scientific papers. |
163 | Simple Dynamic Emission Strategies for Microblog Filtering | Luchen Tan, Adam Roegiest, Charles L.A. Clarke, Jimmy Lin | Push notifications from social media provide a method to keep up-to-date on topics of personal interest. |
164 | Subspace Clustering Based Tag Sharing for Inductive Tag Matrix Refinement with Complex Errors | Yuqing Hou, Zhouchen Lin, Jin-ge Yao | In this paper, we propose an image annotation framework which sequentially performs tag completion and refinement. |
165 | Temporal Query Intent Disambiguation using Time-Series Data | Yue Zhao, Claudia Hauff | In order to classify queries according to their temporal intent (e.g. Past or Future), we explore the usage of time-series data derived from Wikipedia page views as a feature source. |
166 | To Blend or Not to Blend?: Perceptual Speed, Visual Memory and Aggregated Search | Lauren Turpin, Diane Kelly, Jaime Arguello | This study evaluates the relationship between two cognitive abilities ? |
167 | Topic Model based Privacy Protection in Personalized Web Search | Wasi Uddin Ahmad, Md Masudur Rahman, Hongning Wang | To address this privacy issue, we proposed a Topic-based Privacy Protection solution on client side. |
168 | Topic Quality Metrics Based on Distributed Word Representations | Sergey I. Nikolenko | In this work, we propose several new metrics for evaluating topic quality with the help of distributed word representations; our experiments suggest that the new metrics are a better match for human judgement, which is the gold standard in this case, than previously developed approaches. |
169 | Toward Estimating the Rank Correlation between the Test Collection Results and the True System Performance | Julián Urbano, Mónica Marrero | All the results in this paper are fully reproducible with data and code available online |
170 | Tracking Sentiment by Time Series Analysis | Anastasia Giachanou, Fabio Crestani | In this study, we explore conventional time series analysis methods and their applicability on topic and sentiment trend analysis. |
171 | Tweet2Vec: Learning Tweet Embeddings Using Character-level CNN-LSTM Encoder-Decoder | Soroush Vosoughi, Prashanth Vijayaraghavan, Deb Roy | We present Tweet2Vec, a novel method for generating general-purpose vector representation of tweets. |
172 | Two Sample T-tests for IR Evaluation: Student or Welch? | Tetsuya Sakai | Using past data from both TREC and NTCIR, the present study demonstrates that the latter advice should not be followed blindly in the context of IR system evaluation. |
173 | Uncovering Task Based Behavioral Heterogeneities in Online Search Behavior | Rishabh Mehrotra, Prasanta Bhattacharya, Emine Yilmaz | In the current work, we quantify user search task behavior for both single- as well as multi-task search sessions and relate it to tasks and topics. |
174 | Understanding Website Behavior based on User Agent | Kien Pham, Aécio Santos, Juliana Freire | In this paper, we discuss the results of a large-scale study of web site behavior based on their responses to different user-agents. |
175 | Using Word Embedding to Evaluate the Coherence of Topics from Twitter Data | Anjie Fang, Craig Macdonald, Iadh Ounis, Philip Habel | Hence, in this paper, we propose new Word Embedding-based topic coherence metrics. |
176 | Utilizing Focused Relevance Feedback | Elinor Brondwine, Anna Shtok, Oren Kurland | We present a novel study of ad hoc retrieval methods utilizing document-level relevance feedback and/or focused relevance feedback; namely, passages marked as (non-)relevant. |
177 | What Makes a Query Temporally Sensitive? | Craig Willis, Garrick Sherman, Miles Efron | We use qualitative and quantitative techniques to analyze 660 topics from the Text Retrieval Conference (TREC) previously used in the experimental evaluation of temporal retrieval models. |
178 | Which Information Sources are More Effective and Reliable in Video Search | Zhiyong Cheng, Xuanchong Li, Jialie Shen, Alexander G. Hauptmann | In this study, we explore the effectiveness of various video features on the performance of video hyperlinking, including subtitle, metadata, content features (i.e., audio and visual), surrounding context, as well as the combinations of those features. |
179 | Why do you Think this Query is Difficult?: A User Study on Human Query Prediction | Stefano Mizzaro, Josiane Mothe | In this paper, we focus rather on understanding Why a query is perceived by humans as difficult. |
180 | A Platform for Streaming Push Notifications to Mobile Assessors | Adam Roegiest, Luchen Tan, Jimmy Lin, Charles L.A. Clarke | We present an assessment platform for gathering online relevance judgments for mobile push notifications that will be deployed in the newly-created TREC 2016 Real-Time Summarization (RTS) track. |
181 | A Visual Analytics Approach for What-If Analysis of Information Retrieval Systems | Marco Angelini, Nicola Ferro, Giuseppe Santucci, Gianmaria Silvello | We present the innovative visual analytics approach of the VATE system, which eases and makes more effective the experimental evaluation process by introducing the what-if analysis. |
182 | An Architecture for Privacy-Preserving and Replicable High-Recall Retrieval Experiments | Adam Roegiest, Gordon V. Cormack | We demonstrate the infrastructure used in the TREC 2015 Total Recall track to facilitate controlled simulation of "assessor in the loop" high-recall retrieval experimentation. |
183 | Analysing Temporal Evolution of Interlingual Wikipedia Article Pairs | Simon Gottschalk, Elena Demidova | This can lead to different points of views reflected in the articles, as well as complementary and inconsistent information. |
184 | Cobwebs from the Past and Present: Extracting Large Social Networks using Internet Archive Data | Miroslav Shaltev, Jan-Hendrik Zab, Philipp Kemkes, Stefan Siersdorfer, Sergej Zerr | In this paper we present SocGraph – an extraction and exploration system for social relations from the content of around 2 billion web pages collected by the Internet Archive over the 17 years time period between 1996 and 2013. |
185 | Context-Sensitive Auto-Completion for Searching with Entities and Categories | Andreas Schmidt, Johannes Hoffart, Dragan Milchevski, Gerhard Weikum | We have developed a semantic auto-completion system, where suggestions for entities and categories are computed in real-time from the context of already entered entities or categories and from entity-level co-occurrence statistics for the underlying corpus. |
186 | EAIMS: Emergency Analysis Identification and Management System | Richard McCreadie, Craig Macdonald, Iadh Ounis | This system exploits machine learning over data gathered from past emergencies and disasters to build effective models for identifying new events as they occur, tracking developments within those events and analyzing those developments for the purposes of enhancing the decision making processes of emergency response agencies. |
187 | Expedition: A Time-Aware Exploratory Search System Designed for Scholars | Jaspreet Singh, Wolfgang Nejdl, Avishek Anand | In this paper we present Expedition – a time-aware exploratory search system that addresses the requirements and information needs of scholars. |
188 | iGlasses: A Novel Recommendation System for Best-fit Glasses | Xiaoling Gu, Lidan Shou, Pai Peng, Ke Chen, Sai Wu, Gang Chen | As conventional recommendation techniques such as collaborative filtering become inapplicable in the problem, we propose a new recommendation method which exploits the implicit matching rules between human faces and eyeglasses. |
189 | InfoScout: An Interactive, Entity Centric, Person Search Tool | Sean McKeown, Martynas Buivys, Leif Azzopardi | This paper describes InfoScout, a search tool which is intended to reduce the time it takes to identify and gather subject centric information on the Web. |
190 | InLook: Revisiting Email Search Experience | Pranav Ramarao, Suresh Iyengar, Pushkar Chitnis, Raghavendra Udupa, Balasubramanyan Ashok | In this work, we present a lightweight email application codenamed InLook, that intends to provide a productive search experience. |
191 | Interacting with Financial Data using Natural Language | Vassilis Plachouras, Charese Smiley, Hiroko Bretz, Ola Taylor, Jochen L. Leidner, Dezhao Song, Frank Schilder | This work presents a novel system that enables both experts in the finance domain and non-expert users to search financial data with both keyword and natural language queries. |
192 | LONLIES: Estimating Property Values for Long Tail Entities | Mina Farid, Ihab F. Ilyas, Steven Euijong Whang, Cong Yu | We present Lonlies, a system for estimating property values of long tail entities by leveraging their relationships to head topics and entities. |
193 | Personalised News and Blog Recommendations based on User Location, Facebook and Twitter User Profiling | Gabriella Kazai, Iskander Yusof, Daoud Clarke | We build individual models for each user and each location. |
194 | PULP: A System for Exploratory Search of Scientific Literature | Alan Medlar, Kalle Ilves, Ping Wang, Wray Buntine, Dorota Glowacka | We present a system called PULP that supports exploratory search for scientific literature, though the system can be easily adapted to other types of literature. |
195 | SECC: A Novel Search Engine Interface with Live Chat Channel | Cheng Zhang, Peng Zhang, Jingfei Li, Dawei Song | In this paper, we present a demo of novel Search Engine with a live Chat Channel (SECC). |
196 | Simulating Interactive Information Retrieval: SimIIR: A Framework for the Simulation of Interaction | David Maxwell, Leif Azzopardi | This paper describes the SimIIR framework and the different components that can be configured and extended as required. |
197 | The ComeWithMe System for Searching and Ranking Activity-Based Carpooling Rides | Vinicius Monteiro de Lira, Chiara Renso, Raffaele Perego, Salvatore Rinzivillo, Valeria Cesario Times | The ComeWithMe System for Searching and Ranking Activity-Based Carpooling Rides |
198 | ThingSeek: A Crawler and Search Engine for the Internet of Things | Ali Shemshadi, Quan Z. Sheng, Yongrui Qin | To shed light on this line of research, in this paper, we firstly create a set of tools to capture IoT data from a set of given data sources. |
199 | Tweetviz: Visualizing Tweets for Business Intelligence | Bas Sijtsma, Pernilla Qvarfordt, Francine Chen | This paper presents Tweetviz, an interactive tool to help businesses extract actionable information from a large set of noisy Twitter messages. |
200 | Where the Event Lies: Predicting Event Occurrence in Textual Documents | Andrea Ceroni, Ujwal Gadiraju, Jan Matschke, Simon Wingert, Marco Fisichella | In this paper, we present a system to automatize event validation, defined as the task of determining whether a given event occurs in a given document or corpus. |
201 | A Novel Approach to Define and Model Contextual Features in Recommender Systems | Parisa Lak | We are conducting a series of studies to detect, define, select, model and incorporate the most relevant contextual features for RS algorithms. |
202 | A Study of Information Seeking Behavior Using Physical and Online Explorations | Dongho Choi | Considering the analogy between information exploration and geographical exploration, I want to identify the interconnections between these behaviors and predict individuals? |
203 | Appearance-Based Retrieval of Mathematical Notation in Documents and Lecture Videos | Kenny Davila | Based on the notion that visually similar formulas are related, we propose a framework for appearance-based formula retrieval in two different modalities: symbolic for text documents and image-Based for videos. |
204 | Beyond Topical Relevance: Studying Understandability and Reliability in Consumer Health Search | Joao Palotti | Other studies have examined how poor the quality of health information on the web can be. |
205 | Enhancing Information Retrieval with Adapted Word Embedding | Navid Rekabsaz | In this paper, we propose addressing the question of combining the term-to-term similarity of word embedding with IR models. |
206 | Fairness in Information Retrieval | Aldo Lipani | Witnessing a need in industry for measures that ‘make sense’, I focus on the problematics of the two fundamental IR evaluation measures, Precision at cut-off P@n and Recall at cut-off [email protected]$. |
207 | Going Beyond Relevance: Incorporating Effort in Information Retrieval | Manisha Verma | We identified factors that are associated with effort for a single document and gathered judgments for same. |
208 | Measuring Interestingness of Political Documents | Hosein Azarbonyad | The main aim of our research is developing a ranking method for political documents which captures the interesting content within political documents. |
209 | Modeling User Feedback in Dynamic Search and Browsing | Jiyun Luo | In this work, we model session searches as Partially Observable Markov Decision Processes (POMDP). |
210 | Modelling User Search Behaviour Based on Process | Mengdie Zhuang | In general, we assume that they are indicative of the search outcomes (e.g. performance, opinion). |
211 | Retrievability: An Independent Evaluation Measure | Colin Wilkie | Retrievability, a document centric evaluation measure, introduced by Azzopardi and Vinay, provides an alternative approach to evaluation [1]. |
212 | Significant Words Representations of Entities | Mostafa Dehghani | Inspired by the early work of Luhn, we propose significant words language models of a set of documents that capture all, and only, the significant shared terms from them. |
213 | Time-Quality Trade-offs in Search | Ryan Burton | In this paper, I propose a research agenda surrounding the notion of slow search, where retrieval speed may be traded for improvements in result quality. |
214 | Torii: Attribute-based Polarity Analysis with Big Datasets | Fernando O. Gallego | Attribute-based polarity analysis is a fine-grained approach that computes if the opinion about an attribute of (a component of) an item is positive, negative, or neutral. |
215 | User Interaction in Mobile Web Search | Jaewon Kim | In the proposed research, we intend to explore user interaction while searching with the aim of improving search experience on mobile devices. |
216 | Collaborative Information Seeking: Art and Science of Achieving 1+1>2 in IR | Chirag Shah | In this half-day tutorial, this concept, along with some of the foundational works and latest developments in the field of collaborative information seeking (CIS) will be presented. |
217 | Constructing and Mining Web-scale Knowledge Graphs | Evgeniy Gabrilovich, Nicolas Usunier | In this tutorial, we present the state of the art in constructing, mining, and growing knowledge graphs. |
218 | Counterfactual Evaluation and Learning for Search, Recommendation and Ad Placement | Thorsten Joachims, Adith Swaminathan | And can we train systems to optimize online metrics without subjecting users to an online learning algorithm? |
219 | Deep Learning for Information Retrieval | Hang Li, Zhengdong Lu | In the first part, we introduce the fundamental techniques of deep learning for natural language processing and information retrieval, such as word embedding, recurrent neural networks, and convolutional neural networks. |
220 | From Design to Analysis: Conducting Controlled Laboratory Experiments with Users | Diane Kelly, Anita Crescenzi | The goals of the tutorial are (1) to increase participants? |
221 | Instant Search: A Hands-on Tutorial | Ganesh Venkataraman, Abhimanyu Lad, Viet Ha-Thuc, Dhruv Arya | We present techniques for prefix-based retrieval as well as injecting custom ranking functions into elasticsearch. We first present the challenges involved in putting together an instant search solution at scale, followed by a survey of IR and NLP techniques that can be used to address them. |
222 | Online Learning to Rank for Information Retrieval: SIGIR 2016 Tutorial | Artem Grotov, Maarten de Rijke | Below we describe why we believe that the time is right for an intermediate-level tutorial on online learning to rank, the objectives of the proposed tutorial, its relevance, as well as more practical details, such as format, schedule and support materials. |
223 | Question Answering with Knowledge Base, Web and Beyond | Wen-tau Yih, Hao Ma | In this tutorial, we give the audience a coherent overview of the research of question answering (QA). |
224 | Scalability and Efficiency Challenges in Large-Scale Web Search Engines | B. Barla Cambazoglu, Ricardo Baeza-Yates | In particular, the tutorial provides an in-depth architectural overview of a web search engine, mainly focusing on the web crawling, indexing, and query processing components. |
225 | Simulation of Interaction: A Tutorial on Modelling and Simulating User Interaction and Search Behaviour | Leif Azzopardi | In this tutorial, we aim to provide researchers with an overview of simulation, detailing the various types of simulation, models of search behavior used to simulate interaction, along with an overview of the various models of querying, stopping, selecting and marking. |
226 | Succinct Data Structures in Information Retrieval: Theory and Practice | Simon Gog, Rossano Venturini | In this tutorial we will introduce this field of research by presenting the most important succinct data structures to represent set of integers, set of points, trees, graphs and strings together with their most important applications to Information Retrieval problems. |
227 | Temporal Information Retrieval | Nattiya Kanhabua, Avishek Anand | In the latter session, we will describe research issues centered on determining the temporal intent of queries, and time-aware query enhancement, e.g., temporal relevance feedback, and time-aware query reformulation. |
228 | Third International Workshop on | Michael Meder, Frank Hopfgartner, Gabriella Kazai, Udo Kruschwitz | Third International Workshop on |
229 | HIA’16: The 2nd International Workshop on Heterogeneous Information Access at SIGIR 2016 | Ke Zhou, Yiqun Liu, Roger Jie Luo, Joemon Jose | HIA’16: The 2nd International Workshop on Heterogeneous Information Access at SIGIR 2016 |
230 | Medical Information Search Workshop (MEDIR) | Steven Bedrick, Lorraine Goeuriot, Gareth J.F. Jones, Anastasia Krithara, Henning Mueller, George Paliouras | Medical Information Search Workshop (MEDIR) |
231 | Neu-IR: The SIGIR 2016 Workshop on Neural Information Retrieval | Nick Craswell, W. Bruce Croft, Jiafeng Guo, Bhaskar Mitra, Maarten de Rijke | The purpose is to provide an opportunity for people to present new work and early results, compare notes on neural network toolkits, share best practices, and discuss the main challenges facing this line of research. |
232 | Privacy-Preserving IR 2016: Differential Privacy, Search, and Social Media | Hui Yang, Ian Soboroff, Li Xiong, Charles L.A. Clarke, Simson L. Garfinkel | The goals of this workshop include (1) bringing together the two research fields, and (2) yielding fruitful collaborations. |
233 | Search as Learning (SAL) Workshop 2016 | Jacek Gwizdka, Preben Hansen, Claudia Hauff, Jiyin He, Noriko Kando | The "Search as Learning" (SAL) workshop is focused on an area within the information retrieval field that is only beginning to emerge: supporting users in their learning whilst interacting with information content. |
234 | SIGIR 2016 Workshop WebQA II: Web Question Answering Beyond Factoids | Alessandro Moschitti, Lluiís Márquez, Preslav Nakov, Eugene Agichtein, Charles Clarke, Idan Szpektor | Unlike the more formal conference format, the aim of this workshop is to bring together researchers in diverse areas working on this problem, including those from NLP, IR, social media and recommender systems communities. |