Paper Digest: WWW 2013 Highlights
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
If you do not want to miss any interesting academic paper, you are welcome to sign up our free daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
TABLE 1: WWW 2013 Papers
|Real-time recommendation of diverse related articles
|Sofiane Abbar, Sihem Amer-Yahia, Piotr Indyk, Sepideh Mahabadi
|We formalize a novel recommendation problem where the goal is to find the closest most diverse articles to the one the user is currently browsing.
|Multi-label learning with millions of labels: recommending advertiser bid phrases for web pages
|Rahul Agrawal, Archit Gupta, Yashoteja Prabhu, Manik Varma
|In this paper, we eschew this paradigm, and demonstrate that it is possible to efficiently predict the relevant subset of queries from a large set of monetizable ones by posing the problem as a multi-label learning task with each query being represented by a separate label.
|Hierarchical geographical modeling of user locations from social media posts
|Amr Ahmed, Liangjie Hong, Alexander J. Smola
|This paper presents an integrated generative model of location and message content.
|Distributed large-scale natural graph factorization
|Amr Ahmed, Nino Shervashidze, Shravan Narayanamurthy, Vanja Josifovski, Alexander J. Smola
|We propose a framework for large-scale graph decomposition and inference.
|A CRM system for social media: challenges and experiences
|Jitendra Ajmera, Hyung-iL Ahn, Meena Nagarajan, Ashish Verma, Danish Contractor, Stephen Dill, Matthew Denesuk
|In this work we present our experiences in building a system that mines conversations on social platforms to identify and prioritize those posts and messages that are relevant to enterprises.
|Here’s my cert, so trust me, maybe?: understanding TLS errors on the web
|Devdatta Akhawe, Bernhard Amann, Matthias Vallentin, Robin Sommer
|To guide that process, we perform a large-scale measurement study of common TLS warnings.
|Towards a robust modeling of temporal interest change patterns for behavioral targeting
|Mohamed Aly, Sandeep Pandey, Vanja Josifovski, Kunal Punera
|In this paper we explore how the change in user behavior can be used to predict future actions and show how it complements the traditional models of decaying interest and action recency to build a complete picture about the user interests and better predict conversions.
|The anatomy of LDNS clusters: findings and implications for web content delivery
|Hussein A. Alzoubi, Michael Rabinovich, Oliver Spatscheck
|We present a large-scale measurement of clusters of hosts sharing the same local DNS servers.
|Steering user behavior with badges
|Ashton Anderson, Daniel Huttenlocher, Jon Kleinberg, Jure Leskovec
|In this paper, we study how badges can influence and steer user behavior on a site—leading both to increased participation and to changes in the mix of activities a user pursues on the site.
|Cascading tree sheets and recombinant HTML: better encapsulation and retargeting of web content
|Edward O. Benson, David R. Karger
|This paper presents Cascading Tree Sheets (CTS), a CSS-like language for separating this presentational HTML from real content.
|CopyCatch: stopping group attacks by spotting lockstep behavior in social networks
|Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, Christos Faloutsos
|In this paper we focus on the social network Facebook and the problem of discerning ill-gotten Page Likes, made by spammers hoping to turn a profit, from legitimate Page Likes.
|Inferring the demographics of search users: social data meets search queries
|Bin Bi, Milad Shokouhi, Michal Kosinski, Thore Graepel
|In this paper, we offer a solution to this problem by showing how user demographic traits such as age and gender, and even political and religious views can be efficiently and accurately inferred based on their search query histories.
|Strategyproof mechanisms for competitive influence in networks
|Allan Borodin, Mark Braverman, Brendan Lucier, Joel Oren
|We study this model from the perspective of a central mechanism, such as a social networking platform, that can optimize seed placement as a service for the advertisers.
|Alessandro Bozzon, Marco Brambilla, Stefano Ceri, Andrea Mauri
|Most crowdsourcing systems only provide limited and predefined controls; in contrast, we present an approach to crowdsourcing which provides fine-level, powerful and flexible controls.
|On participation in group chats on Twitter
|Ceren Budak, Rakesh Agrawal
|To predict whether a user that attended her first session in a particular Twitter chat group will return to the group, we build 5F Model that captures five different factors: individual initiative, group characteristics, perceived receptivity, linguistic affinity and geographical proximity.
|The role of web hosting providers in detecting compromised websites
|Davide Canali, Davide Balzarotti, Aurélien Francillon
|In this paper we test the ability of web hosting providers to detect compromised websites and react to user complaints.
|Your browsing behavior for a big mac: economics of personal information online
|Juan Pablo Carrascal, Christopher Riederer, Vijay Erramilli, Mauro Cherubini, Rodrigo de Oliveira
|In this work, we rely on refined Experience Sampling – a data collection method that probes users to valuate their PII at the time and place where it was generated in order to minimize retrospective recall and hence increase measurement validity.
|Is this app safe for children?: a comparison study of maturity ratings on Android and iOS applications
|Ying Chen, Heng Xu, Yilu Zhou, Sencun Zhu
|To address these issues, this research aims to systematically uncover the extent and severity of unreliable maturity ratings for mobile apps.
|Traveling the silk road: a measurement analysis of a large anonymous online marketplace
|We perform a comprehensive measurement analysis of Silk Road, an anonymous, international online marketplace that operates as a Tor hidden service and uses Bitcoin as its exchange currency.
|Group chats on Twitter
|James Cook, Krishnaram Kenthapadi, Nina Mishra
|We develop a definition of a group that is inspired by how sociologists define groups and present an algorithm for discovering groups.
|How to grow more pairs: suggesting review targets for comparison-friendly review ecosystems
|James Cook, Alex Fabrikant, Avinatan Hassidim
|We consider the algorithmic challenges behind a novel interface that simplifies consumer research of online reviews by surfacing relevant comparable review bundles: reviews for two or more of the items being researched, all generated in similar enough circumstances to provide for easy comparison.
|A framework for benchmarking entity-annotation systems
|Marco Cornolti, Paolo Ferragina, Massimiliano Ciaramita
|In this paper we design and implement a benchmarking framework for fair and exhaustive comparison of entity-annotation systems.
|A framework for learning web wrappers from the crowd
|Valter Crescenzi, Paolo Merialdo, Disheng Qiu
|We introduce a framework to support a supervised wrapper inference system with training data generated by the crowd.
|Lightweight server support for browser-based CSRF protection
|Alexei Czeskis, Alexander Moshchuk, Tadayoshi Kohno, Helen J. Wang
|In this paper, we present a browser/server solution, Allowed Referrer Lists (ARLs), that addresses the root cause of CSRFs and removes ambient authority for participating web sites that want to be resilient to CSRF attacks.
|Aggregating crowdsourced binary ratings
|Nilesh Dalvi, Anirban Dasgupta, Ravi Kumar, Vibhor Rastogi
|In this paper we analyze a crowdsourcing system consisting of a set of users and a set of binary choice questions.
|Optimal hashing schemes for entity matching
|Nilesh Dalvi, Vibhor Rastogi, Anirban Dasgupta, Anish Das Sarma, Tamas Sarlos
|In this paper, we consider the problem of devising blocking schemes for entity matching.
|No country for old members: user lifecycle and linguistic change in online communities
|Cristian Danescu-Niculescu-Mizil, Robert West, Dan Jurafsky, Jure Leskovec, Christopher Potts
|We propose a framework for tracking linguistic change as it happens and for understanding how specific users react to these evolving norms.
|Crowdsourced judgement elicitation with endogenous proficiency
|Anirban Dasgupta, Arpita Ghosh
|Our main contribution is a simple, new, mechanism for binary information elicitation for multiple tasks when agents have endogenous proficiencies, with the following properties: (i) Exerting maximum effort followed by truthful reporting of observations is a Nash equilibrium.
|Timespent based models for predicting user retention
|Kushal S. Dave, Vishal Vaingankar, Sumanth Kolar, Vasudeva Varma
|In this paper, we attempt to address the problem of predicting user retention based on the user’s previous sessions.
|Attributing authorship of revisioned content
|Luca de Alfaro, Michael Shavlovsky
|Since content can be deleted, only to be later re-inserted, we introduce a notion of authorship that requires comparing each new revision with the entire set of past revisions.
|ClausIE: clause-based open information extraction
|Luciano Del Corro, Rainer Gemulla
|We propose ClausIE, a novel, clause-based approach to open information extraction, which extracts relations and their arguments from natural language text.
|Pick-a-crowd: tell me what you like, and i’ll tell you what to do
|Djellel Eddine Difallah, Gianluca Demartini, Philippe Cudré-Mauroux
|In this paper, we propose and extensively evaluate a different Crowdsourcing approach based on a push methodology.
|Compact explanation of data fusion decisions
|Xin Luna Dong, Divesh Srivastava
|We propose techniques that can efficiently generate correct and compact explanations.
|From query to question in one click: suggesting synthetic questions to searchers
|Gideon Dror, Yoelle Maarek, Avihai Mejer, Idan Szpektor
|To this end, we introduce a learning-based approach that improves not only the relevance of the suggested questions to the original query, but also their grammatical correctness.
|Perception and understanding of social annotations in web search
|Jennifer Fernquist, Ed H. Chi
|In this paper, we describe a study conducted with a new eyetracking mix-method using a live traffic search engine with the suggested design changes on real users using the same experimental procedures.
|AMIE: association rule mining under incomplete evidence in ontological knowledge bases
|Luis Antonio Galárraga, Christina Teflioudi, Katja Hose, Fabian Suchanek
|In this paper, we develop a rule mining model that is explicitly tailored to support the OWA scenario.
|PrefixSolve: efficiently solving multi-source multi-destination path queries on RDF graphs by sharing suffix computations
|Sidan Gao, Kemafor Anyanwu
|In this paper, we propose an optimization technique for general MSMD path queries that generalizes an efficient algebraic approach for solving a variety of single-source path problems.
|When tolerance causes weakness: the case of injection-friendly browsers
|Yossi Gilad, Amir Herzberg
|We present a practical off-path TCP-injection attack for connections between current, non-buggy browsers and web-servers.
|Exploiting innocuous activity for correlating users across sites
|Oana Goga, Howard Lei, Sree Hari Krishnan Parthasarathi, Gerald Friedland, Robin Sommer, Renata Teixeira
|We study how potential attackers can identify accounts on different social network sites that all belong to the same user, exploiting only innocuous activity that inherently comes with posted content.
|The cost of annoying ads
|Daniel G. Goldstein, R. Preston McAfee, Siddharth Suri
|We conclude by proposing a theoretical model which relates ad quality to publisher market share, illustrating how our empirical findings could affect the economics of Internet advertising.
|Researcher homepage classification using unlabeled data
|Sujatha Das Gollapalli, Cornelia Caragea, Prasenjit Mitra, C. Lee Giles
|As an alternative to obtaining datasets to retrain the classifier for the new content, we propose to use effectively unlimited amounts of unlabeled data readily available from these websites in a co-training scenario.
|Google+ or Google-?: dissecting the evolution of the new OSN in its first year
|Roberto Gonzalez, Ruben Cuevas, Reza Motamedi, Reza Rejaie, Angel Cuevas
|This paper tackles the above question by presenting a detailed characterization of G+ based on large scale measurements.
|Probabilistic group recommendation via information matching
|Jagadeesh Gorla, Neal Lathia, Stephen Robertson, Jun Wang
|Research to date in this domain has proposed two approaches: computing recommendations for the group by merging any members’ ratings into a single profile, or computing ranked recommendations for each individual that are then merged via a range of heuristics.
|WTF: the who to follow service at Twitter
|Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma, Dong Wang, Reza Zadeh
|We describe and evaluate a few graph recommendation algorithms implemented in Cassovary, including a novel approach based on a combination of random walks and SALSA.
|Mining expertise and interests from social media
|Ido Guy, Uri Avraham, David Carmel, Sigalit Ur, Michal Jacovi, Inbal Ronen
|In this work, we provide an extensive study that explores the use of social media to infer expertise within a large global organization.
|Measuring personalization of web search
|Aniko Hannak, Piotr Sapiezynski, Arash Molavi Kakhki, Balachander Krishnamurthy, David Lazer, Alan Mislove, Christo Wilson
|In light of this situation, we make three contributions.
|Estimating clustering coefficients and size of social networks via random walk
|Stephen J. Hardiman, Liran Katzir
|In this work, we provide efficient algorithms for estimating these measures which (1) assume no prior knowledge about the network; and (2) access the network using only the publicly available interface.
|Exploiting annotations for the rapid development of collaborative web applications
|Matthias Heinrich, Franz Josef Grüneberger, Thomas Springer, Martin Gaedke
|To ease the development of collaborative web applications, we propose a set of source code annotations representing a lightweight mechanism to introduce concurrency control services into mature web frameworks.
|Web usage mining with semantic analysis
|Laura Hollink, Peter Mika, Roi Blanco
|In our work, we aim to characterize websites in terms of the semantics of the queries that lead to them by linking queries to large knowledge bases on the Web.
|Organizational overlap on social networks and its applications
|Cho-Jui Hsieh, Mitul Tiwari, Deepak Agarwal, Xinyi (Lisa) Huang, Sam Shah
|In this paper, we address the problem of computing edge affinity between two users on a social network, based on the users belonging to organizations such as companies, schools, and online groups.
|Space-efficient data structures for Top-k completion
|Bo-June (Paul) Hsu, Giuseppe Ottaviano
|In this paper, we focus on the case where the string set is so large that compression is needed to fit the data structure in memory.
|Personalized recommendation via cross-domain triadic factorization
|Liang Hu, Jian Cao, Guandong Xu, Longbing Cao, Zhiping Gu, Can Zhu
|In this paper, we propose a generalized Cross Domain Triadic Factorization (CDTF) model over the triadic relation user-item-domain, which can better capture the interactions between domain-specific user factors and item factors.
|Unsupervised sentiment analysis with emotional signals
|Xia Hu, Jiliang Tang, Huiji Gao, Huan Liu
|Inspired by the wide availability of emotional signals in social media, we propose to study the problem of unsupervised sentiment analysis with emotional signals.
|An analysis of socware cascades in online social networks
|Ting-Kai Huang, Md Sazzadur Rahman, Harsha V. Madhyastha, Michalis Faloutsos, Bruno Ribeiro
|In this paper, we analyze data from the walls of roughly 3 million Facebook users over five months, with the goal of developing a better understanding of socware cascades.
|Measurement and analysis of child pornography trafficking on P2P networks
|Ryan Hurley, Swagatika Prusty, Hamed Soroush, Robert J. Walls, Jeannie Albrecht, Emmanuel Cecchet, Brian Neil Levine, Marc Liberatore, Brian Lynn, Janis Wolak
|In this paper, we examine observations of peers sharing known CP on the eMule and Gnutella networks, which were collected by law enforcement using forensic tools that we developed.
|HeteroMF: recommendation in heterogeneous information networks using context dependent factor models
|Mohsen Jamali, Laks Lakshmanan
|In this paper, we propose a context-dependent matrix factorization model, HeteroMF, that considers a general latent factor for entities of every entity type and context-dependent latent factors for every context in which the entities are involved.
|Interactive exploratory search for multi page search results
|Xiaoran Jin, Marc Sloan, Jun Wang
|Instead, we propose a new feedback scheme that makes use of existing UIs and does not alter user’s browsing behaviour; to maximise retrieval performance over multiple result pages, we propose a novel retrieval optimisation framework and show that the optimal ranking policy should choose a diverse, exploratory ranking to display on the first page.
|Spatio-temporal dynamics of online memes: a study of geo-tagged tweets
|Krishna Y. Kamath, James Caverlee, Kyumin Lee, Zhiyuan Cheng
|In our analysis, we (i) examine the impact of location, time, and distance on the adoption of hashtags, which is important for understanding meme diffusion and information propagation; (ii) examine the spatial propagation of hashtags through their focus, entropy, and spread; and (iii) present two methods that leverage the spatio-temporal propagation of hashtags to characterize locations.
|Accountable key infrastructure (AKI): a proposal for a public-key validation infrastructure
|Tiffany Hyun-Jin Kim, Lin-Shung Huang, Adrian Perrig, Collin Jackson, Virgil Gligor
|In this paper, we propose AKI as a new public-key validation infrastructure, to reduce the level of trust in CAs.
|DIGTOBI: a recommendation system for Digg articles using probabilistic modeling
|Younghoon Kim, Yoonjae Park, Kyuseok Shim
|In this paper, we propose DIGTOBI, a personalized recommendation system for Digg articles using a novel probabilistic modeling.
|Understanding latency variations of black box services
|Darja Krushevskaja, Mark Sandler
|We propose a general framework for understanding performance of arbitrary black box services.
|Diversified recommendation on graphs: pitfalls, measures, and algorithms
|Onur Küçüktunç, Erik Saule, Kamer Kaya, Ümit V. Çatalyürek
|In this paper, we show the deficiencies of popular evaluation techniques of diversification methods, and investigate multiple relevance and diversity measures to understand whether they have any correlations.
|What is the added value of negative links in online social networks?
|Jérôme Kunegis, Julia Preusse, Felix Schwagereit
|To answer the question whether negative links have an added value for an online social network, we investigate the machine learning problem of predicting the negative links of such a network using only the positive links as a basis, with the idea that if this problem can be solved with high accuracy, then the "negative link" feature is redundant.
|Voices of victory: a computational focus group framework for tracking opinion shift in real time
|Yu-Ru Lin, Drew Margolin, Brian Keegan, David Lazer
|We propose a novel approach to tame these sources of uncertainty through the introduction of "computational focus groups" to track opinion shifts in social media streams.
|Rethinking the web as a personal archive
|Siân E. Lindley, Catherine C. Marshall, Richard Banks, Abigail Sellen, Tim Regan
|In the study reported here, we follow up on research that suggests a sense of ownership and control can be reinforced by federating online content as a virtual, single store; we do this by conducting interviews with 14 individuals about their Web-based content.
|Expressive languages for selecting groups from graph-structured data
|Vitaliy Liptchinsky, Benjamin Satzger, Rostyslav Zabolotnyi, Schahram Dustdar
|We present an efficient algorithm for evaluating group queries in polynomial time from an input data graph.
|Modeling/predicting the evolution trend of osn-based applications
|Han Liu, Atif Nazir, Jinoo Joung, Chen-Nee Chuah
|This paper presents a new continuous graph evolution model aimed to capture microscopic user-level behaviors that govern the growth of the UAG and collectively define the overall graph structure.
|SoCo: a social network aided context-aware recommender system
|Xin Liu, Karl Aberer
|In this paper, we propose SoCo, a novel context-aware recommender system incorporating elaborately processed social network information.
|Using stranger as sensors: temporal and geo-sensitive question answering via social media
|Yefeng Liu, Todorka Alexandrova, Tatsuo Nakajima
|We analyze the usage patterns and behaviors of the real-world end-users, discuss the lessons learned, and outline the future directions and possible applications that could be built on top of MoboQ.
|James Teng Kin Lo, Eric Wohlstadter, Ali Mesbah
|Gender swapping and user behaviors in online social games
|Jing-Kai Lou, Kunwoo Park, Meeyoung Cha, Juyong Park, Chin-Laung Lei, Kuan-Ta Chen
|In this paper we investigate the phenomenon of "gender swapping," which refers to players choosing avatars of genders opposite to their natural ones.
|Mining structural hole spanners through information diffusion in social networks
|Tiancheng Lou, Jie Tang
|In this work, we precisely define the problem of mining top-k structural hole spanners in large-scale social networks and provide an objective (quality) function to formalize the problem.
|On the evolution of the internet economic ecosystem
|Richard T.B. Ma, John C.S. Lui, Vishal Misra
|We propose a network aware, macroscopic model that captures the characteristics and interactions of the application and network providers, and show how it leads to a market equilibrium of the ecosystem.
|Two years of short URLs internet measurement: security threats and countermeasures
|Federico Maggi, Alessandro Frossi, Stefano Zanero, Gianluca Stringhini, Brett Stone-Gross, Christopher Kruegel, Giovanni Vigna
|Despite short URLs are a significant, new security risk, in accordance with the reports resulting from the observation of the overall phishing and spamming activity, we found that only a relatively small fraction of users ever encountered malicious short URLs.
|Know your personalization: learning topic level personalization in online services
|Anirban Majumder, Nisheeth Shrivastava
|In this paper, we capture OSP’s personalization for an user in a new data structure called the personalization vector (?)
|Saving, reusing, and remixing web video: using attitudes and practices to reveal social norms
|Catherine C. Marshall, Frank M. Shipman
|Saving, reusing, and remixing web video: using attitudes and practices to reveal social norms
|From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews
|Julian John McAuley, Jure Leskovec
|Thus our goal in this paper is to recommend products that a user will enjoy now, while acknowledging that their tastes may have changed over time, and may change again in the future.
|The FLDA model for aspect-based opinion mining: addressing the cold start problem
|Samaneh Moghaddam, Martin Ester
|In this paper, we propose a probabilistic graphical model based on LDA, called Factorized LDA (FLDA), to address the cold start problem.
|Iolaus: securing online content rating systems
|Arash Molavi Kakhki, Chloe Kliman-Silver, Alan Mislove
|In this paper, we present Iolaus, a system that leverages the underlying social network of online content rating systems to defend against such attacks.
|On cognition, emotion, and interaction aspects of search tasks with different search intentions
|Yashar Moshfeghi, Joemon M. Jose
|Results show that we can learn a model that predicts the search task types with reasonable accuracy.
|Ad impression forecasting for sponsored search
|Abhirup Nath, Shibnath Mukherjee, Prateek Jain, Navin Goyal, Srivatsan Laxman
|In this paper, we develop a generative model based approach that addresses these drawbacks.
|Measurement and modeling of eye-mouse behavior in the presence of nonlinear page layouts
|Vidhya Navalpakkam, LaDawn Jentzsch, Rory Sayres, Sujith Ravi, Amr Ahmed, Alex Smola
|We present a lab study on the effect of a rich informational panel to the right of the search result column, on eye and mouse behavior.
|Understanding and decreasing the network footprint of catch-up tv
|Gianfranco Nencioni, Nishanth Sastry, Jigna Chandaria, Jon Crowcroft
|We find that catch-up has certain natural scaling properties compared to traditional TV: The on-demand nature spreads load over time, and users have much higher completion rates for content streams than previously reported.
|Sorry, i don’t speak SPARQL: translating SPARQL queries into natural language
|Axel-Cyrille Ngonga Ngomo, Lorenz Bühmann, Christina Unger, Jens Lehmann, Daniel Gerber
|This paper addresses this drawback by presenting SPARQL2NL, a generic approach that allows verbalizing SPARQL queries, i.e., converting them into natural language.
|Bitsquatting: exploiting bit-flips for fun, or profit?
|Nick Nikiforakis, Steven Van Acker, Wannes Meert, Lieven Desmet, Frank Piessens, Wouter Joosen
|In this paper, we report on a large-scale experiment, measuring the adoption of bitsquatting by the domain-squatting community through the tracking of registrations of bitsquatting domains targeting popular web sites over a 9-month period.
|One-class collaborative filtering with random graphs
|Ulrich Paquet, Noam Koenigstein
|In this paper we present a novel Bayesian generative model for implicit collaborative filtering.
|Latent credibility analysis
|Jeff Pasternack, Dan Roth
|We introduce a new approach to information credibility, Latent Credibility Analysis (LCA), constructing strongly principled, probabilistic models where the truth of each claim is a latent variable and the credibility of a source is captured by a set of model parameters.
|Predicting group stability in online social networks
|Akshay Patil, Juan Liu, Jie Gao
|In this paper, we study two different types of social networks as exemplar platforms for modeling and predicting group stability dynamics.
|Predictive web automation assistant for people with vision impairments
|Yury Puzis, Yevgen Borodin, Rami Puzis, I.V. Ramakrishnan
|In this paper, we propose a novel model-based approach that facilitates web automation without having to either record or replay macros.
|Mining collective intelligence in diverse groups
|Guo-Jun Qi, Charu C. Aggarwal, Jiawei Han, Thomas Huang
|In order to address this issue, we propose a probabilistic model to jointly assess the reliability of sources and find the true data.
|Trade area analysis using user generated mobile location data
|Yan Qu, Jun Zhang
|In this paper, we illustrate how User Generated Mobile Location Data (UGMLD) like Foursquare check-ins can be used in Trade Area Analysis (TAA) by introducing a new framework and corresponding analytic methods.
|Psychological maps 2.0: a web engagement enterprise starting in London
|Daniele Quercia, Joao Paulo Pesce, Virgilio Almeida, Jon Crowcroft
|We build a web game that puts the recognizability of London’s streets to the test. We collect data from 2,255 participants (one order of magnitude a larger sample) and build a recognizability map of London based on their responses.
|Towards realistic team formation in social networks based on densest subgraphs
|Syama Sundar Rangapuram, Thomas Bühler, Matthias Hein
|The goal of this paper is to consider the team formation problem in a realistic setting and present a novel formulation based on densest subgraphs.
|Efficient community detection in large networks using content and links
|Yiye Ruan, David Fuhry, Srinivasan Parthasarathy
|In this paper we discuss a very simple approach of combining content and link information in graph structures for the purpose of community discovery, a fundamental task in network analysis.
|Learning joint query interpretation and response ranking
|Uma Sawant, Soumen Chakrabarti
|We propose two new, natural formulations for joint query interpretation and response ranking that exploit bidirectional flow of information between the knowledge base and the corpus.
|A model for green design of online news media services
|Daniel Schien, Paul Shabajee, Stephen G. Wood, Chris Preist
|In this work we describe a new method which combines models of energy consumption during the use of digital media with models of the behavior of the audience.
|Potential networks, contagious communities, and understanding social network structure
|In this paper we study how the network of agents adopting a particular technology relates to the structure of the underlying network over which the technology adoption spreads.
|Do social explanations work?: studying and modeling the effects of social explanations in recommender systems
|Amit Sharma, Dan Cosley
|Based on these insights, we present a generative probabilistic model that explains the interplay between explanations and background information on music preferences, and how that leads to a final likelihood rating for an artist.
|Question answering on interlinked data
|Saeedeh Shekarpour, Axel-Cyrille Ngonga Ngomo, Sören Auer
|We present a question answering system, which transforms user supplied queries (i.e. natural language sentences or keywords) into conjunctive SPARQL queries over a set of interlinked data sources.
|Pricing mechanisms for crowdsourcing markets
|Yaron Singer, Manas Mittal
|In this paper, we introduce a framework for designing mechanisms with provable guarantees in crowdsourcing markets.
|Truthful incentives in crowdsourcing tasks using regret minimization mechanisms
|Adish Singla, Andreas Krause
|In this paper, we address these questions and present mechanisms using the approach of regret minimization in online learning.
|A predictive model for advertiser value-per-click in sponsored search
|Eric Sodomka, Sébastien Lahaie, Dustin Hillard
|In this paper we propose an approach to keyword value prediction that draws on advertiser bidding behavior across the terms and campaigns in an account.
|I know the shortened URLs you clicked on Twitter: inference attack using public click analytics and Twitter metadata
|Jonghyuk Song, Sangho Lee, Jong Kim
|In this paper, we propose a practical attack technique that can infer who clicks what shortened URLs on Twitter.
|Exploring and exploiting user search behavior on mobile and tablet devices to improve search relevance
|Yang Song, Hao Ma, Hongning Wang, Kuansan Wang
|In this paper, we present a log-based study on user search behavior comparisons on three different platforms: desktop, mobile and tablet.
|Evaluating and predicting user engagement change with degraded search relevance
|Yang Song, Xiaolin Shi, Xin Fu
|We believe that insights from this study can be leveraged by search engine companies to detect and intervene search relevance degradation and to prevent long term user engagement drop.
|Data-Fu: a language and an interpreter for interaction with read/write linked data
|Steffen Stadtmüller, Sebastian Speiser, Andreas Harth, Rudi Studer
|For declaratively specifying interactions between web resources we introduce Data-Fu, a lightweight declarative rule language with state transition systems as formal grounding.
|NIFTY: a system for large scale information flow tracking and clustering
|Caroline Suen, Sandy Huang, Chantat Eksombatchai, Rok Sosic, Jure Leskovec
|We describe the News Information Flow Tracking, Yay!
|When relevance is not enough: promoting diversity and freshness in personalized question recommendation
|Idan Szpektor, Yoelle Maarek, Dan Pelleg
|We found that those drawing-board requirements fail to capture user’s interests.
|Mining acronym expansions and their meanings using query click log
|Bilyana Taneva, Tao Cheng, Kaushik Chakrabarti, Yeye He
|We present a novel, end-to-end solution that addresses the above challenges.
|Groundhog day: near-duplicate detection on Twitter
|Ke Tao, Fabian Abel, Claudia Hauff, Geert-Jan Houben, Ujwal Gadiraju
|We investigate the problem of near-duplicate detection on Twitter and introduce a framework that analyzes the tweets by comparing (i) syntactical characteristics, (ii) semantic similarity, and (iii) contextual information.
|Uncovering locally characterizing regions within geotagged data
|Bart Thomee, Adam Rae
|We propose a novel algorithm for uncovering the colloquial boundaries of locally characterizing regions present in collections of labeled geospatial data.
|Spectral analysis of communication networks using Dirichlet eigenvalues
|Alexander Tsiatas, Iraj Saniee, Onuttom Narayan, Matthew Andrews
|Spectral methods provide effective means to estimate the smallest Cheeger ratio via the spectral gap of the graph Laplacian.
|Subgraph frequencies: mapping the empirical and extremal geography of large graph collections
|Johan Ugander, Lars Backstrom, Jon Kleinberg
|In this work, we draw on the theory of graph homomorphisms to formulate and analyze such a representation, based on computing the frequencies of small induced subgraphs within each graph.
|The self-feeding process: a unifying model for communication dynamics in the web
|Pedro Olmo S. Vaz de Melo, Christos Faloutsos, Renato Assunção, Antonio Loureiro
|We show here that, surprisingly, both approaches are correct, being corner cases of the proposed Self-Feeding Process (SFP).
|Google+Ripples: a native visualization of information flow
|Fernanda Viégas, Martin Wattenberg, Jack Hebert, Geoffrey Borggaard, Alison Cichowlas, Jonathan Feinberg, Jon Orwant, Christopher Wren
|We describe the visualization technique, which is a new mix of node-and-link and circular treemap metaphors.
|Whom to mention: expand the diffusion of tweets by @ recommendation on micro-blogging systems
|Beidou Wang, Can Wang, Jiajun Bu, Chun Chen, Wei Vivian Zhang, Deng Cai, Xiaofei He
|In this paper, whom-to-mention is formulated as a ranking problem and we try to address several new challenges which are not well studied in the traditional information retrieval tasks.
|Wisdom in the social crowd: an analysis of quora
|Gang Wang, Konark Gill, Manish Mohanlal, Haitao Zheng, Ben Y. Zhao
|In this paper, we present results of a detailed analysis of Quora using measurements.
|Learning to extract cross-session search tasks
|Hongning Wang, Yang Song, Ming-Wei Chang, Xiaodong He, Ryen W. White, Wei Chu
|In this work, we target the identification of long-term, or cross-session, search tasks (transcending session boundaries) by investigating inter-query dependencies learned from users’ searching behaviors.
|Content-aware click modeling
|Hongning Wang, ChengXiang Zhai, Anlei Dong, Yi Chang
|In this work, we proposed a novel Bayesian Sequential State model for modeling the user click behaviors, where the document content and dependencies among the sequential click events within a query are characterized by a set of descriptive features via a probabilistic graphical model.
|Is it time for a career switch?
|Jian Wang, Yi Zhang, Christian Posse, Anmol Bhasin
|Is it time for a career switch?
|From cookies to cooks: insights on dietary patterns via analysis of web usage logs
|Robert West, Ryen W. White, Eric Horvitz
|In this preliminary study, we focus on patterns of sodium identified in recipes over time and patterns of admission for congestive heart failure, a chronic illness that can be exacerbated by increases in sodium intake.
|Enhancing personalized search by mining and modeling task behavior
|Ryen W. White, Wei Chu, Ahmed Hassan, Xiaodong He, Yang Song, Hongning Wang
|We describe a method whereby we mine historic search-engine logs to find other users performing similar tasks to the current user and leverage their on-task behavior to identify Web pages to promote in the current ranking.
|Inferring dependency constraints on parameters for web services
|Qian Wu, Ling Wu, Guangtai Liang, Qianxiang Wang, Tao Xie, Hong Mei
|To address this issue, we propose a novel approach, called INDICATOR, to automatically infer dependency constraints on parameters for web services, via a hybrid analysis of heterogeneous web service artifacts, including the service documentation, the service SDKs, and the web services themselves.
|Predicting advertiser bidding behaviors in sponsored search by rationality modeling
|Haifeng Xu, Bin Gao, Diyi Yang, Tie-Yan Liu
|In this paper, we explicitly model these limitations in the rationality of advertisers, and build a probabilistic advertiser behavior model from the perspective of a search engine.
|A biterm topic model for short texts
|Xiaohui Yan, Jiafeng Guo, Yanyan Lan, Xueqi Cheng
|In this paper, we propose a novel way for modeling topics in short texts, referred as biterm topic model (BTM).
|Unified entity search in social media community
|Ting Yao, Yuan Liu, Chong-Wah Ngo, Tao Mei
|To infer the strength of intra-relations, we propose a circular propagation scheme, which reinforces the mutual exchange of information across different entity types in a cyclic manner.
|MATRI: a multi-aspect and transitive trust inference model
|Yuan Yao, Hanghang Tong, Xifeng Yan, Feng Xu, Jian Lu
|In this paper, we propose a multi-aspect trust inference model by exploring an equally important property of trust, i.e., the multi-aspect property.
|Predicting positive and negative links in signed social networks by transfer learning
|Jihang Ye, Hong Cheng, Zhe Zhu, Minghua Chen
|Different from a large body of research on social networks that has focused almost exclusively on positive relationships, we study signed social networks with both positive and negative links.
|Sparse online topic models
|Aonan Zhang, Jun Zhu, Bo Zhang
|In this paper, we present a sparse online topic model, which directly controls the sparsity of latent semantic patterns by imposing sparsity-inducing regularization and learns the topical dictionary by an online algorithm.
|TopRec: domain-specific recommendation through community topic mining in social network
|Xi Zhang, Jian Cheng, Ting Yuan, Biao Niu, Hanqing Lu
|In this paper, we propose a unified framework, TopRec, which detects topical communities to construct interpretable domains for domain-specific collaborative filtering.
|Localized matrix factorization for recommendation based on matrix block diagonal forms
|Yongfeng Zhang, Min Zhang, Yiqun Liu, Shaoping Ma, Shi Feng
|In this paper, we present the Localized Matrix Factorization (LMF) framework, which attempts to meet the challenges of sparsity and scalability by factorizing Block Diagonal Form (BDF) matrices.
|Predicting purchase behaviors from social media
|Yongzheng Zhang, Marco Pennacchiotti
|This paper presents a system for predicting a user’s purchase behaviors on e-commerce websites from the user’s social media profile.
|Anatomy of a web-scale resale market: a data mining approach
|Yuchen Zhao, Neel Sundaresan, Zeqian Shen, Philip S. Yu
|In this paper, we study an instance of such markets that affords interesting data at large scale for mining purposes to understand the properties and patterns of this online market.As part of knowledge discovery of such a market, we first formally propose criteria to reveal unseen resale behaviors by elastic matching identification (EMI) based on the account transfer and item similarity properties of transactions.
|Questions about questions: an empirical analysis of information needs on Twitter
|Zhe Zhao, Qiaozhu Mei
|In this study, we take the initiative to extract and analyze information needs from billions of online conversations collected from Twitter.
|Which vertical search engines are relevant?
|Ke Zhou, Ronan Cummins, Mounia Lalmas, Joemon M. Jose
|To address this, we present a formal analysis and a set of extensive user studies to investigate the effects of various assumptions made for assessing query vertical relevance.
|Making the most of your triple store: query answering in OWL 2 using an RL reasoner
|Yujiao Zhou, Bernardo Cuenca Grau, Ian Horrocks, Zhe Wu, Jay Banerjee
|In this paper, we propose novel techniques that allow us (in many cases) to compute exact query answers using an off-the-shelf RL reasoner, even when the ontology is outside the RL profile.
|Security implications of password discretization for click-based graphical passwords
|Bin B. Zhu, Dongchen Wei, Maowei Yang, Jeff Yan
|In this paper, we show for the first time that two representative discretization schemes leak a significant amount of password information, undermining the security of such graphical passwords.