Most Influential KDD Papers
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) is one of the top data mining conferences in the world. Paper Digest Team analyze all papers published on KDD in the past years, and presents the 15 most influential papers for each year. This ranking list is automatically constructed based upon citations from both research papers and granted patents, and will be frequently updated to reflect the most recent changes. To find the most influential papers from other conferences/journals, visit Best Paper Digest page. Note: the most influential papers may or may not include the papers that won the best paper awards. (Version: 2021-05)
If you do not want to miss any interesting academic paper, you are welcome to sign up our free daily paper digest service to get updates on new papers published in your area every day. To search for papers with highlights, related papers, patents, grants, experts and organizations, please visit our search console. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
team@paperdigest.org
TABLE 1: Most Influential KDD Papers
Year | Rank | Paper | Author(s) |
---|---|---|---|
2020 | 1 | GCC: Graph Contrastive Coding For Graph Neural Network Pre-Training IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We design GCC’s pre-training task as subgraph instance discrimination in and across networks and leverage contrastive learning to empower graph neural networks to learn the intrinsic and transferable structural representations. |
JIEZHONG QIU et. al. |
2020 | 2 | Exploring Automatic Diagnosis Of COVID-19 From Crowdsourced Respiratory Sound Data IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we describe our data analysis over a large-scale crowdsourced dataset of respiratory sounds collected to aid diagnosis of COVID-19. |
CHLOË BROWN et. al. |
2020 | 3 | LayoutLM: Pre-training Of Text And Layout For Document Image Understanding IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose the LayoutLM to jointly model interactions between text and layout information across scanned document images, which is beneficial for a great number of real-world document image understanding tasks such as information extraction from scanned documents. |
YIHENG XU et. al. |
2020 | 4 | Towards Physics-informed Deep Learning For Turbulent Flow Prediction IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we aim to predict turbulent flow by learning its highly nonlinear dynamics from spatiotemporal velocity fields of large-scale fluid flow simulations of relevance to turbulence modeling and climate modeling. |
Rui Wang; Karthik Kashinath; Mustafa Mustafa; Adrian Albert; Rose Yu; |
2020 | 5 | Graph Structure Learning For Robust Graph Neural Networks IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Therefore, in this paper, we explore these properties to defend adversarial attacks on graphs. |
WEI JIN et. al. |
2020 | 6 | Towards Deeper Graph Neural Networks IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we study this observation systematically and develop new insights towards deeper graph neural networks. |
Meng Liu; Hongyang Gao; Shuiwang Ji; |
2020 | 7 | Connecting The Dots: Multivariate Time Series Forecasting With Graph Neural Networks IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a general graph neural network framework designed specifically for multivariate time series data. |
ZONGHAN WU et. al. |
2020 | 8 | GPT-GNN: Generative Pre-Training Of Graph Neural Networks IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present the GPT-GNN framework to initialize GNNs by generative pre-training. |
Ziniu Hu; Yuxiao Dong; Kuansan Wang; Kai-Wei Chang; Yizhou Sun; |
2020 | 9 | AutoFIS: Automatic Feature Interaction Selection In Factorization Models For Click-Through Rate Prediction IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a two-stage algorithm called Automatic Feature Interaction Selection (AutoFIS). |
BIN LIU et. al. |
2020 | 10 | On Sampled Metrics For Item Recommendation IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show that it is possible to improve the quality of the sampled metrics by applying a correction, obtained by minimizing different criteria such as bias or mean squared error. |
Walid Krichene; Steffen Rendle; |
2020 | 11 | Neural Input Search For Large Scale Recommendation Models IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present Neural Input Search (NIS), a technique for learning the optimal vocabulary sizes and embedding dimensions for categorical features. |
MANAS R. JOGLEKAR et. al. |
2020 | 12 | Scaling Graph Neural Networks With Approximate PageRank IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present the PPRGo model which utilizes an efficient approximation of information diffusion in GNNs resulting in significant speed gains while maintaining state-of-the-art prediction performance. |
ALEKSANDAR BOJCHEVSKI et. al. |
2020 | 13 | Taming Pretrained Transformers For Extreme Multi-label Text Classification IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose X-Transformer, the first scalable approach to fine-tuning deep transformer models for the XMC problem. |
Wei-Cheng Chang; Hsiang-Fu Yu; Kai Zhong; Yiming Yang; Inderjit S. Dhillon; |
2020 | 14 | AutoShuffleNet: Learning Permutation Matrices Via An Exact Lipschitz Continuous Penalty In Deep Convolutional Neural Networks IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose to automate channel shuffling by learning permutation matrices in network training. |
Jiancheng Lyu; Shuai Zhang; Yingyong Qi; Jack Xin; |
2020 | 15 | Certifiable Robustness Of Graph Convolutional Networks Under Structure Perturbations IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we close this gap and propose the first method to certify robustness of Graph Convolutional Networks (GCNs) under perturbations of the graph structure. |
Daniel Zügner; Stephan Günnemann; |
2019 | 1 | Optuna: A Next-generation Hyperparameter Optimization Framework IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The purpose of this study is to introduce new design-criteria for next-generation hyperparameter optimization software. |
Takuya Akiba; Shotaro Sano; Toshihiko Yanase; Takeru Ohta; Masanori Koyama; |
2019 | 2 | KGAT: Knowledge Graph Attention Network For Recommendation IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we investigate the utility of knowledge graph (KG), which breaks down the independent interaction assumption by linking items with their attributes. We release the codes and datasets at https://github.com/xiangwang1223/knowledge_graph_attention_network. |
Xiang Wang; Xiangnan He; Yixin Cao; Meng Liu; Tat-Seng Chua; |
2019 | 3 | Auto-Keras: An Efficient Neural Architecture Search System IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel framework enabling Bayesian optimization to guide the network morphism for efficient neural architecture search. |
Haifeng Jin; Qingquan Song; Xia Hu; |
2019 | 4 | Cluster-GCN: An Efficient Algorithm For Training Deep And Large Graph Convolutional Networks IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose Cluster-GCN, a novel GCN algorithm that is suitable for SGD-based training by exploiting the graph clustering structure. To test the scalability of our algorithm, we create a new Amazon2M data with 2 million nodes and 61 million edges which is more than 5 times larger than the previous largest publicly available dataset (Reddit). |
WEI-LIN CHIANG et. al. |
2019 | 5 | Heterogeneous Graph Neural Network IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose HetGNN, a heterogeneous graph neural network model, to resolve this issue. |
Chuxu Zhang; Dongjin Song; Chao Huang; Ananthram Swami; Nitesh V. Chawla; |
2019 | 6 | DEFEND: Explainable Fake News Detection IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, therefore, we study the explainable detection of fake news. |
Kai Shu; Limeng Cui; Suhang Wang; Dongwon Lee; Huan Liu; |
2019 | 7 | Representation Learning For Attributed Multiplex Heterogeneous Network IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we formalize the problem of embedding learning for the Attributed Multiplex Heterogeneous Network and propose a unified framework to address this problem. |
YUKUO CEN et. al. |
2019 | 8 | Robust Graph Convolutional Networks Against Adversarial Attacks IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this problem, we propose Robust GCN (RGCN), a novel model that fortifies” GCNs against adversarial attacks. |
Dingyuan Zhu; Ziwei Zhang; Peng Cui; Wenwu Zhu; |
2019 | 9 | Graph Convolutional Networks With EigenPooling IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce a pooling operator $\pooling$ based on graph Fourier transform, which can utilize the node features and local structures during the pooling process. |
Yao Ma; Suhang Wang; Charu C. Aggarwal; Jiliang Tang; |
2019 | 10 | Fairness In Recommendation Ranking Through Pairwise Comparisons IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we offer a set of novel metrics for evaluating algorithmic fairness concerns in recommender systems. |
ALEX BEUTEL et. al. |
2019 | 11 | Knowledge-aware Graph Neural Networks With Label Smoothness Regularization For Recommender Systems IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here we propose Knowledge-aware Graph Neural Networks with Label Smoothness regularization (KGNN-LS) to provide better recommendations. |
HONGWEI WANG et. al. |
2019 | 12 | Urban Traffic Prediction From Spatio-Temporal Data Using Deep Meta Learning IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To tackle these challenges, we proposed a deep-meta-learning based model, entitled ST-MetaNet, to collectively predict traffic in all location at once. |
ZHEYI PAN et. al. |
2019 | 13 | Chainer: A Deep Learning Framework For Accelerating The Research Cycle IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce the Chainer framework, which intends to provide a flexible, intuitive, and high performance means of implementing the full range of deep learning models needed by researchers and practitioners. |
SEIYA TOKUI et. al. |
2019 | 14 | Predicting Dynamic Embedding Trajectory In Temporal Interaction Networks IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here we propose JODIE, a coupled recurrent neural network model that learns the embedding trajectories of users and items. |
Srijan Kumar; Xikun Zhang; Jure Leskovec; |
2019 | 15 | TF-Ranking: Scalable TensorFlow Library For Learning-to-Rank IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce TensorFlow Ranking, the first open source library for solving large-scale ranking problems in a deep learning framework. |
RAMA KUMAR PASUMARTHI et. al. |
2018 | 1 | Graph Convolutional Neural Networks For Web-Scale Recommender Systems IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here we describe a large-scale deep recommendation engine that we developed and deployed at Pinterest. |
REX YING et. al. |
2018 | 2 | Deep Interest Network For Click-Through Rate Prediction IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel model: Deep Interest Network (DIN) which tackles this challenge by designing a local activation unit to adaptively learn the representation of user interests from historical behaviors with respect to a certain ad. |
GUORUI ZHOU et. al. |
2018 | 3 | XDeepFM: Combining Explicit And Implicit Feature Interactions For Recommender Systems IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel Compressed Interaction Network (CIN), which aims to generate feature interactions in an explicit fashion and at the vector-wise level. |
JIANXUN LIAN et. al. |
2018 | 4 | Adversarial Attacks On Neural Networks For Graph Data IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we introduce the first study of adversarial attacks on attributed graphs, specifically focusing on models exploiting ideas of graph convolutions. |
Daniel Z?gner; Amir Akbarnejad; Stephan G?nnemann; |
2018 | 5 | Large-Scale Learnable Graph Convolutional Networks IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To enable model training on large-scale graphs, we propose a sub-graph training method to reduce the excessive memory and computational resource requirements suffered by prior methods on graph convolutions. |
Hongyang Gao; Zhengyang Wang; Shuiwang Ji; |
2018 | 6 | STAMP: Short-Term Attention/Memory Priority Model For Session-based Recommendation IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this study, we argue that a long-term memory model may be insufficient for modeling long sessions that usually contain user interests drift caused by unintended clicks. |
Qiao Liu; Yifu Zeng; Refuoe Mokhosi; Haibin Zhang; |
2018 | 7 | EANN: Event Adversarial Neural Networks For Multi-Modal Fake News Detection IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In order to address this issue, we propose an end-to-end framework named Event Adversarial Neural Network (EANN), which can derive event-invariant features and thus benefit the detection of fake news on newly arrived events. |
YAQING WANG et. al. |
2018 | 8 | Leveraging Meta-path Based Context For Top- N Recommendation With A Neural Co-Attention Model IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To construct the meta-path based context, we propose to use a priority based sampling technique to select high-quality path instances. |
Binbin Hu; Chuan Shi; Wayne Xin Zhao; Philip S. Yu; |
2018 | 9 | Fairness Of Exposure In Rankings IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address these often conflicting responsibilities, we propose a conceptual and computational framework that allows the formulation of fairness constraints on rankings in terms of exposure allocation. |
Ashudeep Singh; Thorsten Joachims; |
2018 | 10 | Learning Structural Node Embeddings Via Diffusion Wavelets IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we develop GraphWave, a method that represents each node’s network neighborhood via a low-dimensional embedding by leveraging heat wavelet diffusion patterns. |
Claire Donnat; Marinka Zitnik; David Hallac; Jure Leskovec; |
2018 | 11 | Billion-scale Commodity Embedding For E-commerce Recommendation In Alibaba IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present our technical solutions to address these three challenges. |
JIZHE WANG et. al. |
2018 | 12 | IntelliLight: A Reinforcement Learning Approach For Intelligent Traffic Light Control IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a more effective deep reinforcement learning model for traffic light control. |
Hua Wei; Guanjie Zheng; Huaxiu Yao; Zhenhui Li; |
2018 | 13 | DeepInf: Social Influence Prediction With Deep Learning IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: Social and information networking activities such as on Facebook, Twitter, WeChat, and Weibo have become an indispensable part of our everyday life, where we can easily access … |
JIEZHONG QIU et. al. |
2018 | 14 | Detecting Spacecraft Anomalies Using LSTMs And Nonparametric Dynamic Thresholding IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We demonstrate the effectiveness of Long Short-Term Memory (LSTMs) networks, a type of Recurrent Neural Network (RNN), in overcoming these issues using expert-labeled telemetry anomaly data from the Soil Moisture Active Passive (SMAP) satellite and the Mars Science Laboratory (MSL) rover, Curiosity. |
Kyle Hundman; Valentino Constantinou; Christopher Laporte; Ian Colwell; Tom Soderstrom; |
2018 | 15 | Multi-Pointer Co-Attention Networks For Recommendation IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a new neural architecture for recommendation with reviews. |
Yi Tay; Anh Tuan Luu; Siu Cheung Hui; |
2017 | 1 | Metapath2vec: Scalable Representation Learning For Heterogeneous Networks IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the problem of representation learning in heterogeneous networks. |
Yuxiao Dong; Nitesh V. Chawla; Ananthram Swami; |
2017 | 2 | Struc2vec: Learning Node Representations From Structural Identity IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work presents struc2vec, a novel and flexible framework for learning latent representations for the structural identity of nodes. |
Leonardo F.R. Ribeiro; Pedro H.P. Saverese; Daniel R. Figueiredo; |
2017 | 3 | Anomaly Detection With Robust Deep Autoencoders IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Such "Group Robust Deep Autoencoders (GRDA)" give rise to novel anomaly detection approaches whose superior performance we demonstrate on a selection of benchmark problems. |
Chong Zhou; Randy C. Paffenroth; |
2017 | 4 | Google Vizier: A Service For Black-Box Optimization IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we describe Google Vizier, a Google-internal service for performing black-box optimization that has become the de facto parameter tuning engine at Google. |
DANIEL GOLOVIN et. al. |
2017 | 5 | Algorithmic Decision Making And The Cost Of Fairness IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To mitigate such disparities, several techniques have recently been proposed to achieve algorithmic fairness. |
Sam Corbett-Davies; Emma Pierson; Avi Feller; Sharad Goel; Aziz Huq; |
2017 | 6 | GRAM: Graph-based Attention Model For Healthcare Representation Learning IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address these challenges, we propose GRaph-based Attention Model (GRAM) that supplements electronic health records (EHR) with hierarchical information inherent to medical ontologies. |
Edward Choi; Mohammad Taha Bahadori; Le Song; Walter F. Stewart; Jimeng Sun; |
2017 | 7 | ReasoNet: Learning To Stop Reading In Machine Comprehension IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we describe a novel neural network architecture called the Reasoning Network (ReasoNet) for machine comprehension tasks. |
Yelong Shen; Po-Sen Huang; Jianfeng Gao; Weizhu Chen; |
2017 | 8 | Local Higher-Order Graph Clustering IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Local graph clustering methods aim to find a cluster of nodes by exploring a small region of the graph. |
Hao Yin; Austin R. Benson; Jure Leskovec; David F. Gleich; |
2017 | 9 | Dipole: Diagnosis Prediction In Healthcare Via Attention-based Bidirectional Recurrent Neural Networks IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address these issues, we propose Dipole, an end-to-end, simple and robust model for predicting patients’ future health information. |
FENGLONG MA et. al. |
2017 | 10 | TFX: A TensorFlow-Based Production-Scale Machine Learning Platform IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present TensorFlow Extended (TFX), a TensorFlow-based general-purpose machine learning platform implemented at Google. |
DENIS BAYLOR et. al. |
2017 | 11 | Collaborative Variational Autoencoder For Recommender Systems IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a Bayesian generative model called collaborative variational autoencoder (CVAE) that considers both rating and content for recommendation in multimedia scenario. |
Xiaopeng Li; James She; |
2017 | 12 | Patient Subtyping Via Time-Aware LSTM Networks IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In the study of various diseases, heterogeneity among patients usually leads to different progression patterns and may require different types of therapeutic intervention. |
INCI M. BAYTAS et. al. |
2017 | 13 | Meta-Graph Based Recommendation Fusion Over Heterogeneous Information Networks IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: With different meta-graph based features, we propose to use FM with Group lasso (FMG) to automatically learn from the observed ratings to effectively select useful meta-graph based features. |
Huan Zhao; Quanming Yao; Jianda Li; Yangqiu Song; Dik Lun Lee; |
2017 | 14 | Bridging Collaborative Filtering And Semi-Supervised Learning: A Neural Approach For POI Recommendation IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose to devise a general and principled SSL (semi-supervised learning) framework, to alleviate data scarcity via smoothing among neighboring users and POIs, and treat various context by regularizing user preference based on context graphs. |
Carl Yang; Lanxiao Bai; Chao Zhang; Quan Yuan; Jiawei Han; |
2017 | 15 | Embedding-based News Recommendation For Millions Of Users IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Services that incorporated the method we propose are already open to all users and provide recommendations to over ten million individual users per day who make billions of accesses per month. |
Shumpei Okura; Yukihiro Tagami; Shingo Ono; Akira Tajima; |
2016 | 1 | XGBoost: A Scalable Tree Boosting System IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. |
Tianqi Chen; Carlos Guestrin; |
2016 | 2 | Node2vec: Scalable Feature Learning For Networks IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks. |
Aditya Grover; Jure Leskovec; |
2016 | 3 | Structural Deep Network Embedding IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To solve this problem, in this paper we propose a Structural Deep Network Embedding method, namely SDNE. |
Daixin Wang; Peng Cui; Wenwu Zhu; |
2016 | 4 | Asymmetric Transitivity Preserving Graph Embedding IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To tackle this challenge, we propose the idea of preserving asymmetric transitivity by approximating high-order proximity which are based on asymmetric transitivity. |
Mingdong Ou; Peng Cui; Jian Pei; Ziwei Zhang; Wenwu Zhu; |
2016 | 5 | Collaborative Knowledge Base Embedding For Recommender Systems IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we investigate how to leverage the heterogeneous information in a knowledge base to improve the quality of recommender systems. |
Fuzheng Zhang; Nicholas Jing Yuan; Defu Lian; Xing Xie; Wei-Ying Ma; |
2016 | 6 | Interpretable Decision Sets: A Joint Framework For Description And Prediction IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here we propose interpretable decision sets, a framework for building predictive models that are highly accurate, yet also highly interpretable. |
Himabindu Lakkaraju; Stephen H. Bach; Jure Leskovec; |
2016 | 7 | Recurrent Marked Temporal Point Processes: Embedding Event History To Vector IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose the Recurrent Marked Temporal Point Process (RMTPP) to simultaneously model the event timings and the markers. |
NAN DU et. al. |
2016 | 8 | Multi-layer Representation Learning For Medical Concepts IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose Med2Vec, which not only learns the representations for both medical codes and visits from large EHR datasets with over million visits, but also allows us to interpret the learned representations confirmed positively by clinical experts. |
EDWARD CHOI et. al. |
2016 | 9 | CNTK: Microsoft’s Open-Source Deep-Learning Toolkit IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This tutorial will introduce the Computational Network Toolkit, or CNTK, Microsoft’s cutting-edge open-source deep-learning toolkit for Windows and Linux. |
Frank Seide; Amit Agarwal; |
2016 | 10 | Algorithmic Bias: From Discrimination Discovery To Fairness-aware Data Mining IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The aim of this tutorial is to survey algorithmic bias, presenting its most common variants, with an emphasis on the algorithmic techniques and key ideas developed to derive efficient solutions. |
Sara Hajian; Francesco Bonchi; Carlos Castillo; |
2016 | 11 | Smart Reply: Automated Response Suggestion For Email IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose and investigate a novel end-to-end method for automatically generating short email responses, called Smart Reply. |
ANJULI KANNAN et. al. |
2016 | 12 | Deep Crossing: Web-Scale Modeling Without Manually Crafted Combinatorial Features IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes the Deep Crossing model which is a deep neural network that automatically combines features to produce superior models. |
YING SHAN et. al. |
2016 | 13 | Point-of-Interest Recommendations: Learning Potential Check-ins From Friends IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To cope with these challenges, we define three types of friends (i.e., social friends, location friends, and neighboring friends) in LBSN, and develop a two-step framework to leverage the information of friends to improve POI recommendation accuracy and address cold-start problem. |
Huayu Li; Yong Ge; Richang Hong; Hengshu Zhu; |
2016 | 14 | Convolutional Neural Networks For Steady Flow Approximation IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a general and flexible approximation model for real-time prediction of non-uniform steady laminar flow in a 2D or 3D domain based on convolutional neural networks (CNNs). |
Xiaoxiao Guo; Wei Li; Francesco Iorio; |
2016 | 15 | FRAUDAR: Bounding Graph Fraud In The Face Of Camouflage IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose FRAUDAR, an algorithm that (a) is camouflage-resistant, (b) provides upper bounds on the effectiveness of fraudsters, and (c) is effective in real-world data. |
BRYAN HOOI et. al. |
2015 | 1 | Collaborative Deep Learning For Recommender Systems IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this problem, we generalize recently advances in deep learning from i.i.d. input to non-i.i.d. (CF-based) input and propose in this paper a hierarchical Bayesian model called collaborative deep learning (CDL), which jointly performs deep representation learning for the content information and collaborative filtering for the ratings (feedback) matrix. |
Hao Wang; Naiyan Wang; Dit-Yan Yeung; |
2015 | 2 | Certifying And Removing Disparate Impact IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Instead of requiring access to the process, we propose making inferences based on the data it uses. |
Michael Feldman; Sorelle A. Friedler; John Moeller; Carlos Scheidegger; Suresh Venkatasubramanian; |
2015 | 3 | Intelligible Models For HealthCare: Predicting Pneumonia Risk And Hospital 30-day Readmission IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In the 30-day hospital readmission case study, we show that the same methods scale to large datasets containing hundreds of thousands of patients and thousands of attributes while remaining intelligible and providing accuracy comparable to the best (unintelligible) machine learning methods. |
RICH CARUANA et. al. |
2015 | 4 | Inferring Networks Of Substitutable And Complementary Products IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our goal in this paper is to learn the semantics of substitutes and complements from the text of online reviews. |
Julian McAuley; Rahul Pandey; Jure Leskovec; |
2015 | 5 | Deep Graph Kernels IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present Deep Graph Kernels, a unified framework to learn latent representations of sub-structures for graphs, inspired by latest advancements in language modeling and deep learning. |
Pinar Yanardag; S.V.N. Vishwanathan; |
2015 | 6 | PTE: Predictive Text Embedding Through Large-scale Heterogeneous Text Networks IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we fill this gap by proposing a semi-supervised representation learning method for text data, which we call the predictive text embedding (PTE). |
Jian Tang; Meng Qu; Qiaozhu Mei; |
2015 | 7 | Heterogeneous Network Embedding Via Deep Architectures IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we examine the scenario of a heterogeneous network with nodes and content of various types. |
SHIYU CHANG et. al. |
2015 | 8 | SEISMIC: A Self-Exciting Point Process Model For Predicting Tweet Popularity IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we focus on predicting the final number of reshares of a given post. |
Qingyuan Zhao; Murat A. Erdogdu; Hera Y. He; Anand Rajaraman; Jure Leskovec; |
2015 | 9 | Collective Opinion Spam Detection: Bridging Review Networks And Metadata IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose a new holistic approach called SPEAGLE that utilizes clues from all metadata (text, timestamp, rating) as well as relational data (network), and harness them collectively under a unified framework to spot suspicious users and reviews, as well as products targeted by spam. |
Shebuti Rayana; Leman Akoglu; |
2015 | 10 | Forecasting Fine-Grained Air Quality Based On Big Data IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we forecast the reading of an air quality monitoring station over the next 48 hours, using a data-driven method that considers current meteorological data, weather forecasts, and air quality data of the station and that of other stations within a few hundred kilometers. |
YU ZHENG et. al. |
2015 | 11 | Petuum: A New Platform For Distributed Machine Learning On Big Data IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a general-purpose framework that systematically addresses data- and model-parallel challenges in large-scale ML, by leveraging several fundamental properties underlying ML programs that make them different from conventional operation-centric programs: error tolerance, dynamic structure, and nonuniform convergence; all stem from the optimization-centric nature shared in ML programs’ mathematical definitions, and the iterative-convergent behavior of their algorithmic solutions. |
ERIC P. XING et. al. |
2015 | 12 | From Group To Individual Labels Using Deep Features IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we focus on the problem of learning classifiers to make predictions at the instance level. |
Dimitrios Kotzias; Misha Denil; Nando de Freitas; Padhraic Smyth; |
2015 | 13 | Generic And Scalable Framework For Automated Time-series Anomaly Detection IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper introduces a generic and scalable framework for automated anomaly detection on large scale time-series data. |
Nikolay Laptev; Saeed Amizadeh; Ian Flint; |
2015 | 14 | Deep Computational Phenotyping IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose two novel modifications to standard neural net training that address challenges and exploit properties that are peculiar, if not exclusive, to medical data. |
Zhengping Che; David Kale; Wenzhe Li; Mohammad Taha Bahadori; Yan Liu; |
2015 | 15 | COSNET: Connecting Heterogeneous Social Networks With Local And Global Consistency IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose COSNET (COnnecting heterogeneous Social NETworks with local and global consistency), a novel energy-based model, to address this problem by considering both local and global consistency among multiple networks. |
Yutao Zhang; Jie Tang; Zhilin Yang; Jian Pei; Philip S. Yu; |
2014 | 1 | DeepWalk: Online Learning Of Social Representations IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present DeepWalk, a novel approach for learning latent representations of vertices in a network. |
Bryan Perozzi; Rami Al-Rfou; Steven Skiena; |
2014 | 2 | Knowledge Vault: A Web-scale Approach To Probabilistic Knowledge Fusion IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here we introduce Knowledge Vault, a Web-scale probabilistic knowledge base that combines extractions from Web content (obtained via analysis of text, tabular data, page structure, and human annotations) with prior knowledge derived from existing knowledge repositories. |
XIN DONG et. al. |
2014 | 3 | Efficient Mini-batch Training For Stochastic Optimization IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper introduces a technique based on approximate optimization of a conservatively regularized objective function within each minibatch. |
Mu Li; Tong Zhang; Yuqiang Chen; Alexander J. Smola; |
2014 | 4 | GeoMF: Joint Geographical Modeling And Matrix Factorization For Point-of-interest Recommendation IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Besides, researchers have recently discovered a spatial clustering phenomenon in human mobility behavior on the LBSNs, i.e., individual visiting locations tend to cluster together, and also demonstrated its effectiveness in POI recommendation, thus we incorporate it into the factorization model. |
DEFU LIAN et. al. |
2014 | 5 | Clustering And Projected Clustering With Adaptive Neighbors IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel clustering model to learn the data similarity matrix and clustering structure simultaneously. |
Feiping Nie; Xiaoqian Wang; Heng Huang; |
2014 | 6 | Open Question Answering Over Curated And Extracted Knowledge Bases IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present OQA, the first approach to leverage both curated and extracted KBs. |
Anthony Fader; Luke Zettlemoyer; Oren Etzioni; |
2014 | 7 | Jointly Modeling Aspects, Ratings And Sentiments For Movie Recommendation (JMARS) IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we propose a probabilistic model based on collaborative filtering and topic modeling. |
QIMING DIAO et. al. |
2014 | 8 | Travel Time Estimation Of A Path Using Sparse Trajectories IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a citywide and real-time model for estimating the travel time of any path (represented as a sequence of connected road segments) in real time in a city, based on the GPS trajectories of vehicles received in current time slots and over a period of history as well as map data sources. |
Yilun Wang; Yu Zheng; Yexiang Xue; |
2014 | 9 | FastXML: A Fast, Accurate And Stable Tree-classifier For Extreme Multi-label Learning IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our objective, in this paper, is to develop an extreme multi-label classifier that is faster to train and more accurate at prediction than the state-of-the-art Multi-label Random Forest (MLRF) algorithm [2] and the Label Partitioning for Sub-linear Ranking (LPSR) algorithm [35]. |
Yashoteja Prabhu; Manik Varma; |
2014 | 10 | A Dirichlet Multinomial Mixture Model-based Approach For Short Text Clustering IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we proposed a collapsed Gibbs Sampling algorithm for the Dirichlet Multinomial Mixture model for short text clustering (abbr. |
Jianhua Yin; Jianyong Wang; |
2014 | 11 | Learning Time-series Shapelets IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In contrast to the state-of-the-art, this paper proposes a novel perspective in terms of learning shapelets. |
Josif Grabocka; Nicolas Schilling; Martin Wistuba; Lars Schmidt-Thieme; |
2014 | 12 | Streaming Submodular Maximization: Massive Data Summarization On The Fly IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we address the problem of extracting representative elements from a large stream of data. |
Ashwinkumar Badanidiyuru; Baharan Mirzasoleiman; Amin Karbasi; Andreas Krause; |
2014 | 13 | Inferring Gas Consumption And Pollution Emission Of Vehicles Throughout A City IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: As many road segments are not traversed by trajectories (i.e., data sparsity), we propose a Travel Speed Estimation (TSE) model based on a context-aware matrix factorization approach. |
Jingbo Shang; Yu Zheng; Wenzhu Tong; Eric Chang; Yong Yu; |
2014 | 14 | ‘Beating The News’ With EMBERS: Forecasting Civil Unrest Using Open Source Indicators IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe the design, implementation, and evaluation of EMBERS, an automated, 24×7 continuous system for forecasting civil unrest across 10 countries of Latin America using open source indicators such as tweets, news sources, blogs, economic indicators, and other data sources. |
NAREN RAMAKRISHNAN et. al. |
2014 | 15 | Reducing The Sampling Complexity Of Topic Models IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose an algorithm which scales linearly with the number of actually instantiated topics kd in the document. |
Aaron Q. Li; Amr Ahmed; Sujith Ravi; Alexander J. Smola; |
2013 | 1 | Auto-WEKA: Combined Selection And Hyperparameter Optimization Of Classification Algorithms IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider the problem of simultaneously selecting a learning algorithm and setting its hyperparameters, going beyond previous work that attacks these issues separately. |
Chris Thornton; Frank Hutter; Holger H. Hoos; Kevin Leyton-Brown; |
2013 | 2 | U-Air: When Urban Air Quality Inference Meets Big Data IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we infer the real-time and fine-grained air quality information throughout a city, based on the (historical and real-time) air quality data reported by existing monitor stations and a variety of data sources we observed in the city, such as meteorology, traffic flow, human mobility, structure of road networks, and point of interests (POIs). |
Yu Zheng; Furui Liu; Hsun-Ping Hsieh; |
2013 | 3 | Ad Click Prediction: A View From The Trenches IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The goal of this paper is to highlight the close relationship between theoretical advances and practical engineering in this industrial setting, and to show the depth of challenges that appear when applying traditional machine learning methods in a complex dynamic system. |
H. BRENDAN MCMAHAN et. al. |
2013 | 4 | FISM: Factored Item Similarity Models For Top-N Recommender Systems IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To alleviate this problem, we present an item-based method for generating top-N recommendations that learns the item-item similarity matrix as the product of two low dimensional latent factor matrices. |
Santosh Kabbur; Xia Ning; George Karypis; |
2013 | 5 | Learning Geographical Preferences For Point-of-interest Recommendation IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To this end, in this paper, we propose a novel geographical probabilistic factor analysis framework which strategically takes various factors into consideration. |
Bin Liu; Yanjie Fu; Zijun Yao; Hui Xiong; |
2013 | 6 | Spotting Opinion Spammers Using Behavioral Footprints IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work proposes a novel angle to the problem by modeling spamicity as latent. |
ARJUN MUKHERJEE et. al. |
2013 | 7 | LCARS: A Location-content-aware Recommender System IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose LCARS, a location-content-aware recommender system that offers a particular user a set of venues (e.g., restaurants) or events (e.g., concerts and exhibitions) by giving consideration to both personal interest and local preference. |
Hongzhi Yin; Yizhou Sun; Bin Cui; Zhiting Hu; Ling Chen; |
2013 | 8 | Why People Hate Your App: Making Sense Of User Feedback In A Mobile App Store IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose Wiscom, a system that can analyze tens of millions user ratings and comments in mobile app markets at three different levels of detail. |
BIN FU et. al. |
2013 | 9 | Connecting Users Across Social Media Sites: A Behavioral-modeling Approach IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper aims to address the cross-media user identification problem. |
Reza Zafarani; Huan Liu; |
2013 | 10 | Simple And Deterministic Matrix Sketching IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we adapt a well known streaming algorithm for approximating item frequencies to the matrix sketching setting. |
Edo Liberty; |
2013 | 11 | Fast And Scalable Polynomial Kernels Via Explicit Feature Maps IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: Approximation of non-linear kernels using random feature mapping has been successfully employed in large-scale data analysis applications, accelerating the training of kernel … |
Ninh Pham; Rasmus Pagh; |
2013 | 12 | Online Controlled Experiments At Large Scale IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We discuss why negative experiments, which degrade the user experience short term, should be run, given the learning value and long-term benefits. |
RON KOHAVI et. al. |
2013 | 13 | TurboGraph: A Fast Parallel Graph Engine Handling Billion-scale Graphs In A Single PC IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a general, disk-based graph engine called TurboGraph to process billion-scale graphs very efficiently by using modern hardware on a single PC. |
WOOK-SHIN HAN et. al. |
2013 | 14 | Who, Where, When And What: Discover Spatio-temporal Topics For Twitter Users IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a probabilistic model W4 (short for Who+Where+When+What) to exploit such data to discover individual users’ mobility behaviors from spatial, temporal and activity aspects. |
Quan Yuan; Gao Cong; Zongyang Ma; Aixin Sun; Nadia Magnenat- Thalmann; |
2013 | 15 | Denser Than The Densest Subgraph: Extracting Optimal Quasi-cliques With Quality Guarantees IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we define a novel density function, which gives subgraphs of much higher quality than densest subgraphs: the graphs found by our method are compact, dense, and with smaller diameter. |
Charalampos Tsourakakis; Francesco Bonchi; Aristides Gionis; Francesco Gullo; Maria Tsiarli; |
2012 | 1 | Discovering Regions Of Different Functions In A City Using Human Mobility And POIs IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a framework (titled DRoF) that Discovers Regions of different Functions in a city using both human mobility among regions and points of interests (POIs) located in a region. |
Jing Yuan; Yu Zheng; Xing Xie; |
2012 | 2 | Searching And Mining Trillions Of Time Series Subsequences Under Dynamic Time Warping IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we show that by using a combination of four novel ideas we can search and mine truly massive time series for the first time. |
THANAWIN RAKTHANMANON et. al. |
2012 | 3 | Open Domain Event Extraction From Twitter IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper describes TwiCal– the first open-domain event-extraction and categorization system for Twitter. |
Alan Ritter; Mausam; Oren Etzioni; Sam Clark; |
2012 | 4 | Information Diffusion And External Influence In Networks IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a model in which information can reach a node via the links of the social network or through the influence of external sources. |
Seth A. Myers; Chenguang Zhu; Jure Leskovec; |
2012 | 5 | Streaming Graph Partitioning For Large Distributed Graphs IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose natural, simple heuristics and compare their performance to hashing and METIS, a fast, offline heuristic. |
Isabelle Stanton; Gabriel Kliot; |
2012 | 6 | Circle-based Recommendation In Online Social Networks IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents an effort to develop circle-based RS. |
Xiwang Yang; Harald Steck; Yong Liu; |
2012 | 7 | Review Spam Detection Via Temporal Pattern Discovery IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a hierarchical algorithm to robustly detect the time windows where such attacks are likely to have happened. |
Sihong Xie; Guan Wang; Shuyang Lin; Philip S. Yu; |
2012 | 8 | Towards Social User Profiling: Unified And Discriminative Influence Model For Inferring Home Locations IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we focus on the problem of profiling users’ home locations in the context of social network (Twitter). |
Rui Li; Shengjie Wang; Hongbo Deng; Rui Wang; Kevin Chen-Chuan Chang; |
2012 | 9 | Discovering Value From Community Activity On Focused Question Answering Sites: A Case Study Of Stack Overflow IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To better understand this shift in focus from one-off answers to a group knowledge-creation process, we consider a question together with its entire set of corresponding answers as our fundamental unit of analysis, in contrast with the focus on individual question-answer pairs that characterized previous work. |
Ashton Anderson; Daniel Huttenlocher; Jon Kleinberg; Jure Leskovec; |
2012 | 10 | Constructing Popular Routes From Uncertain Trajectories IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a Route Inference framework based on Collective Knowledge (abbreviated as RICK) to construct the popular routes from uncertain trajectories. |
Ling-Yin Wei; Yu Zheng; Wen-Chih Peng; |
2012 | 11 | Rise And Fall Patterns Of Information Diffusion: Model And Implications IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose SpikeM, a concise yet flexible analytical model for the rise and fall patterns of influence propagation. |
Yasuko Matsubara; Yasushi Sakurai; B. Aditya Prakash; Lei Li; Christos Faloutsos; |
2012 | 12 | Event-based Social Networks: Linking The Online And Offline Social Worlds IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We subsequently studied the heterogeneous nature (co-existence of both online and offline social interactions) of EBSNs on two challenging problems: community detection and information flow. |
XINGJIE LIU et. al. |
2012 | 13 | GigaTensor: Scaling Tensor Analysis Up By 100 Times – Algorithms And Discoveries IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose GIGATENSOR, a scalable distributed algorithm for large scale tensor decomposition. |
U. Kang; Evangelos Papalexakis; Abhay Harpale; Christos Faloutsos; |
2012 | 14 | Cross-domain Collaboration Recommendation IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we analyze the cross-domain collaboration data from research publications and confirm the above patterns. |
Jie Tang; Sen Wu; Jimeng Sun; Hang Su; |
2012 | 15 | A Shapelet Transform For Time Series Classification IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose disconnecting the process of finding shapelets from the classification algorithm by proposing a shapelet transformation. |
Jason Lines; Luke M. Davis; Jon Hills; Anthony Bagnall; |
2011 | 1 | Friendship And Mobility: User Movement In Location-based Social Networks IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Using cell phone location data, as well as data from two online location-based social networks, we aim to understand what basic laws govern human motion and dynamics. |
Eunjoon Cho; Seth A. Myers; Jure Leskovec; |
2011 | 2 | Collaborative Topic Modeling For Recommending Scientific Articles IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we develop an algorithm to recommend scientific articles to users of an online community. |
Chong Wang; David M. Blei; |
2011 | 3 | Large-scale Matrix Factorization With Distributed Stochastic Gradient Descent IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe the practical techniques used to optimize performance in our DSGD implementation. |
Rainer Gemulla; Erik Nijkamp; Peter J. Haas; Yannis Sismanis; |
2011 | 4 | Human Mobility, Social Ties, And Link Prediction IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here we address this challenge for the first time by tracking the trajectories and communication records of 6 Million mobile phone users. |
Dashun Wang; Dino Pedreschi; Chaoming Song; Fosca Giannotti; Albert-Laszlo Barabasi; |
2011 | 5 | Driving With Knowledge From The Physical World IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents a Cloud-based system computing customized and practically fast driving routes for an end user using (historical and real-time) traffic conditions and driver behavior. |
Jing Yuan; Yu Zheng; Xing Xie; Guangzhong Sun; |
2011 | 6 | Exploiting Place Features In Link Prediction On Location-based Social Networks IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we study the problem of designing a link prediction system for online location-based social networks. |
Salvatore Scellato; Anastasios Noulas; Cecilia Mascolo; |
2011 | 7 | User-level Sentiment Analysis Incorporating Social Networks IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Employing Twitter as a source for our experimental data, and working within a semi-supervised framework, we propose models that are induced either from the Twitter follower/followee network or from the network in Twitter formed by users referring to each other using "@" mentions. |
CHENHAO TAN et. al. |
2011 | 8 | Differentially Private Data Release For Data Mining IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose the first anonymization algorithm for the non-interactive setting based on the generalization technique. |
Noman Mohammed; Rui Chen; Benjamin C.M. Fung; Philip S. Yu; |
2011 | 9 | Democrats, Republicans And Starbucks Afficionados: User Classification In Twitter IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe a general and robust machine learning framework for large-scale classification of social media users according to dimensions of interest. |
Marco Pennacchiotti; Ana-Maria Popescu; |
2011 | 10 | Discovering Spatio-temporal Causal Interactions In Traffic Data Streams IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose algorithms which construct outlier causality trees based on temporal and spatial properties of detected outliers. |
Wei Liu; Yu Zheng; Sanjay Chawla; Jing Yuan; Xie Xing; |
2011 | 11 | Latent Aspect Rating Analysis Without Aspect Keyword Supervision IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a unified generative model for LARA, which does not need pre-specified aspect keywords and simultaneously mines 1) latent topical aspects, 2) ratings on each identified aspect, and 3) weights placed on different aspects by a reviewer. |
Hongning Wang; Yue Lu; ChengXiang Zhai; |
2011 | 12 | Logical-shapelets: An Expressive Primitive For Time Series Classification IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we address the latter problem by introducing a novel algorithm that finds shapelets in less time than current methods by an order of magnitude. |
Abdullah Mueen; Eamonn Keogh; Neal Young; |
2011 | 13 | Integrating Low-rank And Group-sparse Structures For Robust Multi-task Learning IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a robust multi-task learning (RMTL) algorithm which learns multiple tasks simultaneously as well as identifies the irrelevant (outlier) tasks. |
Jianhui Chen; Jiayu Zhou; Jieping Ye; |
2011 | 14 | On The Semantic Annotation Of Places In Location-based Social Networks IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we develop a semantic annotation technique for location-based social networks to automatically annotate all places with category tags which are a crucial prerequisite for location search, recommendation services, or data cleaning. |
Mao Ye; Dong Shou; Wang-Chien Lee; Peifeng Yin; Krzysztof Janowicz; |
2011 | 15 | Partially Labeled Topic Models For Interpretable Text Mining IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present two new partially supervised generative models of labeled text, Partially Labeled Dirichlet Allocation (PLDA) and the Partially Labeled Dirichlet Process (PLDP). |
Daniel Ramage; Christopher D. Manning; Susan Dumais; |
2010 | 1 | Scalable Influence Maximization For Prevalent Viral Marketing In Large-scale Social Networks IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we design a new heuristic algorithm that is easily scalable to millions of nodes and edges in our experiments. |
Wei Chen; Chi Wang; Yajun Wang; |
2010 | 2 | Unsupervised Feature Selection For Multi-cluster Data IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we consider the feature selection problem in unsupervised learning scenario, which is particularly difficult due to the absence of class labels that would guide the search for relevant information. |
Deng Cai; Chiyuan Zhang; Xiaofei He; |
2010 | 3 | New Perspectives And Methods In Link Prediction IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we consider these factors by first motivating the use of a supervised framework through a careful investigation of issues such as network observational period, generality of existing methods, variance reduction, topological causes and degrees of imbalance, and sampling approaches. |
Ryan N. Lichtenwalter; Jake T. Lussier; Nitesh V. Chawla; |
2010 | 4 | Latent Aspect Rating Analysis On Review Text Data: A Rating Regression Approach IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we define and study a new opinionated text data analysis problem called Latent Aspect Rating Analysis (LARA), which aims at analyzing opinions expressed about an entity in an online review at the level of topical aspects to discover each individual reviewer’s latent opinion on each aspect as well as the relative emphasis on different aspects when forming the overall judgment of the entity. |
Hongning Wang; Yue Lu; Chengxiang Zhai; |
2010 | 5 | Community-based Greedy Algorithm For Mining Top-K Influential Nodes In Mobile Social Networks IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose a new algorithm called Community-based Greedy algorithm for mining top-K influential nodes. |
Yu Wang; Gao Cong; Guojie Song; Kunqing Xie; |
2010 | 6 | Data Mining With Differential Privacy IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider the problem of data mining with formal privacy guarantees, given a data access interface based on the differential privacy framework. |
Arik Friedman; Assaf Schuster; |
2010 | 7 | Multi-label Learning By Exploiting Label Dependency IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose to use a Bayesian network structure to efficiently encode the conditional dependencies of the labels as well as the feature set, with the feature set as the common parent of all labels. |
Min-Ling Zhang; Kun Zhang; |
2010 | 8 | UP-Growth: An Efficient Algorithm For High Utility Itemset Mining IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose an efficient algorithm, namely UP-Growth (Utility Pattern Growth), for mining high utility itemsets with a set of techniques for pruning candidate itemsets. |
Vincent S. Tseng; Cheng-Wei Wu; Bai-En Shie; Philip S. Yu; |
2010 | 9 | An Energy-efficient Mobile Recommender System IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To that end, in this paper, we provide a focused study of extracting energy-efficient transportation patterns from location traces. |
YONG GE et. al. |
2010 | 10 | Temporal Recommendation On Graphs Via Long- And Short-term Preference Fusion IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Based on the STG model framework, we propose a novel recommendation algorithm Injected Preference Fusion (IPF) and extend the personalized Random Walk for temporal recommendation. |
LIANG XIANG et. al. |
2010 | 11 | The Community-search Problem And How To Plan A Successful Cocktail Party IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we study a query-dependent variant of the community-detection problem, which we call the community-search problem: given a graph G, and a set of query nodes in the graph, we seek to find a subgraph of G that contains the query nodes and it is densely connected. |
Mauro Sozio; Aristides Gionis; |
2010 | 12 | Mining Periodic Behaviors For Moving Objects IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we address the problem of mining periodic behaviors for moving objects. |
Zhenhui Li; Bolin Ding; Jiawei Han; Roland Kays; Peter Nye; |
2010 | 13 | Combining Predictions For Accurate Recommender Systems IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: For our analysis we use a set of diverse state-of-the-art collaborative filtering (CF) algorithms, which include: SVD, Neighborhood Based Approaches, Restricted Boltzmann Machine, Asymmetric Factor Model and Global Effects. |
Michael Jahrer; Andreas Töscher; Robert Legenstein; |
2010 | 14 | Suggesting Friends Using The Implicit Social Graph IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we describe the implicit social graph which is formed by users’ interactions with contacts and groups of contacts, and which is distinct from explicit social graphs in which users explicitly add other individuals as their "friends". |
MAAYAN ROTH et. al. |
2010 | 15 | Training And Testing Of Recommender Systems On Data Missing Not At Random IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: As to test recommender systems, we present two performance measures that can be estimated, under mild assumptions, without bias from data even when ratings are missing not at random (MNAR). |
Harald Steck; |
2009 | 1 | Efficient Influence Maximization In Social Networks IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the efficient influence maximization from two complementary directions. |
Wei Chen; Yajun Wang; Siyu Yang; |
2009 | 2 | Collaborative Filtering With Temporal Dynamics IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Thus, modeling temporal dynamics should be a key when designing recommender systems or general customer preference models. |
Yehuda Koren; |
2009 | 3 | Meme-tracking And The Dynamics Of The News Cycle IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop a framework for tracking short, distinctive phrases that travel relatively intact through on-line text; developing scalable algorithms for clustering textual variants of such phrases, we identify a broad class of memes that exhibit wide spread and rich variation on a daily basis. |
Jure Leskovec; Lars Backstrom; Jon Kleinberg; |
2009 | 4 | Social Influence Analysis In Large-scale Networks IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address these fundamental questions, we propose Topical Affinity Propagation (TAP) to model the topic-level social influence on large networks. |
Jie Tang; Jimeng Sun; Chi Wang; Zi Yang; |
2009 | 5 | TrustWalker: A Random Walk Model For Combining Trust-based And Item-based Recommendation IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In order to find a good trade-off, we propose a random walk model combining the trust-based and the collaborative filtering approach for recommendation. |
Mohsen Jamali; Martin Ester; |
2009 | 6 | Beyond Blacklists: Learning To Detect Malicious Web Sites From Suspicious URLs IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we describe an approach to this problem based on automated URL classification, using statistical methods to discover the tell-tale lexical and host-based properties of malicious Web site URLs. |
Justin Ma; Lawrence K. Saul; Stefan Savage; Geoffrey M. Voelker; |
2009 | 7 | Time Series Shapelets: A New Primitive For Data Mining IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we introduce a new time series primitive, time series shapelets, which addresses these limitations. |
Lexiang Ye; Eamonn Keogh; |
2009 | 8 | Finding A Team Of Experts In Social Networks IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Given a task T, a pool of individuals X with different skills, and a social network G that captures the compatibility among these individuals, we study the problem of finding X, a subset of X, to perform the task. |
Theodoros Lappas; Kun Liu; Evimaria Terzi; |
2009 | 9 | WhereNext: A Location Predictor On Trajectory Pattern Mining IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose WhereNext, which is a method aimed at predicting with a certain level of accuracy the next location of a moving object. In addition, we propose a set of other measures, that evaluate a priori the predictive power of a set of Trajectory Patterns. |
Anna Monreale; Fabio Pinelli; Roberto Trasarti; Fosca Giannotti; |
2009 | 10 | New Ensemble Methods For Evolving Data Streams IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a new experimental data stream framework for studying concept drift, and two new variants of Bagging: ADWIN Bagging and Adaptive-Size Hoeffding Tree (ASHT) Bagging. |
Albert Bifet; Geoff Holmes; Bernhard Pfahringer; Richard Kirkby; Ricard Gavaldà; |
2009 | 11 | Relational Learning Via Latent Social Dimensions IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We conduct extensive experiments on social media data (one from a real-world blog site and the other from a popular content sharing site). |
Lei Tang; Huan Liu; |
2009 | 12 | Regression-based Latent Factor Models IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel latent factor model to accurately predict response for large scale dyadic data in the presence of features. |
Deepak Agarwal; Bee-Chung Chen; |
2009 | 13 | Sentiment Analysis Of Blogs By Combining Lexical Knowledge With Text Classification IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a unified framework in which one can use background lexical information in terms of word-class associations, and refine this information for specific domains using any available training examples. |
Prem Melville; Wojciech Gryc; Richard D. Lawrence; |
2009 | 14 | Collective Annotation Of Wikipedia Entities In Web Text IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a general collective disambiguation approach. |
Sayali Kulkarni; Amit Singh; Ganesh Ramakrishnan; Soumen Chakrabarti; |
2009 | 15 | Ranking-based Clustering Of Heterogeneous Information Networks With Star Network Schema IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study clustering of multi-typed heterogeneous networks with a star network schema and propose a novel algorithm, NetClus, that utilizes links across multityped objects to generate high-quality net-clusters. |
Yizhou Sun; Yintao Yu; Jiawei Han; |
2008 | 1 | Factorization Meets The Neighborhood: A Multifaceted Collaborative Filtering Model IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we introduce some innovations to both approaches. |
Yehuda Koren; |
2008 | 2 | ArnetMiner: Extraction And Mining Of Academic Social Networks IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We integrate publications from online Web databases and propose a probabilistic framework to deal with the name ambiguity problem. |
JIE TANG et. al. |
2008 | 3 | Get Another Label? Improving Data Quality And Data Mining Using Multiple, Noisy Labelers IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: (iv) Repeatedly labeling a carefully chosen set of points is generally preferable, and we present a robust technique that combines different notions of uncertainty to select data points for which quality should be improved. |
Victor S. Sheng; Foster Provost; Panagiotis G. Ipeirotis; |
2008 | 4 | Relational Learning Via Collective Matrix Factorization IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To this end, we propose a collective matrix factorization model: we simultaneously factor several matrices, sharing parameters among factors when an entity participates in multiple relations. |
Ajit P. Singh; Geoffrey J. Gordon; |
2008 | 5 | Microscopic Evolution Of Social Networks IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a detailed study of network evolution by analyzing four large online social networks with full temporal information about node and edge arrivals. |
Jure Leskovec; Lars Backstrom; Ravi Kumar; Andrew Tomkins; |
2008 | 6 | Influence And Correlation In Social Networks IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we study this problem systematically. |
Aris Anagnostopoulos; Ravi Kumar; Mohammad Mahdian; |
2008 | 7 | Learning Classifiers From Only Positive And Unlabeled Data IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The problem solved in this paper is how to learn a standard binary classifier given a nontraditional training set of this nature. |
Charles Elkan; Keith Noto; |
2008 | 8 | Feedback Effects Between Similarity And Social Influence In Online Communities IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop techniques for identifying and modeling the interactions between social influence and selection, using data from online communities where both social interaction and changes in behavior over time can be measured. |
David Crandall; Dan Cosley; Daniel Huttenlocher; Jon Kleinberg; Siddharth Suri; |
2008 | 9 | Context-aware Query Suggestion By Mining Click-through And Session Data IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel context-aware query suggestion approach which is in two steps. |
HUANHUAN CAO et. al. |
2008 | 10 | Fast Collapsed Gibbs Sampling For Latent Dirichlet Allocation IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we introduce a novel collapsed Gibbs sampling method for the widely used latent Dirichlet allocation (LDA) model. |
IAN PORTEOUS et. al. |
2008 | 11 | Angle-based Outlier Detection In High-dimensional Data IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel approach named ABOD (Angle-Based Outlier Detection) and some variants assessing the variance in the angles between the difference vectors of a point to the other points. |
Hans-Peter Kriegel; Matthias Schubert; Arthur Zimek; |
2008 | 12 | Discrimination-aware Data Mining IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, the notion of discriminatory classification rules is introduced and studied. |
Dino Pedreshi; Salvatore Ruggieri; Franco Turini; |
2008 | 13 | Joint Latent Topic Models For Text And Citations IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we address the problem of joint modeling of text and citations in the topic modeling framework. |
Ramesh M. Nallapati; Amr Ahmed; Eric P. Xing; William W. Cohen; |
2008 | 14 | The Cost Of Privacy: Destruction Of Data-mining Utility In Anonymized Data Publishing IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we ask whether generalization and suppression of quasi-identifiers offer any benefits over trivial sanitization which simply separates quasi-identifiers from sensitive attributes. |
Justin Brickell; Vitaly Shmatikov; |
2008 | 15 | Composition Attacks And Auxiliary Information In Data Privacy IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper explores how one can reason about privacy in the face of rich, realistic sources of auxiliary information. |
Srivatsava Ranjit Ganta; Shiva Prasad Kasiviswanathan; Adam Smith; |
2007 | 1 | Learning Bayesian Networks IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Richard E. Neapolitan; |
2007 | 2 | Cost-effective Outbreak Detection In Networks IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a general methodology for near optimal sensor placement in these and related problems. |
JURE LESKOVEC et. al. |
2007 | 3 | Trajectory Pattern Mining IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we move towards this direction and develop an extension of the sequential pattern mining paradigm that analyzes the trajectories of moving objects. |
Fosca Giannotti; Mirco Nanni; Fabio Pinelli; Dino Pedreschi; |
2007 | 4 | SCAN: A Structural Clustering Algorithm For Networks IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we proposed a novel algorithm called SCAN (Structural Clustering Algorithm for Networks), which detects clusters, hubs and outliers in networks. |
Xiaowei Xu; Nurcan Yuruk; Zhidan Feng; Thomas A. J. Schweiger; |
2007 | 5 | GraphScope: Parameter-free Mining Of Large Time-evolving Graphs IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose GraphScope, that addresses both problems, using information theoretic principles. |
Jimeng Sun; Christos Faloutsos; Spiros Papadimitriou; Philip S. Yu; |
2007 | 6 | Truth Discovery With Multiple Conflicting Information Providers On The Web IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose a new problem called Veracity, i.e., conformity to truth, which studies how to find true facts from a large amount of conflicting information on many subjects that is provided by various web sites. |
Xiaoxin Yin; Jiawei Han; Philip S. Yu; |
2007 | 7 | A Framework For Community Identification In Dynamic Social Networks IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose frameworks and algorithms for identifying communities in social networks that change over time. |
Chayant Tantipathananandh; Tanya Berger-Wolf; David Kempe; |
2007 | 8 | Density-based Clustering For Real-time Stream Data IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address these issues, this paper proposes D-Stream, a framework for clustering stream data using adensity-based approach. |
Yixin Chen; Li Tu; |
2007 | 9 | Modeling Relationships At Multiple Scales To Improve Accuracy Of Large Recommender Systems IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose novel algorithms for predicting user ratings of items by integrating complementary models that focus on patterns at different scales. |
Robert Bell; Yehuda Koren; Chris Volinsky; |
2007 | 10 | Extracting Semantic Relations From Query Logs IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we study a large query log of more than twenty million queries with the goal of extracting the semantic relations that are implicitly captured in the actions of users submitting queries and clicking answers. |
Ricardo Baeza-Yates; Alessandro Tiberi; |
2007 | 11 | Evolutionary Spectral Clustering By Incorporating Temporal Smoothness IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose two frameworks that incorporate temporal smoothness in evolutionary spectral clustering. |
Yun Chi; Xiaodan Song; Dengyong Zhou; Koji Hino; Belle L. Tseng; |
2007 | 12 | Automatic Labeling Of Multinomial Topic Models IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose probabilistic approaches to automatically labeling multinomial topic models in an objective way. |
Qiaozhu Mei; Xuehua Shen; ChengXiang Zhai; |
2007 | 13 | Co-clustering Based Classification For Out-of-domain Documents IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we address this problem for a text-mining task, where the labeled data are under one distribution in one domain known as in-domain data, while the unlabeled data are under a related but different domain known as out-of-domain data. |
Wenyuan Dai; Gui-Rong Xue; Qiang Yang; Yong Yu; |
2007 | 14 | Practical Guide To Controlled Experiments On The Web: Listen To Your Customers Not To The Hippo IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe common architectures for experimentation systems and analyze their advantages and disadvantages. |
Ron Kohavi; Randal M. Henne; Dan Sommerfield; |
2007 | 15 | Show Me The Money!: Deriving The Pricing Power Of Product Features By Mining Consumer Reviews IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we use techniques that decompose the reviews into segments that evaluate the individual characteristics of a product (e.g., image quality and battery life for a digital camera). |
Nikolay Archak; Anindya Ghose; Panagiotis G. Ipeirotis; |
2006 | 1 | Training Linear SVMs In Linear Time IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents a Cutting Plane Algorithm for training linear SVMs that provably has training time 0(s,n) for classification problems and o(sn log (n))for ordinal regression problems. |
Thorsten Joachims; |
2006 | 2 | Group Formation In Large Social Networks: Membership, Growth, And Evolution IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We use decision-tree techniques to identify the most significant structural determinants of these properties. |
Lars Backstrom; Dan Huttenlocher; Jon Kleinberg; Xiangyang Lan; |
2006 | 3 | Structure And Evolution Of Online Social Networks IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we consider the evolution of structure within large online social networks. |
Ravi Kumar; Jasmine Novak; Andrew Tomkins; |
2006 | 4 | Topics Over Time: A Non-Markov Continuous-time Model Of Topical Trends IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents an LDA-style topic model that captures not only the low-dimensional structure of data, but also how the structure changes over time. |
Xuerui Wang; Andrew McCallum; |
2006 | 5 | YALE: Rapid Prototyping For Complex Data Mining Tasks IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: These case studies cover tasks like feature engineering, text mining, data stream mining and tracking drifting concepts, ensemble methods and distributed data mining. |
Ingo Mierswa; Michael Wurst; Ralf Klinkenberg; Martin Scholz; Timm Euler; |
2006 | 6 | Sampling From Large Graphs IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider several sampling methods, propose novel methods to check the goodness of sampling, and develop a set of scaling laws that describe relations between the properties of the original and the sample.In addition to the theoretical contributions, the practical conclusions from our work are: Sampling strategies based on edge selection do not perform well; simple uniform random node selection performs surprisingly well. |
Jure Leskovec; Christos Faloutsos; |
2006 | 7 | Model Compression IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a method for "compressing" large, complex ensembles into smaller, faster models, usually without significant loss in performance. |
Cristian Buciluǎ; Rich Caruana; Alexandru Niculescu-Mizil; |
2006 | 8 | Orthogonal Nonnegative Matrix T-factorizations For Clustering IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the orthogonality constraint because it leadsto rigorous clustering interpretation. |
Chris Ding; Tao Li; Wei Peng; Haesun Park; |
2006 | 9 | Evolutionary Clustering IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a generic framework for this problem, and discuss evolutionary versions of two widely-used clustering algorithms within this framework: k-means and agglomerative hierarchical clustering. |
Deepayan Chakrabarti; Ravi Kumar; Andrew Tomkins; |
2006 | 10 | Very Sparse Random Projections IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: There has been considerable interest in random projections, an approximate algorithm for estimating distances between pairs of points in a high-dimensional vector space. Let A in … |
Ping Li; Trevor J. Hastie; Kenneth W. Church; |
2006 | 11 | Beyond Streams And Graphs: Dynamic Tensor Analysis IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Thus, we introduce the dynamic tensor analysis (DTA) method, and its variants. |
Jimeng Sun; Dacheng Tao; Christos Faloutsos; |
2006 | 12 | GPLAG: Detection Of Software Plagiarism By Program Dependence Graph Analysis IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we develop a new plagiarism detection tool, called GPLAG, which detects plagiarism by mining program dependence graphs (PDGs). |
Chao Liu; Chen Chen; Jiawei Han; Philip S. Yu; |
2006 | 13 | Utility-based Anonymization Using Local Recoding IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The framework covers both numeric and categorical data. |
JIAN XU et. al. |
2006 | 14 | Mining Long-term Search History To Improve Search Accuracy IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study statistical language modeling based methods to mine contextual information from long-term search history and exploit it for a more accurate estimate of the query language model. |
Bin Tan; Xuehua Shen; ChengXiang Zhai; |
2006 | 15 | Center-piece Subgraphs: Problem Definition And Fast Solutions IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Given Q nodes in a social network (say, authorship network), how can we find the node/author that is the center-piece, and has direct or indirect connections to all, or most of them? |
Hanghang Tong; Christos Faloutsos; |
2005 | 1 | Graphs Over Time: Densification Laws, Shrinking Diameters And Possible Explanations IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We provide a new graph generator, based on a "forest fire" spreading process, that has a simple, intuitive justification, requires very few parameters (like the "flammability" of nodes), and produces graphs exhibiting the full range of properties observed both in prior work and in the present study. |
Jure Leskovec; Jon Kleinberg; Christos Faloutsos; |
2005 | 2 | Query Chains: Learning To Rank From Implicit Feedback IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents a novel approach for using clickthrough data to learn ranked retrieval functions for web search results. |
Filip Radlinski; Thorsten Joachims; |
2005 | 3 | Discovering Evolutionary Theme Patterns From Text: An Exploration Of Temporal Text Mining IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study a particular TTM task — discovering and summarizing the evolutionary patterns of themes in a text stream. |
Qiaozhu Mei; ChengXiang Zhai; |
2005 | 4 | Adversarial Learning IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce the adversarial classifier reverse engineering (ACRE) learning problem, the task of learning sufficient information about a classifier to construct adversarial attacks. |
Daniel Lowd; Christopher Meek; |
2005 | 5 | Feature Bagging For Outlier Detection IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, a novel feature bagging approach for detecting outliers in very large, high dimensional and noisy databases is proposed. |
Aleksandar Lazarevic; Vipin Kumar; |
2005 | 6 | The Predictive Power Of Online Chatter IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: An increasing fraction of the global discourse is migrating online in the form of blogs, bulletin boards, web pages, wikis, editorials, and a dizzying array of new collaborative … |
Daniel Gruhl; R. Guha; Ravi Kumar; Jasmine Novak; Andrew Tomkins; |
2005 | 7 | Privacy-preserving Distributed K-means Clustering Over Arbitrarily Partitioned Data IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our paper makes two contributions in privacy-preserving data mining. |
Geetha Jagannathan; Rebecca N. Wright; |
2005 | 8 | Evaluating Similarity Measures: A Large-scale Study In The Orkut Social Network IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present an extensive empirical comparison of six distinct measures of similarity for recommending online communities to members of the Orkut social network. |
Ellen Spertus; Mehran Sahami; Orkut Buyukkokten; |
2005 | 9 | Density-based Clustering Of Uncertain Data IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose to express the similarity between two fuzzy objects by distance probability functions. |
Hans-Peter Kriegel; Martin Pfeifle; |
2005 | 10 | Deriving Marketing Intelligence From Online Discussion IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents such a system that gathers and annotates online discussion relating to consumer products using a wide variety of state-of-the-art techniques, including crawling, wrapping, search, text classification and computational linguistics. |
NATALIE GLANCE et. al. |
2005 | 11 | Dynamic Syslog Mining For Network Failure Monitoring IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a new methodology of dynamic syslog mining in order to detect failure symptoms with higher confidence and to discover sequential alarm patterns among computer devices. |
Kenji Yamanishi; Yuko Maruyama; |
2005 | 12 | On Mining Cross-graph Quasi-cliques IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Such clusters may be potential pathways.In this paper, we investigate a novel data mining problem, mining cross-graph quasi-cliques, which is generalized from several interesting applications such as cross-market customer segmentation and joint mining of gene expression data and protein interaction data. |
Jian Pei; Daxin Jiang; Aidong Zhang; |
2005 | 13 | Summarizing Itemset Patterns: A Profile-based Approach IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Based on the restoration error, we propose a quality measure function to determine the optimal value of parameter K. Polynomial time algorithms are developed together with several optimization heuristics for efficiency improvement. |
Xifeng Yan; Hong Cheng; Jiawei Han; Dong Xin; |
2005 | 14 | Model-based Overlapping Clustering IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we interpret an overlapping clustering model proposed by Segal et al. [23] as a generalization of Gaussian mixture models, and we extend it to an overlapping clustering model based on mixtures of any regular exponential family distribution and the corresponding Bregman divergence. |
Arindam Banerjee; Chase Krumpelman; Joydeep Ghosh; Sugato Basu; Raymond J. Mooney; |
2005 | 15 | An Approach To Spacecraft Anomaly Detection Problem Using Kernel Feature Space IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: A reasonable alternative to this conventional anomaly detection method is to reuse a vast amount of telemetry data which is multi-dimensional time-series continuously produced from a number of system components in the spacecraft.This paper proposes a novel "knowledge-free" anomaly detection method for spacecraft based on Kernel Feature Space and directional distribution, which constructs a system behavior model from the past normal telemetry data from a set of telemetry data in normal operation and monitors the current system status by checking incoming data with the model.In this method, we regard anomaly phenomena as unexpected changes of causal associations in the spacecraft system, and hypothesize that the significant causal associations inside the system will appear in the form of principal component directions in a high-dimensional non-linear feature space which is constructed by a kernel function and a set of data.We have confirmed the effectiveness of the proposed anomaly detection method by applying it to the telemetry data obtained from a simulator of an orbital transfer vehicle designed to make a rendezvous maneuver with the International Space Station. |
Ryohei Fujimaki; Takehisa Yairi; Kazuo Machida; |
2004 | 1 | Mining And Summarizing Customer Reviews IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this research, we aim to mine and to summarize all the customer reviews of a product. |
Minqing Hu; Bing Liu; |
2004 | 2 | Regularized Multi–task Learning IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we present an approach to multi–task learning based on the minimization of regularization functionals similar to existing ones, such as the one for Support Vector Machines (SVMs), that have been successfully used in the past for single–task learning. |
Theodoros Evgeniou; Massimiliano Pontil; |
2004 | 3 | Kernel K-means: Spectral Clustering And Normalized Cuts IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we give an explicit theoretical connection between them. |
Inderjit S. Dhillon; Yuqiang Guan; Brian Kulis; |
2004 | 4 | A Probabilistic Framework For Semi-supervised Clustering IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a probabilistic model for semi-supervised clustering based on Hidden Markov Random Fields (HMRFs) that provides a principled framework for incorporating supervision into prototype-based clustering. |
Sugato Basu; Mikhail Bilenko; Raymond J. Mooney; |
2004 | 5 | Adversarial Classification IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we develop a formal framework and algorithms for this problem. |
Nilesh Dalvi; Pedro Domingos; Mausam; Sumit Sanghai; Deepak Verma; |
2004 | 6 | Probabilistic Author-topic Models For Information Discovery IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a new unsupervised learning technique for extracting information from large text collections. |
Mark Steyvers; Padhraic Smyth; Michal Rosen-Zvi; Thomas Griffiths; |
2004 | 7 | Towards Parameter-free Data Mining IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we show that recent results in bioinformatics and computational theory hold great promise for a parameter-free data-mining paradigm. |
Eamonn Keogh; Stefano Lonardi; Chotirat Ann Ratanamahatana; |
2004 | 8 | A Quickstart In Frequent Structure Mining Can Make A Difference IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce the GrAph/Sequence/Tree extractiON (Gaston) algorithm that implements this idea by searching first for frequent paths, then frequent free trees and finally cyclic graphs. |
Siegfried Nijssen; Joost N. Kok; |
2004 | 9 | Learning To Detect Malicious Executables In The Wild IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we describe the development of a fielded application for detecting malicious executables in the wild. |
Jeremy Z. Kolter; Marcus A. Maloof; |
2004 | 10 | Automatic Multimedia Cross-modal Correlation Discovery IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel, graph-based approach, "MMG", to discover such cross-modal correlations.Our "MMG" method requires no tuning, no clustering, no user-determined constants; it can be applied to any multimedia collection, as long as we have a similarity function for each medium; and it scales linearly with the database size. |
Jia-Yu Pan; Hyung-Jeong Yang; Christos Faloutsos; Pinar Duygulu; |
2004 | 11 | A Generalized Maximum Entropy Approach To Bregman Co-clustering And Matrix Approximation IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a substantially generalized co-clustering framework wherein any Bregman divergence can be used in the objective function, and various conditional expectation based constraints can be considered based on the statistics that need to be preserved. |
Arindam Banerjee; Inderjit Dhillon; Joydeep Ghosh; Srujana Merugu; Dharmendra S. Modha; |
2004 | 12 | Fast Discovery Of Connection Subgraphs IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a formal definition of this problem, and an ideal solution based on electricity analogues. |
Christos Faloutsos; Kevin S. McCurley; Andrew Tomkins; |
2004 | 13 | SPIN: Mining Maximal Frequent Subgraphs From Graph Databases IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a new algorithm that mines only maximal frequent subgraphs, i.e. subgraphs that are not a part of any other frequent subgraphs. |
Jun Huan; Wei Wang; Jan Prins; Jiong Yang; |
2004 | 14 | Cyclic Pattern Kernels For Predictive Graph Mining IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In contrast to these approaches, we propose a kernel function based on a natural set of cyclic and tree patterns independent of their frequency, and discuss its computational aspects. |
Tamás Horváth; Thomas Gärtner; Stefan Wrobel; |
2004 | 15 | Mining, Indexing, And Querying Historical Spatiotemporal Data IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Based on this observation, we propose a framework that analyzes, manages, and queries object movements that follow such patterns. |
NIKOS MAMOULIS et. al. |
2003 | 1 | Maximizing The Spread Of Influence Through A Social Network IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Models for the processes by which ideas and influence propagate through a social network have been studied in a number of domains, including the diffusion of medical and technological innovations, the sudden and widespread adoption of various strategies in game-theoretic settings, and the effects of "word of mouth" in the promotion of new products. |
David Kempe; Jon Kleinberg; Éva Tardos; |
2003 | 2 | Mining Concept-drifting Data Streams Using Ensemble Classifiers IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a general framework for mining concept-drifting data streams using weighted ensemble classifiers. |
Haixun Wang; Wei Fan; Philip S. Yu; Jiawei Han; |
2003 | 3 | Information-theoretic Co-clustering IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present an innovative co-clustering algorithm that monotonically increases the preserved mutual information by intertwining both the row and column clusterings at all stages. |
Inderjit S. Dhillon; Subramanyam Mallela; Dharmendra S. Modha; |
2003 | 4 | Adaptive Duplicate Detection Using Learnable String Similarity Measures IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a framework for improving duplicate detection using trainable measures of textual similarity. |
Mikhail Bilenko; Raymond J. Mooney; |
2003 | 5 | CloseGraph: Mining Closed Frequent Graph Patterns IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Instead of mining all the subgraphs, we propose to mine closed frequent graph patterns. |
Xifeng Yan; Jiawei Han; |
2003 | 6 | CLOSET+: Searching For The Best Strategies For Mining Frequent Closed Itemsets IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: "In this study, we answer the above questions by a systematic study of the search strategies and develop a winning algorithm CLOSET+. |
Jianyong Wang; Jiawei Han; Jian Pei; |
2003 | 7 | Mining Distance-based Outliers In Near Linear Time With Randomization And A Simple Pruning Rule IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show that a simple nested loop algorithm that in the worst case is quadratic can give near linear time performance when the data is in random order and a simple pruning rule is used. |
Stephen D. Bay; Mark Schwabacher; |
2003 | 8 | Fast Vertical Mining Using Diffsets IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The main problem with these approaches is when intermediate results of vertical tid lists become too large for memory, thus affecting the algorithm scalability.In this paper we present a novel vertical data representation called Diffset, that only keeps track of differences in the tids of a candidate pattern from its generating frequent patterns. |
Mohammed J. Zaki; Karam Gouda; |
2003 | 9 | Mining Data Records In Web Pages IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a more effective technique to perform the task. |
Bing Liu; Robert Grossman; Yanhong Zhai; |
2003 | 10 | Probabilistic Discovery Of Time Series Motifs IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Two limitations of this work were the poor scalability of the motif discovery algorithm, and the inability to discover motifs in the presence of noise.Here we address these limitations by introducing a novel algorithm inspired by recent advances in the problem of pattern discovery in biosequences. |
Bill Chiu; Eamonn Keogh; Stefano Lonardi; |
2003 | 11 | Graph-based Anomaly Detection IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce two techniques for graph-based anomaly detection. |
Caleb C. Noble; Diane J. Cook; |
2003 | 12 | Algorithms For Estimating Relative Importance In Networks IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We investigate the problem of answering such queries in this paper, focusing in particular on defining and computing the importance of nodes in a graph relative to one or more root nodes. |
Scott White; Padhraic Smyth; |
2003 | 13 | Weighted Association Rule Mining Using Weighted Support And Significance Framework IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We address the issues of discovering significant binary relationships in transaction datasets in a weighted setting. |
Feng Tao; Fionn Murtagh; Mohsen Farid; |
2003 | 14 | Indexing Multi-dimensional Time-series With Support For Multiple Distance Measures IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: Although most time-series data mining research has concentrated on providing solutions for a single distance function, in this work we motivate the need for a single index … |
Michail Vlachos; Marios Hadjieleftheriou; Dimitrios Gunopulos; Eamonn Keogh; |
2003 | 15 | Finding Recent Frequent Itemsets Adaptively Over Online Data Streams IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a data mining method for finding recent frequent itemsets adaptively over an online data stream. |
Joong Hyuk Chang; Won Suk Lee; |
2002 | 1 | Optimizing Search Engines Using Clickthrough Data IF:10 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents an approach to automatically optimizing the retrieval quality of search engines using clickthrough data. |
Thorsten Joachims; |
2002 | 2 | SimRank: A Measure Of Structural-context Similarity IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a complementary approach, applicable in any domain with object-to-object relationships, that measures similarity of the structural context in which objects occur, based on their relationships with other objects. |
Glen Jeh; Jennifer Widom; |
2002 | 3 | Mining Knowledge-sharing Sites For Viral Marketing IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we extend our previous techniques, achieving a large reduction in computational cost, and apply them to data from a knowledge-sharing site. |
Matthew Richardson; Pedro Domingos; |
2002 | 4 | On The Need For Time Series Data Mining Benchmarks: A Survey And Empirical Demonstration IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we make the following claim. |
Eamonn Keogh; Shruti Kasetty; |
2002 | 5 | Sequential PAttern Mining Using A Bitmap Representation IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a new algorithm for mining sequential patterns. |
Jay Ayres; Jason Flannick; Johannes Gehrke; Tomi Yiu; |
2002 | 6 | Selecting The Right Interestingness Measure For Association Patterns IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present an overview of various measures proposed in the statistics, machine learning and data mining literature. |
Pang-Ning Tan; Vipin Kumar; Jaideep Srivastava; |
2002 | 7 | Bursty And Hierarchical Structure In Streams IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Underlying much of the text mining work in this area is the following intuitive premise — that the appearance of a topic in a document stream is signaled by a "burst of activity," with certain features rising sharply in frequency as the topic emerges.The goal of the present work is to develop a formal approach for modeling such "bursts," in such a way that they can be robustly and efficiently identified, and can provide an organizational framework for analyzing the underlying content. |
Jon Kleinberg; |
2002 | 8 | Transforming Data To Satisfy Privacy Constraints IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper addresses the important issue of preserving the anonymity of the individuals or entities during the data dissemination process. |
Vijay S. Iyengar; |
2002 | 9 | Privacy Preserving Association Rule Mining In Vertically Partitioned Data IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a two-party algorithm for efficiently discovering frequent itemsets with minimum support levels, without either site revealing individual transaction values. |
Jaideep Vaidya; Chris Clifton; |
2002 | 10 | Transforming Classifier Scores Into Accurate Multiclass Probability Estimates IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Here, we show how to obtain accurate probability estimates for multiclass problems by combining calibrated binary probability estimates. |
Bianca Zadrozny; Charles Elkan; |
2002 | 11 | Interactive Deduplication Using Active Learning IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We investigate various design issues that arise in building a system to provide interactive response, fast convergence, and interpretable output. |
Sunita Sarawagi; Anuradha Bhamidipaty; |
2002 | 12 | Discovering Word Senses From Text IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a clustering algorithm called CBC (Clustering By Committee) that automatically discovers word senses from text. |
Patrick Pantel; Dekang Lin; |
2002 | 13 | Privacy Preserving Mining Of Association Rules IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a framework for mining association rules from transactions consisting of categorical items where the data has been randomized to preserve privacy of individual transactions. |
Alexandre Evfimievski; Ramakrishnan Srikant; Rakesh Agrawal; Johannes Gehrke; |
2002 | 14 | Efficiently Mining Frequent Trees In A Forest IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present TREEMINER, a novel algorithm to discover all frequent subtrees in a forest, using a new data structure called scope-list. |
Mohammed J. Zaki; |
2002 | 15 | Frequent Term-based Text Clustering IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce a novel approach which uses frequent item (term) sets for text clustering. |
Florian Beil; Martin Ester; Xiaowei Xu; |
2001 | 1 | Mining The Network Value Of Customers IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: So far, work in this area has considered only the intrinsic value of the customer (i.e, the expected profit from sales to her). |
Pedro Domingos; Matt Richardson; |
2001 | 2 | Co-clustering Documents And Words Using Bipartite Spectral Graph Partitioning IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we present the novel idea of modeling the document collection as a bipartite graph between documents and words, using which the simultaneous clustering problem can be posed as a bipartite graph partitioning problem. |
Inderjit S. Dhillon; |
2001 | 3 | Mining Time-changing Data Streams IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose an efficient algorithm for mining decision trees from continuously-changing data streams, based on the ultra-fast VFDT decision tree learner. |
Geoff Hulten; Laurie Spencer; Pedro Domingos; |
2001 | 4 | Random Projection In Dimensionality Reduction: Applications To Image And Text Data IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present experimental results on using random projection as a dimensionality reduction tool in a number of cases, where the high dimensionality of the data would otherwise lead to burden-some computations. |
Ella Bingham; Heikki Mannila; |
2001 | 5 | A Streaming Ensemble Algorithm (SEA) For Large-scale Classification IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The methods presented in this paper take advantage of plentiful data, building separate classifiers on sequential chunks of training points. |
W. Nick Street; YongSeog Kim; |
2001 | 6 | Proximal Support Vector Machine Classifiers IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: Instead of a standard support vector machine (SVM) that classifies points by assigning them to one of two disjoint half-spaces, points are classified by assigning them to the … |
Glenn Fung; Olvi L. Mangasarian; |
2001 | 7 | A Robust And Scalable Clustering Algorithm For Mixed Type Attributes In Large Database Environment IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a distance measure that enables clustering data with both continuous and categorical attributes. |
Tom Chiu; DongPing Fang; John Chen; Yao Wang; Christopher Jeris; |
2001 | 8 | Real World Performance Of Association Rule Algorithms IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This study compares five well-known association rule algorithms using three real-world datasets and an artificial dataset. |
Zijian Zheng; Ron Kohavi; Llew Mason; |
2001 | 9 | Learning And Making Decisions When Costs And Probabilities Are Both Unknown IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: After discussing how to make optimal decisions given cost and probability estimates, we present decision tree and naive Bayesian learning methods for obtaining well-calibrated probability estimates. |
Bianca Zadrozny; Charles Elkan; |
2001 | 10 | Mining Top-n Local Outliers In Large Databases IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel method to efficiently find the top-n local outliers in large databases. |
Wen Jin; Anthony K. H. Tung; Jiawei Han; |
2001 | 11 | Empirical Bayes Screening For Multi-item Associations IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper considers the framework of the so-called "market basket problem", in which a database of transactions is mined for the occurrence of unusually frequent item sets. |
William DuMouchel; Daryl Pregibon; |
2001 | 12 | Visualizing Multi-dimensional Clusters, Trends, And Outliers Using Star Coordinates IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: Interactive visualizations are effective tools in mining scientific, engineering, and business data to support decision-making activities. Star Coordinates is proposed as a new … |
Eser Kandogan; |
2001 | 13 | Molecular Feature Mining In HIV Data IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present the application of Feature Mining techniques to the Developmental Therapeutics Program’s AIDS antiviral screen database. |
Stefan Kramer; Luc De Raedt; Christoph Helma; |
2001 | 14 | Experimental Comparisons Of Online And Batch Versions Of Bagging And Boosting IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In previous work, we presented online bagging and boosting algorithms that only require one pass through the training data and presented experimental results on some relatively small datasets. |
Nikunj C. Oza; Stuart Russell; |
2001 | 15 | Mining Web Logs For Prediction Models In WWW Caching And Prefetching IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present an application of web log mining to obtain web-document access patterns and use these patterns to extend the well-known GDSF caching policies and prefetching policies. |
Qiang Yang; Haining Henry Zhang; Tianyi Li; |
2000 | 1 | Mining High-speed Data Streams IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Pedro Domingos; Geoff Hulten; |
2000 | 2 | Efficient Clustering Of High-dimensional Data Sets With Application To Reference Matching IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Andrew McCallum; Kamal Nigam; Lyle H. Ungar; |
2000 | 3 | Agglomerative Clustering Of A Search Engine Query Log IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Doug Beeferman; Adam Berger; |
2000 | 4 | FreeSpan: Frequent Pattern-projected Sequential Pattern Mining IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
JIAWEI HAN et. al. |
2000 | 5 | Efficient Identification Of Web Communities IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Gary William Flake; Steve Lawrence; C. Lee Giles; |
2000 | 6 | Scaling Up Dynamic Time Warping For Datamining Applications IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Eamonn J. Keogh; Michael J. Pazzani; |
2000 | 7 | Generating Non-redundant Association Rules IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Mohammed J. Zaki; |
2000 | 8 | On-line Unsupervised Outlier Detection Using Finite Mixtures With Discounting Learning Algorithms IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Kenji Yamanishi; Jun-Ichi Takeuchi; Graham Williams; Peter Milne; |
2000 | 9 | Depth First Generation Of Long Patterns IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Ramesh C. Agarwal; Charu C. Aggarwal; V. V. V. Prasad; |
2000 | 10 | Visualization Of Navigation Patterns On A Web Site Using Model-based Clustering IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Igor Cadez; David Heckerman; Christopher Meek; Padhraic Smyth; Steven White; |
2000 | 11 | Feature Selection In Unsupervised Learning Via Evolutionary Search IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
YeongSeog Kim; W. Nick Street; Filippo Menczer; |
2000 | 12 | Efficient Mining Of Weighted Association Rules (WAR) IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Wei Wang; Jiong Yang; Philip S. Yu; |
2000 | 13 | Can We Push More Constraints Into Frequent Pattern Mining? IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Jian Pei; Jiawei Han; |
2000 | 14 | Deformable Markov Model Templates For Time-series Pattern Matching IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Xianping Ge; Padhraic Smyth; |
2000 | 15 | Hancock: A Language For Extracting Signatures From Data Streams IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Corinna Cortes; Kathleen Fisher; Daryl Pregibon; Anne Rogers; |
1999 | 1 | MetaCost: A General Method For Making Classifiers Cost-sensitive IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Pedro Domingos; |
1999 | 2 | Efficient Mining Of Emerging Patterns: Discovering Trends And Differences IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Guozhu Dong; Jinyan Li; |
1999 | 3 | Fast And Effective Text Mining Using Linear-time Document Clustering IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Bjornar Larsen; Chinatsu Aone; |
1999 | 4 | Mining Association Rules With Multiple Minimum Supports IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Bing Liu; Wynne Hsu; Yiming Ma; |
1999 | 5 | Mining The Most Interesting Rules IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Roberto J. Bayardo; Rakesh Agrawal; |
1999 | 6 | Entropy-based Subspace Clustering For Mining Numerical Data IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Chun-Hung Cheng; Ada Waichee Fu; Yi Zhang; |
1999 | 7 | Event Detection From Time Series Data IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Valery Guralnik; Jaideep Srivastava; |
1999 | 8 | Activity Monitoring: Noticing Interesting Changes In Behavior IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Tom Fawcett; Foster Provost; |
1999 | 9 | Trajectory Clustering With Mixtures Of Regression Models IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Scott Gaffney; Padhraic Smyth; |
1999 | 10 | Pruning And Summarizing The Discovered Associations IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Bing Liu; Wynne Hsu; Yiming Ma; |
1999 | 11 | Horting Hatches An Egg: A New Graph-theoretic Approach To Collaborative Filtering IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Charu C. Aggarwal; Joel L. Wolf; Kun-Lung Wu; Philip S. Yu; |
1999 | 12 | Using Association Rules For Product Assortment Decisions: A Case Study IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Tom Brijs; Gilbert Swinnen; Koen Vanhoof; Geert Wets; |
1999 | 13 | Efficient Progressive Sampling IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Foster Provost; David Jensen; Tim Oates; |
1999 | 14 | Mining In A Data-flow Environment: Experience In Network Intrusion Detection IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Wenke Lee; Salvatore J. Stolfo; Kui W. Mok; |
1999 | 15 | Handling Concept Drifts In Incremental Learning With Support Vector Machines IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Nadeem Ahmed Syed; Huan Liu; Kah Kay Sung; |