Most Influential SIGMOD Papers
The ACM Special Interest Group on Management of Data (SIGMOD) is one of the top conferences on database management systems and data management technology. Paper Digest Team analyze all papers published on SIGMOD in the past years, and presents the 15 most influential papers for each year. This ranking list is automatically constructed based upon citations from both research papers and granted patents, and will be frequently updated to reflect the most recent changes. To find the most influential papers from other conferences/journals, visit Best Paper Digest page. Note: the most influential papers may or may not include the papers that won the best paper awards. (Version: 2021-05)
If you do not want to miss any interesting academic paper, you are welcome to sign up our free daily paper digest service to get updates on new papers published in your area every day. To search for papers with highlights, related papers, patents, grants, experts and organizations, please visit our search console. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
team@paperdigest.org
TABLE 1: Most Influential SIGMOD Papers
Year | Rank | Paper | Author(s) |
---|---|---|---|
2020 | 1 | ALEX: An Updatable Adaptive Learned Index IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a new learned index called ALEX which addresses practical issues that arise when implementing learned indexes for workloads that contain a mix of point lookups, short range queries, inserts, updates, and deletes. |
JIALIN DING et. al. |
2020 | 2 | Learning Multi-Dimensional Indexes IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce Flood, a multi-dimensional in-memory read-optimized index that automatically adapts itself to a particular dataset and workload by jointly optimizing the index structure and data storage layout. |
Vikram Nathan; Jialin Ding; Mohammad Alizadeh; Tim Kraska; |
2020 | 3 | IDEBench: A Benchmark For Interactive Data Exploration IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we argue that this is due to the fact that the workloads and metrics of popular analytical benchmarks such as TPC-H or TPC-DS were designed for traditional performance reporting scenarios, and do not capture distinctive IDE characteristics. |
Philipp Eichmann; Emanuel Zgraggen; Carsten Binnig; Tim Kraska; |
2020 | 4 | QuickSel: Quick Selectivity Learning With Mixture Models IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a selectivity learning framework, called QuickSel, which falls into the query-driven paradigm but does not use histograms. |
Yongjoo Park; Shucheng Zhong; Barzan Mozafari; |
2020 | 5 | Crypt?: Crypto-Assisted Differential Privacy On Untrusted Servers IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose, Crypt?, a system and programming framework that (1) achieves the accuracy guarantees and algorithmic expressibility of the central model (2) without any trusted data collector like in the local model. |
Amrita Roy Chowdhury; Chenghong Wang; Xi He; Ashwin Machanavajjhala; Somesh Jha; |
2020 | 6 | Elastic Machine Learning Algorithms In Amazon SageMaker IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We discuss such challenges and derive requirements for an industrial-scale ML platform. Next, we describe the computational model behind Amazon SageMaker, which is designed to meet such challenges |
EDO LIBERTY et. al. |
2020 | 7 | Creating Embeddings Of Heterogeneous Relational Datasets For Data Integration Tasks IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose algorithms for obtaining local embeddings that are effective for data integration tasks on relational databases. |
Riccardo Cappuzzo; Paolo Papotti; Saravanan Thirumuruganathan; |
2020 | 8 | CockroachDB: The Resilient Geo-Distributed SQL Database IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents the design of CockroachDB and its novel transaction model that supports consistent geo-distributed transactions on commodity hardware. |
REBECCA TAFT et. al. |
2020 | 9 | Cheetah: Accelerating Database Queries With Switch Pruning IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we leverage programmable switches in the network to partially offload query computation to the switch. |
Muhammad Tirmazi; Ran Ben Basat; Jiaqi Gao; Minlan Yu; |
2020 | 10 | The Machine Learning Bazaar: Harnessing The ML Ecosystem For Effective System Development IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address these problems, we introduce the Machine Learning Bazaar, a new framework for developing machine learning and automated machine learning software systems. |
Micah J. Smith; Carles Sala; James Max Kanter; Kalyan Veeramachaneni; |
2020 | 11 | Realistic Re-evaluation Of Knowledge Graph Completion Methods: An Experimental Study IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper is the first systematic study with the main objective of assessing the true effectiveness of embedding models when the unrealistic triples are removed. |
Farahnaz Akrami; Mohammed Samiul Saeef; Qingheng Zhang; Wei Hu; Chengkai Li; |
2020 | 12 | Deep Learning Models For Selectivity Estimation Of Multi-Attribute Queries IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose two complementary approaches that are effective for this scenario. |
Shohedul Hasan; Saravanan Thirumuruganathan; Jees Augustine; Nick Koudas; Gautam Das; |
2020 | 13 | A Comprehensive Benchmark Framework For Active Learning Methods In Entity Matching IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we build a unified active learning benchmark framework for EM that allows users to easily combine different learning algorithms with applicable example selection algorithms. |
Venkata Vamsikrishna Meduri; Lucian Popa; Prithviraj Sen; Mohamed Sarwat; |
2020 | 14 | Finding Related Tables In Data Lakes For Interactive Data Science IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop search and management solutions for the Jupyter Notebook data science platform, to enable scientists to augment training data, find potential features to extract, clean data, and find joinable or linkable tables. |
Yi Zhang; Zachary G. Ives; |
2020 | 15 | Estimating Numerical Distributions Under Local Differential Privacy IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a new reporting mechanism, called the square wave (SW) mechanism, which exploits the numerical nature in reporting. |
Zitao Li; Tianhao Wang; Milan Lopuhaä-Zwakenberg; Ninghui Li; Boris Škoric; |
2019 | 1 | Towards Scaling Blockchain Systems Via Sharding IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work takes a principled approach to apply sharding to blockchain systems in order to improve their transaction throughput at scale. |
HUNG DANG et. al. |
2019 | 2 | An End-to-End Automatic Cloud Database Tuning System Using Deep Reinforcement Learning IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address these challenges, we design an end-to-end automatic CDB tuning system, CDBTune, using deep reinforcement learning (RL). |
JI ZHANG et. al. |
2019 | 3 | Interventional Fairness: Causal Database Repair For Algorithmic Fairness IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we formalize the situation as a database repair problem, proving sufficient conditions for fair classifiers in terms of admissible variables as opposed to a complete causal model. |
Babak Salimi; Luke Rodriguez; Bill Howe; Dan Suciu; |
2019 | 4 | Designing Fair Ranking Schemes IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we develop a system that helps users choose criterion weights that lead to greater fairness. |
Abolfazl Asudeh; H. V. Jagadish; Julia Stoyanovich; Gautam Das; |
2019 | 5 | FITing-Tree: A Data-aware Index Structure IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a novel data-aware index structure called FITing-Tree which approximates an index using piece-wise linear functions with a bounded error specified at construction time. |
Alex Galakatos; Michael Markovitch; Carsten Binnig; Rodrigo Fonseca; Tim Kraska; |
2019 | 6 | Snorkel DryBell: A Case Study In Deploying Weak Supervision At Industrial Scale IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a first-of-its-kind study showing how existing knowledge resources from across an organization can be used as weak supervision in order to bring development time and cost down by an order of magnitude, and introduce Snorkel DryBell, a new weak supervision management system for this setting. |
STEPHEN H. BACH et. al. |
2019 | 7 | VChain: Enabling Verifiable Boolean Range Queries Over Blockchain Databases IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we take the first step toward investigating the problem of verifiable query processing over blockchain databases. |
Cheng Xu; Ce Zhang; Jianliang Xu; |
2019 | 8 | Responsible Data Science IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: There is a pressing need to integrate algorithmic and statistical principles, social science theories, and basic humanist concepts so that we can think critically and constructively about the socio-technical systems we are building. |
Lise Getoor; |
2019 | 9 | SkinnerDB: Regret-Bounded Query Evaluation Via Reinforcement Learning IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Along with SkinnerDB, we introduce a new quality criterion for query execution strategies. |
IMMANUEL TRUMMER et. al. |
2019 | 10 | HoloDetect: Few-Shot Learning For Error Detection IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a few-shot learning framework for error detection. |
Alireza Heidari; Joshua McGrath; Ihab F. Ilyas; Theodoros Rekatsinas; |
2019 | 11 | Blurring The Lines Between Blockchains And Database Systems: The Case Of Hyperledger Fabric IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To tackle these questions, we first explore Fabric from the perspective of database research, where we observe weaknesses in the transaction pipeline. We then solve these issues by transitioning well-understood database concepts to Fabric, namely transaction reordering as well as early transaction abort. |
Ankur Sharma; Felix Martin Schuhknecht; Divya Agrawal; Jens Dittrich; |
2019 | 12 | Democratizing Data Science Through Interactive Curation Of ML Pipelines IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we present Alpine Meadow, a first Interactive Automated Machine Learning tool. |
ZEYUAN SHANG et. al. |
2019 | 13 | CECI: Compact Embedding Cluster Index For Scalable Subgraph Matching IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel framework for subgraph listing based on Compact Embedding Cluster Index (\idx), which divides the data graph into multiple embedding clusters for parallel processing. |
Bibek Bhattarai; Hang Liu; H. Howie Huang; |
2019 | 14 | Answering Multi-Dimensional Analytical Queries Under Local Differential Privacy IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the problem of answering MDA queries under local differential privacy (LDP). |
TIANHAO WANG et. al. |
2019 | 15 | AI Meets AI: Leveraging Query Executions To Improve Index Recommendations IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a study of the design space for this classification problem. |
BAILU DING et. al. |
2018 | 1 | The Case For Learned Index Structures IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this exploratory research paper, we start from this premise and posit that all existing index structures can be replaced with other types of models, including deep-learning models, which we term \em learned indexes. |
Tim Kraska; Alex Beutel; Ed H. Chi; Jeffrey Dean; Neoklis Polyzotis; |
2018 | 2 | Deep Learning For Entity Matching: A Design Space Exploration IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we examine applying deep learning (DL) to EM, to understand DL’s benefits and limitations. |
SIDHARTH MUDGAL et. al. |
2018 | 3 | Cypher: An Evolving Query Language For Property Graphs IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We compare the features of Cypher to other property graph query languages, and describe extensions, at an advanced stage of development, which will form part of Cypher 10, turning the language into a compositional language which supports graph projections and multiple named graphs. |
NADIME FRANCIS et. al. |
2018 | 4 | Structured Streaming: A Declarative API For Real-Time Applications In Apache Spark IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe the system’s design and use cases from several hundred production deployments on Databricks, the largest of which process over 1 PB of data per month. |
MICHAEL ARMBRUST et. al. |
2018 | 5 | G-CORE: A Core For Future Graph Query Languages IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We report on a community effort between industry and academia to shape the future of graph query languages. |
RENZO ANGLES et. al. |
2018 | 6 | Query-based Workload Forecasting For Self-Driving Database Management Systems IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a robust forecasting framework called QueryBot 5000 that allows a DBMS to predict the expected arrival rate of queries in the future based on historical data. |
LIN MA et. al. |
2018 | 7 | Marginal Release Under Local Differential Privacy IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we provide a set of algorithms for materializing marginal statistics under the strong model of local differential privacy. |
Graham Cormode; Tejas Kulkarni; Divesh Srivastava; |
2018 | 8 | Privacy At Scale: Local Differential Privacy In Practice IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This tutorial aims to introduce the key technical underpinnings of these deployed systems, to survey current research that addresses related problems within the LDP model, and to identify relevant open problems and research directions for the community. |
GRAHAM CORMODE et. al. |
2018 | 9 | SuRF: Practical Range Query Filtering With Fast Succinct Tries IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present the Succinct Range Filter (SuRF), a fast and compact data structure for approximate membership tests. |
HUANCHEN ZHANG et. al. |
2018 | 10 | VerdictDB: Universalizing Approximate Query Processing IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Therefore, we argue that a universal solution is needed: a database-agnostic approximation engine that will widen the reach of this emerging technology across various platforms. |
Yongjoo Park; Barzan Mozafari; Joseph Sorenson; Junhao Wang; |
2018 | 11 | Fonduer: Knowledge Base Construction From Richly Formatted Data IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce Fonduer, a machine-learning-based KBC system for richly formatted data. |
SEN WU et. al. |
2018 | 12 | FASTER: A Concurrent Key-Value Store With In-Place Updates IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents FASTER, a new key-value store for point read, blind update, and read-modify-write operations. |
BADRISH CHANDRAMOULI et. al. |
2018 | 13 | Dostoevsky: Better Space-Time Trade-Offs For LSM-Tree Based Key-Value Stores Via Adaptive Removal Of Superfluous Merging IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we show that all mainstream LSM-tree based key-value stores in the literature and in industry are suboptimal with respect to how they trade off among the I/O costs of updates, point lookups, range lookups, as well as the cost of storage, measured as space-amplification. |
Niv Dayan; Stratos Idreos; |
2018 | 14 | Dynamic Pricing In Spatial Crowdsourcing: A Matching-Based Approach IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In the paper, we formally define this <u>G</u>lobal <u>D</u>ynamic <u>P</u>ricing(GDP) problem in spatial crowdsourcing. |
YONGXIN TONG et. al. |
2018 | 15 | A Nutritional Label For Rankings IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this demonstration we present Ranking Facts, a Web-based application that generates a "nutritional label" for rankings. |
KE YANG et. al. |
2017 | 1 | BLOCKBENCH: A Framework For Analyzing Private Blockchains IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper concerns recent private blockchain systems designed with stronger security (trust) assumption and performance requirement. |
TIEN TUAN ANH DINH et. al. |
2017 | 2 | Automatic Database Management System Tuning Through Large-scale Machine Learning IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To overcome these challenges, we present an automated approach that leverages past experience and collects new information to tune DBMS configurations: we use a combination of supervised and unsupervised machine learning methods to (1) select the most impactful knobs, (2) map unseen database workloads to previous workloads from which we can transfer experience, and (3) recommend knob settings. |
Dana Van Aken; Andrew Pavlo; Geoffrey J. Gordon; Bohan Zhang; |
2017 | 3 | Bolt-on Differential Privacy For Scalable Stochastic Gradient Descent-based Analytics IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We address this challenge by providing a novel analysis of the L2-sensitivity of SGD, which allows, under the same privacy guarantees, better convergence of SGD when only a constant number of passes can be made over the data. |
XI WU et. al. |
2017 | 4 | Heterogeneity-aware Distributed Parameter Servers IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study distributed machine learning in heterogeneous environments in this work. |
Jiawei Jiang; Bin Cui; Ce Zhang; Lele Yu; |
2017 | 5 | Monkey: Optimal Navigable Key-Value Store IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we show that key-value stores backed by an LSM-tree exhibit an intrinsic trade-off between lookup cost, update cost, and main memory footprint, yet all existing designs expose a suboptimal and difficult to tune trade-off among these metrics. |
Niv Dayan; Manos Athanassoulis; Stratos Idreos; |
2017 | 6 | Amazon Aurora: Design Considerations For High Throughput Cloud-Native Relational Databases IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we describe the architecture of Aurora and the design considerations leading to that architecture. |
ALEXANDRE VERBITSKI et. al. |
2017 | 7 | A General-Purpose Counting Filter: Making Every Bit Count IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a new general-purpose AMQ, the counting quotient filter (CQF). |
Prashant Pandey; Michael A. Bender; Rob Johnson; Rob Patro; |
2017 | 8 | Approximate Query Processing: No Silver Bullet IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we reflect on the state of the art of Approximate Query Processing. |
Surajit Chaudhuri; Bolin Ding; Srikanth Kandula; |
2017 | 9 | Data Management Challenges In Production Machine Learning IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The goal of the tutorial is to bring forth these issues, draw connections to prior work in the database literature, and outline the open research questions that are not addressed by prior art. |
Neoklis Polyzotis; Sudip Roy; Steven Euijong Whang; Martin Zinkevich; |
2017 | 10 | MacroBase: Prioritizing Attention In Fast Data IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In response, we present MacroBase, a data analytics engine that prioritizes end-user attention in high-volume fast data streams. |
PETER BAILIS et. al. |
2017 | 11 | Pufferfish Privacy Mechanisms For Correlated Data IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Since this mechanism may be computationally inefficient, we provide an additional mechanism that applies to some practical cases such as physical activity measurements across time, and is computationally efficient. |
Shuang Song; Yizhen Wang; Kamalika Chaudhuri; |
2017 | 12 | Debunking The Myths Of Influence Maximization: An In-Depth Benchmarking Study IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we perform an in-depth benchmarking study of IM techniques on social networks. |
Akhil Arora; Sainyam Galhotra; Sayan Ranu; |
2017 | 13 | Accelerating Pattern Matching Queries In Hybrid CPU-FPGA Architectures IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Taking advantage of recently released hybrid multicore architectures, such as the Intel’s Xeon+FPGA machine, where the FPGA has coherent access to the main memory through the QPI bus, we explore the benefits of specializing operators to hardware. |
David Sidler; Zsolt István; Muhsen Owaida; Gustavo Alonso; |
2017 | 14 | Falcon: Scaling Up Hands-Off Crowdsourced Entity Matching To Build Cloud Services IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose Falcon, a solution that scales up the hands-off crowdsourced EM approach of Corleone, using RDBMS-style query execution and optimization over a Hadoop cluster. |
SANJIB DAS et. al. |
2017 | 15 | Azure Data Lake Store: A Hyperscale Distributed File Service For Big Data Analytics IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present an overview of ADLS architecture, design points, and performance. |
RAGHU RAMAKRISHNAN et. al. |
2016 | 1 | Stop-and-Stare: Optimal Sampling Algorithms For Viral Marketing In Billion-scale Networks IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose SSA and D-SSA, two novel sampling frameworks for IM-based viral marketing problems. |
Hung T. Nguyen; My T. Thai; Thang N. Dinh; |
2016 | 2 | FPTree: A Hybrid SCM-DRAM Persistent And Concurrent B-Tree For Storage Class Memory IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose a novel hybrid SCM-DRAM persistent and concurrent B-Tree, named Fingerprinting Persistent Tree (FPTree) that achieves similar performance to DRAM-based counterparts. |
Ismail Oukid; Johan Lasperas; Anisoara Nica; Thomas Willhalm; Wolfgang Lehner; |
2016 | 3 | Simba: Efficient In-Memory Spatial Analytics IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present the Simba (Spatial In-Memory Big data Analytics) system that offers scalable and efficient in-memory spatial query processing and analytics for big spatial data. |
DONG XIE et. al. |
2016 | 4 | EmptyHeaded: A Relational Engine For Graph Processing IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present EmptyHeaded, a high-level engine that supports a rich datalog-like query language and achieves performance comparable to that of low-level engines. |
Christopher R. Aberger; Susan Tu; Kunle Olukotun; Christopher Ré; |
2016 | 5 | Data Cleaning: Overview And Emerging Challenges IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: Detecting and repairing dirty data is one of the perennial challenges in data analytics, and failure to do so can result in inaccurate analytics and unreliable decisions. … |
Xu Chu; Ihab F. Ilyas; Sanjay Krishnan; Jiannan Wang; |
2016 | 6 | Goods: Organizing Google’s Datasets IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present GOODS, a project to rethink how we organize structured datasets at scale, in a setting where teams use diverse and often idiosyncratic ways to produce the datasets and where there is no centralized system for storing and querying them. |
ALON HALEVY et. al. |
2016 | 7 | Constance: An Intelligent Data Lake System IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To avoid this, we propose Constance, a Data Lake system with sophisticated metadata management over raw data extracted from heterogeneous data sources. |
Rihan Hai; Sandra Geisler; Christoph Quix; |
2016 | 8 | Dynamic Prefetching Of Data Tiles For Interactive Visualization IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present ForeCache, a general-purpose tool for exploratory browsing of large datasets. |
Leilani Battle; Remco Chang; Michael Stonebraker; |
2016 | 9 | Learning Linear Regression Models Over Factorized Joins IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a new paradigm for computing batch gradient descent that exploits the factorized computation and representation of the training datasets, a rewriting of the regression objective function that decouples the computation of cofactors of model parameters from their convergence, and the commutativity of cofactor computation with relational union and projection. |
Maximilian Schleich; Dan Olteanu; Radu Ciucanu; |
2016 | 10 | Quickr: Lazily Approximating Complex AdHoc Queries In BigData Clusters IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a system that approximates the answer to complex ad-hoc queries in big-data clusters by injecting samplers on-the-fly and without requiring pre-existing samples. |
SRIKANTH KANDULA et. al. |
2016 | 11 | Principled Evaluation Of Differentially Private Algorithms Using DPBench IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose a set of evaluation principles which we argue are essential for sound evaluation. |
Michael Hay; Ashwin Machanavajjhala; Gerome Miklau; Yan Chen; Dan Zhang; |
2016 | 12 | Data Blocks: Hybrid OLTP And OLAP On Compressed Storage Using Both Vectorization And Compilation IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This work aims at reducing the main-memory footprint in high performance hybrid OLTP & OLAP databases, while retaining high query performance and transactional throughput. |
HARALD LANG et. al. |
2016 | 13 | Wander Join: Online Aggregation Via Random Walks IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a new approach, the wander join algorithm, to the online aggregation problem by performing random walks over the underlying join graph. |
Feifei Li; Bin Wu; Ke Yi; Zhuoyue Zhao; |
2016 | 14 | Efficient Subgraph Matching By Postponing Cartesian Products IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the problem of subgraph matching that extracts all subgraph isomorphic embeddings of a query graph q in a large data graph G. |
Fei Bi; Lijun Chang; Xuemin Lin; Lu Qin; Wenjie Zhang; |
2016 | 15 | PrivTree: A Differentially Private Algorithm For Hierarchical Decompositions IF:3 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To remedy the deficiency of existing solutions, we present PrivTree, a histogram construction algorithm that adopts hierarchical decomposition but completely eliminates the dependency on a pre-defined h. |
Jun Zhang; Xiaokui Xiao; Xing Xie; |
2015 | 1 | Spark SQL: Relational Data Processing In Spark IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Using Catalyst, we have built a variety of features (e.g. schema inference for JSON, machine learning types, and query federation to external databases) tailored for the complex needs of modern data analysis. |
MICHAEL ARMBRUST et. al. |
2015 | 2 | Twitter Heron: Stream Processing At Scale IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents the design and implementation of this new system, called Heron. |
SANJEEV KULKARNI et. al. |
2015 | 3 | Influence Maximization In Near-Linear Time: A Martingale Approach IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents an influence maximization algorithm that provides the same worst-case guarantees as the state of the art, but offers significantly improved empirical efficiency. |
Youze Tang; Yanchen Shi; Xiaokui Xiao; |
2015 | 4 | The LDBC Social Network Benchmark: Interactive Workload IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper describes the LDBC Social Network Benchmark (SNB), and presents database benchmarking innovation in terms of graph query functionality tested, correlated graph generation techniques, as well as a scalable benchmark driver on a workload with complex graph dependencies. |
ORRI ERLING et. al. |
2015 | 5 | KATARA: A Data Cleaning System Powered By Knowledge Bases And Crowdsourcing IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose KATARA, a knowledge base and crowd powered data cleaning system that, given a table, a KB, and a crowd, interprets table semantics to align it with the KB, identifies correct and incorrect data, and generates top-k possible repairs for incorrect data. |
XU CHU et. al. |
2015 | 6 | Overview Of Data Exploration Techniques IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this tutorial, we survey recent developments in the emerging area of database systems tailored for data exploration. |
Stratos Idreos; Olga Papaemmanouil; Surajit Chaudhuri; |
2015 | 7 | Design And Implementation Of The LogicBlox System IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we discuss the design considerations behind the LogicBlox system and give an overview of its implementation, highlighting innovative aspects. |
MOLHAM AREF et. al. |
2015 | 8 | K-Shape: Efficient And Accurate Clustering Of Time Series IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present k-Shape, a novel algorithm for time-series clustering. |
John Paparrizos; Luis Gravano; |
2015 | 9 | Apache Tez: A Unifying Framework For Modeling And Building Data Processing Applications IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce Apache Tez, an open-source framework designed to build data-flow driven processing runtimes. |
BIKAS SAHA et. al. |
2015 | 10 | Rethinking SIMD Vectorization For In-Memory Databases IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present novel vectorized designs and implementations of database operators, based on advanced SIMD operations, such as gathers and scatters. |
Orestis Polychroniou; Arun Raghavan; Kenneth A. Ross; |
2015 | 11 | Fast Serializable Multi-Version Concurrency Control For Main-Memory Database Systems IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a novel MVCC implementation for main-memory database systems that has very little overhead compared to serial execution with single-version concurrency control, even when maintaining serializability guarantees. |
Thomas Neumann; Tobias Mühlbauer; Alfons Kemper; |
2015 | 12 | QASCA: A Quality-Aware Task Assignment System For Crowdsourcing Applications IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we investigate the online task assignment problem: Given a pool of n questions, which of the k questions should be assigned to a worker? |
Yudian Zheng; Jiannan Wang; Guoliang Li; Reynold Cheng; Jianhua Feng; |
2015 | 13 | ICrowd: An Adaptive Crowdsourcing Framework IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To this end, we propose an adaptive crowdsourcing framework, called iCrowd. |
Ju Fan; Guoliang Li; Beng Chin Ooi; Kian-lee Tan; Jianhua Feng; |
2015 | 14 | Mining Quality Phrases From Massive Text Corpora IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a new framework that extracts quality phrases from text corpora integrated with phrasal segmentation. |
Jialu Liu; Jingbo Shang; Chi Wang; Xiang Ren; Jiawei Han; |
2015 | 15 | DBSCAN Revisited: Mis-Claim, Un-Fixability, And Approximation IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we prove that for d ≥ 3, the DBSCAN problem requires Ω(n4/3) time to solve, unless very significant breakthroughs—ones widely believed to be impossible—could be made in theoretical computer science. |
Junhao Gan; Yufei Tao; |
2014 | 1 | Storm@twitter IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper describes the use of Storm at Twitter. |
ANKIT TOSHNIWAL et. al. |
2014 | 2 | Influence Maximization: Near-optimal Time Complexity Meets Practical Efficiency IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents TIM, an algorithm that aims to bridge the theory and practice in influence maximization. |
Youze Tang; Xiaokui Xiao; Yanchen Shi; |
2014 | 3 | Resolving Conflicts In Heterogeneous Data By Truth Discovery And Source Reliability Estimation IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose to resolve conflicts among multiple sources of heterogeneous data types. |
QI LI et. al. |
2014 | 4 | Querying K-truss Community In Large And Dynamic Graphs IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel community model based on the k-truss concept, which brings nice structural and computational properties. |
Xin Huang; Hong Cheng; Lu Qin; Wentao Tian; Jeffrey Xu Yu; |
2014 | 5 | Corleone: Hands-off Crowdsourcing For Entity Matching IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe Corleone, a HOC solution for EM, which uses the crowd in all major steps of the EM process. |
CHAITANYA GOKHALE et. al. |
2014 | 6 | HYDRA: Large-scale Social Identity Linkage Via Heterogeneous Behavior Modeling IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes HYDRA, a solution framework which consists of three key steps: (I) modeling heterogeneous behavior by long-term behavior distribution analysis and multi-resolution temporal information matching; (II) constructing structural consistency graph to measure the high-order structure consistency on users’ core social structures across different platforms; and (III) learning the mapping function by multi-objective optimization composed of both the supervised learning on pair-wise ID linkage information and the cross-platform structure consistency maximization. |
Siyuan Liu; Shuhui Wang; Feida Zhu; Jinbo Zhang; Ramayya Krishnan; |
2014 | 7 | Natural Language Question Answering Over RDF: A Graph Data Driven Approach IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a systematic framework to answer natural language questions over RDF repository (RDF Q/A) from a graph data-driven perspective. |
LEI ZOU et. al. |
2014 | 8 | Morsel-driven Parallelism: A NUMA-aware Query Evaluation Framework For The Many-core Age IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In response, we present the morsel-driven query execution framework, where scheduling becomes a fine-grained run-time task that is NUMA-aware. |
Viktor Leis; Peter Boncz; Alfons Kemper; Thomas Neumann; |
2014 | 9 | TriAD: A Distributed Shared-nothing RDF Engine Based On Asynchronous Message Passing IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We investigate a new approach to the design of distributed, shared-nothing RDF engines. |
Sairam Gurajada; Stephan Seufert; Iris Miliaraki; Martin Theobald; |
2014 | 10 | PrivBayes: Private Data Release Via Bayesian Networks IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address the deficiency of the existing methods, this paper presents PrivBayes, a differentially private method for releasing high-dimensional data. |
Jun Zhang; Graham Cormode; Cecilia M. Procopiuc; Divesh Srivastava; Xiaokui Xiao; |
2014 | 11 | Navigating The Maze Of Graph Analytics Frameworks Using Massive Graph Datasets IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we offer a quantitative roadmap for improving the performance of all these frameworks and bridging the "ninja gap". |
NADATHUR SATISH et. al. |
2014 | 12 | Knowing When You’re Wrong: Building Fast And Reliable Approximate Query Processing Systems IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we show that it is possible to implement a query approximation pipeline that produces approximate answers and reliable error bars at interactive speeds. |
SAMEER AGARWAL et. al. |
2014 | 13 | Local Search Of Communities In Large Graphs IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a \emph{local search} strategy, which searches in the neighborhood of a vertex to find the best community for the vertex. |
Wanyun Cui; Yanghua Xiao; Haixun Wang; Wei Wang; |
2014 | 14 | Blowfish Privacy: Tuning Privacy-utility Trade-offs Using Policies IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present Blowfish, a class of privacy definitions inspired by the Pufferfish framework, that provides a rich interface for this trade-off. |
Xi He; Ashwin Machanavajjhala; Bolin Ding; |
2014 | 15 | Scalable Atomic Visibility With RAMP Transactions IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we identify a new isolation model—Read Atomic (RA) isolation—that matches the requirements of these use cases by ensuring atomic visibility: either all or none of each transaction’s updates are observed by other transactions. |
Peter Bailis; Alan Fekete; Joseph M. Hellerstein; Ali Ghodsi; Ion Stoica; |
2013 | 1 | Shark: SQL And Rich Analytics At Scale IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: Shark is a new data analysis system that marries query processing with complex analytics on large clusters. It leverages a novel distributed memory abstraction to provide a … |
REYNOLD S. XIN et. al. |
2013 | 2 | Trinity: A Distributed Graph Engine On A Memory Cloud IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce Trinity, a general purpose graph engine over a distributed memory cloud. |
Bin Shao; Haixun Wang; Yatao Li; |
2013 | 3 | Hekaton: SQL Server’s Memory-optimized OLTP Engine IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To achieve this it uses only latch-free data structures and a new optimistic, multiversion concurrency control technique. |
CRISTIAN DIACONU et. al. |
2013 | 4 | Inter-media Hashing For Large-scale Retrieval From Heterogeneous Data Sources IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a new multimedia retrieval paradigm to innovate large-scale search of heterogenous multimedia data. |
Jingkuan Song; Yang Yang; Yi Yang; Zi Huang; Heng Tao Shen; |
2013 | 5 | BigBench: Towards An Industry Standard Benchmark For Big Data Analytics IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present BigBench, an end-to-end big data benchmark proposal. |
AHMAD GHAZAL et. al. |
2013 | 6 | Integrating Scale Out And Fault Tolerance In Stream Processing Using Operator State Management IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Based on them, we describe an integrated approach for dynamic scale out and recovery of stateful operators. |
Raul Castro Fernandez; Matteo Migliavacca; Evangelia Kalyvianaki; Peter Pietzuch; |
2013 | 7 | LinkBench: A Database Benchmark Based On The Facebook Social Graph IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we present a new synthetic benchmark called LinkBench. |
Timothy G. Armstrong; Vamsi Ponnekanti; Dhruba Borthakur; Mark Callaghan; |
2013 | 8 | Fast Exact Shortest-path Distance Queries On Large Networks By Pruned Landmark Labeling IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a new exact method for shortest-path distance queries on large-scale networks. |
Takuya Akiba; Yoichi Iwata; Yuichi Yoshida; |
2013 | 9 | NADEEF: A Commodity Data Cleaning System IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present NADEEF, an extensible, generalized and easy-to-deploy data cleaning platform. |
MICHELE DALLACHIESA et. al. |
2013 | 10 | Leveraging Transitive Relations For Crowdsourced Joins IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we focus on the crowdsourced join query which aims to utilize humans to find all pairs of matching objects from two collections. |
Jiannan Wang; Guoliang Li; Tim Kraska; Michael J. Franklin; Jianhua Feng; |
2013 | 11 | Building An Efficient RDF Store Over A Relational Database IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we describe a novel storage and query mechanism for RDF which works on top of existing relational representations. |
MIHAELA A. BORNEA et. al. |
2013 | 12 | Bolt-on Causal Consistency IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider the problem of separating consistency-related safety properties from availability and durability in distributed data stores via the application of a "bolt-on" shim layer that upgrades the safety of an underlying general-purpose data store. |
Peter Bailis; Ali Ghodsi; Joseph M. Hellerstein; Ion Stoica; |
2013 | 13 | Query Processing On Smart SSDs: Opportunities And Challenges IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We have implemented an initial prototype of Microsoft SQL Server running on a Samsung Smart SSD. |
JAEYOUNG DO et. al. |
2013 | 14 | Finding Time Period-based Most Frequent Path In Big Trajectory Data IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study a new path finding query which finds the most frequent path (MFP) during user-specified time periods in large-scale historical trajectory data. |
Wuman Luo; Haoyu Tan; Lei Chen; Lionel M. Ni; |
2013 | 15 | Split Query Processing In Polybase IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents Polybase, a feature of SQL Server PDW V2 that allows users to manage and query data stored in a Hadoop cluster using the standard SQL query language. |
DAVID J. DEWITT et. al. |
2012 | 1 | Probase: A Probabilistic Taxonomy For Text Understanding IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a universal, probabilistic taxonomy that is more comprehensive than any existing ones. |
Wentao Wu; Hongsong Li; Haixun Wang; Kenny Q. Zhu; |
2012 | 2 | SkewTune: Mitigating Skew In Mapreduce Applications IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present an automatic skew mitigation approach for user-defined MapReduce programs and present SkewTune, a system that implements this approach as a drop-in replacement for an existing MapReduce implementation. |
YongChul Kwon; Magdalena Balazinska; Bill Howe; Jerome Rolia; |
2012 | 3 | Calvin: Fast Distributed Transactions For Partitioned Database Systems IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: Many distributed storage systems achieve high data access throughput via partitioning and replication, each system with its own advantages and tradeoffs. In order to achieve high … |
ALEXANDER THOMSON et. al. |
2012 | 4 | Skew-aware Automatic Database Partitioning In Shared-nothing, Parallel OLTP Systems IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To this purpose, we present a novel approach to automatically partitioning databases for enterprise-class OLTP systems that significantly extends the state of the art by: (1) minimizing the number distributed transactions, while concurrently mitigating the effects of temporal skew in both the data distribution and accesses, (2) extending the design space to include replicated secondary indexes, (4) organically handling stored procedure routing, and (3) scaling of schema complexity, data size, and number of partitions. |
Andrew Pavlo; Carlo Curino; Stanley Zdonik; |
2012 | 5 | BLSM: A General Purpose Log Structured Merge Tree IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In contrast, existing log structured techniques improve write throughput but sacrifice read performance and exhibit unacceptable latency spikes. |
Russell Sears; Raghu Ramakrishnan; |
2012 | 6 | A Model-based Approach To Attributed Graph Clustering IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we consider an alternative view and propose a model-based approach to attributed graph clustering. |
Zhiqiang Xu; Yiping Ke; Yi Wang; Hong Cheng; James Cheng; |
2012 | 7 | GUPT: Privacy Preserving Data Analysis Made Easy IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents the design and evaluation of a new system, GUPT, that overcomes these challenges. |
Prashanth Mohan; Abhradeep Thakurta; Elaine Shi; Dawn Song; David Culler; |
2012 | 8 | CrowdScreen: Algorithms For Filtering Data With Humans IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Given a large set of data items, we consider the problem of filtering them based on a set of properties that can be verified by humans. |
ADITYA G. PARAMESWARAN et. al. |
2012 | 9 | InfoGather: Entity Augmentation And Attribute Discovery By Holistic Matching With Web Tables IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present three core operations, namely entity augmentation by attribute name, entity augmentation by example and attribute discovery, that are useful for "information gathering" tasks (e.g., researching for products or stocks). |
Mohamed Yakout; Kris Ganjam; Kaushik Chakrabarti; Surajit Chaudhuri; |
2012 | 10 | Large-scale Machine Learning At Twitter IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents a case study of Twitter’s integration of machine learning tools into its existing Hadoop-based, Pig-centric analytics platform. |
Jimmy Lin; Alek Kolcz; |
2012 | 11 | Efficient Transaction Processing In SAP HANA Database: The End Of A Column Store Myth IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In summary, the paper aims at illustrating how the SAP HANA database is able to efficiently work in analytical as well as transactional workload environments. |
VISHAL SIKKA et. al. |
2012 | 12 | Can We Beat The Prefix Filtering?: An Adaptive Framework For Similarity Join And Search IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a cost model to judiciously select an appropriate prefix for each object. |
Jiannan Wang; Guoliang Li; Jianhua Feng; |
2012 | 13 | So Who Won?: Dynamic Max Discovery With The Crowd IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we focus on one such function, maximum, that finds the highest ranked object or tuple in a set. |
Stephen Guo; Aditya Parameswaran; Hector Garcia-Molina; |
2012 | 14 | Query Preserving Graph Compression IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: (2) We provide techniques for aintaining compressed graph Gr in response to changes ΔG to the original graph G. |
Wenfei Fan; Jianzhong Li; Xin Wang; Yinghui Wu; |
2012 | 15 | Towards A Unified Architecture For In-RDBMS Analytics IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our main contribution in this work is to take a step towards such a unified architecture. |
Xixuan Feng; Arun Kumar; Benjamin Recht; Christopher Ré; |
2011 | 1 | CrowdDB: Answering Queries With Crowdsourcing IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe the design of CrowdDB, report on an initial set of experiments using Amazon Mechanical Turk, and outline important avenues for future work in the development of crowdsourced query processing systems. |
Michael J. Franklin; Donald Kossmann; Tim Kraska; Sukriti Ramesh; Reynold Xin; |
2011 | 2 | No Free Lunch In Data Privacy IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we critically analyze the privacy protections offered by differential privacy. |
Daniel Kifer; Ashwin Machanavajjhala; |
2011 | 3 | Collective Spatial Keyword Querying IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present empirical studies that offer insight into the efficiency and accuracy of the solutions. |
Xin Cao; Gao Cong; Christian S. Jensen; Beng Chin Ooi; |
2011 | 4 | Apache Hadoop Goes Realtime At Facebook IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper describes the reasons why Facebook chose Hadoop and HBase over other systems such as Apache Cassandra and Voldemort and discusses the application’s requirements for consistency, availability, partition tolerance, data model and scalability. |
DHRUBA BORTHAKUR et. al. |
2011 | 5 | Processing Theta-joins Using MapReduce IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We study the problem of how to map arbitrary join conditions to Map and Reduce functions, i.e., a parallel infrastructure that controls data flow based on key-equality only. |
Alper Okcan; Mirek Riedewald; |
2011 | 6 | Zephyr: Live Migration In Shared Nothing Databases For Elastic Cloud Platforms IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose Zephyr, a technique to efficiently migrate a live database in a shared nothing transactional database architecture. |
Aaron J. Elmore; Sudipto Das; Divyakant Agrawal; Amr El Abbadi; |
2011 | 7 | Design And Evaluation Of Main Memory Hash Join Algorithms For Multi-core CPUs IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The focus of this paper is on investigating efficient hash join algorithms for modern multi-core processors in main memory environments. |
Spyros Blanas; Yinan Li; Jignesh M. Patel; |
2011 | 8 | A Platform For Scalable One-pass Analytics Using MapReduce IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address these limitations, we propose a new data analysis platform that employs hash techniques to enable fast in-memory processing, and a new frequent key based technique to extend such processing to workloads that require a large key-state space. |
Boduo Li; Edward Mazur; Yanlei Diao; Andrew McGregor; Prashant Shenoy; |
2011 | 9 | Apples And Oranges: A Comparison Of RDF Benchmarks And Real RDF Datasets IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we compare data generated with existing RDF benchmarks and data found in widely used real RDF datasets. |
Songyun Duan; Anastasios Kementsietsidis; Kavitha Srinivas; Octavian Udrea; |
2011 | 10 | SkimpyStash: RAM Space Skimpy Key-value Store On Flash-based Storage IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present SkimpyStash, a RAM space skimpy key-value store on flash-based storage, designed for high throughput, low latency server applications. |
Biplob Debnath; Sudipta Sengupta; Jin Li; |
2011 | 11 | Workload-aware Database Monitoring And Consolidation IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We formalize the consolidation problem as a non-linear optimization program, aiming to minimize the number of servers and balance load, while achieving near-zero performance degradation. |
Carlo Curino; Evan P.C. Jones; Samuel Madden; Hari Balakrishnan; |
2011 | 12 | Graph Cube: On Warehousing And OLAP Multidimensional Networks IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce Graph Cube, a new data warehousing model that supports OLAP queries effectively on large multidimensional networks. |
Peixiang Zhao; Xiaolei Li; Dong Xin; Jiawei Han; |
2011 | 13 | IReduct: Differential Privacy With Reduced Relative Errors IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper introduces iReduct, a differentially private algorithm for computing answers with reduced relative error. |
Xiaokui Xiao; Gabriel Bender; Michael Hay; Johannes Gehrke; |
2011 | 14 | Reverse Spatial And Textual K Nearest Neighbor Search IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we define Reverse Spatial Textual k Nearest Neighbor (RSTkNN) query, i.e., finding objects that take the query object as one of their k most spatial-textual similar objects. |
Jiaheng Lu; Ying Lu; Gao Cong; |
2011 | 15 | Schedule Optimization For Data Processing Flows On The Cloud IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study scheduling of dataflows that involve arbitrary data processing operators in the context of three different problems: 1) minimize completion time given a fixed budget, 2) minimize monetary cost given a deadline, and 3) find trade-offs between completion time and monetary cost without any a-priori constraints. |
Herald Kllapi; Eva Sitaridi; Manolis M. Tsangaris; Yannis Ioannidis; |
2010 | 1 | Pregel: A System For Large-scale Graph Processing IF:10 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we present a computational model suitable for this task. |
GRZEGORZ MALEWICZ et. al. |
2010 | 2 | TwitterMonitor: Trend Detection Over The Twitter Stream IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present TwitterMonitor, a system that performs trend detection over the Twitter stream. |
Michael Mathioudakis; Nick Koudas; |
2010 | 3 | Efficient Parallel Set-similarity Joins Using MapReduce IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we study how to efficiently perform set-similarity joins in parallel using the popular MapReduce framework. |
Rares Vernica; Michael J. Carey; Chen Li; |
2010 | 4 | A Comparison Of Join Algorithms For Log Processing In MaPreduce IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we describe crucial implementation details of a number of well-known join strategies in MapReduce, and present a comprehensive experimental comparison of these join techniques on a 100-node Hadoop cluster. |
SPYROS BLANAS et. al. |
2010 | 5 | Differentially Private Aggregation Of Distributed Time-series With Transformation And Encryption IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose the first differentially private aggregation algorithm for distributed time-series data that offers good practical utility without any trusted server. |
Vibhor Rastogi; Suman Nath; |
2010 | 6 | Data Warehousing And Analytics Infrastructure At Facebook IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we will present how these systems have come together and enabled us to implement a data warehouse that stores more than 15PB of data (2.5PB after compression) and loads more than 60TB of new data (10TB after compression) every day. |
ASHISH THUSOO et. al. |
2010 | 7 | Overview Of SciDB: Large Scale Array Storage, Processing And Analysis IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this talk we will describe our set of motivating examples and use them to explain the features of SciDB. |
Paul G. Brown; |
2010 | 8 | K-isomorphism: Privacy Preserving Network Publication Against Structural Attacks IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We follow this line of work and identify two realistic targets of attacks, namely, NodeInfo and LinkInfo. |
James Cheng; Ada Wai-chee Fu; Jia Liu; |
2010 | 9 | An Evaluation Of Alternative Architectures For Transaction Processing In The Cloud IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper lists alternative architectures to effect cloud computing for database applications and reports on the results of a comprehensive evaluation of existing commercial cloud services that have adopted these architectures. |
Donald Kossmann; Tim Kraska; Simon Loesing; |
2010 | 10 | FAST: Fast Architecture Sensitive Tree Search On Modern CPUs And GPUs IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present FAST, an extremely fast architecture sensitive layout of the index tree. |
CHANGKYU KIM et. al. |
2010 | 11 | Fast Sort On CPUs And GPUs: A Case For Bandwidth Oblivious SIMD Sort IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a competitive analysis of comparison and non-comparison based sorting algorithms on two modern architectures – the latest CPU and GPU architectures. |
NADATHUR SATISH et. al. |
2010 | 12 | Analyzing The Energy Efficiency Of A Database Server IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we focus on the role of database software in affecting, and, ultimately, improving the energy efficiency of a server. |
Dimitris Tsirogiannis; Stavros Harizopoulos; Mehul A. Shah; |
2010 | 13 | Ricardo: Integrating R And Hadoop IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The ability to apply sophisticated statistical analysis methods to this data is becoming essential for marketplace competitiveness. |
SUDIPTO DAS et. al. |
2010 | 14 | IBM Infosphere Streams For Scalable, Real-time, Intelligent Transportation Services IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we demonstrate the use of IBM InfoSphere Streams, a scalable stream processing platform, for tackling these challenges. |
ALAIN BIEM et. al. |
2010 | 15 | Searching Trajectories By Locations: An Efficiency Study IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we study a new problem of searching trajectories by locations, in which context the query is only a small set of locations with or without an order specified, while the target is to find the k Best-Connected Trajectories (k-BCT) from a database such that the k-BCT best connect the designated locations geographically. |
Zaiben Chen; Heng Tao Shen; Xiaofang Zhou; Yu Zheng; Xing Xie; |
2009 | 1 | A Comparison Of Approaches To Large-scale Data Analysis IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we describe and compare both paradigms. |
ANDREW PAVLO et. al. |
2009 | 2 | Privacy Integrated Queries: An Extensible Platform For Privacy-preserving Data Analysis IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We report on the design and implementation of the Privacy Integrated Queries (PINQ) platform for privacy-preserving data analysis. |
Frank D. McSherry; |
2009 | 3 | Secure KNN Computation On Encrypted Databases IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we discuss the general problem of secure computation on an encrypted database and propose a SCONEDB Secure Computation ON an Encrypted DataBase) model, which captures the execution and security requirements. |
Wai Kit Wong; David Wai-lok Cheung; Ben Kao; Nikos Mamoulis; |
2009 | 4 | The Design Of The Force.com Multitenant Internet Application Development Platform IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper defines multitenancy, explains its benefits, and demonstrates why metadata-driven architectures are the premier choice for implementing multitenancy. |
Craig D. Weissman; Steve Bobrowski; |
2009 | 5 | Scalable Join Processing On Very Large RDF Graphs IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present two contributions for scalable join processing. |
Thomas Neumann; Gerhard Weikum; |
2009 | 6 | Entity Resolution With Iterative Blocking IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose an iterative blocking framework where the ER results of blocks are reflected to subsequently processed blocks. |
Steven Euijong Whang; David Menestrina; Georgia Koutrika; Martin Theobald; Hector Garcia-Molina; |
2009 | 7 | Quality And Efficiency In High Dimensional Nearest Neighbor Search IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Motivated by this, we propose a new access method called the locality sensitive B-tree (LSB-tree) that enables fast high-dimensional NN search with excellent quality. |
Yufei Tao; Ke Yi; Cheng Sheng; Panos Kalnis; |
2009 | 8 | Keyword Search On Structured And Semi-structured Data IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this tutorial, we give an overview of the state-of-the-art techniques for supporting keyword search on structured and semi-structured data, including query result definition, ranking functions, result generation and top-k query processing, snippet generation, result clustering, query cleaning, performance optimization, and search quality evaluation. |
Yi Chen; Wei Wang; Ziyang Liu; Xuemin Lin; |
2009 | 9 | 3-HOP: A High-compression Indexing Scheme For Reachability Query IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a new 3-hop indexing scheme for directed graphs with higher density. |
Ruoming Jin; Yang Xiang; Ning Ruan; David Fuhry; |
2009 | 10 | ZStream: A Cost-based Query Processor For Adaptively Detecting Composite Events IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a CEP system called ZStream to efficiently process such sequential patterns. |
Yuan Mei; Samuel Madden; |
2009 | 11 | Attacks On Privacy And DeFinetti’s Theorem IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we present a method for reasoning about privacy using the concepts of exchangeability and deFinetti’s theorem. |
Daniel Kifer; |
2009 | 12 | Self-organizing Tuple Reconstruction In Column-stores IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel design, partial sideways cracking, that minimizes the tuple reconstruction cost in a self-organizing way. |
Stratos Idreos; Martin L. Kersten; Stefan Manegold; |
2009 | 13 | Why Not? IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show through a user study the usefulness of our answers, and describe two algorithms for finding the manipulation that discarded the data item of interest. |
Adriane Chapman; H. V. Jagadish; |
2009 | 14 | MayBMS: A Probabilistic Database Management System IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: As a proof of concept for its ease of use, we have built on top of MayBMS a Web-based application that offers NBA-related information based on what-if analysis of team dynamics using data available at www.nba.com. |
Jiewen Huang; Lyublena Antova; Christoph Koch; Dan Olteanu; |
2009 | 15 | Combining Keyword Search And Forms For Ad Hoc Querying Of Databases IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we investigate combining the two with the hopes of creating an approach that provides the best of both. |
Eric Chu; Akanksha Baid; Xiaoyong Chai; AnHai Doan; Jeffrey Naughton; |
2008 | 1 | Freebase: A Collaboratively Created Graph Database For Structuring Human Knowledge IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: Freebase is a practical, scalable tuple database used to structure general human knowledge. The data in Freebase is collaboratively created, structured, and maintained. Freebase … |
Kurt Bollacker; Colin Evans; Praveen Paritosh; Tim Sturge; Jamie Taylor; |
2008 | 2 | Pig Latin: A Not-so-foreign Language For Data Processing IF:10 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe a new language called Pig Latin that we have designed to fit in a sweet spot between the declarative style of SQL, and the low-level, procedural style of map-reduce. |
Christopher Olston; Benjamin Reed; Utkarsh Srivastava; Ravi Kumar; Andrew Tomkins; |
2008 | 3 | Private Queries In Location Based Services: Anonymizers Are Not Necessary IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel framework to support private location-dependent queries, based on the theoretical work on Private Information Retrieval (PIR). |
Gabriel Ghinita; Panos Kalnis; Ali Khoshgozaran; Cyrus Shahabi; Kian-Lee Tan; |
2008 | 4 | Towards Identity Anonymization On Graphs IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this issue, we study a specific graph-anonymization problem. |
Kun Liu; Evimaria Terzi; |
2008 | 5 | SPADE: The System S Declarative Stream Processing Engine IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present Spade – the System S declarative stream processing engine. |
Bugra Gedik; Henrique Andrade; Kun-Lung Wu; Philip S. Yu; Myungcheol Doo; |
2008 | 6 | Provenance And Scientific Workflows: Challenges And Opportunities IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We will (1) provide a general overview of scientific workflows, (2) describe research on provenance for scientific workflows and show in detail how provenance is supported in existing systems; (3) discuss emerging applications that are enabled by provenance; and (4) outline open problems and new directions for database-related research. |
Susan B. Davidson; Juliana Freire; |
2008 | 7 | Column-stores Vs. Row-stores: How Different Are They Really? IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we demonstrate that this assumption is false. |
Daniel J. Abadi; Samuel R. Madden; Nabil Hachem; |
2008 | 8 | Efficient Pattern Matching Over Event Streams IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a formal evaluation model that offers precise semantics for this new class of queries and a query evaluation framework permitting optimizations in a principled way. |
Jagrati Agrawal; Yanlei Diao; Daniel Gyllstrom; Neil Immerman; |
2008 | 9 | Efficient Aggregation For Graph Summarization IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce two database-style operations to summarize graphs. |
Yuanyuan Tian; Richard A. Hankins; Jignesh M. Patel; |
2008 | 10 | EASE: An Effective 3-in-1 Keyword Search Method For Unstructured, Semi-structured And Structured Data IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose an efficient and adaptive keyword search method, called EASE, for indexing and querying large collections of heterogenous data. |
Guoliang Li; Beng Chin Ooi; Jianhua Feng; Jianyong Wang; Lizhu Zhou; |
2008 | 11 | Graphs-at-a-time: Query Language And Access Methods For Graph Databases IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a query language for graph databases that supports arbitrary attributes on nodes, edges, and graphs. |
Huahai He; Ambuj K. Singh; |
2008 | 12 | A Case For Flash Memory Ssd In Enterprise Database Applications IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The objective of this work is to understand the applicability and potential impact that flash memory SSD (Solid State Drive) has for certain type of storage spaces of a database server where sequential writes and random reads are prevalent. |
Sang-Won Lee; Bongki Moon; Chanik Park; Jae-Myung Kim; Sang-Woo Kim; |
2008 | 13 | OLTP Through The Looking Glass, And What We Found There IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Rather than simply profiling Shore, we progressively modified it so that after every feature removal or optimization, we had a (faster) working system that fully ran our workload. |
Stavros Harizopoulos; Daniel J. Abadi; Samuel Madden; Michael Stonebraker; |
2008 | 14 | Relational Joins On Graphics Processors IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a novel design and implementation of relational join algorithms for new-generation graphics processing units (GPUs). |
BINGSHENG HE et. al. |
2008 | 15 | Multi-tenant Databases For Software As A Service: Schema-mapping Techniques IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper describes a new schema-mapping technique for multi-tenancy called Chunk Folding, where the logical tables are vertically partitioned into chunks that are folded together into different physical multi-tenant tables and joined as needed. |
Stefan Aulbach; Torsten Grust; Dean Jacobs; Alfons Kemper; Jan Rittinger; |
2007 | 1 | Trajectory Clustering: A Partition-and-group Framework IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: For the second phase, we present a density-based line-segment clustering algorithm. |
Jae-Gil Lee; Jiawei Han; Kyu-Young Whang; |
2007 | 2 | Map-reduce-merge: Simplified Relational Data Processing On Large Clusters IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We improve Map-Reduce into a new model called Map-Reduce-Merge. |
Hung-chih Yang; Ali Dasdan; Ruey-Lung Hsiao; D. Stott Parker; |
2007 | 3 | BLINKS: Ranked Keyword Searches On Graphs IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address these problems, we propose BLINKS, a bi-level indexing and query processing scheme for top-k keyword search on graphs. |
Hao He; Haixun Wang; Jun Yang; Philip S. Yu; |
2007 | 4 | M-invariance: Towards Privacy Preserving Re-publication Of Dynamic Datasets IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Based on rigorous theoretical analysis, we develop a new generalization principle m-invariance that effectively limits the risk of privacy disclosure in re-publication. |
Xiaokui Xiao; Yufei Tao; |
2007 | 5 | Design Of Flash-based DBMS: An In-page Logging Approach IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a new design called in-page logging (IPL) for flash memory based database servers. |
Sang-Won Lee; Bongki Moon; |
2007 | 6 | Model Management 2.0: Manipulating Richer Mappings IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We review what has been learned from recent experience, explain the revised model management vision based on that experience, and identify the research problems that the revised vision opens up. |
Philip A. Bernstein; Sergey Melnik; |
2007 | 7 | Hiding The Presence Of Individuals From Shared Databases IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a metric, δ-presence, that clearly links the quality of anonymization to the risk posed by inadequate anonymization. |
Mehmet Ercan Nergiz; Maurizio Atzori; Chris Clifton; |
2007 | 8 | JouleSort: A Balanced Energy-efficiency Benchmark IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose and motivate JouleSort, an external sort benchmark, for evaluating the energy efficiency of a wide range of computer systems from clusters to handhelds. |
Suzanne Rivoire; Mehul A. Shah; Parthasarathy Ranganathan; Christos Kozyrakis; |
2007 | 9 | Identifying Meaningful Return Information For XML Keyword Search IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this challenge, we present an XML keyword search engine, XSeek, to infer the semantics of the search and identify return nodes effectively. |
Ziyang Liu; Yi Chen; |
2007 | 10 | Fg-index: Towards Verification-free Query Processing On Graph Databases IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel indexing technique that constructs a nested inverted-index, called FG-index, based on the set of Frequent subGraphs (FGs). |
James Cheng; Yiping Ke; Wilfred Ng; An Lu; |
2007 | 11 | Making Database Systems Usable IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study why database systems today are so difficult to use. |
H. V. JAGADISH et. al. |
2007 | 12 | Fast And Practical Indexing And Querying Of Very Large Graphs IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present the GRIPP index structure (GRaph Indexing based on Pre- and Postorder numbering) for answering reachability queries in graphs. |
Silke Trißl; Ulf Leser; |
2007 | 13 | Cayuga: A High-performance Event Processing Engine IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a demonstration of Cayuga, a complex event monitoring system for high speed data streams. |
LARS BRENNA et. al. |
2007 | 14 | Sketching Probabilistic Data Streams IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose the first space- and time-efficient algorithms for approximating complex aggregate queries (including, the number of distinct values and join/self-join sizes) over probabilistic data streams. |
Graham Cormode; Minos Garofalakis; |
2007 | 15 | MashMaker: Mashups For The Masses IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Robert J. Ennals; Minos N. Garofalakis; |
2006 | 1 | High-performance Complex Event Processing Over Streams IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present the design, implementation, and evaluation of a system that executes complex event queries over real-time streams of RFID readings encoded as events. |
Eugene Wu; Yanlei Diao; Shariq Rizvi; |
2006 | 2 | Personalized Privacy Preservation IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Motivated by this, we present a new generalization framework based on the concept of personalized anonymity. |
Xiaokui Xiao; Yufei Tao; |
2006 | 3 | Integrating Compression And Execution In Column-oriented Database Systems IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The ability to compress many adjacent tuples at once lowers the per-tuple cost of compression, both in terms of CPU and space overheads.In this paper, we discuss how we extended C-Store (a column-oriented DBMS) with a compression sub-system. |
Daniel Abadi; Samuel Madden; Miguel Ferreira; |
2006 | 4 | VisTrails: Visualization Meets Data Management IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In VisTrails, we address the problem of visualization from a data management perspective: VisTrails manages the data and metadata of a visualization product. |
STEVEN P. CALLAHAN et. al. |
2006 | 5 | GPUTeraSort: High Performance Graphics Co-processor Sorting For Large Database Management IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a novel external sorting algorithm using graphics processors (GPUs) on large databases composed of billions of records and wide keys. |
Naga Govindaraju; Jim Gray; Ritesh Kumar; Dinesh Manocha; |
2006 | 6 | Finding K-dominant Skylines In High Dimensional Space IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To find more important and meaningful skyline points in high dimensional space, we propose a new concept, called k-dominant skyline which relaxes the idea of dominance to k-dominance. |
Chee-Yong Chan; H. V. Jagadish; Kian-Lee Tan; Anthony K. H. Tung; Zhenjie Zhang; |
2006 | 7 | Effective Keyword Search In Relational Databases IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel IR ranking strategy for effective keyword search. |
Fang Liu; Clement Yu; Weiyi Meng; Abdur Chowdhury; |
2006 | 8 | LINQ: Reconciling Object, Relations And XML In The .NET Framework IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: NET framework, approaches this problem by defining a pattern of general-purpose standard query operators for traversal, filter, and projection. |
Erik Meijer; Brian Beckman; Gavin Bierman; |
2006 | 9 | Dynamic Authenticated Index Structures For Outsourced Databases IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our findings exhibit that the proposed solutions improve performance substantially over existing approaches, both for static and dynamic environments. |
Feifei Li; Marios Hadjieleftheriou; George Kollios; Leonid Reyzin; |
2006 | 10 | Efficient Query Processing In Geographic Web Search Engines IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the problem of efficient query processing in scalable geographic search engines. |
Yen-Yu Chen; Torsten Suel; Alexander Markowetz; |
2006 | 11 | MonetDB/XQuery: A Fast XQuery Processor Powered By A Relational Engine IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper describes the main features, key contributions, and lessons learned while implementing such a system. |
PETER BONCZ et. al. |
2006 | 12 | Provenance Management In Curated Databases IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe an approach in which we track the user’s actions while browsing source databases and copying data into a curated database, in order to record the user’s actions in a convenient, queryable form. |
Peter Buneman; Adriane Chapman; James Cheney; |
2006 | 13 | Record Linkage: Similarity Measures And Algorithms IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This tutorial provides a comprehensive and cohesive overview of the key research results in the area of record linkage methodologies and algorithms for identifying approximate duplicate records, and available tools for this purpose. |
Nick Koudas; Sunita Sarawagi; Divesh Srivastava; |
2006 | 14 | Injecting Utility Into Anonymized Datasets IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we will discuss the shortcomings of current heuristic approaches to measuring utility and we will introduce a formal approach to measuring utility. |
Daniel Kifer; Johannes Gehrke; |
2006 | 15 | Declarative Networking: Language, Execution And Optimization IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we address fundamental database issues in this domain. |
BOON THAU LOO et. al. |
2005 | 1 | Incognito: Efficient Full-domain K-anonymity IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: A number of organizations publish microdata for purposes such as public health and demographic research. We introduce a set of algorithms for producing minimal full-domain generalizations, and show that these algorithms perform up to an order of magnitude faster than previous algorithms on two real-life databases.Besides full-domain generalization, numerous other models have also been proposed for k-anonymization. |
Kristen LeFevre; David J. DeWitt; Raghu Ramakrishnan; |
2005 | 2 | Robust And Fast Similarity Search For Moving Object Trajectories IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce a novel distance function, Edit Distance on Real sequence (EDR) which is robust against these data imperfections. |
Lei Chen; M. Tamer Özsu; Vincent Oria; |
2005 | 3 | Schema And Ontology Matching With COMA++ IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We demonstrate the schema and ontology matching tool COMA++. |
David Aumueller; Hong-Hai Do; Sabine Massmann; Erhard Rahm; |
2005 | 4 | Reference Reconciliation In Complex Information Spaces IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our experiments show that (1) we considerably improve precision and recall over standard methods on a diverse set of personal information datasets, and (2) there are advantages to using our algorithm even on a standard citation dataset benchmark. |
Xin Dong; Alon Halevy; Jayant Madhavan; |
2005 | 5 | Efficient Keyword Search For Smallest LCAs In XML Databases IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose keyword search in XML documents, modeled as labeled trees, and describe corresponding efficient algorithms. |
Yu Xu; Yannis Papakonstantinou; |
2005 | 6 | Deriving Private Information From Randomized Data IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose two data reconstruction methods that are based on data correlations. |
Zhengli Huang; Wenliang Du; Biao Chen; |
2005 | 7 | A Cost-based Model And Effective Heuristic For Repairing Constraints By Value Modification IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this context, we introduce a novel cost framework that allows for the application of techniques from record-linkage to the search for good repairs. |
Philip Bohannon; Wenfei Fan; Michael Flaster; Rajeev Rastogi; |
2005 | 8 | Conceptual Partitioning: An Efficient Method For Continuous Nearest Neighbor Monitoring IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose conceptual partitioning (CPM), a comprehensive technique for the efficient monitoring of continuous NN queries. |
Kyriakos Mouratidis; Dimitris Papadias; Marios Hadjieleftheriou; |
2005 | 9 | Substructure Similarity Search In Graph Databases IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Since exact matching is often too restrictive, similarity search of complex structures becomes a vital operation that must be supported efficiently.In this paper, we investigate the issues of substructure similarity search using indexed features in graph databases. |
Xifeng Yan; Philip S. Yu; Jiawei Han; |
2005 | 10 | Clio Grows Up: From Research Prototype To Industrial Tool IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we revisit the architecture and algorithms behind Clio. |
Laura M. Haas; Mauricio A. Hernández; Howard Ho; Lucian Popa; Mary Roth; |
2005 | 11 | RankSQL: Query Algebra And Optimization For Relational Top-k Queries IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To optimize top-k queries, we propose a dimensional enumeration algorithm to explore the extended plan space by enumerating plans along two dual dimensions: ranking and membership. |
Chengkai Li; Kevin Chen-Chuan Chang; Ihab F. Ilyas; Sumin Song; |
2005 | 12 | A Generic Framework For Monitoring Continuous Spatial Queries Over Moving Objects IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper proposes a generic framework for monitoring continuous spatial queries over moving objects. |
Haibo Hu; Jianliang Xu; Dik Lun Lee; |
2005 | 13 | Semantics And Evaluation Techniques For Window Aggregates In Data Streams IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To address this problem, we propose a framework for defining window semantics, which can be used to express almost all types of windows of which we are aware, and which is easily extensible to other types of windows that may occur in the future. |
Jin Li; David Maier; Kristin Tufte; Vassilis Papadimos; Peter A. Tucker; |
2005 | 14 | Verifying Completeness Of Relational Query Results In Data Publishing IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce a scheme for users to verify that their query results are complete (i.e., no qualifying tuples are omitted) and authentic (i.e., all the result values originated from the owner). |
HweeHwa Pang; Arpit Jain; Krithi Ramamritham; Kian-Lee Tan; |
2005 | 15 | Tributaries And Deltas: Efficient And Robust Aggregation In Sensor Network Streams IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce Tributary-Delta, a novel approach that combines the advantages of the tree and multi-path approaches by running them simultaneously in different regions of the network. |
Amit Manjhi; Suman Nath; Phillip B. Gibbons; |
2004 | 1 | Order Preserving Encryption For Numeric Data IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present an order-preserving encryption scheme for numeric data that allows any comparison operation to be directly applied on encrypted data. |
Rakesh Agrawal; Jerry Kiernan; Ramakrishnan Srikant; Yirong Xu; |
2004 | 2 | Graph Indexing: A Frequent Structure-based Approach IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we investigate the issues of indexing graphs and propose a novel solution by applying a graph mining technique. |
Xifeng Yan; Philip S. Yu; Jiawei Han; |
2004 | 3 | ORDPATHs: Insert-friendly XML Node Labels IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a hierarchical labeling scheme called ORDPATH that is implemented in the upcoming version of Microsoft® SQL Server™. |
PATRICK O’NEIL et. al. |
2004 | 4 | Integrating Vertical And Horizontal Partitioning Into Automated Physical Database Design IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present novel techniques for designing a scalable solution to this integrated physical design problem that takes both performance and manageability into account. |
Sanjay Agrawal; Vivek Narasayya; Beverly Yang; |
2004 | 5 | SINA: Scalable Incremental Processing Of Continuous Queries In Spatio-temporal Databases IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce two types of updaes, namely positive and negative updates. |
Mohamed F. Mokbel; Xiaopeing Xiong; Walid G. Aref; |
2004 | 6 | IMAP: Discovering Complex Semantic Matches Between Database Schemas IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe the iMAP system which semi-automatically discovers both 1-1 and complex matches. |
Robin Dhamankar; Yoonkyong Lee; AnHai Doan; Alon Halevy; Pedro Domingos; |
2004 | 7 | Extending Query Rewriting Techniques For Fine-grained Access Control IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a novel fine-grained access control model based on authorization views that allows "authorization-transparent" querying; that is, user queries can be phrased in terms of the database relations, and are valid if they can be answered using only the information contained in these authorization views. |
Shariq Rizvi; Alberto Mendelzon; S. Sudarshan; Prasan Roy; |
2004 | 8 | Efficient Set Joins On Similarity Predicates IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we present an efficient, scalable and general algorithm for performing set joins on predicates involving various similarity measures like intersect size, Jaccard-coefficient, cosine similarity, and edit-distance. |
Sunita Sarawagi; Alok Kirpal; |
2004 | 9 | Adaptive Stream Resource Management Using Kalman Filters IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we focus on minimization of communication overhead for both synthetic and real-world streams. |
Ankur Jain; Edward Y. Chang; Yuan-Fang Wang; |
2004 | 10 | Indexing Spatio-temporal Trajectories With Chebyshev Polynomials IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we attempt to approximate and index a d- dimensional (d ≥ 1) spatio-temporal trajectory with a low order continuous polynomial. |
Yuhan Cai; Raymond Ng; |
2004 | 11 | CORDS: Automatic Discovery Of Correlations And Soft Functional Dependencies IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce CORDS, an efficient and scalable tool for automatic discovery of correlations and soft functional dependencies between columns. |
Ihab F. Ilyas; Volker Markl; Peter Haas; Paul Brown; Ashraf Aboulnaga; |
2004 | 12 | An Interactive Clustering-based Approach To Integrating Source Query Interfaces On The Deep Web IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose an interactive, clustering-based approach to matching query interfaces. |
Wensheng Wu; Clement Yu; AnHai Doan; Weiyi Meng; |
2004 | 13 | Identifying Similarities, Periodicities And Bursts For Online Search Queries IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present several methods for mining knowledge from the query logs of the MSN search engine. |
Michail Vlachos; Christopher Meek; Zografoula Vagena; Dimitrios Gunopulos; |
2004 | 14 | Prediction And Indexing Of Moving Objects With Unknown Motion Patterns IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our second contribution is a novel recursive motion function that supports a broad class of non-linear motion patterns. |
Yufei Tao; Christos Faloutsos; Dimitris Papadias; Bin Liu; |
2004 | 15 | Secure XML Querying With Security Views IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose an efficient algorithm for deriving security view definitions from security policies (defined on the original document DTD) for different user groups. |
Wenfei Fan; Chee-Yong Chan; Minos Garofalakis; |
2003 | 1 | The Design Of An Acquisitional Query Processor For Sensor Networks IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We evaluate these issues in the context of TinyDB, a distributed query processor for smart sensor devices, and show how acquisitional techniques can provide significant reductions in power consumption on our sensor devices. |
Samuel Madden; Michael J. Franklin; Joseph M. Hellerstein; Wei Hong; |
2003 | 2 | Winnowing: Local Algorithms For Document Fingerprinting IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We prove a novel lower bound on the performance of any local algorithm. |
Saul Schleimer; Daniel S. Wilkerson; Alex Aiken; |
2003 | 3 | Gigascope: A Stream Database For Network Applications IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we describe our motivation for and constraints in developing Gigascope, the Gigascope architecture and query language, and performance issues. |
Chuck Cranor; Theodore Johnson; Oliver Spataschek; Vladislav Shkapenyuk; |
2003 | 4 | An Optimal And Progressive Algorithm For Skyline Queries IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we develop BBS (<u>b</u>ranch-and-<u>b</u>ound <u>s</u>kyline), a progressive algorithm also based on nearest neighbor search, which is IO optimal, i.e., it performs a single access only to those R-tree nodes that may contain skyline points. |
Dimitris Papadias; Yufei Tao; Greg Fu; Bernhard Seeger; |
2003 | 5 | TelegraphCQ: Continuous Dataflow Processing IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
SIRISH CHANDRASEKARAN et. al. |
2003 | 6 | Information Sharing Across Private Databases IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We formalize the notion of minimal information sharing across private databases, and develop protocols for intersection, equijoin, intersection size, and equijoin size. |
Rakesh Agrawal; Alexandre Evfimievski; Ramakrishnan Srikant; |
2003 | 7 | Evaluating Probabilistic Queries Over Imprecise Data IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we study probabilistic query evaluation based upon uncertain data. |
Reynold Cheng; Dmitri V. Kalashnikov; Sunil Prabhakar; |
2003 | 8 | XRANK: Ranked Keyword Search Over XML Documents IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present the XRANK system that is designed to handle these novel features of XML keyword search. |
Lin Guo; Feng Shao; Chavdar Botev; Jayavel Shanmugasundaram; |
2003 | 9 | Robust And Efficient Fuzzy Match For Online Data Cleaning IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a new similarity function which overcomes limitations of commonly used similarity functions, and develop an efficient fuzzy match algorithm. |
Surajit Chaudhuri; Kris Ganjam; Venkatesh Ganti; Rajeev Motwani; |
2003 | 10 | Extracting Structured Data From Web Pages IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study the problem of automatically extracting the database values from such template-generated web pages without any learning examples or other similar human input. |
Arvind Arasu; Hector Garcia-Molina; |
2003 | 11 | Adaptive Filters For Continuous Queries Over Distributed Data Streams IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Significant communication overhead is incurred in the presence of rapid update streams, and we propose a new technique for reducing the overhead. |
Chris Olston; Jing Jiang; Jennifer Widom; |
2003 | 12 | Distributed Top-k Monitoring IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We show that transmitting entire data streams is unnecessary to support these queries and present an alternative approach that reduces communication significantly. |
Brian Babcock; Chris Olston; |
2003 | 13 | Spectral Bloom Filters IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present novel methods for reducing the probability and magnitude of errors. |
Saar Cohen; Yossi Matias; |
2003 | 14 | Efficient Similarity Search And Classification Via Rank Aggregation IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a novel approach to performing efficient similarity search and classification in high dimensional data. |
Ronald Fagin; Ravi Kumar; D. Sivakumar; |
2003 | 15 | XQuery: A Query Language For XML IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This tutorial will provide an overview of the syntax and semantics of XQuery, as well as insight into the principles that guided the design of the language. |
Don Chamberlin; |
2002 | 1 | Storing And Querying Ordered XML Using A Relational Database System IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper shows that XML’s ordered data model can indeed be efficiently supported by a relational database system. |
IGOR TATARINOV et. al. |
2002 | 2 | Executing SQL Over Encrypted Data In The Database-service-provider Model IF:10 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we focus on the second challenge. |
Hakan Hacigümüş; Bala Iyer; Chen Li; Sharad Mehrotra; |
2002 | 3 | Holistic Twig Joins: Optimal XML Pattern Matching IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: A limitation of this approach for matching twig patterns is that intermediate result sizes can get large, even when the input and output sizes are more manageable.In this paper, we propose a novel holistic twig join algorithm, TwigStack, for matching an XML query twig pattern. |
Nicolas Bruno; Nick Koudas; Divesh Srivastava; |
2002 | 4 | Continuously Adaptive Continuous Queries Over Streams IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a continuously adaptive, continuous query (CACQ) implementation based on the eddy query processing framework. |
Samuel Madden; Mehul Shah; Joseph M. Hellerstein; Vijayshankar Raman; |
2002 | 5 | Clustering By Pattern Similarity In Large Data Sets IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we explore a more general type of similarity. |
Haixun Wang; Wei Wang; Jiong Yang; Philip S. Yu; |
2002 | 6 | Accelerating XPath Location Steps IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Despite its flexibility, the new index can be implemented and queried using purely relational techniques, but it performs especially well if the underlying database host provides support for R-trees. |
Torsten Grust; |
2002 | 7 | A Monte Carlo Algorithm For Fast Projective Clustering IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a mathematical formulation for the notion of optimal projective cluster, starting from natural requirements on the density of points in subspaces. |
Cecilia M. Procopiuc; Michael Jones; Pankaj K. Agarwal; T. M. Murali; |
2002 | 8 | Processing Complex Aggregate Queries Over Data Streams IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Providing (perhaps approximate) answers to queries over such continuous data streams is a crucial requirement for many application environments; examples include large telecom and IP network installations where performance data from different parts of the network needs to be continuously collected and analyzed.In this paper, we consider the problem of approximately answering general aggregate SQL queries over continuous data streams with limited memory. |
Alin Dobra; Minos Garofalakis; Johannes Gehrke; Rajeev Rastogi; |
2002 | 9 | Covering Indexes For Branching Path Queries IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we ask if the traditional relational query acceleration techniques of summary tables and covering indexes have analogs for branching path expression queries over tree- or graph-structured XML data. |
Raghav Kaushik; Philip Bohannon; Jeffrey F Naughton; Henry F Korth; |
2002 | 10 | APEX: An Adaptive Path Index For XML Data IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose APEX, an adaptive path index for XML data. |
Chin-Wan Chung; Jun-Ki Min; Kyuseok Shim; |
2002 | 11 | Minimal Probing: Supporting Expensive Predicates For Top-k Queries IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper addresses the problem of evaluating ranked top-k queries with expensive predicates. |
Kevin Chen-Chuan Chang; Seung-won Hwang; |
2002 | 12 | Querying And Mining Data Streams: You Only Get One Look A Tutorial IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Minos Garofalakis; Johannes Gehrke; Rajeev Rastogi; |
2002 | 13 | Rate-based Query Optimization For Streaming Information Sources IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In view of this, we propose shifting from a cardinality-based approach to a rate-based approach, and give an optimization framework that aims at maximizing the output rate of query evaluation plans. |
Stratis D. Viglas; Jeffrey F. Naughton; |
2002 | 14 | Implementing Database Operations Using SIMD Instructions IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present techniques for implementing these using SIMD instructions. |
Jingren Zhou; Kenneth A. Ross; |
2002 | 15 | Time-parameterized Queries In Spatio-temporal Databases IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a general framework that covers time-parameterized variations of the most common spatial queries, namely window queries, k-nearest neighbors and spatial joins. |
Yufei Tao; Dimitris Papadias; |
2001 | 1 | Outlier Detection For High Dimensional Data IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we discuss new techniques for outlier detection which find the outliers by studying the behavior of projections from the data set. |
Charu C. Aggarwal; Philip S. Yu; |
2001 | 2 | On Supporting Containment Queries In Relational Database Management Systems IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we explore some performance implications of both options using native implementations in two commercial relational database systems and in a special purpose inverted list engine. |
Chun Zhang; Jeffrey Naughton; David DeWitt; Qiong Luo; Guy Lohman; |
2001 | 3 | Reconciling Schemas Of Disparate Data Sources: A Machine-learning Approach IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe LSD, a system that employs and extends current machine-learning techniques to semi-automatically find such mappings. |
AnHai Doan; Pedro Domingos; Alon Y. Halevy; |
2001 | 4 | Locally Adaptive Dimensionality Reduction For Indexing Large Time Series Databases IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work we introduce a new dimensionality reduction technique which we call Adaptive Piecewise Constant Approximation (APCA). |
Eamonn Keogh; Kaushik Chakrabarti; Michael Pazzani; Sharad Mehrotra; |
2001 | 5 | Filtering Algorithms And Implementation For Very Fast Publish/subscribe Systems IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper describes an attempt at the construction of such algorithms and its implementation. |
FRAN&CCEDIL;OISE FABRET et. al. |
2001 | 6 | Space-efficient Online Computation Of Quantile Summaries IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a new online algorithm for computing∈-approximate quantile summaries of very large data sequences. |
Michael Greenwald; Sanjeev Khanna; |
2001 | 7 | Updating XML IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Clearly, in order to fully evolve XML into a universal data representation and sharing format, we must allow users to specify updates to XML documents and must develop techniques to process them efficiently. |
Igor Tatarinov; Zachary G. Ives; Alon Y. Halevy; Daniel S. Weld; |
2001 | 8 | On Computing Correlated Aggregates Over Continual Data Streams IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose single-pass techniques for approximate computation of correlated aggregates over both landmark and sliding window views of a data stream of tuples, using a very limited amount of space. |
Johannes Gehrke; Flip Korn; Divesh Srivastava; |
2001 | 9 | Efficient Computation Of Iceberg Cubes With Complex Measures IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study efficient methods for computing iceberg cubes with some popularly used complex measures, such as average, and develop a methodology that adopts a weaker but anti-monotonic condition for testing and pruning search space. |
Jiawei Han; Jian Pei; Guozhu Dong; Ke Wang; |
2001 | 10 | STHoles: A Multidimensional Workload-aware Histogram IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce STHoles, a “workload-aware” histogram that allows bucket nesting to capture data regions with reasonably uniform tuple density. |
Nicolas Bruno; Surajit Chaudhuri; Luis Gravano; |
2001 | 11 | Optimizing Queries Using Materialized Views: A Practical, Scalable Solution IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents a fast and scalable algorithm for determining whether part or all of a query can be computed from materialized views and describes how it can be incorporated in transformation-based optimizers. |
Jonathan Goldstein; Per-Åke Larson; |
2001 | 12 | PREFER: A System For The Efficient Execution Of Multi-parametric Ranked Queries IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We have implemented the algorithms proposed in this paper in a prototype system called PREFER, which operates on top of a commercial database management system. |
Vagelis Hristidis; Nick Koudas; Yannis Papakonstantinou; |
2001 | 13 | Selectivity Estimation Using Probabilistic Models IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we show how probabilistic graphical models can be effectively used for this task as an accurate and compact approximation of the joint frequency distribution of multiple attributes across multiple relations. |
Lise Getoor; Benjamin Taskar; Daphne Koller; |
2001 | 14 | Automatic Segmentation Of Text Into Structured Records IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we present a method for automatically segmenting unformatted text records into structured elements. |
Vinayak Borkar; Kaustubh Deshmukh; Sunita Sarawagi; |
2001 | 15 | Minimization Of Tree Pattern Queries IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study tree pattern minimization both in the absence and in the presence of integrity constraints (ICs) on the underlying tree-structured database. |
Sihem Amer-Yahia; SungRan Cho; Laks V. S. Lakshmanan; Divesh Srivastava; |
2000 | 1 | Mining Frequent Patterns Without Candidate Generation IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this study, we propose a novel frequent pattern tree (FP-tree) structure, which is an extended prefix-tree structure for storing compressed, crucial information about frequent patterns, and develop an efficient FP-tree-based mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth. |
Jiawei Han; Jian Pei; Yiwen Yin; |
2000 | 2 | LOF: Identifying Density-based Local Outliers IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we contend that for many scenarios, it is more meaningful to assign to each object a degree of being an outlier. |
Markus M. Breunig; Hans-Peter Kriegel; Raymond T. Ng; Jörg Sander; |
2000 | 3 | Efficient Algorithms For Mining Outliers From Large Data Sets IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel formulation for distance-based outliers that is based on the distance of a point from its kth nearest neighbor. |
Sridhar Ramaswamy; Rajeev Rastogi; Kyuseok Shim; |
2000 | 4 | Privacy-preserving Data Mining IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Since the primary task in data mining is the development of models about aggregated data, can we develop accurate models without access to precise information in individual data records? |
Rakesh Agrawal; Ramakrishnan Srikant; |
2000 | 5 | NiagaraCQ: A Scalable Continuous Query System For Internet Databases IF:10 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents the design of NiagaraCQ system and gives some experimental results on the system’s performance and scalability. |
Jianjun Chen; David J. DeWitt; Feng Tian; Yuan Wang; |
2000 | 6 | Eddies: Continuously Adaptive Query Processing IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we introduce a query processing mechanism called an eddy, which continuously reorders operators in a query plan as it runs. |
Ron Avnur; Joseph M. Hellerstein; |
2000 | 7 | Indexing The Positions Of Continuously Moving Objects IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The present paper proposes a novel, R*-tree based indexing technique that supports the efficient querying of the current and projected future positions of such moving objects. |
Simonas Šaltenis; Christian S. Jensen; Scott T. Leutenegger; Mario A. Lopez; |
2000 | 8 | XMill: An Efficient Compressor For XML Data IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe a tool for compressing XML data, with applications in data exchange and archiving, which usually achieves about twice the compression ratio of gzip at roughly the same speed. |
Hartmut Liefke; Dan Suciu; |
2000 | 9 | Finding Generalized Projected Clusters In High Dimensional Spaces IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We discuss very general techniques for projected clustering which are able to construct clusters in arbitrarily aligned subspaces of lower dimensionality. |
Charu C. Aggarwal; Philip S. Yu; |
2000 | 10 | Influence Sets Based On Reverse Nearest Neighbor Queries IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a general approach for solving RNN queries and an efficient R-tree based method for large data sets, based on this approach. |
Flip Korn; S. Muthukrishnan; |
2000 | 11 | Synchronizing A Database To Improve Freshness IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we study how to refresh a local copy of an autonomous data source to maintain the copy up-to-date. |
Junghoo Cho; Hector Garcia-Molina; |
2000 | 12 | Efficient And Extensible Algorithms For Multi Query Optimization IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we demonstrate that multi-query optimization using heuristics is practical, and provides significant benefits. |
Prasan Roy; S. Seshadri; S. Sudarshan; Siddhesh Bhobe; |
2000 | 13 | Making B+- Trees Cache Conscious In Main Memory IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a new indexing technique called “Cache Sensitive B+-Trees” (CSB+-Trees). |
Jun Rao; Kenneth A. Ross; |
2000 | 14 | A Data Model And Data Structures For Moving Objects Databases IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We formally define a data model for such databases that includes complex evolving spatial structures such as line networks or multi-component regions with holes. |
Luca Forlizzi; Ralf Hartmut Güting; Enrico Nardelli; Markus Schneider; |
2000 | 15 | A Framework For Expressing And Combining Preferences IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a formal framework for expressing and combining user preferences to address this problem. |
Rakesh Agrawal; Edward L. Wimmers; |
1999 | 1 | OPTICS: Ordering Points To Identify The Clustering Structure IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: For medium sized data sets, the cluster-ordering can be represented graphically and for very large data sets, we introduce an appropriate visualization technique. |
Mihael Ankerst; Markus M. Breunig; Hans-Peter Kriegel; Jörg Sander; |
1999 | 2 | Fast Algorithms For Projected Clustering IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We develop an algorithmic framework for solving the projected clustering problem, and test its performance on synthetic data. |
Charu C. Aggarwal; Joel L. Wolf; Philip S. Yu; Cecilia Procopiuc; Jong Soo Park; |
1999 | 3 | Storing Semistructured Data With STORED IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe a technique that can use relational database management systems to store and manage semistructured data. |
Alin Deutsch; Mary Fernandez; Dan Suciu; |
1999 | 4 | Bottom-up Computation Of Sparse And Iceberg CUBE IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a new algorithm (BUC) for Iceberg-CUBE computation. We introduce the Iceberg-CUBE problem as a reformulation of the datacube (CUBE) problem. |
Kevin Beyer; Raghu Ramakrishnan; |
1999 | 5 | An Adaptive Query Execution System For Data Integration IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents the Tukwila data integration system, designed to support adaptivity at its core using a two-pronged approach. |
Zachary G. Ives; Daniela Florescu; Marc Friedman; Alon Levy; Daniel S. Weld; |
1999 | 6 | Ripple Joins For Online Aggregation IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a new family of join algorithms, called ripple joins, for online processing of multi-table aggregation queries in a relational database management system (DBMS). |
Peter J. Haas; Joseph M. Hellerstein; |
1999 | 7 | Approximate Computation Of Multidimensional Aggregates Of Sparse Data Using Wavelets IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a novel method that provides approximate answers to high-dimensional OLAP aggregation queries in massive sparse data sets in a time-efficient and space-efficient manner. |
Jeffrey Scott Vitter; Min Wang; |
1999 | 8 | Join Synopses For Approximate Query Answering IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we demonstrate the difficulty of providing good approximate answers for join-queries using only statistics (in particular, samples) from the base relations. |
Swarup Acharya; Phillip B. Gibbons; Viswanath Poosala; Sridhar Ramaswamy; |
1999 | 9 | Record-boundary Discovery In Web Documents IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we describe a heuristic approach to discovering record boundaries in Web documents. |
D. W. Embley; Y. Jiang; Y.-K. Ng; |
1999 | 10 | On Random Sampling Over Joins IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present theoretical results explaining the difficulty of this problem and setting limits on the efficiency that can be achieved. |
Surajit Chaudhuri; Rajeev Motwani; Vivek Narasayya; |
1999 | 11 | Self-tuning Histograms: Building Histograms Without Looking At Data IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we introduce self-tuning histograms. |
Ashraf Aboulnaga; Surajit Chaudhuri; |
1999 | 12 | XML-based Information Mediation With MIX IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: The MIX mediator system, MIXm, is developed as part of the MIX Project at the San Diego Supercomputer Center, and the University of California, San Diego.1 … |
CHAITAN BARU et. al. |
1999 | 13 | Online Association Rule Mining IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a novel algorithm to compute large itemsets online. |
Christian Hidber; |
1999 | 14 | DynaMat: A Dynamic View Management System For Data Warehouses IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we present DynaMat, a system that dynamically materializes information at multiple levels of granularity in order to match the demand (workload) but also takes into account the maintenance restrictions for the warehouse, such as down time to update the views and space availability. |
Yannis Kotidis; Nick Roussopoulos; |
1999 | 15 | Selectivity Estimation In Spatial Databases IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we examine selectivity estimation in the context of Geographic Information Systems, which manage spatial data such as points, lines, poly-lines and polygons. |
Swarup Acharya; Viswanath Poosala; Sridhar Ramaswamy; |
1998 | 1 | CURE: An Efficient Clustering Algorithm For Large Databases IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a new clustering algorithm called CURE that is more robust to outliers, and identifies clusters having non-spherical shapes and wide variances in size. |
Sudipto Guha; Rajeev Rastogi; Kyuseok Shim; |
1998 | 2 | Automatic Subspace Clustering Of High Dimensional Data For Data Mining Applications IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present CLIQUE, a clustering algorithm that satisfies each of these requirements. |
Rakesh Agrawal; Johannes Gehrke; Dimitrios Gunopulos; Prabhakar Raghavan; |
1998 | 3 | Efficiently Mining Long Patterns From Databases IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a pattern-mining algorithm that scales roughly linearly in the number of maximal patterns embedded in a database irrespective of the length of the longest pattern. |
Roberto J. Bayardo; |
1998 | 4 | Enhanced Hypertext Categorization Using Hyperlinks IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our contribution is to propose robust statistical models and a relaxation labeling technique for better classification by exploiting link information in a small neighborhood around documents. |
Soumen Chakrabarti; Byron Dom; Piotr Indyk; |
1998 | 5 | Exploratory Mining And Pruning Optimizations Of Constrained Associations Rules IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose, in this paper, an architecture that opens up the black-box, and supports constraint-based, human-centered exploratory mining of associations. |
Raymond T. Ng; Laks V. S. Lakshmanan; Jiawei Han; Alex Pang; |
1998 | 6 | Optimal Multi-step K-nearest Neighbor Search IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: After revealing the strong performance shortcomings of the state-of-the-art algorithm for k-nearest neighbor search [Korn et al. 1996], we present a novel multi-step algorithm which is guaranteed to produce the minimum number of candidates. |
Thomas Seidl; Hans-Peter Kriegel; |
1998 | 7 | New Sampling-based Summary Statistics For Improving Approximate Query Answers IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper introduces two new sampling-based summary statistics, concise samples and counting samples, and presents new techniques for their fast incremental maintenance regardless of the data distribution. |
Phillip B. Gibbons; Yossi Matias; |
1998 | 8 | The Pyramid-technique: Towards Breaking The Curse Of Dimensionality IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose the Pyramid-Technique, a new indexing method for high-dimensional data spaces. |
Stefan Berchtold; Christian Böhm; Hans-Peter Kriegal; |
1998 | 9 | Integration Of Heterogeneous Databases Without Common Domains Using Queries Based On Textual Similarity IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we reject the assumption that global domains can be easily constructed, and assume instead that the names are given in natural language text. |
William W. Cohen; |
1998 | 10 | Wavelet-based Histograms For Selectivity Estimation IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a technique based upon a multiresolution wavelet decomposition for building histograms on the underlying data distributions, with applications to databases, statistics, and simulation. |
Yossi Matias; Jeffrey Scott Vitter; Min Wang; |
1998 | 11 | Efficient Mid-query Re-optimization Of Sub-optimal Query Execution Plans IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we describe an algorithm that detects sub-optimality of a query execution plan during query execution and attempts to correct the problem. |
Navin Kabra; David J. DeWitt; |
1998 | 12 | Bitmap Index Design And Evaluation IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a general framework to study the design space of bitmap indexes for selection queries and examine the disk-space and time characteristics that the various alternative index choices offer. |
Chee-Yong Chan; Yannis E. Ioannidis; |
1998 | 13 | Your Mediators Need Data Conversion! IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present the YAT system for data conversion. |
Sophie Cluet; Claude Delobel; Jérǒme Siméon; Katarzyna Smaga; |
1998 | 14 | Catching The Boat With Strudel: Experiences With A Web-site Management System IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe Strudel’s key characteristics, report on our experiences using Strudel, and present the technical problems that arose from our experience. |
Mary Fernández; Daniela Florescu; Jaewoo Kang; Alon Levy; Dan Suciu; |
1998 | 15 | Integrating Association Rule Mining With Relational Database Systems: Alternatives And Implications IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In support of this trend, we consider a spectrum of architectural alternatives for coupling mining with database systems. |
Sunita Sarawagi; Shiby Thomas; Rakesh Agrawal; |
1997 | 1 | Dynamic Itemset Counting And Implication Rules For Market Basket Data IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We consider the problem of analyzing market-basket data and present several important contributions. |
Sergey Brin; Rajeev Motwani; Jeffrey D. Ullman; Shalom Tsur; |
1997 | 2 | Beyond Market Baskets: Generalizing Association Rules To Correlations IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose measuring significance of associations via the chi-squared test for correlation from classical statistics. |
Sergey Brin; Rajeev Motwani; Craig Silverstein; |
1997 | 3 | Online Aggregation IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose a new online aggregation interface that permits users to both observe the progress of their aggregation queries and control execution on the fly. |
Joseph M. Hellerstein; Peter J. Haas; Helen J. Wang; |
1997 | 4 | The SR-tree: An Index Structure For High-dimensional Nearest Neighbor Queries IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: To overcome this drawback, we propose a new index structure called the SR-tree (Sphere/Rectangle-tree) which integrates bounding spheres and bounding rectangles. |
Norio Katayama; Shin’ichi Satoh; |
1997 | 5 | Improved Query Performance With Variant Indexes IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The paper concludes by introducing a new method whereby multi-dimensional group-by queries, reminiscent of OLAP/Datacube queries but with more flexibility, can be very efficiently performed. |
Patrick O’Neil; Dallan Quass; |
1997 | 6 | InfoSleuth: Agent-based Semantic Integration Of Information In Open And Dynamic Environments IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The goal of the InfoSleuth project at MCC is to exploit and synthesize new technologies into a unified system that retrieves and processes information in an ever-changing network of information sources. |
R. J. BAYARDO et. al. |
1997 | 7 | Balancing Push And Pull For Data Broadcast IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we study how to augment the push-only model with a “pull-based” approach of using a backchannel to allow clients to send explicit requests for data to the server. We propose and investigate a set of three techniques that can delay the onset of saturation and thus, enhance the performance and scalability of the system. |
Swarup Acharya; Michael Franklin; Stanley Zdonik; |
1997 | 8 | Efficiently Supporting Ad Hoc Queries In Large Datasets Of Time Sequences IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we consider a very large dataset comprising multiple distinct time sequences. |
Flip Korn; H. V. Jagadish; Christos Faloutsos; |
1997 | 9 | Infomaster: An Information Integration System IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: Infomaster is an information integration system that provides integrated access to multiple distributed heterogeneous information sources on the Internet, thus giving the illusion … |
Michael R. Genesereth; Arthur M. Keller; Oliver M. Duschka; |
1997 | 10 | Scalable Parallel Data Mining For Association Rules IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present two new parallel algorithms for mining association rules. |
Eui-Hong Han; George Karypis; Vipin Kumar; |
1997 | 11 | An Array-based Algorithm For Simultaneous Multidimensional Aggregates IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present a MOLAP algorithm to compute the Cube, and compare it to a leading ROLAP algorithm. |
Yihong Zhao; Prasad M. Deshpande; Jeffrey F. Naughton; |
1997 | 12 | Similarity-based Queries For Time Series Data IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a query processing algorithm that uses the underlying R-tree index of a multidimensional data set to answer similarity queries efficiently. |
Davood Rafiei; Alberto Mendelzon; |
1997 | 13 | Meaningful Change Detection In Structured Data IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we focus on detecting meaningful changes in hierarchically structured data, such as nested-object data. |
Sudarshan S. Chawathe; Hector Garcia-Molina; |
1997 | 14 | Range Queries In OLAP Data Cubes IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present fast algorithms for range queries for two types of aggregation operations: SUM and MAX. |
Ching-Tien Ho; Rakesh Agrawal; Nimrod Megiddo; Ramakrishnan Srikant; |
1997 | 15 | Maintenance Of Data Cubes And Summary Tables In A Warehouse IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose a method of maintaining aggregate views (the summary-delta table method), and use it to solve two problems in maintaining summary tables in a warehouse: (1) how to efficiently maintain a summary table while minimizing the batch window needed for maintenance, and (2) how to maintain a large set of summary tables defined over the same base tables. |
Inderpal Singh Mumick; Dallan Quass; Barinderpal Singh Mumick; |
1996 | 1 | BIRCH: An Efficient Data Clustering Method For Very Large Databases IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Prior work does not adequately address the problem of large datasets and minimization of I/O costs.This paper presents a data clustering method named BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies), and demonstrates that it is especially suitable for very large databases. |
Tian Zhang; Raghu Ramakrishnan; Miron Livny; |
1996 | 2 | Mining Quantitative Association Rules In Large Relational Tables IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce the problem of mining association rules in large relational tables containing both quantitative and categorical attributes. |
Ramakrishnan Srikant; Rakesh Agrawal; |
1996 | 3 | Implementing Data Cubes Efficiently IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we investigate the issue of which cells (views) to materialize when it is too expensive to materialize all views. |
Venky Harinarayan; Anand Rajaraman; Jeffrey D. Ullman; |
1996 | 4 | The Dangers Of Replication And A Solution IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: A new two-tier replication algorithm is proposed that allows mobile (disconnected) applications to propose tentative update transactions that are later applied to a master copy. |
Jim Gray; Pat Helland; Patrick O’Neil; Dennis Shasha; |
1996 | 5 | Data Mining Techniques IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Techniques for mining knowledge in different kinds of databases, including relational, transaction, object-oriented, spatial, and active databases, as well as global information systems, will be examined. |
Jiawei Han; |
1996 | 6 | A Query Language And Optimization Techniques For Unstructured Data IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe new optimization techniques for the deep or "vertical" dimension of UnQL queries. |
Peter Buneman; Susan Davidson; Gerd Hillebrand; Dan Suciu; |
1996 | 7 | Improved Histograms For Selectivity Estimation Of Range Predicates IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we provide a taxonomy of histograms that captures all previously proposed histogram types and indicates many new possibilities. |
Viswanath Poosala; Peter J. Haas; Yannis E. Ioannidis; Eugene J. Shekita; |
1996 | 8 | Change Detection In Hierarchically Structured Information IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Since in many cases changes must be computed from old and new versions of the data, we define the hierarchical change detection problem as the problem of finding a "minimum-cost edit script" that transforms one data tree to another, and we present efficient algorithms for computing such an edit script. |
Sudarshan S. Chawathe; Anand Rajaraman; Hector Garcia-Molina; Jennifer Widom; |
1996 | 9 | Query Caching And Optimization In Distributed Mediator Systems IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a cost-based optimization technique that caches statistics of actual calls to the sources and consequently estimates the cost of the possible execution plans based on the statistics cache. |
S. Adali; K. S. Candan; Y. Papakonstantinou; V. S. Subrahmanian; |
1996 | 10 | Partition Based Spatial-merge Join IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper describes PBSM (Partition Based Spatial-Merge), a new algorithm for performing spatial join operation. |
Jignesh M. Patel; David J. DeWitt; |
1996 | 11 | Data Mining Using Two-dimensional Optimized Association Rules: Scheme, Algorithms, And Visualization IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: For each class, we propose efficient algorithms for computing the regions that give optimal association rules for gain, support, and confidence, respectively. |
Takeshi Fukuda; Yasukiko Morimoto; Shinichi Morishita; Takeshi Tokuyama; |
1996 | 12 | A Framework For Supporting Data Integration Using The Materialized And Virtual Approaches IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents a framework for data integration currently under development in the Squirrel project. |
Richard Hull; Gang Zhou; |
1996 | 13 | Materialized View Maintenance And Integrity Constraint Checking: Trading Space For Time IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We investigate the problem of incremental maintenance of an SQL view in the face of database updates, and show that it is possible to reduce the total time cost of view maintenance by materializing (and maintaining) additional views. |
Kenneth A. Ross; Divesh Srivastava; S. Sudarshan; |
1996 | 14 | Algorithms For Deferred View Maintenance IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present new algorithms to incrementally refresh a view during deferred maintenance. |
Latha S. Colby; Timothy Griffin; Leonid Libkin; Inderpal Singh Mumick; Howard Trickey; |
1996 | 15 | Spatial Hash-joins IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We examine how to apply the hash-join paradigm to spatial joins, and define a new framework for spatial hash-joins. |
Ming-Ling Lo; Chinya V. Ravishankar; |
1995 | 1 | An Effective Hash-based Algorithm For Mining Association Rules IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we examine the issue of mining association rules among items in a large database of sales transactions. |
Jong Soo Park; Ming-Syan Chen; Philip S. Yu; |
1995 | 2 | Nearest Neighbor Queries IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: A frequently encountered type of query in Geographic Information Systems is to find the k nearest neighbor objects to a given point in space. Processing such queries requires … |
Nick Roussopoulos; Stephen Kelley; Frédéric Vincent; |
1995 | 3 | FastMap: A Fast Algorithm For Indexing, Data-mining And Visualization Of Traditional And Multimedia Datasets IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe a fast algorithm to map objects into points in some k-dimensional space (k is user-defined), such that the dis-similarities are preserved. |
Christos Faloutsos; King-Ip Lin; |
1995 | 4 | A Critique Of ANSI SQL Isolation Levels IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper shows that these phenomena and the ANSI SQL definitions fail to properly characterize several popular isolation levels, including the standard locking implementations of the levels covered. |
HAL BERENSON et. al. |
1995 | 5 | The Merge/purge Problem For Large Databases IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we detail the sorted neighborhood method that is used by some to solve merge/purge and present experimental results that demonstrates this approach may work well in practice but at great expense. |
Mauricio A. Hernández; Salvatore J. Stolfo; |
1995 | 6 | Copy Detection Mechanisms For Digital Documents IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe algorithms for such detection, and metrics required for evaluating detection mechanisms (covering accuracy, efficiency, and security). |
Sergey Brin; James Davis; Héctor García-Molina; |
1995 | 7 | Broadcast Disks: Data Management For Asymmetric Communication Environments IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe a new technique called "Broadcast Disks" for structuring the broadcast in a way that provides improved performance for non-uniformly accessed data. |
Swarup Acharya; Rafael Alonso; Michael Franklin; Stanley Zdonik; |
1995 | 8 | View Maintenance In A Warehousing Environment IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We introduce a new algorithm, ECA (for "Eager Compensating Algorithm"), that eliminates the anomalies. |
Yue Zhuge; Héctor García-Molina; Joachim Hammer; Jennifer Widom; |
1995 | 9 | Balancing Histogram Optimality And Practicality For Query Result Size Estimation IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present both theoretical and experimental results on several issues related to this trade-off. |
Yannis E. Ioannidis; Viswanath Poosala; |
1995 | 10 | Incremental Maintenance Of Views With Duplicates IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present an algorithm that propagates changes from base relations to materialized views. |
Timothy Griffin; Leonid Libkin; |
1995 | 11 | Efficient Optimistic Concurrency Control Using Loosely Synchronized Clocks IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper describes an efficient optimistic concurrency control scheme for use in distributed database systems in which objects are cached and manipulated at client machines while persistent storage and transactional support are provided by servers. |
Atul Adya; Robert Gruber; Barbara Liskov; Umesh Maheshwari; |
1995 | 12 | Topological Relations In The World Of Minimum Bounding Rectangles: A Study With R-trees IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper is concerned with the retrieval of topological relations in Minimum Bounding Rectangle-based data structures. |
Dimitris Papadias; Timos Sellis; Yannis Theodoridis; Max J. Egenhofer; |
1995 | 13 | Applying Update Streams In A Soft Real-time Database System IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we discuss the various properties of updates and views (including staleness) that affect this tradeoff. |
B. Adelberg; H. Garcia-Molina; B. Kao; |
1995 | 14 | Adaptive Parallel Aggregation Algorithms IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this work, we propose new algorithms that dynamically adapt, at query evaluation time, in response to observed grouping selectivities. |
Ambuj Shatdal; Jeffrey F. Naughton; |
1995 | 15 | An Online Video Placement Policy Based On Bandwidth To Space Ratio (BSR) IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a dynamic placement policy (called the Bandwidth to Space Ratio (BSR) Policy) that creates and/or deletes replica of a video, and mixes hot and cold videos so as to make the best use of bandwidth and space of a storage device. |
Asit Dan; Dinkar Sitaram; |
1994 | 1 | Fast Subsequence Matching In Time-series Databases IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present an efficient indexing method to locate 1-dimensional subsequences within a collection of sequences, such that the subsequences match a given (query) pattern within a specified tolerance. |
Christos Faloutsos; M. Ranganathan; Yannis Manolopoulos; |
1994 | 2 | Sleepers And Workaholics: Caching Strategies In Mobile Environments IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a taxonomy of different cache invalidation strategies and study the impact of client’s disconnection times on their performance. |
Daniel Barbará; Tomasz Imieliński; |
1994 | 3 | Shoring Up Persistent Applications IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we give the goals and motivation for SHORE, and describe how SHORE provides features of both technologies. |
MICHAEL J. CAREY et. al. |
1994 | 4 | From Structured Documents To Novel Query Facilities IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper describes a natural mapping from SGML documents into OODB’s and a formal extension of two OODB query languages (one SQL-like and the other calculus) in order to deal with SGML document retrieval. |
V. Christophides; S. Abiteboul; S. Cluet; M. Scholl; |
1994 | 5 | Energy Efficient Indexing On Air IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe two methods, (1,m) Indexing and Distributed Indexing, for organizing and accessing broadcast data. |
Tomasz Imielinski; S. Viswanathan; B. R. Badrinath; |
1994 | 6 | XSB As An Efficient Deductive Database Engine IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper describes the XSB system, and its use as an in-memory deductive database engine. |
Konstantinos Sagonas; Terrance Swift; David S. Warren; |
1994 | 7 | Quickly Generating Billion-record Synthetic Databases IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents several database generation techniques. |
Jim Gray; Prakash Sundaresan; Susanne Englert; Ken Baclawski; Peter J. Weinberger; |
1994 | 8 | Staggered Striping In Multimedia Information Systems IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper describes staggered striping as a novel technique to provide effective support for multiple users accessing the different objects in the database. |
Steven Berson; Shahram Ghandeharizadeh; Richard Muntz; Xiangyu Ju; |
1994 | 9 | The Effectiveness Of GIOSS For The Text Database Discovery Problem IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The first part of this paper presents a practical solution based on estimating the result size of a query and a database. |
Luis Gravano; Héctor García-Molina; Anthony Tomasic; |
1994 | 10 | Data Replication For Mobile Computers IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we present and analyze various static and dynamic data allocation methods. |
Yixiu Huang; Prasad Sistla; Ouri Wolfson; |
1994 | 11 | Optimization Of Dynamic Query Evaluation Plans IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Instead, we propose a novel optimization model that assigns the bulk of the optimization effort to compile-time and delays carefully selected optimization decisions until run-time. |
Richard L. Cole; Goetz Graefe; |
1994 | 12 | Adaptive Selectivity Estimation Using Query Feedback IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we propose a novel approach for estimating the record selectivities of database queries. |
Chungmin Melvin Chen; Nick Roussopoulos; |
1994 | 13 | Incremental Updates Of Inverted Lists For Text Document Retrieval IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, the problem of incremental updates of inverted lists is addressed using a new dual-structure index. |
Anthony Tomasic; Héctor García-Molina; Kurt Shoens; |
1994 | 14 | Combinatorial Pattern Discovery For Scientific Data: Some Preliminary Results IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents an example of combinatorial pattern discovery: the discovery of patterns in protein databases. |
JASON TSONG-LI WANG et. al. |
1994 | 15 | Spatial Joins Using Seeded Trees IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we explore a spatial join method that dynamically constructs index trees called seeded trees at join time. |
Ming-Ling Lo; Chinya V. Ravishankar; |
1993 | 1 | Mining Association Rules Between Sets Of Items In Large Databases IF:10 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present an efficient algorithm that generates all significant association rules between items in the database. |
Rakesh Agrawal; Tomasz Imieliński; Arun Swami; |
1993 | 2 | The LRU-K Page Replacement Algorithm For Database Disk Buffering IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper introduces a new approach to database disk buffering, called the LRU-K method. |
Elizabeth J. O’Neil; Patrick E. O’Neil; Gerhard Weikum; |
1993 | 3 | Maintaining Views Incrementally IF:8 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present incremental evaluation algorithms to compute changes to materialized views in relational and deductive database systems, in response to changes (insertions, deletions, and updates) to the relations. |
Ashish Gupta; Inderpal Singh Mumick; V. S. Subrahmanian; |
1993 | 4 | Efficient Processing Of Spatial Joins Using R-trees IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Starting from a straightforward approach, we present several techniques for improving its execution time with respect to both, CPU- and I/O-time. |
Thomas Brinkhoff; Hans-Peter Kriegel; Bernhard Seeger; |
1993 | 5 | Predicate Migration: Optimizing Queries With Expensive Predicates IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we develop a theory for moving expensive predicates in a query plan so that the total cost of the plan — including the costs of both joins and restrictions — is minimal. |
Joseph M. Hellerstein; Michael Stonebraker; |
1993 | 6 | Practical Prefetching Via Data Compression IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we analyze the practical aspects of using data compression techniques for prefetching. |
Kenneth M. Curewitz; P. Krishnan; Jeffrey Scott Vitter; |
1993 | 7 | The 007 Benchmark IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we describe the benchmark and present performance results from its implementation in three OODBMS systems. |
Michael J. Carey; David J. DeWitt; Jeffrey F. Naughton; |
1993 | 8 | Intelligent Integration Of Information IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper describes and classifies methods to transform data to information in a three-layer, mediated architecture. |
Gio Wiederhold; |
1993 | 9 | The SEQUOIA 2000 Storage Benchmark IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents a benchmark that concisely captures the data base requirements of a collection of Earth Scientists working in the SEQUOIA 2000 project on various aspects of global change research. |
Michael Stonebraker; Jim Frew; Kenn Gardels; Jeff Meredith; |
1993 | 10 | Database System Issues In Nomadic Computing IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper discusses in some detail the impact of nomadic computing on a number of traditional database system concepts. |
Rafael Alonso; Henry F. Korth; |
1993 | 11 | LH: Linear Hashing For Distributed Files IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: LH* generalizes Linear Hashing to parallel or distributed RAM and disk files. An LH* file can be created from objects provided by any number of distributed and autonomous clients. … |
Witold Litwin; Marie-Anne Neimat; Donovan A. Schneider; |
1993 | 12 | A Modeling Study Of The TPC-C Benchmark IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we present results from a modelling study of the TPC-C benchmark for both single node and distributed database management systems. |
Scott T. Leutenegger; Daniel Dias; |
1993 | 13 | Hy+: A Hygraph-based Query And Visualization System IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Mariano Consens; Alberto Mendelzon; |
1993 | 14 | Experiences Building The Open OODB Query Optimizer IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper reports our experiences building the query optimizer for TI’s Open OODB system. |
José A. Blakeley; William J. McKenna; Goetz Graefe; |
1993 | 15 | Loading Data Into Description Reasoners IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present the architecture and algorithms of a system that converts most of the inferences made by the KBMS into a collection of SQL queries, thereby relying on the optimization facilities of existing DBMS to gain efficiency, while maintaining an object-centered view of the world with a substantive semantics and significantly different reasoning facilities than those provided by Relational DBMS and their deductive extensions. |
Alex Borgida; Ronald J. Brachman; |
1992 | 1 | Continuous Queries Over Append-only Databases IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper describes the techniques used in Tapestry, which do not depend on triggers and thus be implemented on any commercial database that supports SQL. |
Douglas Terry; David Goldberg; David Nichols; Brian Oki; |
1992 | 2 | Querying Object-oriented Databases IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Michael Kifer; Won Kim; Yehoshua Sagiv; |
1992 | 3 | Extensible/rule Based Query Rewrite Optimization In Starburst IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper describes the Query Rewrite facility of the Starburst extensible database system, a novel phase of query optimization. |
Hamid Pirahesh; Joseph M. Hellerstein; Waqar Hasan; |
1992 | 4 | Event Specification In An Active Object-oriented Database IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe a model and a language for specifying basic and composite trigger events in the context of an object-oriented database. |
N. H. Gehani; H. V. Jagadish; O. Shmueli; |
1992 | 5 | ARIES/IM: An Efficient And High Concurrency Index Management Method Using Write-ahead Logging IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a method, called ARIESIIM (Algorithm for Recovery and Isolation Exploiting Semantics for Index Management), for concurrency control and recovery of B+-trees. |
C. Mohan; Frank Levine; |
1992 | 6 | Behavior Of Database Production Rules: Termination, Confluence, And Observable Determinism IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The analysis methods are presented in the context of the Starburst Rule System; they will form the basis of an interactive development environment for Starburst rule programmers. |
Alexander Aiken; Jennifer Widom; Joseph M. Hellerstein; |
1992 | 7 | Rule Condition Testing And Action Execution In Ariel IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper describes testing of rule conditions and execution of rule actions in Ariel active DBMS. |
Eric N. Hanson; |
1992 | 8 | Sequential Sampling Procedures For Query Size Estimation IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We provide a procedure, based on random sampling, for estimation of the size of a query result. |
Peter J. Haas; Arun N. Swami; |
1992 | 9 | Parallel R-trees IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our goal is to design a server for spatial data, so that to maximize the throughput of range queries. |
Ibrahim Kamel; Christos Faloutsos; |
1992 | 10 | A General Framework For The Optimization Of Object-oriented Queries IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The goal of this work is to integrate in a general framework the different query optimization techniques that have been proposed in the object-oriented context. |
Sophie Cluet; Claude Delobel; |
1992 | 11 | Efficient And Flexible Methods For Transient Versioning Of Records To Avoid Locking By Read-only Transactions IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present efficient and flexible methods which permit read-only transactions that do not mind reading a possibly slightly old, but still consistent, version of the data base to execute without acquiring locks. |
C. Mohan; Hamid Pirahesh; Raymond Lorie; |
1992 | 12 | Query Optimization For Parallel Execution IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We address this novel problem in the context of Select-Project-Join queries by extending the execution space, cost model and search algorithm that are widely used in commercial DBMSs. |
Sumit Ganguly; Waqar Hasan; Ravi Krishnamurthy; |
1992 | 13 | On The Performance Of Object Clustering Techniques IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We investigate the performance of some of the best-known object clustering algorithms on four different workloads based upon the tektronix benchmark. |
Manolis M. Tsangaris; Jeffrey F. Naughton; |
1992 | 14 | The Performance Of Three Database Storage Structures For Managing Large Objects IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This study analyzes the performance of the storage structures and algorithms employed in three experimental database storage systems – EXODUS, Starburst, and EOS – for managing large unstructured general-purpose objects. |
Alexandros Biliris; |
1992 | 15 | DOODLE: A Visual Language For Object-oriented Databases IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we introduce DOODLE, a new visual and declarative language for object-oriented databases. |
Isabel F. Cruz; |
1991 | 1 | Objects And Views IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Serge Abiteboul; Anthony Bonner; |
1991 | 2 | A Retrieval Technique For Similar Shapes IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
H. V. Jagadish; |
1991 | 3 | On The Propagation Of Errors In The Size Of Join Results IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Yannis E. Ioannidis; Stavros Christodoulakis; |
1991 | 4 | Aspects: Extending Objects To Support Multiple, Independent Roles IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Joel Richardson; Peter Schwarz; |
1991 | 5 | Replica Control In Distributed Systems: As Asynchronous Approach IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Calton Pu; Avraham Leff; |
1991 | 6 | Data Caching Tradeoffs In Client-server DBMS Architectures IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Michael J. Carey; Michael J. Franklin; Miron Livny; Eugene J. Shekita; |
1991 | 7 | Language Features For Interoperability Of Databases With Schematic Discrepancies IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Ravi Krishnamurthy; Witold Litwin; William Kent; |
1991 | 8 | Toward A Multilevel Secure Relational Data Model IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Sushil Jajodia; Ravi Sandhu; |
1991 | 9 | Left-deep Vs. Bushy Trees: An Analysis Of Strategy Spaces And Its Implications For Query Optimization IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Yannis E. Ioannidis; Younkyung Cha Kang; |
1991 | 10 | Cache Consistency And Concurrency Control In A Client/server DBMS Architecture IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Yongdong Wang; Lawrence A. Rowe; |
1991 | 11 | Segment Indexes: Dynamic Indexing Techniques For Multi-dimensional Interval Data IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Curtis P. Kolovson; Michael Stonebraker; |
1991 | 12 | Updating Relational Databases Through Object-based Views IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Thierry Barsalou; Niki Siambela; Arthur M. Keller; Gio Wiederhold; |
1991 | 13 | Algebraic Support For Complex Objects With Arrays, Identity, And Inheritance IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Scott L. Vandenberg; David J. DeWitt; |
1991 | 14 | An Optimistic Commit Protocol For Distributed Transaction Management IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Eliezer Levy; Henry F. Korth; Abraham Silberschatz; |
1991 | 15 | Dynamic File Allocation In Disk Arrays IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
Gerhard Weikum; Peter Zabback; Peter Scheuermann; |
1990 | 1 | The R*-tree: An Efficient And Robust Access Method For Points And Rectangles IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: The R-tree, one of the most popular access methods for rectangles, is based on the heuristic optimization of the area of the enclosing rectangle in each inner node. By running … |
Norbert Beckmann; Hans-Peter Kriegel; Ralf Schneider; Bernhard Seeger; |
1990 | 2 | Encapsulation Of Parallelism In The Volcano Query Processing System IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe the reasons for not choosing the bracket model, introduce the novel operator model, and provide details of Volcano’s exchange operator that parallelizes all other operators. |
Goetz Graefe; |
1990 | 3 | Organizing Long-running Activities With Triggers And Transactions IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe a model based on event-condition-action rules and coupling modes. |
Umeshwar Dayal; Meichun Hsu; Rivka Ladin; |
1990 | 4 | Randomized Algorithms For Optimizing Large Join Queries IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We have adapted these algorithms to the optimization of project-select-join queries. |
Y. E. Ioannidis; Younkyung Kang; |
1990 | 5 | Linear Clustering Of Objects With Multiple Attributes IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we discuss what the desired properties of such a mapping are, and evaluate, through analysis and simulation, several mappings that have been proposed in the past. |
H. V. Jagadish; |
1990 | 6 | Practical Selectivity Estimation Through Adaptive Sampling IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Recently we have proposed an adaptive, random sampling algorithm for general query size estimation. |
Richard J. Lipton; Jeffrey F. Naughton; Donovan A. Schneider; |
1990 | 7 | Set-oriented Production Rules In Relational Database Systems IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose incorporating a production rules facility into a relational database system. |
Jennifer Widom; S. J. Finkelstein; |
1990 | 8 | ACTA: A Framework For Specifying And Reasoning About Transaction Structure And Behavior IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: The ACTA framework is not yet another transaction model, but is intended to unify the existing models. |
Panayiotis K. Chrysanthis; Krithi Ramamritham; |
1990 | 9 | On Rules, Procedure, Caching And Views In Data Base Systems IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper demonstrates that a simple rule system can be constructed that supports a more powerful view system than available in current commercial systems. |
Michael Stonebraker; Anant Jhingran; Jeffrey Goh; Spyros Potamianos; |
1990 | 10 | Access Support In Object Bases IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present several alternative extensions of access support relations for a given path expression, the best of which has to be determined according to the application-specific database usage profile. |
Alfons Kemper; Guido Moerkotte; |
1990 | 11 | Integrating Object-oriented Data Modelling With A Rule-based Programming Paradigm IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: LOGRES is a new project for the development of extended database systems which is based on the integration of the object-oriented data modelling paradigm and of the rule-based … |
F. Cacace; S. Ceri; S. Crespi-Reghizzi; L. Tanca; R. Zicari; |
1990 | 12 | The Performance Of A Multiversion Access Method IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: The Time-Split B-tree is an integrated index structure for a versioned timestamped database. It gradually migrates data from a current database to an historical database, … |
David Lomet; Betty Salzberg; |
1990 | 13 | Reliable Transaction Management In A Multidatabase System IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We design a fault tolerant transaction management algorithm and recovery procedures that retain global database consistency. |
Yuri Breitbart; Avi Silberschatz; Glenn R. Thompson; |
1990 | 14 | Implementing Recoverable Requests Using Queues IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We discuss how to implement these protocols using transactions and recoverable queuing systems. |
Philip A. Bernstein; Meichun Hsu; Bruce Mann; |
1990 | 15 | Magic Is Relevant IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We define the magic-sets transformation for traditional relational systems (with duplicates, aggregation and grouping), as well as for relational systems extended with recursion. |
I. S. Mumick; S. J. Finkelstein; Hamid Pirahesh; Raghu Ramakrishnan; |
1989 | 1 | Concurrency Control In Groupware Systems IF:9 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper distinguishes real-time groupware systems from other multi-user systems and discusses their concurrency control requirements. |
C. A. Ellis; S. J. Gibbs; |
1989 | 2 | CLASSIC: A Structural Data Model For Objects IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: CLASSIC is a data model that encourages the description of objects not only in terms of their relations to other known objects, but in terms of a level of intensional structure as … |
Alexander Borgida; Ronald J. Brachman; Deborah L. McGuinness; Lori Alperin Resnick; |
1989 | 3 | The Architecture Of An Active Database Management System IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we propose an architecture for an active DBMS that supports ECA rules. |
Dennis McCarthy; Umeshwar Dayal; |
1989 | 4 | Object Identity As A Query Language Primitive IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our main contribution is the operational part of the data model, the query language IQL, which uses oid’s for three critical purposes: (1) to represent data-structures with sharing and cycles, (2) to manipulate sets and (3) to express any computable database query. |
Serge Abiteboul; Paris C. Kanellakis; |
1989 | 5 | F-logic: A Higher-order Language For Reasoning About Objects, Inheritance, And Scheme IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We propose a database logic which accounts in a clean declarative fashion for most of the “object-oriented” features such as object identity, complex objects, inheritance, methods, etc. |
Michael Kifer; Georg Lausen; |
1989 | 6 | Efficient Management Of Transitive Relationships In Large Data And Knowledge Bases IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present a transitive closure compression technique, based on labeling spanning trees with numeric intervals, and provide both analytical and empirical evidence of its efficacy, including a proof of optimality. |
R. Agrawal; A. Borgida; H. V. Jagadish; |
1989 | 7 | A Performance Evaluation Of Four Parallel Join Algorithms In A Shared-nothing Multiprocessor Environment IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper we analyze and compare four parallel join algorithms. |
Donovan A. Schneider; David J. DeWitt; |
1989 | 8 | Extensible Query Processing In Starburst IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we describe the design of Starburst’s query language processor and discuss the ways in which the language processor can be extended to achieve Starburst’s goals. |
L. M. Haas; J. C. Freytag; G. M. Lohman; H. Pirahesh; |
1989 | 9 | Composite Objects Revisited IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: An earlier paper [KIM87b] presented a model of composite objects which has been implemented in the ORION object-oriented database system at MCC. |
Won Kim; Elisa Bertino; Jorge F. Garza; |
1989 | 10 | ODE (Object Database And Environment): The Language And The Data Model IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents the linguistic facilities provided in O++ and the data model it supports. |
R. Agrawal; N. H. Gehani; |
1989 | 11 | Access Methods For Multiversion Data IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We present an access method designed to provide a single integrated index structure for a versioned timestamped database with a non-deletion policy. |
David Lomet; Betty Salzberg; |
1989 | 12 | Vertical Partitioning For Database Design: A Graphical Algorithm IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This algorithm starts from the attribute affinity matrix by considering it as a complete graph. |
Shamkant B. Navathe; Mingyoung Ra; |
1989 | 13 | Optimization Of Large Join Queries: Combining Heuristics And Combinatorial Techniques IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Examples of such heuristics are the augmentation and local improvement heuristics described in this paper and a heuristic proposed by Krishnamurthy et al. |
A. Swami; |
1989 | 14 | Dynamic Query Evaluation Plans IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: Our work aims at developing criteria when reoptimization is required, how these criteria can be implemented efficiently, and how reoptimization can be avoided by using a new technique called dynamic query evaluation plans. |
G. Graefe; K. Ward; |
1989 | 15 | Redundancy In Spatial Databases IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: Spatial objects other than points and boxes can be stored in spatial indexes, but the techniques usually require the use of approximations that can be arbitrarily bad. This leads … |
J. A. Orenstein; |
1988 | 1 | A Case For Redundant Arrays Of Inexpensive Disks (RAID) IF:10 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper introduces five levels of RAIDs, giving their relative cost/performance, and compares RAID to an IBM 3380 and a Fujitsu Super Eagle. |
David A. Patterson; Garth Gibson; Randy H. Katz; |
1988 | 2 | Data Placement In Bubba IF:7 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: We describe our heuristic approach to solving the data placement problem in Bubba. |
George Copeland; William Alexander; Ellen Boughter; Tom Keller; |
1988 | 3 | A Data Model And Query Language For EXODUS IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we present the design of the EXTRA data model and the EXCESS query language for the EXODUS extensible database system. |
Michael J. Carey; David J. DeWitt; Scott L. Vandenberg; |
1988 | 4 | O2, An Object-oriented Data Model IF:6 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: This paper presents a formal description of the object-oriented data model of this system. |
C. Lecluse; P. Richard; F. Velez; |
1988 | 5 | Optimization Of Large Join Queries IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we apply these general algorithms to the large join query optimization problem. |
Arun Swami; Anoop Gupta; |
1988 | 6 | Equi-depth Multidimensional Histograms IF:5 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Abstract: No abstract available. … |
M. Muralikrishna; David J. DeWitt; |
1988 | 7 | Transaction Management In An Object-oriented Database System IF:4 Related Papers Related Patents Related Grants Related Orgs Related Experts Details Highlight: In this paper, we describe transaction management in ORION, an object-oriented database system. |