PAPER DIGEST
Most Influential SIGMOD 2004 Paper · 2026-03 edition

IMAP: Discovering Complex Semantic Matches Between Database Schemas

Robin Dhamankar; Yoonkyong Lee; AnHai Doan; Alon Halevy; Pedro Domingos

Venue
ACM SIGMOD Conference (SIGMOD) 2004
Recognition
Most Influential SIGMOD 2004 Paper (Rank No. 6)
Edition
2026-03
Impact factor
6
Certificate ID
61b16a22510b7a07

Abstract

Creating semantic matches between disparate data sources is fundamental to numerous data sharing efforts. Manually creating matches is extremely tedious and error-prone. Hence many recent works have focused on automating the matching process. To date, however, virtually all of these works deal only with one-to-one (1-1) matches, such as <b>address = location</b>. They do not consider the important class of more complex matches, such as <b>address</b> = concat (<b>city, state</b>) and <b>room-pric</b> = <b>room-rate*</b><b>(1 + tax-rate)</b>.We describe the <b>iMAP</b> system which semi-automatically discovers both 1-1 and complex matches. <b>iMAP</b> reformulates schema matching as a <i>search</i> in an often very large or infinite match space. To search effectively, it employs a set of searchers, each discovering specific types of complex matches. To further improve matching accuracy, <b>iMAP</b> exploits a variety of domain knowledge, including past complex matches, domain integrity constraints, and overlap data. Finally, <b>iMAP</b> introduces a novel feature that generates explanation of predicted matches, to provide insights into the matching process and suggest actions to converge on correct matches quickly. We apply <b>iMAP</b> to several real-world domains to match relational tables, and show that it discovers both 1-1 and complex matches with high accuracy.

Download PDF certificate