PAPER DIGEST
Most Influential CIKM 1999 Paper · 2026-03 edition
Extracting Semi-structured Data Through Examples
Abstract
<i>In this paper, we describe an innovative approach to extracting semi-structured data from Web sources. The idea is to collect a couple of example objects from the user and to use this information to extract new objects from new pages or texts. To perform the extraction of new objects, we introduce a bottom-up extration strategy and, through experimentation, demonstrate that it works quite effectively with distinct Web sources, even if only a few examples are provided by the user.</i>