PAPER DIGEST
Most Influential CIKM 1999 Paper · 2026-03 edition

Extracting Semi-structured Data Through Examples

Berthier Ribeiro-Neto; Alberto H. F. Laender; Altigran S. da Silva

Venue
ACM Conference on Information and Knowledge Management (CIKM) 1999
Recognition
Most Influential CIKM 1999 Paper (Rank No. 11)
Edition
2026-03
Impact factor
4
Certificate ID
f3522d78a71f48a3

Abstract

<i>In this paper, we describe an innovative approach to extracting semi-structured data from Web sources. The idea is to collect a couple of example objects from the user and to use this information to extract new objects from new pages or texts. To perform the extraction of new objects, we introduce a bottom-up extration strategy and, through experimentation, demonstrate that it works quite effectively with distinct Web sources, even if only a few examples are provided by the user.</i>

Download PDF certificate