PAPER DIGEST
Most Influential CIKM 1999 Paper · 2026-03 edition

Extracting Significant Time Varying Features From Text

Russell Swan; James Allan

Venue
ACM Conference on Information and Knowledge Management (CIKM) 1999
Recognition
Most Influential CIKM 1999 Paper (Rank No. 10)
Edition
2026-03
Impact factor
4
Certificate ID
58672b90346e0569

Abstract

We propose a simple statistical model for the frequency of occurrence of features in a stream of text. Adoption of this model allows us to use classical significance tests to filter the stream for interesting events. We tested the model by building a system and running it on a news corpus. By a subjective evaluation, the system worked remarkably well: almost all of the groups of identified tokens corresponded to news stories and were appropriately placed in time. A preliminary objective evaluation was also used to measure the quality of the system and it showed some of the weaknesses and the power of our approach.

Download PDF certificate