PAPER DIGEST
Most Influential CIKM 2011 Paper · 2026-03 edition

LogSig: Generating System Events From Raw Textual Logs

Liang Tang; Tao Li; Chang-Shing Perng

Venue
ACM Conference on Information and Knowledge Management (CIKM) 2011
Recognition
Most Influential CIKM 2011 Paper (Rank No. 3)
Edition
2026-03
Impact factor
5
Certificate ID
b66156e1f1e514f8

Abstract

Modern computing systems generate large amounts of log data. System administrators or domain experts utilize the log data to understand and optimize system behaviors. Most system logs are raw textual and unstructured. One main fundamental challenge in automated log analysis is the generation of system events from raw textual logs. Log messages are relatively short text messages but may have a large vocabulary, which often result in poor performance when applying traditional text clustering techniques to the log data. Other related methods have various limitations and only work well for some particular system logs. In this paper, we propose a message signature based algorithm logSig to generate system events from textual log messages. By searching the most representative message signatures, logSig categorizes log messages into a set of event types. logSig can handle various types of log data, and is able to incorporate human's domain knowledge to achieve a high performance. We conduct experiments on five real system log data. Experiments show that logSig outperforms other alternative algorithms in terms of the overall performance.

Download PDF certificate