PAPER DIGEST
Most Influential CIKM 2003 Paper · 2026-03 edition

Categorizing Web Queries According To Geographical Locality

Luis Gravano; Vasileios Hatzivassiloglou; Richard Lichtenstein

Venue
ACM Conference on Information and Knowledge Management (CIKM) 2003
Recognition
Most Influential CIKM 2003 Paper (Rank No. 7)
Edition
2026-03
Impact factor
5
Certificate ID
568fe8a68426319e

Abstract

Web pages (and resources, in general) can be characterized according to their <i>geographical locality</i>. For example, a web page with general information about wildflowers could be considered a <i>global</i> page, likely to be of interest to a geographically broad audience. In contrast, a web page with listings on houses for sale in a specific city could be regarded as a <i>local</i> page, likely to be of interest only to an audience in a relatively narrow region. Similarly, some search engine queries (implicitly) target global pages, while other queries are after local pages. For example, the best results for query [wildflowers] are probably <i>global</i> pages about wildflowers such as the one discussed above. However, <i>local</i> pages that are relevant to, say, San Francisco are likely to be good matches for a query [houses for sale] that was issued by a San Francisco resident or by somebody moving to that city. Unfortunately, search engines do not analyze the geographical locality of queries and users, and hence often produce sub-optimal results. Thus query [wildflowers] might return pages that discuss wildflowers in specific U.S. states (and not general information about wildflowers), while query [houses for sale] might return pages with real estate listings for locations other than that of interest to the person who issued the query. Deciding whether an unseen query should produce mostly local or global pages---without placing this burden on the search engine users---is an important and challenging problem, because queries are often ambiguous or underspecify the information they are after. In this paper, we address this problem by first defining how to categorize queries according to their (often implicit) geographical locality. We then introduce several alternatives for automatically and efficiently categorizing queries in our scheme, using a variety of state-of-the-art machine learning tools. We report a thorough evaluation of our classifiers using a large sample of queries from a real web search engine, and conclude by discussing how our query categorization approach can help improve query result quality.

Download PDF certificate