Data Mining Using Two-dimensional Optimized Association Rules: Scheme, Algorithms, And Visualization
Abstract
We discuss data mining based on association rules for two numeric attributes and one Boolean attribute. For example, in a database of bank customers, "Age" and "Balance" are two numeric attributes, and "CardLoan" is a Boolean attribute. Taking the pair (Age, Balance) as a point in two-dimensional space, we consider an association rule of the form((<i>Age, Balance</i>) ∈ <i>P</i>) ⇒ (<i>CardLoan</i> = <i>Yes</i>),which implies that bank customers whose ages and balances fall in a planar region <i>P</i> tend to use card loan with a high probability. We consider two classes of regions, rectangles and <i>admissible</i> (i.e. connected and <i>x</i>-monotone) regions. For each class, we propose efficient algorithms for computing the regions that give optimal association rules for <i>gain, support,</i> and <i>confidence,</i> respectively. We have implemented the algorithms for admissible regions, and constructed a system for visualizing the rules.