A Data Mining Framework for Semantically Distributed Databases

M. Kantardzic and A. Badia (USA)


Data Mining, Distributed Artificial Intelligence, Associa tion Rules, Distributed Systems


We propose a new approach to mining semantically dis tributed data, i.e. distributed data sets in which the dis tribution carries semantic significance. We argue that past approaches to distributed mining have focused on perfor mance, disregarding the characteristics of the data distribu tion. Our approach proceeds in several steps. In a first step, we mine in a distributed fashion and build a local model (for each distributed site) and a global site (for the whole set), as it is traditionally done. We then present novel methods to compare the global model to each local model and decide the degree of similarity among them. Based on this analysis, local models can be clustered and out liers detected. Further mining of the resulting clusters then reveals additional information that traditional approaches may have overlooked. We apply our approach to the case of association rule mining.

Important Links:

Go Back