Subspace Clustering with Self-Organizing Maps
Subspace clustering is the task of identifying clusters in subspaces of the input dimensions of a given dataset. Noisy data in certain attributes cause difficulties for traditional clustering algorithms, because the high discrepancies within them can make objects appear too different to be grouped in the same cluster. This requires methods specially designed for subspace clustering.
Example of a 3D dataset | 2D projection |
Simulated and real world benchmark datasets, varying the number of dimensions, samples and the noise levels are available in the OpenSubspace Framework from the Data Management and Data Exploration Group of Rwth Aachen University. An aditional benchmark dataset varying the number of irrelevant dimensions is available here
Local Adaptive Receptive Field Dimension Selective Self-Organizing Maps - LARFDSSOM
LARFDSSOM is a time-variant topology Self-Organizing Map (SOM) that improves over DSSOM (see below) in terms of clustering quality, computational cost and parameterization. This enables the method to identify the correct number of clusters and their respective relevant dimensions, and it thus presents nearly perfect results in synthetic datasets and surpasses our previous method in most of the real world datasets considered.
LARFDSSSOM Resources
| linux-32bits | linux-64bits | Mac OS X (soon) | Source code |Related Publications
>> Bassani, H.F.; Araujo, A.F.R., "Dimension Selective Self-Organizing Maps With Time-Varying Structure for Subspace and Projected Clustering," Neural Networks and Learning Systems, IEEE Transactions on , vol.PP, no.99, pp.1,1 (link).
Dimension Selective Self-Organizing Maps - DSSOM
DSSOM is a fixed structure Self-Organizing Map (SOM) that can find clusters and identify their relevant dimensions, simultaneously, during the self-organization process. This method employs a locally weighted distance metric to adjust the relevances of the input dimensions. The method includes a simple outlier detection mechanism to improve results in noisy datasets and to allow input patterns to be clustered by more than one node, thus characterizing subspace clustering.
DSSOM Resources
| linux-32bits | linux-64bits | Mac OS X (soon) | Source code |Related Publications
>> ARAUJO, FLAVIA R B ; BASSANI, HANSENCLEVER F ; ARAUJO, ALUIZIO F R . Learning vector quantization with local adaptive weighting for relevance determination in Genome-Wide association studies. In: 2013 International Joint Conference on Neural Networks (IJCNN 2013 Dallas), 2013, Dallas. The 2013 International Joint Conference on Neural Networks (IJCNN). p. 1-8. (link)
>> BASSANI, HANSENCLEVER F. ; ARAUJO, ALUIZIO F. R. . Dimension Selective Self-Organizing Maps for clustering high dimensional data. In: 2012 International Joint Conference on Neural Networks (IJCNN 2012 Brisbane), 2012, Brisbane. The 2012 International Joint Conference on Neural Networks (IJCNN). p. 1-8. (link)