Comparative Performance Of Using PCA With K-Means And Fuzzy C Means Clustering For Customer Segmentation
[Full Text]
AUTHOR(S)
Fahmida Afrin, Md. Al-Amin, Mehnaz Tabassum
KEYWORDS
Index Terms: Data Mining, Clustering, K-means, Principal component analysis, Fuzzy C means, Customer segmentation, Crisp Set
ABSTRACT
Abstract: Data mining is the process of analyzing data and discovering useful information. Sometimes it is called knowledge Discovery. Clustering refers to groups whereas data are grouped in such a way that the data in one cluster are similar, data in different clusters are dissimilar. Many data mining technologies are developed for customer segmentation. PCA is working as a preprocessor of Fuzzy C means and K- means for reducing the high dimensional and noisy data. There are many clustering method apply on customer segmentation. In this paper the performance of Fuzzy C means and K-means after implementing Principal Component Analysis is analyzed. We analyze the performance on a standard dataset for these algorithms. The results indicate that PCA based fuzzy clustering produces better results than PCA based K-means, and is a more stable method for customer segmentation.
REFERENCES
[1] Customer Segmentation, http://www.statsoft.com/Textbook/Customer- Segmentation, [Access Date : 23th May, 2015].
[2] Tejwant Singh, M. M. (2014 ). Performance Comparison of Fuzzy C Means with Respect to Other Clustering Algorithm. International Journal of Advanced Research in Computer Science and Software Engineering, 89-93.
[3] Zhang, L. (2010). Data mining application in customer relationship management, International Conference on Computer Application and System Modeling (ICCASM) (pp. V14-171 - V14-174). Taiyuan: IEEE.
[4] D.Napoleon, S.Pavalakodi. A New Method for Dimensionality Reduction using K Means Clustering Algorithm for High Dimensional Data Set, International Journal of Computer Applications, Volume 13, No.7 (2011), pp. 41-46.
[5] Tajunisha, S. (2010). Performance analysis of k-means with different initialization methods for high dimensional data. International Journal of Artificial Intelligence & Applications, 44-52.
|