International Journal of Scientific & Technology Research

Home About Us Scope Editorial Board Blog/Latest News Contact Us
10th percentile
Powered by  Scopus
Scopus coverage:
Nov 2018 to May 2020


IJSTR >> Volume 9 - Issue 11, November 2020 Edition

International Journal of Scientific & Technology Research  
International Journal of Scientific & Technology Research

Website: http://www.ijstr.org

ISSN 2277-8616

Performance Enhancement Of Customer Segmentation Using A Distributed Python Framework, Ray

[Full Text]



Debajit Datta, Rishav Agarwal, Preetha Evangeline David



Accuracy Metrics, Classification, Clustering, CPU, GPU, Parallel Computing, Recommendation, Segmentation, Speedup.



Over the years, there has been a huge popularity of the recommender systems worldwide. Recommender systems have been implemented over several domains ranging from recommendations for videos and movies to that for products and applications, and many more. The algorithms, which are used for recommender systems, implement segmentation of the customer based on several attributes. These algorithms are time-consuming and require comparatively high computation power. This work deals with the parallelization of different algorithms for simple customer segmentation in the Python environment using the framework, Ray. The dataset for this work includes a huge list of purchases that are carried out by 4000 customers, over a year. The parallelization is carried out throughout the multicores of CPU and the cores of GPU. Additionally, the work also shows the speedup that is obtained after parallelization, for analyzing the overall increase in performance.



[1] KABASAKAL, İnanç. “Customer Segmentation Based On Recency Frequency Monetary Model: A Case Study in E-Retailing.” International Journal of InformaticsTechnologies 13.1 (2020).
[2] Alkhayrat, Maha, Mohamad Aljnidi, and Kadan Aljoumaa. “A comparative dimensionality reduction study in telecom customer segmentation using deep learning and PCA.” Journal of Big Data 7.1 (2020): 9.
[3] Cahyana, Bambang Eka, et al. “Hybrid cluster analysis of customer segmentation of sea transportation users.” Journal of Economics, Finance and Administrative Science (2020).
[4] Carnein, Matthias, and Heike Trautmann. “Customer segmentation based on transactional data using stream clustering.” Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, Cham, 2019.
[5] Hung, Phan Duy, Nguyen Thi Thuy Lien, and Nguyen Duc Ngoc. “Customer segmentation using hierarchical agglomerative clustering.” Proceedings of the 2019 2nd International Conference on Information Science and Systems. 2019.
[6] Datta, Debajit, et al. “Comparison of Performance of Parallel Computation of CPU Cores on CNN model.” 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE). IEEE, 2020.
[7] Karahoda, Sertaç, et al. “Multicore and manycore parallelization of cheap synchronizing sequence heuristics.” Journal of Parallel and Distributed Computing 140 (2020): 13-24.
[8] Gerzhoy, Daniel, et al. “Nested mimd-simd parallelization for heterogeneous microprocessors.” ACM Transactions on Architecture and Code Optimization (TACO) 16.4 (2019): 1-27.
[9] Kim, Sang Hee, et al. “Computing Performance Comparison of CPU and GPU Parallelization for Virtual Heart Simulation.” Journal of Biomedical Engineering Research 41.3 (2020): 128-137.
[10] Essaid, Mokhtar, et al. “GPU parallelization strategies for metaheuristics: a survey.” International Journal of Parallel, Emergent and Distributed Systems 34.5 (2019): 497-522.
[11] Rosenberg, Duane, et al. “GPU Parallelization of a Hybrid Pseudospectral Geophysical Turbulence Framework Using CUDA.” Atmosphere 11.2 (2020): 178.
[12] Rodriguez, Mayra Z., et al. “Clustering algorithms: A comparative approach.” PloS one 14.1 (2019): e0210236.
[13] Bai, BG Mamatha, B. M. Nalini, and Jharna Majumdar. “Analysis and detection of diabetes using data mining techniques—a big data application in health care.” Emerging Research in Computing, Information, Communication and Applications. Springer, Singapore, 2019. 443-455.
[14] Zhang, Shuai, et al. “Deep learning based recommender system: A survey and new perspectives.” ACM Computing Surveys (CSUR) 52.1 (2019): 1-38.
[15] Osadchiy, Timur, et al. “Recommender system based on pairwise association rules.” Expert Systems with Applications 115 (2019): 535-542.
[16] Natarajan, Senthilselvan, et al. “Resolving data sparsity and cold start problem in collaborative filtering recommender system using linked open data.” Expert Systems with Applications 149 (2020): 113248.
[17] Chen, Yewang, et al. “Fast density peak clustering for large scale data based on kNN.” Knowledge-Based Systems 187 (2020): 104824.
[18] Ma, Chencheng, Xuehui Du, and Lifeng Cao. “Improved KNN Algorithm for Fine-Grained Classification of Encrypted Network Flow.” Electronics 9.2 (2020): 324.
[19] Sarker, Iqbal H., et al. “Behavdt: a behavioral decision tree learning to build user-centric context-aware predictive model.” Mobile Networks and Applications 25.3 (2020): 1151-1161.
[20] Katuwal, Rakesh, Ponnuthurai Nagaratnam Suganthan, and Le Zhang. “Heterogeneous oblique random forest.” Pattern Recognition 99 (2020): 107078.
[21] Zhang, Pin, et al. “A novel hybrid surrogate intelligent model for creep index prediction based on particle swarm optimization and random forest.” Engineering Geology 265 (2020): 105328.
[22] Xing, Hong-Jie, and Wei-Tao Liu. “Robust AdaBoost based ensemble of one-class support vector machines.” Information Fusion 55 (2020): 45-58.
[23] Sun, Jie, et al. “Class-imbalanced dynamic financial distress prediction based on Adaboost-SVM ensemble combined with SMOTE and time weighting.” Information Fusion 54 (2020): 128-144.
[24] Wu, Yanli, et al. “Application of alternating decision tree with AdaBoost and bagging ensembles for landslide susceptibility mapping.” Catena 187 (2020): 104396.
[25] Hu, Rongyao, et al. “Robust SVM with adaptive graph learning.” World Wide Web 23.3 (2020): 1945-1968.
[26] Yu, Bin, et al. “SubMito-XGBoost: predicting protein submitochondrial localization by fusing multiple feature information and eXtreme gradient boosting.” Bioinformatics 36.4 (2020): 1074-1081.
[27] Zhang, Yanju, et al. “PeNGaRoo, a combined gradient boosting and ensemble learning framework for predicting non-classical secreted proteins.” Bioinformatics 36.3 (2020): 704-712.
[28] Feng, Yunlong, Jun Fan, and Johan AK Suykens. “A Statistical Learning Approach to Modal Regression.” Journal of Machine Learning Research 21.2 (2020): 1-35.
[29] Guo, Yanrong, Zhengwang Wu, and Dinggang Shen. “Learning longitudinal classification-regression model for infant hippocampus segmentation.” Neurocomputing 391 (2020): 191-198.
[30] Jagani, Khyati, Falguni Vasavada Oza, and Himani Chauhan. “Customer Segmentation and Factors Affecting Willingness to Order Private Label Brands: An E-Grocery Shopper's Perspective.” Improving Marketing Strategies for Private Label Products. IGI Global, 2020. 227-253.