International Journal of Scientific & Technology Research

Home About Us Scope Editorial Board Blog/Latest News Contact Us
10th percentile
Powered by  Scopus
Scopus coverage:
Nov 2018 to May 2020


IJSTR >> Volume 8 - Issue 7, July 2019 Edition

International Journal of Scientific & Technology Research  
International Journal of Scientific & Technology Research

Website: http://www.ijstr.org

ISSN 2277-8616

A Review On: Finding Outlier Points On Real Dimensional Data Sets

[Full Text]



Bhagyashri Karkhanis, Sanjay Sharma



Intrinsic dimension, k nearest neighbours (k-NN), local outlier factor (LOF), local projection-based outlier detection (LPOD), local projection score (LPS), outlier detection, resolution based outlier factor (ROF).



With the latest rate of increase in research into finding outlier point has been studied broadly in area of data mining as well as machine learning. However as the appearance of enormous dimensional data sets in real-life applications to finding outlier point from outlier detection faces a series of new challenging problem in now-a-days. Detecting outliers is to identify the objects that extensively turn aside commencing the common distribution of the real data. Such that items may be seen as suspicious data items due to the different mechanism of generation. Various algorithms have already worked well in such an environment for finding outlier point. Consequently, machine learning methods are developing up-to-date outlier detection methods becomes insistent tasks.



[1]. J.Han and M.Kamber. Data Mining: Concepts and Techniques. Morgan Kaufman Publishers, 2000.
[2]. C. C.Aggarwal and P.S.Yu. Outlier Detection in High Dimensional Data. In Proc. of 2001 ACM SIGMOD International Conference on Management of Data (SIGMOD’01), Santa Barbara, California, USA, 2001.
[3]. C.Zhu, H.Kitagawa and C.Faloutsos. Example-Based Robust Outlier Detection in High Dimensional Datasets. In Proc. of 2005 IEEE International Conference on Data Management (ICDM’05), pp 829-832, 2005.
[4]. J.Zhang, M.Lou, T.W.Ling and H.Wang. HOS-Miner: A System for Detecting Outlying Subspaces of High-dimensional Data. In Proc. of 30th International Conference on Very Large Data Bases (VLDB’04), demo, pages 1265-1268,Toronto, Canada, 2004.
[5]. J.Zhang, Q.Gao and H.Wang. A Novel Method for Detecting Outlying Sub-spaces in High-dimensional Databases Using Genetic Algorithm. 2006 IEEE International Conference on Data Mining (ICDM’06), pages 731-740, Hong Kong, China, 2006.
[6]. J.Zhang and H.Wang. 2006. Detecting Outlying Subspaces for High-dimensional Data: the New Task, Algorithms and Performance. Knowledge and Information Systems (KAIS), 333-355, 2006.
[7]. C.C.Aggarwal. On Abnormality Detection in Spuriously Populated Data Streams. SIAM International Conference on Data Mining (SDM’05), Newport Beach, CA, 2005.
[8]. C.Zhu, H.Kitagawa, and C.Faloutsos. Example-based robust outlier detection in high dimensional datasets. In Proc. ICDM, 2005.
[9]. G.Williams, K.Yamanishi, and J.Takeuchi. Online unsupervised outlier detection using finite mixtures with discounting learning algorithms. In Proc. KDD, 2000.
[10]. K.Yamanishi and J.Takeuchi. Discovering outlier filtering rules from unlabeled data: combining a supervised learner with an unsupervised learner. In Proc. KDD, 2001.
[11]. Y.Pei, O.Zaıane, and Y.Gao. An efficient reference-based approach to outlier detection in large datasets. InProc. ICDM, 2006.
[12]. S.Ramaswamy, R.Rastogi, and K.Shim. Efficient algorithms for mining outliers from large data sets. InProc. SIGMOD, 2000.
[13]. M.M.Breunig, H.P.Kriegel, R.Ng, and J.Sander. LOF: Identifying density-based local outliers. InProc. SIGMOD, 2000.
[14]. S.Papadimitriou, H.Kitagawa, P.Gibbons, and C. Faloutsos. LOCI: Fast outlier detection using the local correlation integral. In Proc. ICDE, 2003.
[15]. H.Fan, O.R.Zaıane, A.Foss, and J.Wu. A nonparametric outlier detection for efficiently discovering top-N outliers from engineering data. In Proc. PAKDD, 2006.
[16]. Huawen Liu, Member, IEEE, Xuelong Li, Fellow, IEEE, Jiuyong Li, Member, IEEE, and Shichao Zhang, Senior Member, IEEE “Efficient Outlier Detection for High-Dimensional Data” IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, 2017.
[17]. Jonathan von Brunken, Michael E.Houle, and Arthur Zimek, “Intrinsic Dimensional Outlier Detection in High-Dimensional Data” NII-2015 -003E, Mar. 2015.
[18]. Hans-Peter Kriegel, Matthias Schubert, Arthur Zimek” Angle-Based Outlier Detection in High-dimensional Data” ACM 978-1-60558-193, 2008.
[19]. Suresh S.Kapare, Bharat A.Tidke, “Spam Outlier Detection in High Dimensional Data: Ensemble Subspace Clustering Approach” IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 6 (3), 2015, 2326-2329.
[20]. Hongchao Song, Zhuqing Jiang, Aidong Men, and Bo Yang, “A Hybrid Semi-Supervised Anomaly Detection Model for High Dimensional Data” Comput Intell Neurosci. 2017.
[21]. K.Das and J.Schneider, “Detecting anomalous records in categorical datasets,” in Proceedings of the ACM KDD, pp. 220–229, 2007.
[22]. Grubbs, F.E., 1969. Procedures for detecting outlying observations in samples. Technometrics, 11: 1-21.
[23]. Laurikkala, J.,M.Juhola1 and E.Kentala, 2000.Informal identification of outliers in medical data. In: Fifth International Workshop on Intelligent Data Analysis in Medicine and Pharmacology, pp: 20-24.