International Journal of Advanced Engineering, Management and Science (IJAEMS) https://dx.doi.org/10.22161/ijaems.612.13
[Vol-6, Issue-12, Dec-2020] ISSN: 2454-1311
An Analysis of Outlier Detection through clustering method T. Chandrakala1, S. Nirmala Sugirtha Rajini2 1Assistant
Professor, Department of Computer Applications, Jawahar Science College, Neyveli, TamilNadu, India Department of Computer Applications, Dr. M.G.R. Educational & Research Institute, Chennai, TamilNadu, India
2Professor,
Received: 08 Nov 2020; Received in revised form: 03 Dec 2020; Accepted: 12 Dec 2020; Available online: 30 Dec 2020 Š2020 The Author(s). Published by Infogain Publication. This is an open access article under the CC BY license (https://creativecommons.org/licenses/by/4.0/).
Abstract— This research paper deals with an outlier which is known as an unusual behavior of any substance present in the spot. This is a detection process that can be employed for both anomaly detection and abnormal observation. This can be obtained through other members who belong to that data set. The deviation present in the outlier process can be attained by measuring certain terms like range, size, activity, etc. By detecting outlier one can easily reject the negativity present in the field. For instance, in healthcare, the health condition of a person can be determined through his latest health report or his regular activity. When found the person being inactive there may be a chance for that person to be sick. Many approaches have been used in this research paper for detecting outliers. The approaches used in this research are 1) Centroid based approach based on K-Means and Hierarchical Clustering algorithm and 2) through Clustering based approach. This approach may help in detecting outlier by grouping all similar elements in the same group. For grouping, the elements clustering method paves a way for it. This research paper will be based on the above mentioned 2 approaches. Keywords— detection of an outlier, data set, clustering approach, abnormality.
I.
INTRODUCTION
Mining, in general, is termed as the intrinsic methodology of discovering interesting, formerly unknown data patterns. Outlier detection has important applications in the field of data mining, such as fraud detection, customer behavior analysis, and intrusion detection. A number of approaches are used in the process of detecting the outlier (Bezerra et al., 2016). Clustering can be termed as a set-grouping task where similar objects are being grouped together. Clustering, a primitive anthropological method is a vital method in exploratory data mining for statistical data analysis, machine learning, and image analysis and in many other predominant branches of supervised and unsupervised learning. Outlier detection is related to unwanted noise in the data. As far as the analysts are concerned, noise in data is not important but acts as a hindrance to data analysis. Noise removal is the process of removing unwanted objects before
www.ijaems.com
any data analysis is performed on the data(Bhattacharya et al., 2015). A large number of domains apply Outlier detection directly. This results in the development of innumerable outlier detection techniques. A lot of these techniques have been developed to solve focused problems pertaining to a particular application domain, while others have been developed in a more generic fashion(Pimentel et al., 2014).
II. DATA IN DATA MINING Generally, we are drowned in information, but starving for knowledge. Data can be collected from multiple sources. The purposes can be categorized as Business, Science, and Society. Business purposes can use the data for Web, ECom, Transaction, and Stock Marketing. Scientific data can be used for Remote sensing, Bio-informatics, and Scientific
Page | 571