3 minute read
IntrusionDetectionOverview
By Jingzhi Yang
IDS is a type of detection system that plays a key role in protecting cyber security by monitoring the states of software and hardware running in a network. The first intrusion detection system was proposed in 1980. Generally speaking, there are two types of IDS classification methods: the detection-based method and the data source-based method Detection-based methods are also sub-classified into misuse-based and anomaly-based detection
Advertisement
Misuse-basedDetection
The idea behind misuse-based detection method is keeping the known or seen attack behaviors as signatures in a database. These specification-based techniques have been shown to produce a low rate of false alarms but are not as effective as anomaly detection in detecting novel attacks. Researchers realized hackers can easily find their ways to bypass the obstacles. For example, a system might have handcrafted rules such as if login time is in the non-business hours and there are 5 continuous failed attempts then suspend the account for 24 hours. After some experiments, hackers will get to know this rule and try to log in at a different time or implement a ‘cooling’ time before his next try to bypass the rule. Rule-based systems can be effective, but they require constant manual updates, and they can be easily defeated by a knowledgeable adversary It is very time consuming and inaccurate
Anomaly-baseddetection
Anomaly detection can be defined as a key element of intrusion detection and other detection systems in which perturbations of normal behavior suggest the presence of intentionally or unintentionally induced attacks, faults, defects, etc. The keyword in this definition is ‘normal behavior’ or as somebody prefers ‘baseline’. The baseline represents how the system normally behaves, and then all network activity is compared to that baseline.
The baseline represents how the system normally behaves, and then all network activity is compared to that baseline Rather than searching for known IOC (indicator of compromise), anomaly-based IDS simply identifies any out-of-theordinary behavior to trigger alerts. For example, if a user usually uses MacOS and logs in from New Orleans but all of a sudden he is using a Windows machine to log in from Russia, it is very unlikely that this log in attempt is coming from the user. Because anomaly-based systems can detect misuse based on network and system behavior, the type of misuse does not need to be previously known.
Machine Learning is an intuitive way to implement anomaly-based detection. The most common supervised algorithms are, Supervised Neural Networks, Support Vector Machines (SVM), k-Nearest Neighbors, Bayesian Networks and Decision Tree. The most common unsupervised algorithms are, K-Means, Self-organizing maps (SOM), C-means, ExpectationMaximization Meta algorithm (EM), Adaptive resonance theory (ART), Unsupervised Niche Clustering (UNC) and One-Class Support Vector Machine.
Anomaly detection and misused based methods have their advantages and disadvantages Is it possible to devise a system that utilizes both methods to achieve better performance? The answer is yes Depren proposed a novel Intrusion Detection System (IDS) architecture utilizing both anomaly and misuse detection approaches to find anomaly connections in network traffic (Depren et al., 2005). In his implementation, anomaly and misuse detection are two separate modules that make independent predictions, and then a rule-based decision support system (DSS) will make a final decision. In his implementation decision-making is a straightforward process. The logic of decision-making can be divided into three parts: 1. if the misused approach predicts it is an attack, then the final decision will be an attack regardless of anomaly approach prediction. 2. If the misused approach predicts it is benign, but the anomaly approach predicts it is an attack then the final decision is unclassified. 3. If both the misused and anomaly approach predicts it is benign then the final decision is benign Using the KDD Cup 99 Data Set which is raw packet network traffic data (e g 0, TCP, HTTP, SF, 181, 5450, 0, 0, 0, 0, 0, 1, normal), Depren achieved a detection rate of 99 90%, a classification rate of 99 84%, and a false positive rate of 1.25%, which overall outperformed both anomaly-based and misused-based detection methods where for anomaly detection, the detection rate is 98.96% and the false positive rate is 1.01%, and for misuse detection, the classification rate is 99.61% and false positive is 0.2% (Depren et al., 2005).
Although there is much literature using raw packet network traffic as a dataset to develop their IDS like Depren’s work, there are very few using user sign-in logs as the dataset. This is because the user sign-in log is sensitive data that is normally only available to each organization’s IT department. However, user authentication is a serious issue right now in the cybersecurity field. Many organizations rely on Microsoft Azure Active Directory to authenticate users to allow them to use resources within the organization such as email services Unfortunately, Microsoft Azure Active Directory authentication is not as effective as expected What makes it worse is that since the implementation is a black-box, there is no way for users to improve the detection. Hopefully in the future there is an open-sourced solution that can address the issue.