Characteristics of Internet Background Radiation
Authors: Ruoming Pang, Vinod Yegneswaran, Paul Barford, Vern Paxson, & Larry Peterson Publisher:
ACM Internet Measurement Conference (IMC), 2004
Presented by: Chowdhury, Abu Rahat
Today’s Outline • • • •
The Authors and their Problem Statements Objective & Terminology The study and Network Telescope Measurement Methodology: • Passive Measurement • Active Measurement
• Comments.
The Authors
Vinod Yegneswaran
Vern Paxson
Grad Student
Associate Professor
Ruoming Pang
Computer Science and Statistics
EECS Department of UC Berkeley,
Software engineer Google NY
University of Wisconsin
Paul Barford
Larry L. Peterson
Assistant Professor, Department of Computer Sciences University of Wisconsin-Madison
Professor Department of Computer Science Princeton, NJ 08544
Current Research Projects and Thrusts Measurement, analysis, and security of wide area networked systems and network protocols
The Problem • Background radiation reflects fundamentally nonproductive traffic, either malicious or benign. While the general presence of background radiation is well known to the network operator community, its nature has yet to be broadly characterized
• Goals of Characterization:
– What is all this nonproductive traffic trying to do? – How can we filter it out to detect new types of malicious activity?
Outline • • • •
The Authors and their Problem Statements Objective & Terminology The study and Network Telescope Measurement Methodology: • Passive Measurement • Active Measurement
• Comments
Objective •
To characterize Background Radiation based on: –
•
Types of attack, behavior, traffic composition, frequency, target networks, etc.
Secondary objectives – –
Development of an effective traffic filtering system Use of active responders to effectively identify the objective of attacks
Natural Background Radiation We are all exposed to ionizing radiation from natural sources at all times. This radiation is called natural background radiation, and its main sources are the following: • Radioactive substances in the earth's crust • Emanation of radioactive gas from the earth • Cosmic rays from outer space which bombard the earth
Source: Google Earth
Internet Background Radiation •
The Baseline “Noise” of Internet traffic – Every IP address---even an unused one---receives packets constantly…So Fundamentally nonproductive traffic. –
Traffic sent to unused addresses.
–
Nonproductive traffic: malicious (flooding backscatter, hostile scan, spam) OR benign (misconfigurations).
–
Pervasive nature (hence “background”).
Backscatter
Source: [MVS01]
Background Radiation • The volume of this traffic is not minor. For example, traffic logs from LBL for an arbitrarily-chosen day show that a total of about 8 million connection attempts (2/3 of the total) Background Radiation
Benign
Misconfiguration
Malicious
Backscatters
Scan for Vulnerability
Worms
The Study • Why do we study it? – To understand Internet malware in action This paper is the first broad characterization of Internet background radiation
Focus: traffic semantics – What is the traffic trying to do at application level?
Measurement methodology – How to extract the meaning of background radiation ?
Measurement Apparatus: Network Telescope • Unused but globally reachable IP Addresses • Their main telescopes: – Lawrence Berkeley National Lab – Size: 1,280 addresses
Outline • • • •
The Authors and their Problem Statements Objective & Terminology The study and Network Telescope Measurement Methodology: • Passive Measurement • Active Measurement
• Comments.
Measurement Methodology: Passive Hit Pattern
What is the type and volume of observed traffic without actively responding to any packet?
How Often Do We See a Packet? • Feb 2006 at Lawrence Berkeley Lab (Average on 1,280 IP’s over period of a week) 342 packets / destination IP / day
Source: Ruoming Pang
=== > A packet every 4 minutes on any IP • But, how are radiation packets distributed: – Among destination IP’s? (Hotspot?) – Over time
Distribution over Destination IP’s
Number of packets per destination IP received over a week
Distribution over Destination IP’s
• Packets are in general evenly distributed among destinations • The biggest hotspot receives < 1% of packets
Number of Source IP’s Per Hour Number of source IP’s also vary over time But not correlated with packet volume Variation of Number of Source IP’s
Other Figures
Summary of Passive Observation • TCP dominates (99% of the TCP packets are TCP/SYN) • Near uniformity among destinations – Hit pattern: sweeping or random
• • • •
Variation over time Considerable amount of ICMP traffic Smaller set of sources scan all possible IPs Most of spoofed IPs are in class A The sources are expecting replies!
Outline • • • •
The Authors and their Problem Statements Objective & Terminology The study and Network Telescope Measurement Methodology: • Passive Measurement • Active Measurement
• Comments.
The Big Picture â&#x20AC;˘ Monitor network traffic to understand/track Internet attack activities â&#x20AC;˘ Monitor incoming traffic to unused IP space
Internet Active Measurement
Monitored traffic
Local network
Unused IP space
Network Telescope •
Use a honeypot to keep conversation going… (in the paper they used HoneyD and Active Sink)
•
Answer PING
•
Establish TCP connections
•
Reply to application (e.g., HTTP) requests … … … …
• • • •
Till we find out what the intention is
Key Components Responding to Application Requests Filter
Taming the Traffic Volume
Analyzing Traffic Semantics
Measurement Methodology (Application-Level Responders) • Data-driven: – Which responders to build is based on observed traffic volumes
• Application-level Responders: – Not only adhere to the structure of the underlying protocol, but also to know what to say
• New types of activities emerge over time, responders also need to evolve
Radiation Activity Classification Which Malware is Most Active?
What is the most Popular Application? Which Vulnerability is Most Targeted?
A Rich Collection of Applications are targeted in the Background Radiation • • • • • • • • • •
Windows RPC HTTP Netbios/CIFS/SMB Virus backdoors (MyDoom, Beagle, etc.) Dameware Universal PnP Microsoft SQL (Slammer) MySQL DNS BitTorrent
TCP Port 80 (HTTP) • Targeted against Microsoft IIS server. • Dominant activity is a WebDAV buffer-overrun exploit.
TCP Port 80 (HTTP)
Port 80 Activities
Other Figures
Summary of Active Observation • • • •
Study dominant activities on the popular ports Same Attacker on multiple networks Some sources avoid Class A Traffic is divided by ports:
– Consider all connections between a source-destination pair on a given destination port
• Background Radiation concentrates on a small number of ports:
– Only look at the most popular ports. – Many popular ports are also used by the normal traffic ⇒ use application semantic level. – Many replies are needed to see what is happening
Conclusion • Background Radiation is – Complex in Structure, highly automated, frequently malicious, potentially adversarial & matured in rapid speed
• Passive measurement reveal only part of the story • Need to interact with the traffic to see what are the actual objectives of the attacker
Strengths • First attempt to characterize background radiation • Good Measurement Methodology: – Detailed set of active responders for popular ports.
• Meaningful Data Analysis: – Passive Analysis: activities concentrate on popular ports. – Active Analysis: Extreme dynamism in many aspects of background radiation.
Weaknesses • The filtering could be biased. – The same kind of activity to all destination IP addresses. – Fail to capture multi-vector worms that pick one exploit per IP address
• Significant amount of connections didn’t proceed • DHCP problem makes source IP address less accurate as source identity. • To what extent the development of application-level responders can be automated?
Reference & Back up Slide
References • [Barford2004] Paul Barford. Trends in Internet Measurement. PPT from U. of Wisconsin, Fall 2004 • [MVS01] Moore, Geoffrey M. Voelker, and Stefan Savage. Inferring Internet Denial-ofService Activity. In Proceedings of the 10th USENIX Security Symposium, pages 9--22. USENIX, August 2001 • Google Earth
Measurement Methodology (Experimental Setup) • Two different systems: iSink, and LBL Sink. • Traces collected from three sites: – Class A network – UW campus – Lawrence Berkeley Lab (LBL) • Same forms of application response. • Different underlying mechanisms. • Support two kinds of data analysis: – Passive analysis: no filter, no responder – Active analysis: with filter, and responder
Experimental Setup: iSink
Experimental Setup: LBL Sink
0 1 1 0 1 1 0 1 1 1 0 1 1 0 1 1 0 1 0 001 1111 1111 0001 11 10 1111 1 0 1 Thank You 0 1 1 11 1 1 1 0 1 1 0 1 0 0 1 1 0 1 1 0 1 11 1 10 0 0 1 11 11 1 1 0 01 1 1 1 1 1 0 1 0 11 0 1 11 1 0 1 1 0 1 1 01
1 01
0 1
0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 010 0101010 0101010 0101010 0101010 0101010 0101010 0101010
0101010 0101010 0101010 0101010
0101010 0101010 0101010
0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010 0101010
01011111101111111010101111111001111111110101010
0101010 0101010 0101010 0101010 0101010 0101010 0101010