4 phases for virus construction

1-install to memory code

Placement of code where it may be executed so that it can install itself to the ram by :

1-mod. To exe or com files

2-mod or repl. Of the boot sector

3-mod or repl of the partition record

4-mod or repl. Of device drivers

5-mod of ovi (overlay) files

2-copy to disk code

Saving the virus code to disk to make to difficult to be detected or removed , the following areas can be used :

1-boot sector 0

2-sectors marked in FAT

3-track 41

4-intersector gaps

5-partition record

6-exe or com or ovl or sys files

7-hidden files

8-data files

3- test for condition code

This maybe special time or date or after specified number of copies of viruses has been made

4- action code : final phase

The component which offers real threat to the comp system

Consider a virus that is designed to infect an assembly language program . it must execute a sequence of steps to effectively plant the virus code for execution :

1-locate the fist executable instruction in the target program

2-replace the inst. With an inst. To jmp to the memory location next to the last inst. Of the target program

3-insert the virus code for execution at the end of the target program

4-insert an inst. at the end of the virus program to simulate the original first inst. Of the target program that the virus replaced in step2

5-add another inst. At the end of the virus code to jmp back to the second inst of the target program

Write an assembly program to design a virus to format a floppy disk

Executable Files Structure

INF: Executable-File Header Format                            [P_WinSDK]

PSSONLY | Windows 3 Developers Notes softlib ENDUSER


Note: This article is part of a set of seven articles, collectively
called the “Windows 3.00 Developer’s Notes.” More information about
the contents of the other articles, and procedures for ordering a
hard-copy set, can be found in the knowledge base article titled “INF:
The Windows 3.00 Developer’s Notes” (Q65260).

This article can be found in the Software/Data Library by searching on
the word EXEFMT or S12688. EXEFMT was archived using the PKware
file-compression utility.

More Information:

Microsoft defined the segmented executable file format for Windows
applications and dynamic-link libraries (DLLs). This file format is
also referred to as the New Executable Format. This new format is an
extension of the existing MS-DOS .EXE format (old-style format). The
purpose of the segmented executable format is to provide the
information needed to support the dynamic linking and segmentation
capabilities of the Windows environment.

An executable file contains Microsoft Windows code and data, or
Windows code, data, and resources. Specific fields have been added to
the old-style .EXE format header to indicate the existence of the
segmented file format. The old-style header may contain a valid
executable program, called a stub program, that will be executed if
the program is run on MS-DOS (without Windows). This stub program
usually prints a message indicating that Microsoft Windows is required
to run the program. The segmented executable format extensions also
begin with a header that describes the contents and location of the
executable image in the file. The loader uses this header information
when it loads the executable segments in memory.


The old-style header contains information the loader expects for a DOS
executable file. It describes a stub program (WINSTUB) the loader can
place in memory when necessary, it points to the new-style header, and
it contains the stub programs relocation table.

The following illustrates the distinct parts of the old-style
executable format:

00h |  Old-style header info  |
20h |        Reserved         |
3Ch |   Offset to segmented   |
|       .EXE header       |
40h |  Relocation table and   |
|    DOS stub program     |
|  Segmented .EXE Header  |
|           .             |
|           .             |
|           .             |

The word at offset 18h in the old-style .EXE header contains the
relative byte offset to the stub program’s relocation table. If this
offset is 40h, then the double word at offset 3Ch is assumed to be the
relative byte offset from the beginning of the file to the beginning
of the segmented executable header. A new-format .EXE file is
identified if the segmented executable header contains a valid
signature. If the signature is not valid, the file is assumed to be an
old-style format .EXE file. The remainder of the old-style format
header will describe a DOS program, the stub. The stub may be any
valid program but will typically be a program that displays an error


Because Windows executable files are often larger than one segment
(64K), additional information (that does not appear in the old-style
header) is required so that the loader can load each segment properly.
The segmented EXE format was developed to provide the loader with this

The segmented .EXE file has the following format:

00h |  Old-style EXE  |
|      Header     |
20h |    Reserved     |
3Ch |    Offset to    | —+
| Segmented Header|    |
+—————–+    |
40h | Relocation Table|    |
|  & Stub Program |    |
+—————–+    |
|                 |    |
+—————–+    |
xxh |  Segmented EXE  | <–+
|      Header     |
|  Segment Table  |
| Resource Table  |
|  Resident Name  |
|      Table      |
| Module Reference|
|      Table      |
| Imported Names  |
|      Table      |
|   Entry Table   |
|  Non-Resident   |
|   Name Table    |
|   Seg #1 Data   |
|   Seg #1 Info   |
|   Seg #n Data   |
|   Seg #n Info   |

The following sections describe each of the components that make up
the segmented EXE format. Each section contains a description of the
component and the fields in the structures that make up that

Note: All unused fields and flag bits are reserved for future use and
must contain 0 (zero) values.


The segmented EXE header contains general information about the EXE
file and contains information on the location and size of the other
sections. The Windows loader copies this section, along with other
data, into the module table in the system data. The module table is
internal data used by the loader to manage the loaded executable
modules in the system and to support dynamic linking.

The following describes the format of the segmented executable header.
For each field, the offset is given relative to the beginning of the
segmented header, the size of the field is defined, and a description
is given.

Offset Size Description
—— —- ———–

00h     DW  Signature word.
“N” is low-order byte.
“E” is high-order byte.

02h     DB  Version number of the linker.

03h     DB  Revision number of the linker.

04h     DW  Entry Table file offset, relative to the beginning of
the segmented EXE header.
06h     DW  Number of bytes in the entry table.

08h     DD  32-bit CRC of entire contents of file.
These words are taken as 00 during the calculation.

0Ch     DW  Flag word.
0001h = SINGLEDATA (Shared automatic data segment)
0002h = MULTIPLEDATA (Instanced automatic data
2000h = Errors detected at link time, module will not
8000h = Library module.
The SS:SP information is invalid, CS:IP points
to an initialization procedure that is called
with AX equal to the module handle. This
initialization procedure must perform a far
return to the caller, with AX not equal to
zero to indicate success, or AX equal to zero
to indicate failure to initialize. DS is set
to the library’s data segment if the
SINGLEDATA flag is set. Otherwise, DS is set
to the caller’s data segment.

A program or DLL can only contain dynamic
links to executable files that have this
library module flag set. One program cannot
dynamic-link to another program.

0Eh     DW  Segment number of automatic data segment.
This value is set to zero if SINGLEDATA and
MULTIPLEDATA flag bits are clear, NOAUTODATA is
indicated in the flags word.

A Segment number is an index into the module’s segment
table. The first entry in the segment table is segment
number 1.

10h     DW  Initial size, in bytes, of dynamic heap added to the
data segment. This value is zero if no initial local
heap is allocated.

12h     DW  Initial size, in bytes, of stack added to the data
segment. This value is zero to indicate no initial
stack allocation, or when SS is not equal to DS.

14h     DD  Segment number:offset of CS:IP.

18h     DD  Segment number:offset of SS:SP.
If SS equals the automatic data segment and SP equals
zero, the stack pointer is set to the top of the
automatic data segment just below the additional heap

| additional dynamic heap  |
+————————–+ <- SP
|    additional stack      |
| loaded auto data segment |
+————————–+ <- DS, SS

1Ch     DW  Number of entries in the Segment Table.

1Eh     DW  Number of entries in the Module Reference Table.
20h     DW  Number of bytes in the Non-Resident Name Table.

22h     DW  Segment Table file offset, relative to the beginning
of the segmented EXE header.

24h     DW  Resource Table file offset, relative to the beginning
of the segmented EXE header.

26h     DW  Resident Name Table file offset, relative to the
beginning of the segmented EXE header.

28h     DW  Module Reference Table file offset, relative to the
beginning of the segmented EXE header.

2Ah     DW  Imported Names Table file offset, relative to the
beginning of the segmented EXE header.

2Ch     DD  Non-Resident Name Table offset, relative to the
beginning of the file.

30h     DW  Number of movable entries in the Entry Table.

32h     DW  Logical sector alignment shift count, log(base 2) of
the segment sector size (default 9).

34h     DW  Number of resource entries.

36h     DB  Executable type, used by loader.

37h-3Fh DB  Reserved, currently 0’s.


The segment table contains an entry for each segment in the executable
file. The number of segment table entries are defined in the segmented
EXE header. The first entry in the segment table is segment number 1.
The following is the structure of a segment table entry.

Size Description
—- ———–

DW   Logical-sector offset (n byte) to the contents of the segment
data, relative to the beginning of the file. Zero means no
file data.

DW   Length of the segment in the file, in bytes. Zero means 64K.

DW   Flag word.
0007h = TYPE_MASK  Segment-type field.
0000h = CODE       Code-segment type.
0001h = DATA       Data-segment type.
0010h = MOVEABLE   Segment is not fixed.
0040h = PRELOAD    Segment will be preloaded; read-only if
this is a data segment.
0100h = RELOCINFO  Set if segment has relocation records.
F000h = DISCARD    Discard priority.

DW   Minimum allocation size of the segment, in bytes. Total size
of the segment. Zero means 64K.


The resource table follows the segment table and contains entries for
each resource in the executable file. The resource table consists of
an alignment shift count, followed by a table of resource records. The
resource records define the type ID for a set of resources. Each
resource record contains a table of resource entries of the defined
type. The resource entry defines the resource ID or name ID for the
resource. It also defines the location and size of the resource. The
following describes the contents of each of these structures:

Size Description
—- ———–

DW   Alignment shift count for resource data.

A table of resource type information blocks follows. The following
is the format of each type information block:

DW  Type ID. This is an integer type if the high-order bit is
set (8000h); otherwise, it is an offset to the type string,
the offset is relative to the beginning of the resource
table. A zero type ID marks the end of the resource type
information blocks.

DW  Number of resources for this type.

DD  Reserved.

A table of resources for this type follows. The following is
the format of each resource (8 bytes each):

DW  File offset to the contents of the resource data,
relative to beginning of file. The offset is in terms
of the alignment shift count value specified at
beginning of the resource table.

DW  Length of the resource in the file (in bytes).

DW  Flag word.
0010h = MOVEABLE  Resource is not fixed.
0020h = PURE      Resource can be shared.
0040h = PRELOAD   Resource is preloaded.

DW  Resource ID. This is an integer type if the high-order
bit is set (8000h), otherwise it is the offset to the
resource string, the offset is relative to the
beginning of the resource table.

DD  Reserved.

Resource type and name strings are stored at the end of the
resource table. Note that these strings are NOT null terminated and
are case sensitive.

DB   Length of the type or name string that follows. A zero value
indicates the end of the resource type and name string, also
the end of the resource table.

DB   ASCII text of the type or name string.


The resident-name table follows the resource table, and contains this
module’s name string and resident exported procedure name strings. The
first string in this table is this module’s name. These name strings
are case-sensitive and are not null-terminated. The following
describes the format of the name strings:

Size Description
—- ———–

DB   Length of the name string that follows. A zero value indicates
the end of the name table.

DB   ASCII text of the name string.

DW   Ordinal number (index into entry table). This value is ignored
for the module name.


The module-reference table follows the resident-name table. Each entry
contains an offset for the module-name string within the imported-
names table; each entry is 2 bytes long.

Size Description
—- ———–

DW   Offset within Imported Names Table to referenced module name


The imported-name table follows the module-reference table. This table
contains the names of modules and procedures that are imported by the
executable file. Each entry is composed of a 1-byte field that
contains the length of the string, followed by any number of
characters. The strings are not null-terminated and are case

Size Description
—- ———–

DB   Length of the name string that follows.

DB   ASCII text of the name string.


The entry table follows the imported-name table. This table contains
bundles of entry-point definitions. Bundling is done to save space in
the entry table. The entry table is accessed by an ordinal value.
Ordinal number one is defined to index the first entry in the entry
table. To find an entry point, the bundles are scanned searching for a
specific entry point using an ordinal number. The ordinal number is
adjusted as each bundle is checked. When the bundle that contains the
entry point is found, the ordinal number is multiplied by the size of
the bundle’s entries to index the proper entry.

The linker forms bundles in the most dense manner it can, under the
restriction that it cannot reorder entry points to improve bundling.
The reason for this restriction is that other .EXE files may refer to
entry points within this bundle by their ordinal number. The following
describes the format of the entry table bundles.

Size Description
—- ———–

DB   Number of entries in this bundle. All records in one bundle
are either moveable or refer to the same fixed segment. A zero
value in this field indicates the end of the entry table.

DB   Segment indicator for this bundle. This defines the type of
entry table entry data within the bundle. There are three
types of entries that are defined.

000h = Unused entries. There is no entry data in an unused
bundle. The next bundle follows this field. This is
used by the linker to skip ordinal numbers.

001h-0FEh = Segment number for fixed segment entries. A fixed
segment entry is 3 bytes long and has the following

DB  Flag word.
01h = Set if the entry is exported.
02h = Set if the entry uses a global (shared) data
The first assembly-language instruction in the
entry point prologue must be “MOV AX,data
segment number”. This may be set only for
SINGLEDATA library modules.

DW  Offset within segment to entry point.

0FFH = Moveable segment entries. The entry data contains the
segment number for the entry points. A moveable segment
entry is 6 bytes long and has the following format.

DB  Flag word.
01h = Set if the entry is exported.
02h = Set if the entry uses a global (shared) data


DB  Segment number.

DW  Offset within segment to entry point.


The nonresident-name table follows the entry table, and contains a
module description and nonresident exported procedure name strings.
The first string in this table is a module description. These name
strings are case-sensitive and are not null-terminated. The name
strings follow the same format as those defined in the resident name


The location and size of the per-segment data is defined in the
segment table entry for the segment. If the segment has relocation
fixups, as defined in the segment table entry flags, they directly
follow the segment data in the file. The relocation fixup information
is defined as follows:

Size Description
—- ———–

DW   Number of relocation records that follow.

A table of relocation records follows. The following is the format
of each relocation record.

DB  Source type.
00h = LOBYTE
03h = FAR_ADDR (32-bit pointer)
05h = OFFSET (16-bit offset)

DB  Flags byte.

DW  Offset within this segment of the source chain.
If the ADDITIVE flag is set, then target value is added to
the source contents, instead of replacing the source and
following the chain. The source chain is an 0FFFFh
terminated linked list within this segment of all
references to the target.

The target value has four types that are defined in the flag
byte field. The following are the formats for each target


DB  Segment number for a fixed segment, or 0FFh for a
movable segment.

DB  0

DW  Offset into segment if fixed segment, or ordinal
number index into Entry Table if movable segment.


DW  Index into module reference table for the imported

DW  Offset within Imported Names Table to procedure name


DW  Index into module reference table for the imported
DW  Procedure ordinal number.


DW  Operating system fixup type.
Floating-point fixups.
0004h = FIERQQ
0005h = FIDRQQ
0006h = FIWRQQ

DW  0

The Simplest COM Infector

The Simplest COM Infector:-


In the world of DOS viruses, the simplest and least threatening is the non-resident COM file infector. This type of virus infects only COM program files, which are just straight 80×86 machine code. They contain no data structures for the operating system to interpret – just code. The very simplicity of a COM file makes it easy to infect with a virus. Likewise, non-resident viruses leave no code in memory which goes on working after the host program (which the virus is attached to) is done working. That means as long as you’re sitting at the DOS prompt, you’re safe.

I say a non-resident COM infector is simple and non-threatening, I mean that in terms of its ability to reproduce and escape. There are some very nasty non-resident COM infectors floating around in the underground.


An example to infect COM file in your current directory if you run it

This virus operates as follows:

  1. An infected program is loaded and executed by DOS.

  2. The virus starts execution at offset 100H in the segment given to it by DOS.

  3. The virus searches the current directory for files with the wildcard "*.COM".

  4. For each file it finds, the virus opens it and writes its own 44 bytes of code to the start of that file.

  5. The virus terminates and returns control to DOS

At the end result is that every COM file in the current directory becomes infected, and the infected host program which was loaded executes the virus instead of the host.

There are three major types of COM infecting viruses

  1. Overwriting viruses

  2. Companion viruses

  3. Parasitic viruses

Infecting EXE Files:-


EXE viruses tend to be more complicated than COM infectors, simply because EXE files are more complex than COM files. The virus must be capable of manipulating the EXE file structure properly in order to infect a program. Fortunately, all is not more complicated, though. Because EXE files can be multi-segmented, some of the hoops we had to jump through to infect COM files – like code that handled relocating offsets – can be dispensed with.

The Structure of an EXE File:-


The EXE file is designed to allow DOS to execute programs that require more than 64 kilobytes of code, data and stack. When loading an EXE file, DOS makes no a priori assumptions about the size of the file. All of this information is stored in the EXE file itself, in the EXE Header at the beginning of the file. This header has two parts to it, a fixed-length portion, and a variable length table of pointers to segment references in the Load Module, called the Relocation Pointer Table. Since any virus which attacks EXE files must be able to manipulate the data in the EXE Header

When DOS loads the EXE file, it uses the Relocation Pointer Table to modify all segment references in the Load Module. After that, the segment references in the image of the program loaded into memory point to the correct memory locations.

A virus that is going to infect an EXE file will have to modify the EXE Header and the Relocation Pointer Table, as well as adding its own code to the Load Module. The Intruder-B virus will attach itself to the end of an EXE program and gain control when the program first starts. This will require a routine similar to that in Timid-II, which copies program code from memory to a file on disk, and then adjusts the file

Then, to infect a new file, the copy routine must perform the following steps:

  1. Read the EXE Header in the host program.
  2. Extend the size of the load module until it is an even multiple of 16 bytes, so cs:0000 will be the first byte of the virus.
  3. Write the virus code currently executing to the end of the EXE file being attacked.
  4. Write the initial value of ss:sp, as stored in the EXE Header, to the location of HOSTS on disk in the above code.
  5. Write the initial value of cs:ip in the EXE Header to the location of HOSTC on disk in the above code.
  6. Store Initial ss=SEG VSEG, Initial sp=OFFSET FINAL + STACK_SIZE, Initial cs=SEG VSEG, and Initial ip=OFFSET VIRUS in the EXE header in place of the old values.
  7. Add two to the Relocation Table Entries in the EXE header.
  8. Add two relocation pointers at the end of the Relocation Pointer Table in the EXE file on disk (the location of these pointers is calculated from the header). The first pointer must point to the segment part of HOSTS. The second should point to the segment part of HOSTC.
  9. Recalculate the size of the infected EXE file, and adjust the header fields Page Count and Last Page Size accordingly.
  10. Write the new EXE Header back out to disk.

Boot sectors:-


The boot sector virus can be the simplest or the most sophisticated of all computer viruses. On the one hand, the boot sector is always located in a very specific place on disk. Therefore, both the search and copy mechanisms can be extremely quick and simple. On the other hand, since the boot sector is the first code to gain control after the ROM startup code , it is very difficult to stop before it loads. If one writes a boot sector virus with sufficiently sophisticated anti-detection routines, it can also be very difficult to detect after it loads, making the virus nearly invincible.

A boot sector virus can be fairly simple – at least in principle. All that such a virus must do is take over the first sector on the disk. From there, it tries to find uninfected disks in the system. Problems arise when that virus becomes so complicated that it takes up too much room. Then the virus must become two or more sectors long, and the author must find a place to hide multiple sectors, load them, and copy them. This can be a messy and difficult job. However, it is not too difficult to design a virus that takes up only a single sector. This chapter and the next will deal with such viruses.

Rather than designing a virus that will infect a boot sector, it is much easier to design a virus that simply is a self-reproducing boot sector. Before we do that, though, let’s design a normal boot sector that can load DOS and run it. By doing that, we’ll learn just what a boot sector does. That will make it easier to see what a virus has to work around so as not to cause problems.

Infecting Device Drivers


COM, EXE and boot sector viruses are not the only possibilities for DOS executables. One could also infect SYS files.

Although infecting SYS files is perhaps not that important a vector for propagating viruses, simply because people don’t share SYS files the way they do COMs, EXEs and disks, I hope this exercise will be helpful in opening your mind up to the possibilities open to viruses. And certainly there are more than a few viruses out there that do infect device drivers already.

Let’s tackle this problem from a little bit different angle: suppose you are a virus writer for the U.S. Army, and you’re given the task of creating a SYS-infecting virus, because the enemy’s anti-virus has a weakness in this area. How would you go about tackling this job?

Step One: The File Structure

The first step in writing a virus when you don’t even know anything about the file structure you’re trying to infect is to learn about that file structure. You have to know enough about it to be able to:

  1. modify it without damaging it so that it will not be recognized by the operating system or fail to execute properly, and
  2. put code in it that will be executed when you want it to be.

A typical example of failure to fulfill condition (a) is messing up an EXE header. When a virus modifies an EXE header, it had better do it right, or any one of a variety of problems can occur.

A SYS file is coded as a straight binary program file, very similar to a COM file, except it starts at offset 0 instead of offset 100H. Unlike a COM file, the SYS file must have a very specific structure. It has a header, like an EXE file, though it is coded and assembled as a pure binary file, more like a COM file. It’s kind of like coding an EXE program by putting a bunch of DB’s at the start of it to define the EXE header, and then assembling it as a COM file, rather than letting the assembler and linker create the EXE header automatically.2.

Step Two: System Facilities

The next important question one must answer when building a virus like this is "What system facilities will be available when the code is up and running?" In the case of device driver viruses, this question is non-trivial simply because DOS has only partially loaded when the device driver executes for the first time. Not all of the DOS functions which an ordinary application program can call are available yet.

In the case of DOS device drivers, what will and will not work is fairly well documented, both by Microsoft in the references mentioned above, and in other places, like some of the books on DOS device drivers mentioned in the bibliography.

Remember that you can always assume that a particular system function is available at some low level, and program assuming that it is. Then, of course, if it is not, your program simply will not work, and you’ll have to go back to the drawing board.