[comp.sys.apple] Virus 101 - Chapter 1

woodside@ttidca.TTI.COM (George Woodside) (03/01/89)

Preface: The program VKILLER is specific to the ATARI ST. My apologies
for not making this clear in the previous posting, which went to
several newsgroups. I have recieved far too many requests for the
program from users of other systems to reply to each one individually,
and the mailer has bounced some of the replies I tried to send. If you
have an Atari, VKILLER was posted here a few weeks ago, and is
available in the archives, on GEnie, Compuserve, and from most public
domain disk distributors and User Group libraries. The current version
is 2.01. 

Initial postings will cover virus fundamentals, as they apply to the
area of the Atari ST and, similarly, to MS-DOS systems. The file
systems of the two machines are nearly identical. These general
information articles will be cross-posted to the newsgroups in which
this topic is now active. Future postings will be made only to the
Atari newsgroup, since they will deal with viruses (the plural,
according to Webster's, is viruses) known to exist in the ST world.
They would automatically be different than an IBM virus, since they
are in the 68000 instruction set, or from a Mac or Amiga virus, since
the file systems differ. Since all the viruses I have located are the
"BOOT SECTOR" type (far and away the most common), that's what I will
dwell upon. If and when the proposed newsgroup comp.virus becomes
active, it will be added to the list for all postings.

Your generic disclaimer: I just an old-school computer hacker, with 20
years in the software business. I built my first IMSAI many years ago,
and have had several different computers. That qualifies me to have
spent a lot of time on computers, but nothing further. I may be wrong
about some things, may have a different opinion than you or anybody
else, or most anything else you'd care to have disclaimed. What I
think is my own opinion, and in no way represents the opinion or
position of my employer or anyone else. I've written several articles
for magazines as well as software related to virus detection and
killing, but I have been known to be wrong (so they tell me :^)).

While posting any kind of information about viruses may trigger
someone to attempt creating one, I believe that the benefit of the
knowledge to potential victims outweighs that risk. I don't believe
that you can stop someone (who wishes to) from creating a virus by
withholding information - it is already available from many sources.
Since not all viruses act the same, or attempt to attack in the same
manner, it may help potential (or current) victims to learn about the
symptoms of the viruses known to exist, and how to protect themselves.

While the concept of viruses can be complex, I'll try to keep things
at a level that should be understandable by most anyone past the
casual user genre. However, since I've been at this sort of thing for
some time, what I consider basic knowledge may not be familiar to
everyone. Advance apologies are offered here for any invalid
assumptions, typos, smart alec remarks, grammatic errors, or whatever
offends you. 

Some basic terms, as they have come to be used in this area: 

A VIRUS is any program which spreads itself secretly. It may be
destructive, a prank, or even intended to be helpful, but it spreads. 

A TROJAN HORSE is a program which executes one function secretly while
appearing to be accomplishing some other task, or appearing to be some
other program entirely. One task a Trojan Horse may accomplish is to
install a virus, which would then spread itself.

A WORM is a program or function which imbeds itself inside another
program, be it an application, part of a system, a library or
whatever. It may or may not spread itself by some means, and may or
may not have destructive intents. 

Now, to the basics of a disk (specifically floppies, but true of most
hard disks as well): 

A DIRECTORY is a list of files and sub-directories. There is one
primary directory (called the root directory) on a disk. It contains
the entries for files, and other directories (called sub-directories,
or folders on the Atari). Sub-directories (folders) may contain
entries of other sub-directories, files, or both. Every file has one
entry in the disk directory (or in some sub-directory). That entry
contains, among other things, the file name, date and time of
creation, length, and the address of the first entry in the File
Allocation Table (FAT) for the file. 

A FAT is a File Allocation Table. It is a road map of how the
operating system will locate data on a disk. Essentially, it is a
series of pointers. The directory entry of a file points to the first
FAT entry of that file. That entry points to the next, and so on,
until the last entry, which contains a special value indicating end of
file. There are two copies of the FAT on the disk, since it is
absolutely critical. Lose the FAT, and the data on the disk becomes
un-accessable. 

A BOOT SECTOR is the first sector on a floppy disk. With the Atari
(and MS-DOS) system, it contains configuration information about the
disk. That information includes how many tracks are on the disk, how
many sectors per track, how many sides on the disk, how big the FATs
and directories are, where the data begins, etc. On the MS-DOS
systems, the boot sector contains the ID of the operating system under
which it was formatted. On the Atari, that value is not used, but
replaced (in part) by a number. That number should be different on
every disk, and is used as part of the mechanism by which disk changes
are detected. The boot sector may or may not contain executable code.
If it does contain executable code, it is normally executed only at system
powerup or system reset time. 

On all such disks, the boot sector is number 0, the first sector on the
first side of the first track. On a standard format Atari disk, the
next five sectors are the first copy of the FAT, the next five sectors
are the second copy of the FAT, the next seven sectors are the root
directory, and the remainder of the disk is available for data.

Now, on with the show: 

Floppy disks are changed on a regular basis while the computer is
being used. More so on systems with no hard disks, but periodically on
most all systems. This event, referred to as a "Media Change", is
detected by the computer's disk drive. The disk door is opened, the
status of the write protection changes as one disk is removed and
another is inserted, etc. When that happens, the operating system must
recognize that the disk has been changed before attempting to read or
write to the new disk. The operating system reads the disk's boot
sector to learn about the newly inserted disk. That instant, when the
operating system checks the new disk, is when nearly all the boot
sector viruses spread. We'll get to that in the next chapter, but first,
a more primary question: 

How did the virus get in there? 

When a computer is booted up from a power off state, or reset (in most
cases), it starts executing code from internal ROMs. Those ROMs set up
primary vectors, minimal configuration information, and perform some
fundamental tests. Then they start moving into uncharted waters. They
have to find out what devices are attached, and get them into
operating status. They also have to provide a means of expanding their
own capabilities to support new devices, functions, and whatever else
which may not have existed when the ROMs were created. One of the
means by which this is accomplished is by checking various addresses
for special codes, magic numbers, or any kind of response to a read
or write. Another function which may be enabled is checking the boot
sector on an inserted floppy disk for executable status. If that boot
sector has executable status, the code contained in the boot sector is
executed. That code may cause other portions of the disk to be loaded
and executed, set variables or vectors, or nearly anything imaginable.
That includes infecting the system with a virus, if that's what the
boot sector code contains. Executable status may be via a special flag
value in a reserved address, but it is normally determined by adding
up the value of all the data bytes in the boot sector. If the total
derived (called a checksum) is a specific value (a "magic" number),
then the boot sector is deemed executable. The code is usually executed
at that time. The code is not normally garanteed to be loaded at any
specific address in memory, so it must be "position independant",
or capable of executing no matter where it exists in memory.

The boot sector is of limited size, normally 512 bytes. While that is
enough for a small program, it may not be enough for whatever task it
is designed to accomplish. So, part of what the code in the boot sector
accomplishes must be to load the rest of the code it needs to get the job done.
This may be a normal data file, or hard coded to some other part of the
disk.

If the code from the boot sector is designed only to accomplish some task,
it will normally take the steps to do so, then return to the operating
system. This may be setting the screen resolution or colors, issuing
an initialization command to some device, or setting up some option
or feature. If the code is designed to remain available after the initial
execution (such as part of a device driver), it must inform the operating
system that it wishes to remain resident. The operating system then
alters the amount of RAM available to protect the space occupied by the
loaded code, so that subsequent programs do not tamper with the loaded
routine. Such a routine is referred to as a "Terminate and Stay Resident"
routine, or a TSR. Viruses must be TSR type programs. They have to remain
in the system, and active, to be able to accomplish their spread, and
eventually, their true goal. If the boot sector program was designed
to attack immediately, it may accomplish its destruction, but it would
never get the opportunity to spread, and the disk which caused the
attack would be easily identifiable.

Most viruses accomplish system infection by taking over a "vector". A
vector is a specific address in system memory which contains the
address of a routine or function. When an interrupt (such as pressing
a key, the clock ticking, or so on) occurs, processing is suspended,
and the system loads the address in some vector associated with that
event. It executes the routine at the address which was stored in the
vector, then resumes whatever it was up to when the interrupt
occurred. Other vectors are not associated with interrupts, but with
specific functions, such as display a character on the screen, read a
sector from the disk, write to the printer, and so on. 

To take over a vector, the steps are fairly simple. A RAMdisk, for
example, will usually take over a disk read/write vector. When it
installs itself, it removes the current address from the vector
assigned to the disk read/write function. It saves that address in
it's own code, and places the address of it's own code in the vector.
When a disk read/write call is made by the operating system, the
operating system loads the address found in the proper vector, and
starts executing the code found at that address. That address now
points to the executable code of the RAMdisk. The first thing the
RAMdisk does is check the function call's parameters to see if the
read/write is for the RAMdisk. If it is, the RAMdisk accomplishes the
read or write, and returns to the operating system. If the read/write
is for some other disk drive, the RAMdisk code passes the call on to
the address it removed from the vector, allowing the assigned device
to accomplish the task. 

There may be more than one alteratiion of the vector. Each new routine
which is installed will save the old vector, and insert itself. That
means that the routine installed last will get the first access to any
call which uses that vector. If it does not want the call, it passes
the call on to the address it found in the vector, and so on. The
significance of this sequencing is that a boot sector virus, if
present, will be one of the first "vector snatchers" to get installed.
Conversely, it will be one of the last routines in the sequence to get
executed when a vector is accessed. 

If the vector in question happens to be for floppy disk I/O, the virus
will be one of the last vectors before the real physical read/write
routine. So, if a program designed to detect a virus's floppy disk I/O
calls is executed as part of a startup procedure, it can easily be
fooled. The detect program will see only normal system I/O calls
passing through the vector. The virus resides in the vector list after
the anti-virus program, so the anti-virus will never see any activity
generated by the virus. The anti-virus thinks that things are
progressing well, while, in reality, the virus is either spreading or
doing damage behind the anti-virus's back. 

If the anti-virus gets installed first (say, by being in a boot sector
itself), it has a better chance of offering protection, but not an
absolute one. Some viruses check things like ROM version numbers, and
know the absolute addresses in the ROMs of the functions they want. By
using those addresses, they can bypass subsequent links in the vector
list, and still do their dirty work. They can also refuse to install
themselves if the addresses or version numbers do not correspond to
the environment they want. 

End of Chapter 1. 
-- 
*George R. Woodside - Citicorp/TTI - Santa Monica, CA 
*Path:       ..!{philabs|csun|psivax}!ttidca!woodside