[comp.virus] Virus naming

LEICHTER@Venus.YCC.Yale.Edu (Jerry Leichter) (08/22/89)

Every new virus report these days seems to lead to a debate about a
proper name for the beasts.  May I suggest that this matter be
settled, once and for all, by adopting long-established traditions
used in a variety of sciences, ranging from astronomy to biology to
medicine: The discoverer of, or the first person to describe, a
planet/microbe/disease has an essentially absolute right to choose a
name for it.  A poorly-chosen name for something that gets discussed
extensively will sometimes fall by the wayside, but that's the
exception.

The closest match from the traditional sciences is clearly with
medicine.  The person who gets to choose the name is the person who
publishes the first article which describes the disease in some
detail.  The tone of such articles is quite similar to the tone of the
recent analyses of viral code.  While the discover can choose any name
he likes, traditionally the names chosen reflect either some obvious
and distinctive mark or symptom of the disease (AIDS - Acquired Immune
Deficiency Syndrome), or the place where it was first noted (Lyme
Disease).  When the discoverer doesn't choose a name, the disease
often gets named after him (Wernickie's Aphasia).

Other fields of science have established their own traditions (names
of Roman gods for planets; Latin descriptive terms for species -
though this gets tempered by humor).  Biological viruses have pretty
arbitrary names: One large class, the Coxsackie viruses, are named
after a town in upstate New York where the first member of the class
was isolated; another, the Herpes viruses, I believe have a name
derived from Greek via a particular disease caused by one of them.
Others have names like "T4 phage".

                            -- Jerry

hollombe%sdcsvax@ucsd.edu (The Polymath) (08/26/89)

EICHTER@Venus.YCC.Yale.Edu (Jerry Leichter) writes:
}The closest match from the traditional sciences is clearly with
}medicine.  The person who gets to choose the name is the person who
}publishes the first article which describes the disease in some
}detail.  ...
}... When the discoverer doesn't choose a name, the disease
}often gets named after him (Wernickie's Aphasia).

I think this is the way to go for simple psychological reasons.
Naming a virus for its discoverer is a strong discouragement to the
virus writers.  Imagine the frustration of writing what you think is a
really nifty virus, only to have someone else's name associated with
it.  Not much incentive there.

There's more than one way to fight this war.

- --
The Polymath (aka: Jerry Hollombe, hollombe@ttidca.tti.com)  Illegitimati Nil
Citicorp(+)TTI                                                 Carborundum
3100 Ocean Park Blvd.   (213) 452-9191, x2483
Santa Monica, CA  90405 {csun|philabs|psivax}!ttidca!hollombe

dav@eleazar.dartmouth.edu (William David Haas) (09/04/89)

In article <0001.8909011255.AA07043@ge.sei.cmu.edu> ttidca.TTI.COM!hollombe%sdc
svax@ucsd.edu (The Polymath) writes:
<EICHTER@Venus.YCC.Yale.Edu (Jerry Leichter) writes:
<}... When the discoverer doesn't choose a name, the disease
<}often gets named after him (Wernickie's Aphasia).
<
<I think this is the way to go for simple psychological reasons.
<Naming a virus for its discoverer is a strong discouragement to the
<virus writers.  Imagine the frustration of writing what you think is a
<really nifty virus, only to have someone else's name associated with
<it.  Not much incentive there.
<
<There's more than one way to fight this war.

And then you will have virus writers 'discovering' their own work to
their name on it.

VANVLECK_TOM@prune.bitnet (03/14/91)

I like the idea of a "virus hash ID" if one could be computed.  For
the popular name, suppose we named viruses like they name comets, with
computer type, discoverer's name, year, and a letter?  e.g. "PC
Skulason 1990c" If you discover a new virus, you name it and report it
to a central place.  Earliest accepted report is the one that's
recognized.

Doing so would glorify the virus fighters instead of the jerks.  The
name wouldn't depend on sizes (vary, different viruses same size), or
strings in the virus (sometimes offensive).

Old viruses: leave the names alone; they're cute.  Arguments about
priority, renaming when more info comes in: bio science puts up with
this.  Who's the central authority?  Some "virus researcher" who can
tell when 2 viruses are the same, and has safe storage for samples.
Volunteers?

Tom Van Vleck <vanvleck_tom @ tandem.com>

CHESS@YKTVMV.BITNET (David.M.Chess) (03/22/91)

The trouble with hash codes, or dates, or anything else semi-automatic
is that, when there get to be enough of them, the names start to
become useless.  At IBM, we tried to use number-names whenever
possible early on, but the disadvantages became apparent after not too
long.  If there's a 453 and a 435 virus, for instance, it's Real Hard
to remember which is which!  The same would apply to a #AR657XXL and
#AR567LXL, or a PC Smith 910004 and PC Smith 910014.

Our current rather tentative approach is to use a
generally-non-numeric stem for each virus family, and then tack on a
number or similar object to pin down exactly which object we're
discussing.  So we talk about the "Flip-2343" and the "Flip-2153" (if
I've remembered the numbers right).  The first part helps the human
remember which virus in general this is, and the second part pins it
down.  If it is desirable to have a distinct number of some kind for
each virus (and it might well be at some point), I'd suggest having a
technically- redundant-but-in-fact-very-very-helpful-to-us-
finite-humans human name for each one (or at least each strain) as
well.

DC

PHYS169@csc.canterbury.ac.nz (Mark Aitchison, U of Canty; Physics) (03/27/91)

CHESS@YKTVMV.BITNET (David.M.Chess) writes:
> The trouble with hash codes, or dates, or anything else semi-automatic
> is that, when there get to be enough of them, the names start to
> become useless.  At IBM, we tried to use number-names whenever
> possible early on, but the disadvantages became apparent after not too
> long.  If there's a 453 and a 435 virus, for instance, it's Real Hard
> to remember which is which!

I agree, but there are two reasons for a virus name:

(1) To indicate in "human-friendly" terms roughly what it is, and
(2) To positively identify which virus it is.

In the first case, you usually aren't concerned if it is a slight
modification of a well known virus (so long as it does the same
things), and it is nice if there are just a few, easily pronounced
names to remember. To start with, that is what we had. Now, the system
is breaking down because there are so many, often minor modifications,
and a lack of communication or standardisation by anti-virus workers.
Having a lot of easy-to-remember but incorrectly applied virus names
is worse than useless. Hence my suggestion for a change.

Ideally, there should be a method of identification, given nothing but
the virus itself. So if people over the other side of the world also
find the same virus, you can definately say "yes, this is the same
virus" without having to send a copy of the whole thing. It would be
nice if such a method for positive identification also helped with an
easily remembered name as well. Well, that is possible (e.g. with my
CHECKOUT program), and it partly involves the "family plus number"
method you mentioned.

This is how it works...
You create a hashcode that consists of two parts (see my BOOTID
program), one part has bit-flags that identify certain good and bad
things the boot code is doing. Similar viruses get similar codes here.
If you can't be bothered working out what this part of the code means,
the CHECKOUT program has an option that explains it all in English.
The other part of the code is a seemingly-random code derived from the
bytes in the boot sector. Two viruses that are similar but slightly
different will get totally different codes, so this part is of little
use to us humans. But the total code can be used to look up a list of
known good and bad boot sectors. This would have a "popular name",
that hopefully is assigned carefully, perhaps by one person or
organisation. So, if it is a known virus, you get two things, the
hashcode plus a sensible name. If it isn't in the list of known
viruses, you just get a hashcode, the last 3 characters of which I, at
least, find easy to identify the basic type of virus from at first
glance.

Now, this hasn't been extended to other types of virus yet, but I have
a plan in mind, which puts more emphasis on what the virus does, and
less on the code it uses to get there, but it is still determined only
from the contents of the virus, rather than some obscure historical
fact that gave it a name. As I have said, there is still a place for
"family" or "generic" names for viruses, *but* it should be a lot more
organised than at present, otherwise there will be more and more cases
of confusion - which can be dangerous since some variations of some
viruses have to be handled very differently.

By the way, BOOTID and CHECKOUT are both free from
cantva.canterbury.ac.nz, 132.181.30.3, in the pc subdirectory. There
will be a new version released within the next week, with better
analysis facilities in the CHECKOUT program, and a slight change to
hashcodes produced by both programs, to allow for some types of good
(e.g. "virus-immune") disks that gave "worrying" results. Keep sending
suggestions, though!

Mark Aitchison, Physics, University of Canterbury, New Zealand.

frisk@rhi.hi.is (Fridrik Skulason) (03/29/91)

CHESS@YKTVMV.BITNET (David.M.Chess) writes:

>Our current rather tentative approach is to use a
>generally-non-numeric stem for each virus family, and then tack on a
>number or similar object to pin down exactly which object we're
>discussing.  So we talk about the "Flip-2343" and the "Flip-2153"

This is same as I do, but possibly with a letter added at the end, if
two variants have the same length, like

	Plastique/AntiCAD-4096A
	Plastigue/AntiCAD-4096B alias "Invader"
	Plastique/AntiCAD-4096C alias "Invader B"

With 4-5 new viruses every week, the naming problem is getting pretty
bad...

- -frisk