[comp.virus] programmable virus scanner

davidf@cs.heriot-watt.ac.uk (David.J.Ferbrache) (02/09/90)

Yes, the idea is an excellent one. The concept of a programmable virus
recognition system has already been adopted on the Mac, specifically in
release 3.0 and later of Jeff Shulman's virus detective package. This
desk accessory uses an abstract definition language for the detection
of viruses by resource patterns or code signatures.

Jeff's patterns allow quite complex expressions and sub expressions tied
with the cand operator (&). The product can test for the creator and
type of any file; the resource type, name and size; and a code string
to be searched which can be offset by a fixed value from the start or
beginning of the resource.

Jeff allows wildcarding in a limited form to occur in his search
strings. This takes the form of a skip over non-significant bytes, thus
the search string "3C#500" would match 3C, skip 5 bytes and then match
00. Thus matching the string 3C12C9006A800.

This adaptability has proved virus detective to be one of the most
useful anti-virus utilities on the Mac. Thus a new virus (such as the
WDEF strain) can be reported along with a virus detective
identification string which can be rapidly added by the user to his
virus detective copy.

Finally virus detective incorporates generic detection patterns for
most Mac viruses, thus eliminating the problem caused by the regiment
of nVIR virus clones (most of which are produced by a simple binary or
resource edit of the infected file).

Obviously the general virus detection system may be less efficient
than the alternative specialist detection software, however the use of
precise offsets may cause significant improvements in pattern
scanning. Other extensions could include test expressions for the file
alteration date and length information in the directory (catching the
648 seconds signature and the Oropax rounded file size signature); and
extended wildcard matching to deal with the 1260 decryptor scrambling
routine. (With a possible syntax allowing ?X to match up four random
characters before a further match, we could have
?20B8#4?20B9#4?20BF#4?20310D?203105?2047?2040?20E2 this rather complex
pattern would allow for up to 32 random bytes to be inserted between
each significant byte in the 1260 decryptor string and would also skip
the variable values in the MOV instructions.  Naturally as the form of
the expression becomes more complex (ending in full regular
expressions with exponential search time and memory requirements)
search times will increase.

In summary the requirements for filter expressions are:

1. Directory entry filters - file length (including modular value test, eg
    MOD 51), file alteration date, attribute settings and file extensions
2. File characteristic filters - possibly including the destination of the
   initial jump instruction, definitely including a form of code scanning
   using wildcarded expressions (either in a specified scan range or at
   a number of scan ranges based on offset from the start and end of the
   file).

Naturally such a virus detector could be extended to allow scan of
memory or of a range of disk sectors, thus allowing one program to
deal with application, boot sector and partition record viruses. (In a
similar manner to the way norton utilities allows for a variety of
data sources - sector, file, cluster etc).

Anyway just a few thoughts.