davidf@cs.heriot-watt.ac.uk (David.J.Ferbrache) (02/09/90)
Yes, the idea is an excellent one. The concept of a programmable virus recognition system has already been adopted on the Mac, specifically in release 3.0 and later of Jeff Shulman's virus detective package. This desk accessory uses an abstract definition language for the detection of viruses by resource patterns or code signatures. Jeff's patterns allow quite complex expressions and sub expressions tied with the cand operator (&). The product can test for the creator and type of any file; the resource type, name and size; and a code string to be searched which can be offset by a fixed value from the start or beginning of the resource. Jeff allows wildcarding in a limited form to occur in his search strings. This takes the form of a skip over non-significant bytes, thus the search string "3C#500" would match 3C, skip 5 bytes and then match 00. Thus matching the string 3C12C9006A800. This adaptability has proved virus detective to be one of the most useful anti-virus utilities on the Mac. Thus a new virus (such as the WDEF strain) can be reported along with a virus detective identification string which can be rapidly added by the user to his virus detective copy. Finally virus detective incorporates generic detection patterns for most Mac viruses, thus eliminating the problem caused by the regiment of nVIR virus clones (most of which are produced by a simple binary or resource edit of the infected file). Obviously the general virus detection system may be less efficient than the alternative specialist detection software, however the use of precise offsets may cause significant improvements in pattern scanning. Other extensions could include test expressions for the file alteration date and length information in the directory (catching the 648 seconds signature and the Oropax rounded file size signature); and extended wildcard matching to deal with the 1260 decryptor scrambling routine. (With a possible syntax allowing ?X to match up four random characters before a further match, we could have ?20B8#4?20B9#4?20BF#4?20310D?203105?2047?2040?20E2 this rather complex pattern would allow for up to 32 random bytes to be inserted between each significant byte in the 1260 decryptor string and would also skip the variable values in the MOV instructions. Naturally as the form of the expression becomes more complex (ending in full regular expressions with exponential search time and memory requirements) search times will increase. In summary the requirements for filter expressions are: 1. Directory entry filters - file length (including modular value test, eg MOD 51), file alteration date, attribute settings and file extensions 2. File characteristic filters - possibly including the destination of the initial jump instruction, definitely including a form of code scanning using wildcarded expressions (either in a specified scan range or at a number of scan ranges based on offset from the start and end of the file). Naturally such a virus detector could be extended to allow scan of memory or of a range of disk sectors, thus allowing one program to deal with application, boot sector and partition record viruses. (In a similar manner to the way norton utilities allows for a variety of data sources - sector, file, cluster etc). Anyway just a few thoughts.