[comp.virus] Signature Programs

71435.1777@CompuServe.COM (Bob Bosen) (11/16/89)

As a member of the American National Standards Institute's (ANSI) X9E9
working group and an active participant in standards activities
regarding computer security and authentication, I have been reading
the various comments on "Checksum" programs with a lot of interest
ever since this forum became accessible to me about 2 weeks ago. If
the comments which follow are way off-base, please forgive my newness
to the forum; perhaps these things have been discussed in the less
recent past....

I've been surprised at the lack of content regarding sophisticated
authentication algorithms. In this forum of late,I've been reading
about checksums and CRCs but I haven't heard any mention of ANSI X9.9
or ISO 8731-2, which are both extremely relevant standards. Both of
these authentication algorithms have served the international banking
community well, having been used for years to secure billions of
dollars worth of daily wire-funds transfers without a single verified
case of compromise.

Checksum programs work by attempting to "authenticate" a program or
file by calculating a number, based upon the content of the file. That
number serves as a digital "signature" reflecting the exact status of
the file at the moment when the calculation was made. Unfortunately,
authentication in hostile environments is not a trivial subject, and
has been shown to be susceptible to forgery and compromise.
Furthermore, as Paul Kerchen and Y.  Radai have recently commented,
very serious attention must be paid to exactly where the signatures
(and any component parts critical to their calculation) are stored.

In my opinion, if properly implemented, signature programs have the
potential for being much more useful than "scanners" (or any other
known anti-viral technique) in most instances, since they don't
require any foreknowledge about the viruses which may attack in the
future.

Relying on simplistic algorithms to calculate these signatures suffers
from an obvious disadvantage: Future viruses can compensate for the
way the signature is calculated, or forge signatures that appear to be
valid.  Relying on supposedly "secret, proprietary" algorithms is very
risky: the annals of cryptography are littered with the bones of
algorithms that couldn't withstand the scrutiny of dedicated
adversaries. If the history of algorithmic research can teach us
anything, it is that we shouldn't trust any cryptographic algorithms
unless they've been examined by a very large population of experts.

There is a developing science of "authentication technology" that is
revealing important facts about the kinds of algorithms that can be
relied upon to resist the scrutiny of adversaries. It's amazing how
many people are unaware of these things, and it's DANGEROUS to base
virus detection products on insecure algorithms. As this knowledge
grows and becomes more easily available to the people that write
viruses, commercial vendors of virus detection programs will be forced
to learn about this stuff the hard way.

The American Bankers Association, in cooperation with the American
National Standards Institute, (with representation from NSA, NIST,
Federal Reserve, the Vendor community, etc.) and the International
Standards Organization have endorsed standardized ways of calculating
digital signatures. There are powerful ways of using these respected,
standardized algorithms in the reliable detection of viral
contamination. It's complex and expensive to put together a practical
implementation, but once it's done it can provide a very reliable
first line of defense that won't need 49 different revisions per year
to keep up with identified threats.

By the way, for those of you that are wondering if performance will
suffer, the answer is that it need NOT suffer. Certainly,
unsophisticated implementations might turn out to be abysmally slow,
but it is quite possible and practical, with careful design and
implementation, to adapt combinations of these standards to the IBM PC
world, for example, in a way that users hardly notice. Practical
defenses based on ANSI X9.9, for example, can now authenticate a 100K
file in 3.2 seconds on an IBM "AT"-class machine running at 10 Mhz
without any extra hardware or fancy disk drives. This is best done by
applying a judicious combination of DES encryption with CRC techniques
on a random sampling of file contents, rippling the cryptographic
residue through the entire calculation with a technique that crypto
people call "cipher-block chaining". Furthermore, files don't need to
be checked every single time they are used, so these modest delays
need not occur more than a few times per month per file.

While I'm rambling on, I can't resist the urge to comment on a related
subject. Paul Kerchen writes:

>   where does one store these checksums and their keys? if they
>   are stored on disk, they are vulnerable to attack....

and Y. Radai comments on "static" versus "dynamic" invocation of
signature calculation, leading to discussion of the various advantages
and disadvantages of storing keys and signatures (and maybe even
protection logic) on an active hard disk versus off-line storage on a
diskette.

In my experience, all of these viewpoints have advantages and
disadvantages, and a sophisticated defense must allow users to pick
and choose from all of these techniques according to his own needs. A
heirarchy of interlocking defenses must be put together, with "dialy"
or "dynamic" (continuous but random) checks acting as the first line
of defense based on an active hard disk, and with periodic (weekly or
monthly) off-line checks based on a "sterile kernel" stored on a
bootable diskette that's kept locked up when not in use. In essence,
the monthly checkup from the sterile kernel checks up on the defenses
that've been exposed to viruses in the "dirty" world....

So how 'bout it? Anybody against using respected industry standard
authentication algorithms? Anybody got a better idea?

(By the way, my comments should not be construed to represent any
official viewpoints of the American National Standards Institute. I'm
just a member of the working group....)

Bob Bosen
Vice President
Enigma Logic, Inc.
2151 Salvio Street #301
Concord, CA   94565
Tel: (415) 827-5707
Internet: 71435.1777@COMPUSERVE.COM

XRAYSROK@SBCCVM.BITNET (Steve Woronick) (11/17/89)

   Bob Bosen <71435.1777@CompuServe.COM> brings up some interesting
points, asking why programmers writing authentification programs are
utilizing CRC and checksum algorithms rather than more sophisticated
algorithms like ANSI X9.9, ISO 8731-2, or DES.  I think it depends on
what you are trying to do.  If your plan is to encrypt your program and
rely on difficulties in decryption for protection against infection, then
it probably makes sense to use something very sophisticated, because you
want to make certain that no one but yourself can do the decryption.
If you are leaving the encrypted form on your disk (where it might be
compared with the unencrypted form which is surely to appear either on
your disk or in memory at some future date if you use it), you don't
want to be using something so simple that it might give your algorithm
away.

   On the other hand, if you are not encrypting your program but are
simply trying to generate a number (or maybe several numbers) for
authentification purposes, I don't see that it is necessary to use
anything more sophisticated than a polynomial.  If the virus doesn't
know your polynomial, then it's chances of guessing a sequence of
characters with which to "pad" your program file in order to generate
the same CRC value as the original unaltered program is quite
small.  Of course, everyone ought to be using a slightly different
algorithm (i.e. different polynomials) and ought to be hiding the
authentification algorithm.  Correct me if I'm wrong:  If the algorithm is
sophisticated enough that it is very hard for anyone to guess CRC values,
then there probably is no need to hide the values it calculates for each
of your program files; in principle, one might be able to deduce the
algorithm by comparing program files with the CRC values generated by the
algorithm, but this will work only if there is enough information
available for analysis (which will not be the case for sufficiently
high order polynomials).  The information in a CRC is small compared to
the information in an encrypted file, so CRC programs need not be
terribly sophisticated to foil discovery.

   It has been pointed out that doing a cold boot from a clean floppy
assures you that your system is running clean (i.e. there are no viruses
in memory --- there may be some on your hard disk, but these are dormant
until you run an infected program).  If you then run your
authentification program from the clean floppy disk on your clean system
to check your hard disk (or other), you can rest fully assured that no
virus etc. has had the opportunity to intercept your checking program
and fool you into thinking that an infected program is uninfected (unless
you were dumb and previously exposed the clean disk, though write-
protected, to the inquiring eyes of a virus).  And since there are no
viruses in memory, none can steal your checking algorithm or any of the
CRC values (which you probably are keeping on the clean disk; for that
matter you can keep your own personal polynomial coefficients on the
disk also).  You probably will wisely want to keep your clean disk
write-locked to prevent accidents, but infection is not the only threat
(so write protection does not fully protect one from accidents).  If one
runs the authentification program (or even accesses the disk it's on),
without first doing the cold clean boot, then one risks having the
authentification algorithm stolen by a virus.  And as has been stated
before, one cannot be certain of the authentification results if the
cold boot from the clean disk was not done.  Finally, you obviously have
to write to the clean disk once in a while to update the CRC-values list
for new programs/ whatever, but this is no problem because you're not
going access it without first doing the cold clean boot.  One of course
also assumes that your clean disk was really clean to start with.

   Any comments?  Here's a question:  What's a good reference for
finding out about ANSI X9.9 and ISO 8731-2?  I can give you one for DES
(Data Encryption Standard):  Numerical Recipes, The Art of Scientific
Computing, by W.H. Press, B.P. Flannery, S.A. Teukolsky, and W.T.
Vetterling, published by Cambridge University Press, (c)1986,
p. 214-220.  Two and one half pages of highly-inefficiently coded FORTRAN
is given which implements the DES algorithm (except that the standard
itself explicitly states that any implementation in software is not
secure and therefore not DES).
- -----------------------------------------------------------------------
Steven C. Woronick     | Disclaimer:  These are my own opinions.
Physics Dept.          |     Always check it out for yourself...
SUNY at Stony Brook    |
Stony Brook, NY  11794 |
Acknowledge-To: <XRAYSROK@SBCCVM>

RADAI1@HBUNOS.BITNET (Y. Radai) (12/07/89)

  Bob Bosen has posted a couple of articles on "signature" programs
(I prefer to call them "checksum" programs).  I agree with some of
what he has written, but disagree with other portions.  In V2 #249 he
asks Steve Woronick:
>                                              Are you saying you could
>write or describe a virus that could infect a program but avoid
>detection by an off-line ANSI X9.9-based message authentication code?

  I don't know what Steve's answer is, but mine is definitely YES, and
I say that even though I know very little about the ANSI X9.9 algo-
rithm.  Bob and many others, particularly those with backgrounds in
cryptology, tend to emphasize the *algorithm*: X9.9 or DES or RSA
is considered by the experts to be more secure than CRC, and that's
all there is to it.  What they miss is the fact that what has to
ensure security on a computer is not simply an algorithm, but rather a
*program* which implements it in a given *operating system*.  And even
a program based on the most sophisticated checksum algorithm in the
world is circumventable if it is not written *very carefully*.
  Take, for example, the PC checksum programs in the directory <MSDOS
TROJAN-PRO> or <MSDOS.FILUTL> of the Simtel20 archives.  They all use
a CRC (or in a few cases a more primitive) algorithm.  Suppose we
choose one of them and replace the CRC algorithm by the ANSI X9.9
algorithm.  Will that ensure security?  Far from it!  For one thing,
most of these programs have no provision for checksumming the boot
sector.  That means that despite the use of a sophisticated algorithm,
these programs would be totally ineffective against boot-sector virus-
es, and that includes a sizable percentage of existing viruses.
  Boot-sector checksumming is available in a few of these programs,
e.g. it was finally added to the FluShot+ program a few months ago.
But to the best of my knowledge this program still does not have
partition-record checksumming.  And that goes for almost all the other
programs in those directories also (Sentry is a welcome exception).
  But is checksumming the BS and PR all we need to worry about?  Defi-
nitely not.  If we perform the checksumming when memory is infected by
a Brain-type virus, even X9.9 won't detect any modification.
  So now all we have to do is ensure that memory is uninfected when we
perform the checksumming (by booting from a clean diskette, etc.).
Right?  Wrong!  There are at least five other loopholes in PC-DOS/
MS-DOS which a virus writer could exploit if the program is not care-
fully written, all of which are independent of the checksum algorithm
and do not depend on memory being infected.  (These have apparently
never been used in any actual virus so far.)  Exploitation of such
loopholes is much more practical (from the point of view of the virus
writer) than the checksum-forging methods alluded to by several people
in this forum, since they are independent of the checksum program and
do not require any calculations (of checksums, polynomials, keys,
etc.).  True, all of these loopholes can be blocked if the author of
the checksum program thinks of them.  The trouble is not only that
most authors do not, but also that there may be other loopholes which
none of us has thought of yet.
  The conclusion is that even a program based on the most sophistica-
ted checksum algorithm in the world cannot be depended on to detect
all infections.  Whether a given algorithm is secure depends heavily
on how it's implemented as a *program* in a particular *system*.
  If it's relevant, Bob, I would suggest that you raise this issue
with the rest of the ANSI working group.  There's a small problem,
however:  I have not publicly specified what these additional, more
subtle loopholes are, since I feel it would be quite irresponsible of
me to do so.  But somewhere around No. 89 on my list of 927 things to
do is writing virus simulators to implement all, or at least most, of
these loopholes.  If Bob or anyone else is willing to send me a PC
program which implements X9.9 or any other signature algorithm which
he thinks is secure, that would raise the priority of my writing at
least one of these simulators, which I could then throw at the program
in order to test whether it really is secure.

  Bob also asks:
>                                                   Who can say whether
>the more sophisticated viruses of the future will attempt to analyze
>CRC signatures or target specific products that rely on CRC methods?

  Since he specifically mentions CRC methods, he is obviously not re-
ferring to the types of loopholes to which I alluded above.  In V2
#238 I gave arguments against the claim that CRC programs are circum-
ventable in practice by checksum-forging methods, provided certain ob-
vious precautions are taken.  Bob has given no reply to these argu-
ments and I don't see how emphasis on *future* viruses affects them
(except possibly as concerns the time required for the virus to do its
work).  While I obviously can't prove it, my personal feeling is that
*in practice* a CRC algorithm based on a randomly or personally chosen
generator is, and will remain, just as secure as any more sophistica-
ted algorithm (if the CRC base and program are kept offline) and pro-
bably a lot faster.  In any case, the most important thing is the pro-
gram, not the algorithm.

                                     Y. Radai
                                     Hebrew Univ. of Jerusalem, Israel
                                     RADAI1@HBUNOS.BITNET

71435.1777@CompuServe.COM (Bob Bosen) (12/20/89)

In his mailing of Dec 07 '89, Y. Radai seems to be taking the position
that since I am in favor of sophisticated authentication algorithms, I
must be against sophisticated program implementations. Nothing could
be further from the truth. A really reliable virus detection program
must have BOTH a trustworthy authentication algorithm and a
sophisticated implementation. I stressed the importance of
sophisticated authentication algorithms only because as a newcomer to
VIRUS-L, I was seeing a lot more discussion of implementation details
and scanner programs than of quality authentication techniques.

Please don't misinterpret me: PROGRAMS THAT PURPORT TO DEFEND AGAINST
VIRUSES MUST BE EXTREMELY CAREFULLY WRITTEN. In my view, they should
use the best and most sophisticated defenses available. Today, that
means authentication algorithms should be based on published standards
that have stood the test of time, such as ANSI X9.9. Obviously if a
clever virus writer is able to orchestrate a situation in which the
virus is never examined, then even a sophisticated authentication
algorithm is of no use.  What is needed is a well-written and
convenient program that applies a sophisticated authentication
algorithm across all program code without exception. Clearly this is
better than a well-written and convenient program that applies some
programmer's guess at an authentication algorithm across all program
code without exception!

The address where copies of ANSI X9.9 can be obtained didn't make it
into my last posting. Sorry about that. Copies of ANSI X9 standards
can be obtained through:

Secretariat: American Bankers Association
             Standards Department
             1120 Connecticut Avenue, N.W.
             Washington, D.C. 20036

I think the price is $15.00. I bet if you send a check and a mailing
label with your return address on it, you'll get quick response.

- -Bob Bosen-
Vice President
Enigma Logic Inc.
71435.1777@COMPUSERVE.COM

rsa@lll-crg.llnl.gov (RSA Data Security) (02/01/90)

          The following  paper is  presented for review and discussion.  It
          will be submitted to a number of conferences and MD4 will
	  be proposed to a number of standards organizations. We encourage
	  people to study and evaluate MD4.
          _________________________________________________________________

                          The MD4 Message Digest Algorithm
                          --------------------------------

                                 by Ronald L. Rivest
             MIT Laboratory for Computer Science, Cambridge, Mass. 02139
                                         and
               RSA Data Security, Inc., Redwood City, California 94065


                  (C) Copyright 1989, 1990 RSA Data Security, Inc.

	       			  (Version 1/29/90)



          Abstract:
          ---------

          This note  describes the  MD4  message  digest  algorithm.    The
          algorithm takes as input an input message of arbitrary length and
          produces  as   output  a  128-bit  ``fingerprint''  or  ``message
          digest''  of   the  input.   It  is   conjectured  that   it   is
          computationally infeasible  to produce  two messages  having  the
          same message  digest, or  to produce  any message  having a given
          prespecified target  message digest.   The  MD4 algorithm is thus
          ideal for digital signature applications, where a large file must
          be ``compressed'' in a secure manner before being signed with the
          RSA public-key cryptosystem.

          The MD4  algorithm  is  designed  to  be  quite  fast  on  32-bit
          machines.    On  a  SUN  Sparc  station,  it  runs  at  1,100,000
          bytes/second.     On  a  DEC  MicroVax  II,  it  runs  at  70,000
          bytes/second.   In addition,  the MD4  algorithm does not require
          any large  substitution tables;  the algorithm can be coded quite
          compactly.


[Ed. Due to the length of this paper, I've placed it on the
VIRUS-L/comp.virus document archive at cert.sei.cmu.edu, where it is
available for anonymous FTP.  The filename is:
pub/virus-l/docs/md4.rsa.paper.]

          (C) Copyright 1989, 1990 RSA Data Security, Inc.
          All rights reserved.

RADAI1@HBUNOS.BITNET (Y. Radai) (02/13/90)

  There have been a few responses to my postings on signature (or
checksum) programs and algorithms in V2 #256 and V3 #4, resp., mainly
by Bob Bosen.  Let me begin by briefly summarizing my earlier postings:

  In the first posting I pointed out that what has to ensure security
on a computer is not simply an algorithm, but rather a *program* which
implements it in a given operating system.  And even a program based
on the most sophisticated checksum algorithm in the world is circum-
ventable if it is not written very carefully, since there are liable
to be (and in PC-DOS/MS-DOS definitely *are*) loopholes in the system
which are independent of the checksum algorithm and which a virus
writer could exploit.
  Bob Bosen responded to this point only by saying that in the compa-
rison of algorithms, both implementing programs must be well-written
and convenient and "apply the algorithm across all program code
without exception".  Well, that last phrase may suggest to some people
that it's necessary to checksum the boot sector and partition record
also, but otherwise, this requirement is insufficient, at least as I
understand Bob's words.  And even if we interpret it in such a broad
manner that it becomes theoretically correct, this rule is rather
useless; it gives the checksum-program writer no clue whatsoever as
to how to plug most of the loopholes.

  I considered, and still consider, this emphasis on the implementing
program, rather than on the algorithm, to be fully justified.  One
way of putting it is that given a choice between a sophisticated algo-
algorithm with poor implementation and a mediocre algorithm with good
implementation, I would prefer the latter.
  Of course, it's not really an either-or matter; there's nothing to
prevent us from striving for quality in *both* aspects.  And as long
as one is aware that a naive implementation of an algorithm is very
dangerous, and is aware of the great difficulty of conceiving of all
of the loopholes, I suppose it makes sense to ask which of several
algorithms is best, given the *assumption* that all are implemented
equally well.  So in my later posting, I suspended discussion of im-
plementation and restricted myself to algorithms.  I mentioned six of
them and expressed the opinion that for anti-viral purposes (which I
characterized by three properties but there are actually more), any
algorithm which satisfied a certain pair of conditions should be suf-
ficiently secure.  I considered the best algorithm to be the fastest
of those satisfying these two conditions, my guess being that this
would be CRC.  However, I specifically mentioned three points on which
I might be wrong.  I concluded by challenging people to supply evi-
dence and argumentation relative to the viral environment.

  Bob Bosen took up my challenge as far as argumentation is involved:
>             Specifically, in his opinion (1), Mr. Radai says that a
>virus must perform all its work ... "on the computer which it's attacking
>and in a very short time". That is not necessarily true. In networked
>environments with shared file systems, (and especially if remote
>execution is available), viruses could execute on different computers and
>take as much time as they needed. Also, as pointed out by Bill Murray,
>viral infections during the process of software distribution may be done
>off-line at the convenience of the attackers.

  But as Bill correctly pointed out, we have in mind two different ap-
plications of authentication software: (I) for recognizing contamina-
tion of the program in the target execution environment; (II) for en-
suring that programs are received as they are shipped.  (II) is un-
doubtedly important, but my articles were concerned only with (I)
(sorry I didn't state this explicitly), hence Bob's reference to in-
fection during software distribution is not relevant to my remarks.
  As for networked environments, he's right, there are *some* such
environments in which property (1) would not hold.  However, the
average user does *not* operate in such an environment.  Why should
*he* choose a slow program just because it adds security in *such* an
environment?

>I must also disagree with Mr. Radai's opinion (2), wherein he posits "By
>its very nature, a virus is designed to attack a large percentage of the
>users of a particular system." Why? What's to prevent a virus writer from
>launching a "surgical strike" against a small population of familiar
>computers that are known to control assets or information of great value?

I must say I find this response rather surprising, considering that
the answer to your question is contained in the continuation of the
very same paragraph from which you're quoting me.  In case it wasn't
clear, I'll rephrase that answer:  There's nothing at all to prevent
such a strike, but if such a strike is made, there's no reason why it
has to be a *viral* strike.  The great advantage of writing a virus,
as compared to an ordinary immediate-acting Trojan horse (which I'll
abbreviate here to simply "a Trojan") is that by delaying its damag-
ing effects and replicating itself, it can spread rapidly to a large
population of users and thus ultimately cause damage to many more
users.  But it has the disadvantage that the delay of damage gives
checksum programs a chance to detect the infection.  Now if you're
talking about performing damage to only a *small* population, then
there's not much advantage to writing a virus rather than a Trojan.
A Trojan is easier to write and can't be detected by a checksum pro-
gram.  So to your question of what's to prevent a viral strike against
a small population, I counterpose the question What's to prevent a
*Trojan* attack against a small population?  What could your sophisti-
cated algorithm do against that??  The answer is Nothing.  Even ISO
8731-2 raised to the X9.9-th power would be helpless against an imme-
diate-acting Trojan.

>As to Mr. Radai's opinion (3), he says that "a virus writer is not in a
>position to know what checksum algorithms may be in use on the computers
>on which his virus is unleashed." That's true TODAY. ....
>                                              .... But if our society is
>ever going to overcome its current vulnerability, we'll need reliable,
>low-cost defense mechanisms that are worthy of widespread use. This
>implies a necessity for economies of scale. Therefore, this opinion (3)
>will not necessarily be true for very long.

  I appreciate this looking to the future (which is why I mentioned
loopholes which have not yet been exploited in any existing virus).
However, I'm less enthusiastic about this particular point.  I presume
that when Bob talks of the need for "reliable low-cost mechanisms" and
"economies of scale" he is referring to his previously expressed opin-
ion that someday there may be only a few high-quality programs used
by millions of people.  Frankly, I find the "few" part of this rather
unlikely.  If anything, the trend seems to be in the opposite direc-
tion.  (For example, there's a new algorithm MD4 which has been de-
scribed by Ronald Rivest.)  And even if the situation envisioned by
Bob should someday arise, I think it would still be quite difficult to
exploit.

  In any case, properties (2) and (3) were less important to my case
than (1).  And there's an important fourth property of the environ-
ment which I neglected to mention.  Almost all types of attacks pro-
posed by cryptanalysts against signature algorithms involve *knowledge
of the signature* of one or more files, which is natural in Applica-
tion (II).  It therefore requires a very secure algorithm.  But in the
environment I am considering (Application (I)), such knowledge and
hence such attacks are impossible even with an unsophisticated algo-
rithm such as CRC, provided different users use different keys, and
the checksum base etc. are kept offline while a virus may be in memory.

>                                                          From what I've
>read in this forum of late, it does appear that Ross Greenberg and Y.
>Radai are at one end of this spectrum and that Bill Murray, Ralph Merkle,
>Jim Bidzos, Fred Cohen, and the others mentioned in Mr. Radai's Jan. 4
>posting are more or less at the other end with me. (If I've
>misrepresented your views here, gentlemen, I hope you'll forgive and
>correct me for it. I'm reading between the lines.)

First, don't forget that the "others" mentioned in my posting include
Prof. Michael Rabin, and that the algorithm which he advocated was the
same, relatively unsophisticated, algorithm which I consider (tenta-
tively) to be the best.  (I take this opportunity to point out that
the algorithm of Rabin's to which I am referring is from a little-
known paper of his, and is not to be confused with a better-known
algorithm of his which involves taking the square root of the message
[= the file] modulo the product of two secret primes.)
  As concerns your lumping me together with Ross Greenberg at the low
end of the sophistication spectrum, that picture is not far from cor-
rect only if you limit yourself to the *algorithm* involved.  As re-
gards security of the *implementing program*, I'm miles away from Ross.


  Although I have said that I tentatively consider CRC to be the best
algorithm for the purposes which I am considering, I wish to empha-
size that I am in no way committed to it.  If someone can produce
evidence that some other algorithm is both more secure in the environ-
ment with which I am concerned and is faster or almost as fast as CRC,
I'm willing to alter my opinion.  However, what it takes to convince
me (and I think a lot of other readers) is not a proclamation that
Committee X has adopted Algorithm Y as its standard, but *an actual
comparison of programs which implement the various algorithms*.
  In comparing speed, we will, of course, take into account what Bob
has pointed out, namely that a program can be written to implement a
sophisticated algorithm on a small part of the file and a fast algo-
rithm on the rest.  Obviously, such "hybrids" could also be among the
contenders.
  Comparing security is much trickier since the results will depend
strongly on what test viruses we can think of.  So no such test can
ever be conclusive.  But it should be more convincing than mere proc-
lamations and appeals to the authority of some committee or other.
(By the way, I made a challenge to compare the security of programs by
such tests over a month ago; why no response?)
  It is only when we have the results of such comparisons that each
user will be in a position to decide which program is best suited to
his needs.  My guess is that for Application (I) there won't be many
users who will choose a program which is based on a sophisticated
algorithm.

                                     Y. Radai
                                     Hebrew Univ. of Jerusalem, Israel
                                     RADAI1@HBUNOS.BITNET

71435.1777@CompuServe.COM (Bob Bosen) (04/10/90)

Several weeks ago, Ross Greenburg challenged me to obtain and post
descriptions of tests and user experiences involving use of
sophisticated authentication algorithms in the "real world" against
real viruses. Because I represent a commercial software vendor I
was hesitant to publish my own test results out of fear I would sound
biased. Most of my clients are rather secretive, and it took a while
before I was able to arrange for the following to be written and
cleared for posting. The following is a message forwarded from
Padgett Peterson, a well-known (in some circles) virus researcher,
employed by a well-known Defense Contractor. He speaks only for
himself.

Padgett conducted a detailed evaluation of a great many viral defense
products, subjecting them to a collection of viruses and stressing
them in other ways. I am posting his words for him because at the
moment, his internet access is rather awkward. He comments on valuable
ways to use authentication algorithms at all ends of the spectrum, and
I find his views similar to my own, inasmuch as my product offers
authentication algorithms at all ends of the spectrum and allows
users to "fine-tune" the sophistication of the algorithm to suit all
the extremes and norms Padgett discusses. But there are things in his
views that'll make a lot of folks happy. The following are his words:


FOR POSTING

                                                 A. Padgett Peterson


Recently, following a hiatus from the VIRUS-L forum, I have had the
opportunity to examine the continuing authentication (thank you
WordStar) saga.  All of the people involved appear to be knowledgeable
and concerned participants, yet they seem to be arguing the same side
of two different questions:

1)  Authentication of known software in a controlled unique environment

(Radai and Greenberg).

2.  Authentication of unknown, publicly transmitted software (Bosen and

Murray).

The virus issue, while a valid concern, is just a complicating factor,
since, if the software were trusted, by definition it could not be
infected.  The focus of the issue is what level of authentication is
necessary for trust.  All of the participants agree that some is
necessary - the question is how much?

My personal feeling is that an authentication algorithm may be very
simple (CRC or less) provided that it is unknown (or unpredictable).
Since my 4.77 Mhz/ST-412 museum piece is capable of a simple byte
count/XOR/ROR disk file check at 50k bytes/second (and could be faster
if done in RAM by a TSR between LORD and EXECUTE), performance
concerns are unnecessary (quantum economics).  This method is suitable
for any physically controlled system.

Unfortunately, Mr. Greenberg's algorithm fails this test because it is
publicly known.  A mechanism designed to subvert his programs is
feasible (worm, trojan, virus, bomb, etc.).  However, given a small
number of different algorithms (ADD/SUB/XOR followed by ROL/ROR/NOP
give nine easily) generated by a machine-unique seed (time hack at
initial algorithm load would work), a non-resident intruder would have
a very hard time subverting a system without generating a few errors
first.

This is particularly effective if even the creator of such a program
cannot predict which algorithm/seed will be used on a particular
machine.

A procedure such as this is even workable in a networked/server
environment: the file itself is stored en clair.  Each authorized user
has a unique signature file.  No two signatures match yet each will
authenticate the same file in the proper machine.  A nightmare for
intruders.

Alternatively, a publicly transmitted file for which the algorithm/key
is also public requires a much more rigorous algorithm to avoid
spoofing or infection by a determined intruder.  In this case ANSI or
DES is appropriate.

Taken together, the indication would be that for inter-machine
transmission, the more rigorous public-key methods would be
appropriate, while a much simpler one would be suitable for
intra-machine retrieval.  This would postulate a software package
that:

a:  Uses a simple (fast) but unique algorithm for known files whose
signatures are stored on the platform.

b:  Requires a much more rigorous authentication process for unknown
files (possibly also requiring authorization for load).

c:  Once (b) is satisfied allows a file to migrate to (a).

Considering the viral threat, if a virus is accompanied by a valid
signature, ANY authentication scheme will pass it, however, as aoon as
a resident file is infected, the unique resident signature will become
invalid.

The point was raised concerning Boot and Partition Table Infectors
(Hidden Sector, FAT, Root, RAM-Resident, and Bad Sector Infectors are
also possible).  This is a different question from that of
authenticating a file.  At present I know of only one package that
provides complete coverage: Enigma-Logic's Virus-Safe which I use.

However, over 90% of all PC virii could have been caught early by a
CLI that occasionally compares the Top-Of-Memory, the end of DOS/TSR
memory, and the first byte of the Boot Sector against known values.
MS-DOS doesn't.

(END OF PADGETT PETERSON POSTING)

Thank You,


Bob Bosen
Enigma Logic Inc.

greenber@uunet.UU.NET (Ross M. Greenberg) (04/12/90)

>My personal feeling is that an authentication algorithm may be very
>simple (CRC or less) provided that it is unknown (or unpredictable).
>Since my 4.77 Mhz/ST-412 museum piece is capable of a simple byte
>count/XOR/ROR disk file check at 50k bytes/second (and could be faster
>if done in RAM by a TSR between LORD and EXECUTE), performance
>concerns are unnecessary (quantum economics).  This method is suitable
>for any physically controlled system.
>
>Unfortunately, Mr. Greenberg's algorithm fails this test because it is
>publicly known.  A mechanism designed to subvert his programs is
>feasible (worm, trojan, virus, bomb, etc.).  However, given a small
>number of different algorithms (ADD/SUB/XOR followed by ROL/ROR/NOP
>give nine easily) generated by a machine-unique seed (time hack at
>initial algorithm load would work), a non-resident intruder would have
>a very hard time subverting a system without generating a few errors
>first.

Sorry: although it would be easy to ascertain via disassembly the
particluar method I use in my code for generating a signature, I would
hope that the bad guys are as easily fooled by someone using the word
"Checksum" or "CRC" as you were.  <Gotcha! Heheheh> My signature code
stuff is proprietary and has never been released to anyone.  I use the
word "checksum" or "CRC" loosely because it's easier to say than "an
assortment of instructions that merely generate a unique number based
upon a stream of input with no real formal basis for figuring out just
how good the particular algorithm issince it seems good enough so
far."

>This is particularly effective if even the creator of such a program
>cannot predict which algorithm/seed will be used on a particular
>machine.

I may include such a random seed in the future, but it seems pretty
easy to be able to determine that seed and therefore why bother?
Remember that DOS isn;t really an operating system anyway and it would
be pretty easy for someone to subvert *any* signature generating code
easily. Better still would be to use two differing algorithms that
combine into one unique signature.

>However, over 90% of all PC virii could have been caught early by a
>CLI that occasionally compares the Top-Of-Memory, the end of DOS/TSR
>memory, and the first byte of the Boot Sector against known values.
>MS-DOS doesn't.

Fascinating number, that 90%.  No justification for it from what I can
see.  And your statement on the Boot Sector's first byte being the
important one to check is totally wrong.  If you could send me the
background on that number, I'd apreciate it.  I believe none of the
numbers I see bandied about regarding viruses.  Too easy to slip a
decimal point or two, or to extrapolate from a limited subset.

Mr. Peterson makes some interesting points.  They do not, however,
seem conclusive to me.  I stand by my earlier statements that a simple
algorithm for CRC/signature/checksum checks is "good enough".

Ross M. Greenberg,     Software Concepts Design,    greenber@utoday.UU.NET
             594 Third Avenue, New York, New York, 10016
 Voice:(212)-889-6431 BIX: greenber  MCI: greenber   CIS: 72461,3212
                            BBS:(212)-889-6438