[comp.sys.apple] viruses and checksums

AWCTTYPA@UIAMVS.BITNET ("David A. Lyons") (03/04/89)

>Date:         Fri, 3 Mar 89 11:46:00 -0800
>From:         Ryan Lanctot <rdlanctot@INSTR.OKANAGAN.BC.CA>
>Subject:      viruses
>
>As anyone knows, a checksum is only effective if the person who wrote
>the virus doesn't have the smarts to make the checksum add up after
>the virus has inserted itself.  Some really smart cookie would
>probably have the virus checksum the program itself before insertion,
>then rebalance the checksum......  Or they could corrupt the checksum
>program itself to produce the same result every time , no matter how
>the program looked.
>
> Ryan Lanctot
> <rdlanctot@instr.okanagan.bc.ca>

No!  It isn't anywhere near that simple.  A virus could make the
checksum come out the same if and _only_ if it knew what method of
checksumming was being done on the file!  Simple single-byte
additive checksums would indeed be very easy to bypass:  just make
sure the bytes added to the file add up to 0 (mod 256).

The number of possible checksumming schemes is immense.  (Start by
considering adding groups of N bytes rather than single bytes, and
rotating the checksum-in-progress by M bits after each N-byte group
was added in.)

Individuals can certainly bypass checksum protection on particular
pieces of software, but this is a very different thing from having a
virus do it.  Analysis of programs by other programs (for example, a
virus trying to determine what algorithm is being used in a
newly-encountered program) is something that is, in general, not
possible even in theory, and certainly not in practice.

By the way, a simpler and probably just as effective method for
software to detect that it has been altered (possibly infected by a
virus) is to check its own length on disk and compare it against the
correct length.

(ProDOS 8 programs can do an OPEN, GET_EOF, and CLOSE on the
pathname stored at $280 immediately after they start.  Be sure to
deal with errors on the OPEN appropriately so that you don't get
false virus-detector triggers when someone launches your program
from the rare program launcher that does not correctly put your
program's complete or partial pathname at $280.)

 --David A. Lyons              bitnet: awcttypa@uiamvs
   DAL Systems                 CompuServe:  72177,3233
   P.O. Box 287                GEnie mail:    D.LYONS2
   North Liberty, IA 52317     AppleLinkPE: Dave Lyons

rdlanctot@instr.okanagan.bc.ca (Ryan Lanctot) (03/07/89)

>No! It isn't anywhere near that simple. A virus could make the
>
>
True, I hadn't thought of that.  Although it is also true, to some extent,
that the checksum must be stored somewhere...... Perhaps a virus could
simply change the checksum on a random basis, thereby giving false (?)
reports to the system people.  Then, when it's getting to the point where
it seems there's something wrong with the checksum software, the virus could
insert itself, which would be taken as another false report.  Problem with this scheme is finding the checksum byte(s)....

Ryan Lanctot
<rdlanctot@instr.okanagan.bc.ca>

AWCTTYPA@UIAMVS.BITNET ("David A. Lyons") (03/08/89)

>Date:         Mon, 6 Mar 89 09:27:00 -0800
>From:         Ryan Lanctot <rdlanctot@INSTR.OKANAGAN.BC.CA>
>Subject:      VIRUSES AND CHECKSUMS
>
>>No! It isn't anywhere near that simple. [...]
>
>True, I hadn't thought of that.  Although it is also true, to some
>extent, that the checksum must be stored somewhere...... Perhaps a
>virus could simply change the checksum on a random basis, thereby
>giving false (?) reports to the system people.  Then, when it's
>getting to the point where it seems there's something wrong with the
>checksum software, the virus could insert itself, which would be
>taken as another false report.  Problem with this scheme is finding
>the checksum byte(s)....
>
>Ryan Lanctot
><rdlanctot@instr.okanagan.bc.ca>

Again, finding the checksum, even if it is stored somewhere
explicitly, is not something that can be automated.

Anyway, I have a great deal of trouble imagining this "the checksum
routine that cried 'wolf'" scenario.  In many cases the algorithm
used would be trivial enough that the implementation's correctness
could be verified easily.  Even if a complicated scheme was used, it
would be a simple matter to determine that the implementation
computed some _function_ of a file's contents (always gives the same
result given a particular stream of bytes for input), so that you
could be sure when your detector complained that your files really
were being fiddled with in some way.

 --David A. Lyons              bitnet: awcttypa@uiamvs
   DAL Systems                 CompuServe:  72177,3233
   P.O. Box 287                GEnie mail:    D.LYONS2
   North Liberty, IA 52317     AppleLinkPE: Dave Lyons

ALBRO@NIEHS.BITNET (03/09/89)

I was under the impression that, in contrast to a checksum, which would give
the same number regardless of the order of the bytes in a set, a CRC would
come out differently if you had the same bytes in a different order.  If
that is the case, introducing new code into a file (even with a separate
checksum of zero) would always be detected by a CRC.  Is this right?

gwyn@smoke.BRL.MIL (Doug Gwyn ) (03/10/89)

In article <8903081732.aa22924@SMOKE.BRL.MIL> ALBRO@NIEHS.BITNET writes:
>I was under the impression that, in contrast to a checksum, which would give
>the same number regardless of the order of the bytes in a set, a CRC would
>come out differently if you had the same bytes in a different order.

A true checksum is indeed simply the sum of the word values and thus does
not detect transposition of the words.  However, often the term "checksum"
is applied more generically, to include CRCs and other "summaries" of the
contents of a file used for similar purposes.

>If that is the case, introducing new code into a file (even with a
>separate checksum of zero) would always be detected by a CRC.
>Is this right?

There is only a small chance of such a change not being detected by a good
CRC, unless the code change took cognizance of the CRC and arranged to
reproduce the same CRC value.  If you know where the CRC is and how it's
computed, it isn't actually very hard to outwit it.

AWCTTYPA@UIAMVS.BITNET ("David A. Lyons") (03/10/89)

>Date:         Wed, 8 Mar 89 17:29:00 EST
>From:         ALBRO%NIEHS.BITNET@CUNYVM.CUNY.EDU
>Subject:      RE: viruses and checksums
>
>I was under the impression that, in contrast to a checksum, which
>would give the same number regardless of the order of the bytes in a
>set, a CRC would come out differently if you had the same bytes in a
>different order.

Interesting...I don't have a definition of "checksum" handy.  In my
previous messages, when I said "checksum" I meant a very broad class
of functions of ordered groups of bytes, with would include CRCs.

(By the way, "CRC" standads for Cyclic Redundancy Check, or
something much like that.)

>If that is the case, introducing new code into a file (even with a
>separate checksum of zero) would always be detected by a CRC.  Is
>this right?

A CRC is very _likely_ to detect a change, but its a simple exercise
to show that it won't _always_.

The result of the CRC computation is going to be, say 2 bytes long.
(It doesn't really matter... choose any reasonable length.)  The
checksum contains far fewer bits than the data being checked, so (by
the pigeonhole principle, I think?), it's obvious that the CRC
function cannot be one-to-one:  there is at least one CRC value that
will result from doing the same CRC calculation on two separate
inputs.

For the CRC to be useful, you want its results to be fairly evenly
distributed (if it usually gave the same value, transmission errors
would be less likely to be detected).  In practice, probably _all_
CRC values can be generated by zillions and zillions of different
inputs, but that still leaves you with a very good chance of
detecting a change _unless_ the person or program changing your data
"knows" the exact CRC calculation being used.

 --David A. Lyons              bitnet: awcttypa@uiamvs
   DAL Systems                 CompuServe:  72177,3233
   P.O. Box 287                GEnie mail:    D.LYONS2
   North Liberty, IA 52317     AppleLinkPE: Dave Lyons