[net.crypt] Decryption of corrupted files...

ramin@rtgvax.UUCP (04/30/86)

[ eat hot deathly cipher... aaarrrg


Here's a nice 'n juicy one...

I have a problem to which all suggestions and flames are welcome...
The program under question is a file encryption routine I wrote years ago
under VMS 3.x in VMS assembler. It incorporates a block AND record chaining
encryption method. The actual encryption engine uses the VAX
cyclic redundancy check instruction (CRC) with large prime coefficients
and a number of other convolutions (reversible, naturally...).
This was done at a time when VMS had absolutely NO encryption facilities
and one was sorely needed for private files. To dissuade people from
copying the ciphertext file to try to break it by trial and error,
I also encrypted the file id and added a single line-header to the ciphertext
file  (i.e. not to even TRY decoding if the file-id is different but
allowing for a backup mechanism when files are backed-up and recopied)
along with a few other info (a seed value taken from the system clock, a CRC 
checksum on the ciphertext, the encryption mode for sizes less than one block,
etc...).

The CRC scheme, though probably not anything as involved as the heavy-duty
encryption routines out there was very fast and showed very little
convergence under alternate (but close) keywords. The distribution of
ciphertext ascii bytes was also quite nice...

This system has worked quite well for some years. Until a few weeks ago
I encrypted some LISP, C, and MACRO source programs I had been working
on in my own time. I also decrypted them several times afterward to
make a few changes.

Yesterday I tried decrypting some files to start porting them to un*x,
but sadly the checksum failed. I have a command qualifier
that bypasses the checksum and the file decrypted successfully.
When looking at the cleartext result, I found very spurious ascii characters
mostly in the high end (i.e. 8th bit set). About 80 percent of the cleartext
was fine but then a series of garbage characters would take over and
no more than two lines later, the cleartext would proceed. Very little
connection was observed between the type of garbage (i.e. #include at
the beginning of file mapped to different garbage than #include in the
middle of the file). I suspect this is because of the chaining...

Now my question is what method would anyone suggest in recovering ciphertext
files that have been corrupted accidentally (i.e. either through disk or
tape errors and/or malicious tampering (:-). I presume this is the case
for most people, that once one encrypts a file one doesn't leave the cleartext
file sitting around and usually there's only one copy of the ciphertext file...
(In my case, I took printouts of the final programs home so I managed to
recover everything but my pride (:-)

The implication here is that unlike network error-correcting schemes there
is no possibility of a "retry" transmission. In my case, I could have made
the encryption program write to disk and read it back for verification, but
remember that the corruption happened several days after the file had
even been touched... I suppose I could include the header in a separate
"central" header-file facility but I think keeping them together lends
itself to moving it around easily and what if the header-file becomes
corrupted (ouch!)

My current inclination is compartmentalization (phew!)
i.e., to separate a file into user-definable "pages"
with different granularities (i.e. every n bytes) which would be
separately encrypted with different checksums and all, with chaining starting
at a fixed point and ending at a fixed point and then restarting again with
a new value. In this way damage to the "entire" file would be avoided,
and would not carry forth due to chaining to the rest of the file.
I was lucky this time... what if someone gets ALL garbage and no printouts?

This still fails to address a "recovery" scheme from such errors...
Any ideas out there? I think I could no longer enforce the problem parameter
that the ciphertext file be as close to cleartext in size as possible...
i.e. not allowing control mechanisms into the file is over-optimistic.

I certainly hope this is trivial (and if it is, it's back to the drawing
board...) Any references to sources would also be welcome... I will
repost any replies and/or references, and as soon as I'm done reading the
references, the results from that if anyone is interested...

Sorry this is so long but I thought the problem is general enough and
merits at least SOME communal brainstorming...

Looking forward to suggestions...


ramin






-- 
=--------------------------------------=-------------------------------------=
: alias: ramin firoozye'               :   USps: Systems Control Inc.        :
: uucp: ...!shasta \                   :         1801 Page Mill Road         :
:       ...!lll-lcc \                  :         Palo Alto, CA  94303        :
:       ...!ihnp4    \...!ramin@rtgvax :   ^G:   (415) 494-1165 x-1777       :
=--------------------------------------=-------------------------------------=

gnu@hoptoad.UUCP (05/03/86)

The traditional way to improve data reliability when there is no chance
to recover the data is called "forward error correction".  It adds some
redundant data to the message.  This kind of thing will be used in
Stargate for example, since you can't ask the satellite to send it
again please over a receive-only link.  Disk error correcting codes
are like this too (Fire codes).

But:  Any kind of redundancy you put in your cleartext (or enciphering
algorithm) will cause it to be easier to break.  A simple example.
Let's say you put a parity bit in each character before encrypting.  A
code-breaker who tried a key that produced bad parity characters would
know that that was the wrong key.  Knowing this, and the enciphering
algorithm, lets a codebreaker eliminate a lot of possible keys quickly.

There's probably no simple way to make your data both easier to recover
(for you) and harder to recover (for a codebreaker).
-- 
John Gilmore  {sun,ptsfa,lll-crg,ihnp4}!hoptoad!gnu   jgilmore@lll-crg.arpa

desj@brahms.BERKELEY.EDU (David desJardins) (05/04/86)

In article <757@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes:
>The traditional way to improve data reliability when there is no chance
>to recover the data is called "forward error correction".  It adds some
>redundant data to the message.
>.....
>But:  Any kind of redundancy you put in your cleartext (or enciphering
>algorithm) will cause it to be easier to break.  A simple example.
>Let's say you put a parity bit in each character before encrypting.  A
>code-breaker who tried a key that produced bad parity characters would
>know that that was the wrong key.
>.....
>There's probably no simple way to make your data both easier to recover
>(for you) and harder to recover (for a codebreaker).

   Well, all you need to do is to apply any sort of error-correcting
scheme *after* encryption (i.e. on the ciphertext).  Then, if the file
is corrupted, or you have transmission errors, then you hopefully will
still be able to recover the ciphertext, which you can then decrypt.
Since the error correction is applied *after* the encryption, it gives
no additional information to the adversary.
   The field of error-correcting codes is far too large for me to go into
here, and algebraic coding in particular, which is what I am supposed to
be studying (sigh...), requires some knowledge of modern algebra.  There
seems to be a need for a book on "Error Correcting Codes for Programmers,"
which would describe how to implement some simple error-correcting codes,
but I don't know of any such book....

   -- David desJardins

@hpislx.UUCP (05/04/86)

This message is empty.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (05/05/86)

In article <13619@ucbvax.BERKELEY.EDU> desj@brahms.UUCP (David desJardins) writes:
>...  There
>seems to be a need for a book on "Error Correcting Codes for Programmers,"
>which would describe how to implement some simple error-correcting codes,
>but I don't know of any such book....

Richard Hamming wrote such a book; it was aimed at practicing
engineers rather than theoreticians.  I forget its title, but
it was something obvious.  Wish I had a copy.

putnam@steinmetz.UUCP (jefu) (05/10/86)

In article <529@brl-smoke.ARPA> gwyn@brl.ARPA writes:
>In article <13619@ucbvax.BERKELEY.EDU> desj@brahms.UUCP (David desJardins) writes:
>>...  There
>>seems to be a need for a book on "Error Correcting Codes for Programmers,"
>>which would describe how to implement some simple error-correcting codes,
>>but I don't know of any such book....
>
>Richard Hamming wrote such a book; it was aimed at practicing
>engineers rather than theoreticians.  I forget its title, but
>it was something obvious.  Wish I had a copy.

I have a copy.  It is :
   Coding and Information Theory
   R. W. Hamming
   Prentice-Hall
   1980

I, with only peripheral interest in such things, find this well worth having.  
Good reading as well as good information.  


-- 
               O                   -- jefu
       tell me all about           -- UUCP: {rochester,edison}!steinmetz!putnam
Anna Livia! I want to hear all.... -- ARPA: putnam@GE-CRD

king@kestrel.UUCP (05/13/86)

   From: gnu@hoptoad.uucp (John Gilmore)
   Newsgroups: net.crypt
   Date: 3 May 86 12:20:42 GMT

   The traditional way to improve data reliability when there is no chance
   to recover the data is called "forward error correction".  It adds some
   redundant data to the message.  This kind of thing will be used in
   Stargate for example, since you can't ask the satellite to send it
   again please over a receive-only link.  Disk error correcting codes
   are like this too (Fire codes).

   But:  Any kind of redundancy you put in your cleartext (or enciphering
   algorithm) will cause it to be easier to break.  A simple example.
   Let's say you put a parity bit in each character before encrypting.  A
   code-breaker who tried a key that produced bad parity characters would
   know that that was the wrong key.  Knowing this, and the enciphering
   algorithm, lets a codebreaker eliminate a lot of possible keys quickly.

   There's probably no simple way to make your data both easier to recover
   (for you) and harder to recover (for a codebreaker).
   -- 
   John Gilmore  {sun,ptsfa,lll-crg,ihnp4}!hoptoad!gnu   jgilmore@lll-crg.arpa

Wouldn't it be better to put on the error correction AFTER the
encryption?

-dick

abc@brl-smoke.ARPA (Brint Cooper ) (05/15/86)

Unfortunately (or perhaps fortunately for those in the business),
'simple error-correcting codes' fall into two classes:  powerful codes
(correct many error patterns) that severely impede the rate of
information flow and weak codes that do not.  Error control is not yet a
'handbook' science.

Brint

In article <529@brl-smoke.ARPA> gwyn@brl.ARPA writes:
>In article <13619@ucbvax.BERKELEY.EDU> desj@brahms.UUCP (David desJardins) writes:
>>...  There
>>seems to be a need for a book on "Error Correcting Codes for Programmers,"
>>which would describe how to implement some simple error-correcting codes,
>>but I don't know of any such book....
>
>Richard Hamming wrote such a book; it was aimed at practicing
>engineers rather than theoreticians.  I forget its title, but
>it was something obvious.  Wish I had a copy.


-- 
Brint Cooper

	 ARPA:  abc@brl-bmd.arpa
	 UUCP:  ...{seismo,unc,decvax,cbosgd}!brl-bmd!abc