alanf@bruce.cs.monash.OZ.AU (Alan Grant Finlay) (06/07/91)
This is the Turbo-C source distribution of tangle 3.2, a file encryption program for MS-DOS to run on IBM-PC compatibles. Files are separated and preceded with ---------<file name>--------------- The file read.me refers to some files which are only required for an executable distribution, they can easily be manufactured. --------------------<read.me>----------------------- This file accompanies a set of programs to support a personal encryption system for use with IBM-PC compatibles running MS-DOS. The encryption/ decryption process is symmetric and takes only a few seconds for one or two pages of text. Very large files take a significant length of time to convert. The source files for "tangle.com" and "untangle.com" are "tangle.c" with the ENCRYPT constant set to 1 or 0 respectively. Since not all users may have access to the Turbo-C compiler, executable versions are also supplied. The file "tangle.doc" is a user manual. A description of the algorithm is given in "algthm.txt", and the author's address in "author.txt". The file "dir.cpt" is a directory of the distribution disk (without itself listed) encrypted with the password "abracadabra". You can untangle this file and check file sizes. To use the programs type "tangle src dst" or "untangle src dst" where src is source file and dst is the destination file. ----------------------<tangle.doc>--------------------- User manual for Tangle 3.2 -------------------------- This program supercedes Crypt 2.0, a name change has been adopted to avoid confusion with the Unix password encryption function (one way trapdoor). This program does not support the encryption algorithm of Crypt 2.0 however the name change should make it somewhat easier to introduce version 3 without too much confusion and gradually convert the encrypted files. The source code "tangle.c" can be used to produce two executable files which are assumed to be called "tangle" and "untangle" by the program itself. The source is in Turbo C (Trademark) and can be compiled with the tiny memory model. Since use is made of system dependent functions for obtaining the input file size for tangle, the program is not portable as is. To use tangle and untangle simply type the command name followed by the source and destination file names. You will be prompted for a password and status information is produced. The password is echoed since for the length of password intended it would be too easy to mess it up. The password should be a long sequence of characters such as a sentence with some strange words or characters in it. The maximum length is the same as the maximum block array width (100 as supplied). The password is scrolled off the screen after the line is terminated and an error checking code is displayed. The error checking code number reveals some information about the password but not enough to worry about. The purpose of the error checking code is to alert the user to an incorrectly typed password, since it is a function of the password it should always be the same when the same password is used. The algorithm used was inspired by the Rubik's cube. The core algorithm works as follows: First a fixed block size is determined which is the product of two numbers "wide" and "high". The data is processed in a two dimensional array with these dimensions one block at a time. If the last block cannot be entirely filled it is padded with junk characters. The password is used to determine a key on a column per character basis. The key value for each column (modulo the height) is the amount this column is shifted down. The part which emerges from the bottom is inserted into the top (i.e. a cyclic shift). The rows are similarly shifted but using the values 1,2,3,...,etc. This two stage process which is called a shuffle is repeated a fixed number of times. The result is a permutation of the array determined by the password alone. This algorithm produces a pure permutation of the data which has the advantage of not changing the character set but weaknesses that cannot be ignored for any serious use. A number of modifications to the algorithm produce one which in the author's opinion is secure against a chosen plaintext attack and if a large enough good password is used then immune to brute force attack for the conceivable future. The first modification is to change the nature of the algorithm so that it is no longer a pure permutation. The strategy adopted is to include a phase in each shuffle which replaces each second line by the bitwise exclusive-or of itself and the line before. This means that the history of the encryption process, which characters happen to be located above each other in each shuffle, affects the final outcome. An analogy is that of shuffling a pack of cards on which the ink is still wet as compared to a normal shuffle. With this modification the algorithm is strong against known plaintext attack. Secure that is provided the known text is not a simple pattern (e.g. all space characters). To secure against simple patterned source a simple substitution of the block is performed after the first shuffle. This eliminates simple patterns and also prevents a simple chosen plaintext attack strategy the author discovered which applies to Crypt 2.0 (Tangle's predecessor). If the input file is smaller than 450 bytes it is padded to this size. The block size is determined according to the size of the input file and the minimum size. For files smaller than 10 Kbyte there will be only one block. Each block has two dimensions - wide and high. The last block usually needs some padding which is chosen from whatever happens to be in the array already. The junk is selected randomly though the random number generator has only about 30,000 possible seeds. The password or key is a string supplied by the user, this is extended to the maximum length (100) by repetition however the repeated strings are modified by exclusive-or '152' to help extract the maximum information from the supplied string (since later only the remainder mod high will be utilised). The security of the system is highly dependent on the choice of password and particularly its length. Passwords which are too short are rejected. Naturally the password is not recorded permanently anywhere by the encryption program and version 3.0 erases the password from memory before exit. The easiest way to choose a good password is to use a whole sentence (spaces and punctuation are allowed) with some strange characters in it. It is relatively easy for a cryptanalyst to try out all short sentences with words from a standard dictionary. Brute force cracking of an encrypted file with n blocks of block size b, a password of p units chosen from a set of q values and with s shuffles per block requires O(s * b * (q^p)) operations. Obviously increasing q and/or p pays great dividends. For a pure permutation it would be possible to try all b! rearrangements of a block. This is ruled out by the minimum block size being 450. To check all possible keys with the minimum block dimensions of 32*15 (width*height) and with the minimum password length of 10 characters we need to consider 15^10 possible keys. 15^10 possibilities would take a week to check at the rate of one per microsecond. Actually the number of keys is greater than this (note how the password is extended above) so it would take longer. Using a password length of 15 characters the time would be increased to over ten thousand years. Longer passwords are supported by the program since these calculations assume an unpredictable password is used. Even if the password is chosen from a dictionary of 1000 four letter words the program would allow 6 such words with the minimum block size corresponding to 1000^6 keys which is approximately 15^15. It is clear that increasing s is not an effective way of improving security against brute force attack. The time taken by the algorithm is O(s * b * n). Since (b*n) cannot be changed s should be chosen as small as is considered safe. Note that s is a compile time constant for the Tangle program and may be changed (it is called SECURE). There is another security issue to consider. Even when a cyphertext cannot be deciphered it may reveal information about a sequence of cyphertexts. Consider the encryption of two files differing by only one character. The difference in the input files causes a cascade of differences in the output files due to the exclusive-or in the shuffle operation. This cascade is approximately 1.5 to the power of the number of shuffles when the differences are sparse hence ideally the value chosen for SECURE should be at least the logarithm base 1.5 of the block size. The block size varies from 450 to 10000. Hence a conservative choice for SECURE would be 2*LOG(100)/LOG(1.5)=23. Using the minimum block size of 450, SECURE=15 would be sufficient. This should not be taken too seriously however since if the files require more than one block then the identical input blocks (except most likely the last block due to the padding) will be encrypted identically. The value chosen for SECURE in Tangle is 10 which is a compromise. The security of the above algorithm to detection of similar sources is also compromised by the fact that so far the algorithm encrypts each bit plane (corresponding to the 8 bits in a byte) independently. This is a weakness of version 2 which has been corrected in version 3.0. A one character difference will be a one bit difference in some planes but not very likely for all planes. To fix this the exclusive-or phase of the shuffle is modified in version 3 to add 1 mod 256 to the byte. This means the bit planes are no longer independent and each entire block is mixed together. It would have been preferable to use a cyclic shift instead of adding 1 mod 256 but the C language does not have this operation. Because a carry into the 8th bit position will only occur on average one time in 128, SECURE must be chosen large enough to cause a cascade of changes in the 8th plane. If similar source produces similar output then it seems likely a chosen plaintext attack could be mounted. Hence for systems requiring this security SECURE should be chosen as large as the conservative estimates above (15 to 23, according to the block size). Since the intended domain of application to Tangle is personal computer file security, the more modest default value of SECURE=10 should suffice. Where protection against casual snooping is all that is required then SECURE=3 would be sufficient. In this case it would be a good idea to remove the redundant phases of the last shuffle. Of the three phases of a shuffle, only the first depends upon the key. This means the final two phases of the last shuffle are redundant and may be omitted. Version 3 also incorporates a change recommended by Russel Herman that using a floating point library for a single use of the square root function was unwarranted. Version 3 uses a simple linear search square root calculation and this cuts the size of the executable code by about 50% - thanks Herman! Some final operational advice. Confidential documents should be created using a ram disk or a floppy which will be destroyed. Note that deleting and even overwriting magnetic media does not prevent the data's recovery. Watch out for editors which keep temporary copies of files which they are working upon in places of their own choice. A deleted temporary file is often easily recovered and you may not even be aware of its existence. (The algorithm is the same for all versions 3.x) Version 3.2: The only changes are the removal of a superfluous include and alterations to comments in the source. --------------------------------<tangle.c>--------------- /***********************************************************************/ /* (c) Copyright 1989, 1990, Alan Finlay. Tangle, Version 3.2 . */ /* This program may be freely distributed but may not be sold or */ /* marketed in any form without the permission and written consent of */ /* the author. I retain all copyrights to this program, in either the */ /* original or modified forms, and no violation, deletion, or change of */ /* the copyright notice is allowed. I will accept no responsibility for */ /* any loss, damage, or compromised data caused directly or indirectly by */ /* this program. It is released on an "as is" basis with no warranty */ /* as to its being fit or suitable for encrypting any form of data. */ /***********************************************************************/ #include <stdio.h> #include <stdlib.h> #include <dos.h> #include <sys\stat.h> #define ENCRYPT 1 /* Choose tangle (1) or untangle (0) */ #define DEBUG 0 /* Show page after each shuffle if non zero */ #define TRUE -1 #define FALSE 0 #define LIMIT 100 /* Maximum block size is LIMIT*LIMIT */ #define SECURE 10 /* The number of block transformations */ #define MINB 450 /* Minimum block size - insecure if too small */ typedef unsigned char line[LIMIT]; char copyright[40] = "(c) copyright 1989,1990, Alan Finlay"; unpat(page,wide,high) /* Simple substitution to eliminate simple patterns */ line page[LIMIT]; /* [width,height] */ int wide,high; { int i,j,k; k = 0; for (i=0;i<wide;i++) for (j=0;j<high;j++) { k = (k+7)%256; page[i][j] = page[i][j] ^ k; } } #if ENCRYPT shuffle(page1,code,wide,high) line page1[LIMIT]; /* [width,height] */ line code; int wide,high; { int i,j,k,key,shift; line *mix1,*mix2; line *oldline,page2[LIMIT]; /* [height,width] */ for (k=0;k<SECURE;k++) { #if DEBUG show(page1,wide,high); #endif /* Shift columns */ for (i=0;i<wide;i++) { oldline = page1[i]; key = (int) code[i]; for (j=0;j<high;j++) page2[j][i] =(*oldline)[(j+key)%high]; } /* Mixup */ for (j=1;j<high;j+=2) { mix1 = page2[j-1]; mix2 = page2[j]; for (i=0;i<wide;i++) (*mix2)[i] = ((*mix2)[i]^(*mix1)[i])+1; /* Assume overflow ignored so 255+1==0 */ } /* Shift rows */ for (j=0;j<high;j++) { oldline = page2[j]; shift = (j%(wide-1))+1; for (i=0;i<wide;i++) page1[i][j] = (*oldline)[(i+shift)%wide]; } /* Eliminate any pattern (after first iteration only) */ if (k==0) unpat(page1,wide,high); } } #else unshuffle(page1,code,wide,high) line page1[LIMIT]; /* [width,height] */ line code; int wide,high; { int i,j,k,key,shift; line *mix1,*mix2; line *newline,page2[LIMIT]; /* [height,width] */ for (k=0;k<SECURE;k++) { #if DEBUG show(page1,wide,high); #endif /* Eliminate any pattern (before last iteration only) */ if (k==SECURE-1) unpat(page1,wide,high); /* Shift rows back */ for (j=0;j<high;j++) { newline = page2[j]; shift = wide-(j%(wide-1))-1; for (i=0;i<wide;i++) (*newline)[i] = page1[(i+shift)%wide][j]; } /* Reverse mixup */ for (j=1;j<high;j+=2) { mix1 = page2[j-1]; mix2 = page2[j]; for (i=0;i<wide;i++) (*mix2)[i] = ((*mix2)[i]-1)^(*mix1)[i]; /* Assume underflow is ignored so 0-1==255 */ } /* Shift columns back */ for (i=0;i<wide;i++) { newline = page1[i]; key = (int) code[i]; for (j=0;j<high;j++) (*newline)[(j+key)%high] = page2[j][i]; } } } #endif show(page,wide,high) line page[LIMIT]; int wide,high; { int i,j; puts("\n"); for (j=0;j<high;j++) { putc('\n',stdout); for (i=0;i<wide;i++) { if (page[i][j]<30) putc('*',stdout); else putc(page[i][j],stdout); } } } main (argc,argv) int argc; char *argv[]; { FILE *infile,*outfile; int wide,high,i,j,k; /* Block width and height, loop counters */ int blkn = 1; /* Block counter */ int clen; /* Password code length */ long chksum; /* Password checksum */ int ch = 0; int invers; /* Version of input file for decrypt */ int vers = 3; /* Version of this program */ line page[LIMIT],code; #if ENCRYPT int chrcnt; /* Character counter */ long fsize; /* Input file size */ int blocksize,nblocks; struct time t; /* For system time */ struct stat st; /* For input file stats */ /* Randomise the rand() function */ gettime(&t); srand(t.ti_min*400+t.ti_sec*100+t.ti_hund); /* random seed <30000 */ /* Check the input arguments */ if (argc!=3) {puts("\nUsage is: tangle src dst\n"); exit(1);} #else int blkcnt; /* Check the input arguments */ if (argc!=3) {puts("\nUsage is: untangle src dst\n"); exit(1);} #endif if ((infile = fopen(argv[1],"rb")) == NULL) { printf("\n%s",argv[1]); perror(" "); exit(1);} if ((outfile = fopen(argv[2],"wb")) == NULL) { printf("\n%s",argv[2]); perror(" "); exit(1);} #if ENCRYPT /* Get input file size */ if (stat(argv[1],&st)!=0) {perror(" "); exit(1);} fsize = st.st_size; printf("The input file size is %ld\n",fsize); /* Choose block size accordingly */ if (fsize<(LIMIT*LIMIT)) blocksize = (int) fsize; else { nblocks = (int) (fsize/(LIMIT*LIMIT)+1); blocksize = (int) (fsize/nblocks+1); } if (fsize<MINB) blocksize = MINB; /* Minimum block size enforced */ wide = 0; while (wide*wide<blocksize) wide++; /* Approx square root */ wide = wide+10; if (wide>LIMIT) wide = LIMIT; high = blocksize/wide+1; if (high>LIMIT) high = LIMIT; while (1) { blocksize = wide*high; if (fsize<(long) blocksize) break; else { /* Multiple blocks, check for last block too small */ if (((fsize-1)%blocksize)>(blocksize*3/4)) break; /* (fsize-1) is used above so perfect fit is accepted! */ high--; wide--; /* Try a smaller block */ } if (wide<50) break; } printf("The width and height are (%d,%d)\n",wide,high); printf("The last block is %ld bytes\n",((fsize-1)%blocksize)+1); fprintf(outfile,"%d,%d,%d,",vers,wide,high); #else fscanf(infile,"%d,%d,%d,",&invers,&wide,&high); if (invers!=vers) { printf("This is version %d of the encryption program.\n",vers); printf("The input file is for program version %d or invalid.\n",invers); exit(1); } #endif /* Get password */ while(1) { puts("\nPlease enter your password"); fgets(code,LIMIT,stdin); clen = strlen(code); if (clen>9) break; puts("Insecure password, try a longer one."); puts("For security do not use a name or word in any dictionary."); puts("For example use something like \"Dazed and Konfuzed\""); } for (i=0;i<25;i++) puts(" "); /* Clear the screen */ if (clen>wide) puts("Warning: tail of password ignored"); /* Extend password to possible limit, not null terminated */ for (i=clen;i<LIMIT;i++) code[i] = code[i%clen] ^ '\152'; /* Generate a checksum for the characters */ for (chksum=0,i=0;i<clen;i++) chksum += (int) code[i]*i; printf("The password checksum is %ld. Please wait ...\n",chksum % 1000); do { /* tangle or untangle a block */ #if ENCRYPT chrcnt = 0; #else if (fscanf(infile,"%d,",&blkcnt)==EOF) goto NOBLOCK; #endif for (j=0;j<high;j++) { for (i=0;i<wide;i++) { if ((ch = getc(infile)) != EOF) { page[i][j] = ch; #if ENCRYPT chrcnt++;} else if (i==0 && j==0) goto NOBLOCK; /* EOF at start of block! */ /* Pad the last block with existing junk */ else page[i][j] = page[rand()%wide][rand()%high]; #else ;} else {puts("Error: unexpected end of file"); goto NOBLOCK;} #endif } } #if ENCRYPT fprintf(outfile,"%d,",chrcnt); shuffle(page,code,wide,high); for (j=0;j<high;j++) for (i=0;i<wide;i++) putc(page[i][j],outfile); #else unshuffle(page,code,wide,high); for (j=0;j<high;j++) for (i=0;i<wide;i++) if ((j*wide+i)<blkcnt) putc(page[i][j],outfile); #endif printf("Finished block number %d\n",blkn++); } while (ch != EOF); NOBLOCK: /* Jump here to avoid writing an empty block */ for (i=0;i<LIMIT;i++) code[i] = ' '; /* Rubout the password before exit */ fclose(infile); fclose(outfile); } ------------------------------<author.txt>------------------- Alan Finlay Computer Science Dept. Monash University CLAYTON VIC 3168 Australia ------------------------- email: alanf@bruce.oz.au N.B. I wrote this program on my own machine in my own time, hence it does not belong to Monash University. --------------------------------<algthm.txt>----------------- For each block, let us call it page[x,y] where 0<=x<wide and 0<=y<high: (it is understood that each line is for all x<wide and for all y<high) ----------------------------------------------------------------------- 1) Do one shuffle: page[x,y] := page[x,(y+key[x]) mod high] if y is odd then page[x,y] := (page[x,y] XOR page[x,y-1])+1 mod 256 page[x,y] := page[(x+(y mod (wide-1))+1) mod wide,y] 2) Perform a simple substitution: page[x,y] := page[x,y] XOR ((x*high+y)*7 mod 256) 3) Do 9 shuffles: page[x,y] := page[x,(y+key[x]) mod high] if y is odd then page[x,y] := (page[x,y] XOR page[x,y-1])+1 mod 256 page[x,y] := page[(x+(y mod (wide-1))+1) mod wide,y] Each step in the process is easily reversible. ------------------------------------------------------------------------- The encrypted file format is: 1) Three decimal integers: program version number (= 3), wide, high. Each as chars terminated by a comma (i.e. C format "%d,%d,%d,"). 2) for each block: The real number of characters in the block as a comma terminated decimal integer (i.e. C format "%d,"). The block of bytes: for (j:=0..high-1), for (i:=0..wide-1) page[i,j] ------------------------------------------------------------------------- ---------------- end of Tangle 3.2 source distribution ------------------