gnu@hoptoad.uucp (John Gilmore) (08/28/90)
One thing that came up at Crypto '90 was a short paper from Ms. Helen Bergen at Queensland U. in Australia. She noticed the 'locked document' commands in Word Perfect, used by all the secretaries in her dept., and looked to see how strong it was. It turned out that the MSDOS DEBUG command and an envelope for scratch paper are enough for anyone to decode both a document AND the key used for it! Word Perfect Corp. didn't care about her results (letter reproduced below), but I thought that some Word Perfect losers, I mean users, here on the net might want to know. You should consider WP locked documents like ROT13: fine to keep the text garbled until you type a command, useless for keeping things private. John Gilmore From: <> Date: Mon, 27 Aug 90 10:28 +1000 To: cygint!gnu Dear John, Here is the letter and a copy of the Latex source of my paper. It will be published in CRYPTOLOGIA in the near future. Thanks for your interest, Regards, Helen Bergen **************************************************** Quote from letter received from WordPerfect Pacific: Thankyou for the copy of your paper entitled "File Security in WordPerfect 5.0". I sent a copy of the paper to WordPerfect Corporation in the USA and recently received a reply from them. They confirmed that people have written programs to break the password. However, WordPerfect Corporation does not have such a program and therefore has no way of breaking it. They also pointed out that very few users would know how to write such a program. It is possible that the manual may be amended in a future edition to clarify the protection that a password gives. They recommend that anyone concerned about security may want to take higher precautions than the password protection. Thankyou for your interest in WordPerfect ******************************** FILE SECURITY IN WORDPERFECT 5.0 H.A. Bergen School of Computing Science W.J. Caelli Information Security Research Centre Faculty of Information Technology Queensland University of Technology G.P.O. Box 2434, Brisbane, Q 4001, AUSTRALIA ABSTRACT: Cryptanalysis of files encrypted with the 'locked document' option of the word processing package WordPerfect V5.0, is shown to be remarkably simple. The encryption key and the plaintext are easily recovered in a ciphertext only attack. File security is thus compromised and is not in accord with the claim by the manufacturer that: "If you forget the password, there is absolutely no way to retrieve the document". KEYWORDS: Cryptanalysis, WordPerfect. INTRODUCTION WordPerfect is one of the most popular word processing packages in use today. It has a 'locked document' option which aims at protection of a WordPerfect file from unauthorised access. The manual states "You can protect or lock your documents with a password so that no one will be able to retrieve or print the file without knowing the password - not even you". The manual also claims that "If you forget the password, there is absolutely no way to retrieve the document" [1]. This option is used to 'add' a password to an existing or newly created WordPerfect file. The file is then encrypted using the password as the cryptographic key, and is stored on disk. Any subsequent retrieval or printing of the file via WordPerfect requires the entry of the correct password. With the increasing use of distributed file systems and sharing of data, this option might appear to be an attractive means of protecting sensitive files, particularly where they may reside on a shared network server. It is easily implemented without the expense and installation of another software protection/encryption package. The encryption algorithm used in the WordPerfect 4.2 version, however, was successfully cryptanalysed by Bennett [2]. He concluded that the encryption system was unsatisfactory for protection of sensitive documents. The present study extends this work to an investigation of the security of the WordPerfect 5.0 encryption system on both the IBM PC and DEC VAX systems as well as WordPerfect 5.1 on the IBM PC. WORDPERFECT FILES WordPerfect version 5.0 was used on an IBM-PC and other compatible systems to create various files consisting of original documents and their associated ciphertext with different passwords. The DOS utility DEBUG was used to display the content of the files in hexadecimal notation. The WordPerfect files were created on three different systems. By this we mean, three different licenced copies of WordPerfect running on different Personal Computers with different printers. An example from just one of these systems has been given in detail. Version 4.2 format Files created under 4.2 contain just the ASCII representation of the character text. Printer definitions and setup parameters are in separate files and are used only when the file is to be printed. For example, a file may contain zeros (ASCII code 30 hex) and new line characters (these are converted to the ASCII line feed character, 0A hex). The plaintext file in hexadecimal would be 30 30 30 30 30 30 30 30 30 30 0A 30 30 30 30 30 30 30 30 30 30 0A 30 30 30 30 30 30 30 30 30 30 The corresponding ciphertext file with a key value of the ASCII letter A is FE FF 61 61 41 00 73 72 75 74 77 76 79 78 7B 7A 47 7C 7F 7E 61 60 63 62 65 64 67 5C 69 68 6B 6A 6D 6C 6F 6E 51 50 Encrypted files contain an extra 6 bytes, shown in the first line of the above. The first 4 bytes are constant for all keys and are used by the WordPerfect program to determine whether the file is plaintext or ciphertext. The latter 2 bytes contain a checksum derived from the key, as described by Bennett [2]. For example, FE FF 61 61 43 00 key = C FE FF 61 61 71 C0 key = AA Version 5.0 format Files created under version 5.0 are stored in a different format. With the default WordPerfect format, the file contains the document text appended to printer setup information. There are other options to save the file in DOS text format or in 4.2 format, and in these the printer information is omitted. For example, a document containing 32 characters of text is saved in 5.0 format as a file of approximately 600 - 1000 bytes (depending on the particular printer system) and in 4.2 or DOS format as a file of 32 bytes. The locked document option in version 5.0 allows encryption of files only in WordPerfect format, the one containing all the printer information. * All version 5.0 files, original and encrypted forms have the same four characters in byte positions 0 - 3 : FF 57 50 43 (HEX) or W P C (ASCII) These codes were unchanged for files created on three different systems, i.e., three different licenced copies of WordPerfect 5.0 on three different PC's using different printers. * BYTES 4 - 7 are related to the offset address of the text, ie. the start of the document text. * BYTES 8 - 11 are constant for all files: 01 0A 00 00 * ENCRYPTED TEXT STARTS HERE * BYTES 12 - 15 are constant for a plaintext file: 00 00 00 00 For an encrypted file, however, bytes 12 and 13 in the above contain a checksum related to the key value used. This checksum appears to be the same as that used in the 4.2 version [2]. * BYTES 16 - 21 were constant for files prepared on the three different systems and contained: FB FF 05 00 32 00 * BYTES 22 - 31. Of these 10 bytes, 22, 23, 28 are file and system dependent, but bytes 24, 26, 29, 30, 31 are constant with value 00. * BYTES 32 - 39 were constant for files prepared on the three different systems and contained 42 00 00 00 02 00 56 00 * BYTES 40 - 47. Of these 8 bytes, 42, 46, 47 are file and system dependent, but bytes 40, 41, 43, 44, 45 are constant with value 00. * The remaining bytes of the printer header information are dependent on the particular hardware and printer in use. These might change according to the printer setup values. * The remaining bytes are document text. The offset address in bytes 4 - 5 gives the start of the document text. This is dependent on the size of the printer information and this can obviously change from one system to another. * Other systems may well have different printers or setup parameters which change some of the bytes that we found to be constant. In general though, there will be a reasonable number of constant known plaintext bytes. ANALYSIS The encryption algorithm was found to be the same as that used in the 4.2 version [2]. The main differences between version 4.2 and 5.0 are in the file formats. Bytes 0 - 15 of the original and encrypted files contain some useful information. The offset address in bytes 4-5 gives the starting point of the document text. The checksum of the key in the encrypted file is in bytes 12 - 13. This gives the key directly if the key is a single character. The encryption of the file starts at byte number 16, so all the printer information as well as the document is encrypted. The Encryption Algorithm * Firstly, the ciphertext is XORed with an ascending sequence of bytes based on the sequence in Hexadecimal : 02 03 04 ... 79 7A 7B ... FD FE FF 00 01 02 03 ... Note that the sequence repeats from 00 not 02 after reaching FF. The keylength determines the starting point of the sequence to be used, ie. starting point = keylength + 1 For example, for key = QWERTY the starting point of the ascending sequence would be at position 6 in the sequence giving a starting value of 07. * Secondly, the resulting text is XORed with the key characters in blocks of key length, to restore the original plaintext. This type of polyalphabetic substitution is called a Vigenere cipher. The analysis of Vigenere ciphers is well known and covered in the standard cryptography literature e.g. [3,4,5]. Plan of attack In the 4.2 version, the only text encrypted was that contained in the actual document. This is unknown plaintext. In version 5.0, however, the printer information as well as the document text is encrypted. We have identified bytes 16 - 21, 24 - 27, 29 - 41, 43 - 45 as being constant for a particular system (as defined earlier, a particular licenced copy of WordPerfect on a particular PC and printer), and they do not change markedly from one system to another. So we have the ideal situation of known plaintext for a reasonable number of bytes. This can greatly simplify our attack as it makes it possible to recover the actual key. Then it is trivial to recover the plaintext by using WordPerfect to retrieve the file using the recovered key as the ''password''. Alternatively, a program could be written to do this as the encryption/decryption algorithm is known. We outline a strategy with the following example from one particular system: Document text consists of three lines of ten ASCII zeros each. The size of the original file and the encrypted file is 651 bytes. 0000000000 0000000000 0000000000 Plaintext file contains in hexadecimal (for a particular printer): BYTES 0-15 FF 57 50 43 6B 02 00 00-01 0A 00 00 00 00 00 00 16-31 FB FF 05 00 32 00 2D 02-00 00 07 00 11 00 00 00 32-47 42 00 00 00 02 00 56 00-00 00 53 00 00 00 0C 00 . ........ . 619-623 30 30 30 30 30 624-639 30 30 30 30 30 0A 30 30-30 30 30 30 30 30 30 30 640-650 0A 30 30 30 30 30 30 30-30 30 30 Ciphertext file contains in hexadecimal: BYTES 0-15 FF 57 50 43 6B 02 00 00-01 0A 00 00 6E 50 00 00 16-31 B0 B4 42 41 7E 47 6C 46-53 53 58 59 45 5F 59 4C 32-47 19 5B 57 51 5E 57 07 74-63 63 3C 69 64 6F 65 7C . ....... . 619-623 19 14 1F 19 0C 624-639 1B 1B 17 11 1C 2D 11 14-03 03 0F 09 04 0F 09 1C 640-650 31 0B 07 01 0C 07 01 E4-F3 F3 FF We will illustrate a known ciphertext only attack, even though we obviously know the exact plaintext in this particular example. So we assume that we have a ciphertext file produced on some other hardware system using a different licenced copy of WordPerfect. As explained earlier, we can be confident that a substantial portion of text is common to all systems. Thus to summarise, the known plaintext we have is BYTES 16 - 21 known BYTES 22 - 31 known except for 22, 23, 28 BYTES 31 - 39 known BYTES 40 - 47 known except for 42, 46, 47 * Firstly, look at bytes in positions 12 - 15 in the ciphertext file above which contain the checksum of the key. If the key is one character, it will be evident in byte number 12. For longer keys bytes 12 and 13 are probably non zero. In this example the checksum is 6E 50 which implies a key size greater then 1. * Now we consider bytes 16 - 47. For byte number 16, we will try to deduce the key character used. To do this, choose a keylength starting with likely values say, 4 to 10 characters. Then XOR the plaintext characters with the ascending sequence (in the algorithm section) starting with position keylength which has the value keylength + 1. Then XOR that result with the associated ciphertext and the key character should result. For example, keylength of 4 keylength of 8 plaintext FB 1111 1011 FB 1111 1011 sequence 05 0000 0101 09 0000 1001 xor --------- --------- 1111 1110 1111 0010 cipher B0 1011 0000 B0 1011 0000 xor --------- --------- 0100 1110 0100 0010 => key 4 E 4 2 Thus we get the following table: keylength starting possible sequence key character 4 05 4E 5 06 4D 6 07 4C 7 08 43 8 09 42 9 0A 41 10 0B 40 * Now for a keylength of 4, byte 16 gives a possible key character of 4E. Bytes 20, 24, 28 ... must also have been created from the same key character, so we deduce a potential key character for these other bytes to see if it is also 4E. It turns out that the other potential key characters are not 4E. * So we take the next possible key length, 5. Deduce the key character for bytes 21, 26 .. to see if they match the value for byte number 16 for that keylength, which is 4D. They do not. * When a match is obtained for the first key character, deduce the key characters for the remaining positions. We show the full analysis for bytes 16 - 31 for a keylength of 8. As we stated earlier, bytes at positions 22, 23 and 28 are unknown, and we signify these as ??. BYTES 16 - 23 plaintext FB FF 05 00 32 00 ?? ?? sequence 09 0A 0B 0C 0D 0E 0F 10 xor ------------------------------ F2 F5 0E 0C 3F 0C ?? ?? cipher B0 B4 42 41 7E 47 6C 46 xor ------------------------------ => key 42 41 4C 4D 41 49 ?? ?? BYTES 24 - 31 plaintext 00 00 07 00 ?? 00 00 00 sequence 11 12 13 14 15 16 17 18 xor ------------------------------ 11 12 14 14 ?? 16 17 18 cipher 53 53 58 59 45 5F 59 4C xor ------------------------------- => key 42 41 4C 4D ?? 49 4E 54 * The repeating sequence for the key is obvious, even with three unknown bytes at positions 22, 23 and 28, and so the key characters are: 42 41 4C 4D 41 49 4E 54 B A L M A I N T * Further checks on the key could be done using the known bytes from 32-47, if the repeating pattern of the key characters is ambiguous. * In general, the probability of deducing the key bytes is dependent on the keylength. Some definitions relating to the key byte are useful: * Known: the key byte may be determined at two or more different positions which correspond to known plaintext. * Possible: the key byte may be determined at only one position. * Unknown: the key byte may not be determined as there is no overlap of this byte with known plaintext. In summary, for a keylength of 1-9, the key bytes are all known and thus all of the key may always be deduced. For a keylength of 10-13, 15-17, there is a small proportion of possible to known key bytes. Thus all the key may be deduced with a high probability. Keys with keylengths of 14, 18-24 contain one, two or three unknown key bytes and an increasingly high proportion of possible to known key bytes. At least five bytes of the key may always be determined. * Retrieve the plaintext using WordPerfect with the key as the password. This is the easiest way to decrypt the document text. * If no access to WordPerfect is available, then it is straightforward to recover the plaintext with a short C program which implements the decryption algorithm as described previously. This has been done successfully. CONCLUSION The encryption key is easily recovered in an apparent KNOWN CIPHERTEXT ONLY attack, as the system provides enough known plaintext in the printer information regardless of the document plaintext. The analysis, as shown, can literally be done on the back of a (large) envelope. The analysis may be slightly more difficult where the physical system on which the files were prepared is completely unknown and vastly different to any system we have encountered, as this may reduce the amount of known plaintext. In these situations, statistical analysis based on the characteristic frequencies of characters in a language is used to decipher text files. This is a standard method which is straightforward although a program may have to be written. In summary, the cryptanalysis of files encrypted with the 'locked document' option in WordPerfect version 5.0 is remarkably simple. The inclusion of portions of known plaintext in the encrypted file is a fatal flaw in the system, since it provides a mechanism of attack in which the key can be recovered by hand, and document plaintext easily retrieved. All of the key can easily be recovered for keylengths of 1-13 and 15-17, far in excess of commonly used passwords of 8 characters. A high proportion of the key can be deduced for keylengths of 14 and 18-24. The cipher used is too weak, providing little or no protection. If the attacker has knowledge of any other unencrypted file from the same system, the analysis is made even more simple. We stress that **both the key and the plaintext can be recovered**, independent of the content of the plaintext. The worst problem is that it may give a false sense of security. For example, an attacker may decrypt a document, modify it and re-encrypt so that the originator is unaware of the alterations. We conclude that the file security is not consistent with claims made by the manufacturer and is not sufficent to protect sensitive documents from anything but the most naive attack. References 1. WORDPERFECT CORPORATION (1989): WordPerfect for IBM Personal Computers.\\ 2. BENNETT, J (1987): Analysis of the encryption algorithm used in the WordPerfect Word Processing Program, Cryptologia, Vol XI. No 4. pp 206-210.\\ 3. KONHEIM, A G (1981): {\em Cryptography, A Primer}, Wiley.\\ 4. DENNING, D E (1981): {\em Cryptography and Data Security}, Addison Wesley.\\ 5. CARROLL, J and Robbins, L E (1989): Computer Cryptanalysis of Product Ciphers, Cryptologia, Vol XIII. No 4. pp 303-326.\\ Biographical Helen Bergen is a Lecturer in the School of Computing Science, Faculty of Information Technology, at the Queensland University of Technology. Her research interests within the Information Security Research Centre, Faculty of Information Technology, include cryptology and the application of supercomputers. Bill Caelli is Director of the Information Security Research Centre within the Faculty of Information Technology at the Queensland University of Technology. He is also Technical Director and Founder of ERACOM Pty. Ltd., a manufacturer of cryptographic equipment. His research interests lie in the development and application of cryptographic systems to enhance security, control and management of computer and data network systems. -- John Gilmore {sun,pacbell,uunet,pyramid}!hoptoad!gnu The Gutenberg Bible is printed on hemp (marijuana) paper. So was the July 2, 1776 draft of the Declaration of Independence. Why can't we grow it now?
grantk@manta.NOSC.MIL (Kelly J. Grant) (08/29/90)
[...lots of interesting crypto stuff deleted...] I disagree that WP documents are "trivial" to decode. They are possibly trivial for the 'sci.crypt' people who have experience in breaking ciphers and the like, but for people who have no training or knowledge in the subject, I think WP locked documents are perfectly safe for reports or other "private" (but not classified) documents. Of course, now that you have posted a cookbook approach to breaking these documents, they are a little less secure. I certainly wouldn't have taken the time to try to figure out the encryption before (I probably still won't). In reality, we all know truly sensitive data should be locked by a "world class" encryption scheme, and then placed in a secure place. But what ciphers can't be broken ? In the larger sense, what is a secure place? What man can build, man can destroy. The WP protection scheme keeps honest people honest, like car door locks. The hackers of this world are greatly outnumbered by people who either could not or would not attempt to break into a document that wasn't thiers in the first place. Actually, I'm glad this was posted. This way, when someone comes into my office wailing about a WP document for which they have fogotten the password I will be able to help. Kelly DISCLAIMER: My opinions are mine. CSC can't have them. -- Kelly Grant (619) 225-8401 Computer Sciences Corp ^^^^^^^^ Important: manta.UUCP won't get to me 4045 Hancock Street "If you are given lemons.....see if you can trade for San Diego, CA 92110 chocolate" - me (Carl Ellison) (08/29/90)
In article <1190@manta.NOSC.MIL> (Kelly J. Grant) writes: >I disagree that WP documents are "trivial" to decode. They are possibly >trivial for the 'sci.crypt' people who have experience in breaking >ciphers and the like, but for people who have no training or knowledge >in the subject, I think WP locked documents are perfectly safe for >reports or other "private" (but not classified) documents. [ more of the same deleted ] Kelly, I can't speak for the original posters, but I was offended by WP's false claim. I don't care about the security of the encryption mechanism. My mother has a china cabinet with an ornamental latch -- pretty and useless -- but the manufacturer didn't claim it was as good as a steel safe. WP made that bogus a claim -- and that was the unforgivable sin.
jkp@cs.HUT.FI (Jyrki Kuoppala) (08/29/90)
[ followups to and comp.os.msdos.apps ] In article <1190@manta.NOSC.MIL>, grantk@manta (Kelly J. Grant) writes: >I disagree that WP documents are "trivial" to decode. They are possibly >trivial for the 'sci.crypt' people who have experience in breaking >ciphers and the like, but for people who have no training or knowledge >in the subject, I think WP locked documents are perfectly safe for >reports or other "private" (but not classified) documents. Well, I wouldn't say they're safe for anybody. Because 'unlocking' the documents in case of a forgotten password is useful, someone will probably soon write a program to 'unlock' the documents and release it to free distribution like comp.sources.misc. I probably would do it, if I used WP. This is true for lots of other things in computer security, too. For example, Bridge (well, I don't know who owns the company manufacturing these beasts this week ;-) terminal servers and MAC-level bridges accept configuration commands to a magic UDP port and no access control is used. Well, the bridges themselves ask a password to enter the configuration mode (local or global netmanager) but the 'global' netmanager is implemented by just sending UDP packets to a magic port and the only difference betwen local and global netmanager is that a local manager can't send UDP commands from the bridge to other Bridge equipment (oh yes, another difference is that the global netmanager password isn't shown to local manager). But normal Unix machines don't have the 'control', so they can issue 'global netmanager'-level commands and the Bridges are even so friendly that they tell you all the passwords. This 'access control' kind of resembles the story of the beast which thinks it can't see anybody else if the someone else's eyes are covered in the Hitchiker's Guide to the Galaxy, makes you wonder if the HGG was used as a design document ;-) Similar things appear on many Sun workstations; people may think that it's good enough protection because not many people know about the vulnerabilities and those in the know should not tell others. However, if the problems are not discussed and solved, we are in deep trouble; often the documents don't point out the vulnerabilities (probably because of commercial reasons - it wouldn't look quite good if Bridge put in it's documentation something like "By the way, this 'access control' mechanism isn't designed to really work, it's just there so we wouldn't get a bad reputation for not providing access controls.") Just as for the WP someone will probably write (and probably many have already written) a program to open 'locked' documents, I have written some software for Bridge administration (in addition to sending those UDP packets, it can read files from a Bridge NCS/AT and function as a NCS/AT file server) because the software is useful. I am planning to announce that the software is available for anonymous ftp when I have it somewhat cleaned up. I don't know if the access control problems have been fixed in current software releases; they were there two years ago and the local represantative was informed, so they might be fixed, but I would't be so confident. People using Bridge equipment might ask their vendor if the problem still exists. >Of course, >now that you have posted a cookbook approach to breaking these documents, >they are a little less secure. And then again, maybe a lot more secure since the problem is now widely known and the vendor probably will change the documentation to tell that the protection is not 'a real thing' and users wanting real privacy will have to use alternative methods. >In reality, we all know truly sensitive data should be locked by a >"world class" encryption scheme, and then placed in a secure place. >But what ciphers can't be broken ? In the larger sense, what is a >The WP protection >scheme keeps honest people honest, like car door locks. Yes, this is a point; in my opinion, however, the WP protection did a lot more harm than good since it was documented to be quite safe when it was not. The users were fooled into thinking that their car was locked when it fact it was not. //Jyrki
nfs@cs.Princeton.EDU (Norbert Schlenker) (08/30/90)
In article <12163@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes: >One thing that came up at Crypto '90 was a short paper from Ms. Helen >Bergen at Queensland U. in Australia. She noticed the 'locked >document' commands in Word Perfect, used by all the secretaries in her >dept., and looked to see how strong it was... > >Quote from letter received from WordPerfect Pacific: > >They confirmed that people have written programs to break the password. >However, WordPerfect Corporation does not have such a program and >therefore has no way of breaking it. They also pointed out that very >few users would know how to write such a program. They won't need to write it, will they? After all, very few users would know how to write WordPerfect. They just buy it shrinkwrapped. >It is possible that the manual may be amended in a future edition to >clarify the protection that a password gives. They recommend that >anyone concerned about security may want to take higher precautions >than the password protection. Oh, jolly good! After all, there is "absolutely no way to retrieve the document" according to the WP manual. I have a friend who works on a contract basis for a large Wall Street law firm supporting WP on a LAN. He's very interested in the program that I cobbled up in two hours from the Bergen/Caelli paper. I plan to share in the profits :-) And now to the paper itself: > FILE SECURITY IN WORDPERFECT 5.0 > > H.A. Bergen School of Computing Science > W.J. Caelli Information Security Research Centre >... > * BYTES 22 - 31. Of these 10 bytes, 22, 23, 28 are file and system >dependent, but bytes 24, 26, 29, 30, 31 are constant with value 00. Hmmm. What about bytes 25 and 27? Also, I should report that my version of WP 5.0 has a nonzero byte in byte 26. > * BYTES 32 - 39 were constant for files prepared on the three >different systems and contained > > 42 00 00 00 02 00 56 00 My WP 5.0 has different values (usually FF FF instead of 02 00) in bytes 36 and 37. <remainder of excellent file description and encoding algorithm elided> All in all, a fine article. I probably shouldn't offer, but anyone who wants my ugly C hack for a decoder, including atrocious interactive dialogue to resolve ambiguities in the key via human guessing, should send me email. Norbert