kc@rna.UUCP (Kaare Christian) (03/26/85)
I've recently acquired Microsoft WORD for the mac and I'm starting to use it instead of MacWrite. The virtues of Word are another story - they've been discussed here and elsewhere. One of the best features of MacWrite is a great program called Write2Troff (w2t) that appeared on the net a few months ago. W2t converts a MacWrite document into a troff document, thus one can print MacWrite documents using a laser printer or photo-typesetter attached to a Unix machine. Unfortunately the Word environment is missing this key feature. Is anybody working on a similar program for WORD? A grad student here at Rockefeller U. desperately needs this program. Respond immediately if you have any leads. I've investigated the internal format of Word files. It's unlike any other word processor document file that I've investigated. Word files contain three sections: a fixed length binary header, the text, and a variable length binary format trailer. The header is somewhat decipherable - for example the length of the text is encoded in one of the first few words. The text section of a word document file is completely clean - it simply contains carriage return delimited paragraphs. There aren't any embedded control codes. The font, point size, margins, and other format information appears to follow the text in the trailing binary record. Thus one must decode the format of the trailing binary record in order to recover the formatting information from the text. The binary record is mysterious. Perhaps it is encoded, or otherwise processed to make life difficult. (Why?) I saved a short file twice, using the name 'a' the first time and the name 'b' the second. The text sections were the same, but the trailing binary records were vastly different. This implies that there is encryption, or perhaps that there is random noise hiding the formatting information, or something even more devious. Does anyone know the format of these things? Question 3. Word (version 2) on the PC includes a very nice program that allows you to make your own printer drivers. You can decode an existing printer driver, change what you want, and then save the new driver. Its easy to use (for a programmer) and it is very powerful. Enough hooks are provided to add a custom driver for a laser printer such as our QMS. Does anyone know the format of the word printer driver files on the mac? Are there any tools for building your own mac printer drivers? The PC version of word allows you to output word documents in a printer independent manner, with complete positioning information in the output. It resembles (perhaps its a rip-off of) the output of titroff. Is there any similar facility for the mac? The microsoft customer hot-line (its not very hot - no 800 number) wasn't any help with these questions. I didn't think they would be, but it was worth a call. Are any of you netlanders able to help? Microsoft, I know you're listening? I'd be glad to serve as a clearinghouse for any info I receive. Replies should probably go directly to me unless they are of interest to the entire civilized world. Thanks, and happy decoding. Kaare Christian cmcl2!rna!kc 212-570-7672 1230 E. 63rd. NYC, NY 10021
kc@rna.UUCP (Kaare Christian) (07/03/85)
During the past few months several people have enquired about the internal format of Microsoft Word document files on the mac. Until now there has not been a satisfactory method of decoding these files. Yesterday I received my Word 1.05 update kit. The update fixes a few miscellaneous bugs, it improves the support for the LaserPrinter, and it contains a new convert utility that can convert mac Word documents to PC Word format and vice versa. Although Word's macintosh document format is a mystery, the document format on the PC is much more understandable. Thus this convert utility may serve as an important first step in decoding Word documents on the macintosh. I haven't tried this yet, and I probably won't for a week or so because of the holiday. Kaare Christian cmcl2!rna!kc