[comp.sources.mac] Man2RTF text file conversion tool

norman@d.cs.okstate.edu (Norman P. Graham) (10/09/90)

[Man2RTF text file conversion tool]

From the response I've found in my mailbox, it seems like a lot of
people out there need this simple tool.  Well, you asked for it and here
it is.  [If I had known that it was going out to the world I would have
used a finite state machine. <blush>]

If you want to use this as a MPW tool, but you don't have a C compiler,
don't dispair: Just post a note to me and I'll email the MPW tool to
you.

As is usual for quick hacks, this tool has not passed through quality
assurance.  It has worked for the few files I've passed through it, but
those files have been the only test cases.  With some luck, Man2RTF will
also work for you.

BEWARE: This tool is written in ANSI C. If your C compiler doesn't grok
ANSI C, then you'll have to work a little to get it to compile-- but
only a little.

Good luck with it.

--Norm

---
#! /bin/sh
# This is a shell archive, meaning:
# 1. Remove everything above the #! /bin/sh line.
# 2. Save the resulting text in a file.
# 3. Execute the file with /bin/sh (not csh) to create the files:
#	Man2RTF.c
# This archive created: Mon Oct  8 15:46:15 1990
# By:	Roger L. Long (bytebug@dhw68k.cts.com)
export PATH; PATH=/bin:$PATH
echo shar: extracting "'Man2RTF.c'" '(22013 characters)'
if test -f 'Man2RTF.c'
then
	echo shar: will not over-write existing file "'Man2RTF.c'"
else
sed 's/^X//' << \SHAR_EOF > 'Man2RTF.c'
X/*
X    Copyright 1990 by Norman Graham, Brandywine Softworks. 
X    
X    Permission to use, copy, modify, and distribte this software for
X    any purpose and without fee is hereby granted, provided that the
X    above copyright notice appear in all copies and that both the
X    copyright notice and this permission notice and warranty disclaimer
X    appear in supporting documentation, and that the name of Brandywine
X    Softworks or Norman Graham not be used in advertising or publicity
X    pertaining to distribution of the software without specific, written
X    prior permission.
X    
X    Brandywine Softworks and Norman Graham disclaim all warranties with
X    regard to this software, including all implied warranties of
X    merchantability and fitness. In no event shall Brandywine Softworks
X    or Norman Graham be liable for any special, indirect, or consequential
X    damages or any damages whatsoever resulting from loss of use, data,
X    or profits, whether in an action of contract, negligence, or other
X    tortious action, arising out of or in connection with the use or
X    performance of this software.
X*/
X
X/******************************************************************************
X    File:   Man2RTF.c
X    Author: Norman Graham
X    Date:   2 September 1990
X
X    Usage:  Man2RTF [-f|-h8|-h7] <Man.file >RTF.File
X
X    Description:
X    
X        Man2RTF is a quick hack to convert a text file to Rich Text Format.
X        It converts some of the control sequences common in Un*x files into
X        the equivalent RTF character formatting. It converts the following
X        sequences:
X        
X            "a\ba"          =>  Bold 'a'
X            "a\ba\ba"       =>  Bold 'a'
X            "a\ba\ba\ba"    =>  Bold 'a'
X            "_\ba"          =>  Underlined 'a'
X            "_\ba\ba"       =>  Underlined Bold 'a'
X            "`"             =>  Typographer's Opening Single Quote
X            "'"             =>  Typographer's Closing Single Quote
X            
X        Any other string with a '\b' in it is written to stderr as an error.
X        The idea is that Man2RTF will tell you when you need to extend it
X        to handle control sequences not currently handled. 
X    
X        Any non-control sequence is just copied to the RTF file.
X
X    
X    Options:
X    
X        -f  Format the text with 10 point Courier and format the page
X            to display 66 lines on a vertical 8.5 inch by 11 inch page.
X            Margins are Top = 0.45 inch, Bottom = 0.45 inch, and
X            Left = 1.25 inches.
X
X        -h8 Format the text with 8 point Courier and format the page
X            to display two columns of 66 lines on a horizontal 8.5 inch
X            by 11 inch page. Margins are Top = 0.21 inch, Bottom = 0.21 inch, 
X            Left = 0.15 inch, and Right = 0.15 inch with 0.15 inch between
X            columns. This format is not as useful as '-h7'.
X
X        -h7 Format the text with 7 point Courier and format the page
X            to display two columns of 66 lines on a horizontal 8.5 inch
X            by 11 inch page. Margins are Top = 0.9 inch, Bottom = 0.45 inch, 
X            Left = 0.45 inch, and Right = 0.45 inch with 0.8 inch between
X            columns. This is a nice format because it provides a top margin
X            for binding and the space between columns is large enough that
X            you need not worry about the columns running together.
X        
X        If no option is specified, Man2RTF will only generate font style
X        information; it will not generate font name, font size, or 
X        page formating information.
X    
X    
X    Caveats:
X        
X        If you use this tool on a Un*x box with the intention of 
X        downloading the resulting file to your Mac, you need to
X        be aware that the resulting file probably contains typographer's
X        opening and closing quotes. These characters have values > 127
X        (i.e. their high bits are set), thus you'll need to make special
X        arangements to download the file. I'd suggest using mcvert 
X        (available on sumex-aim) to convert the file to a macbinary
X        text file and then do a macbinary file transfer. As an alternative,
X        you could just nullify the typographer quote code by changing
X        kOpenSingleQuote to '`' and kCloseSingleQuote to '\''.
X
X
X    MPW build commands:
X        
X        In MPW, you can build Man2RTF by executing the following commands
X        directly from this file. You'll need to edit the Link command
X        to put the tool where you want it and to repair the line continuation
X        characters that undoubtly will be munged by transport over the 
X        internet.
X    
X        C  "{Active}"
X    
X        Link -w -c 'MPS ' -t MPST       6
X            "{Active}".o                6
X            "{CLibraries}"StdClib.o     6
X            "{CLibraries}"CInterface.o  6
X            "{Libraries}"Stubs.o        6
X            "{CLibraries}"CRuntime.o    6
X            "{Libraries}"Interface.o    6
X            -o {MPW}Tools:LocalTools:Man2RTF
X        
X        Delete "{Active}".o
X    
X    
X    Porting:
X    
X        This program is written in ANSI C. If your compiler does not
X        support the ANSI C standard, you will need to modify this code.
X        Pay attention to the function prototypes, the new style function
X        definitions, and the string concatenation [used in calls to fprintf()
X        and puts()]. I believe most C compilers now support enum types, 
X        but if yours doesn't you'll need to do some modifications here
X        as well.
X        
X        For MPW users. Watch out for munged characters in the Link
X        command (from above) and in the definitions of kOpenSingleQuote
X        and kCloseSingleQuote. These definitions contain characters
X        that are > 127 (i.e. their high bit is set).
X
X
X    Bugs:
X    
X        As with most quick hacks, the internals of this program are almost
X        completely undocumented. But it is a very simple program and 
X        experienced C programmers should have no trouble following its
X        logic.
X        
X
X    The following are some simple test strings:
X
X        Clear text _U_n_d_e_r_l_i_n_e_ _t_e_x_t Clear text
X        
X        Clear text BBoolldd  tteexxtt Clear text
X        
X        Clear text _BB_oo_ll_dd_--_UU_nn_dd_ee_rr_ll_ii_nn_ee_  _tt_ee_xx_tt Clear text
X        
X        Clear text BBBaaaddd   ttteeexxxttt Clear text
X        
X        Clear text B_Ba_ad_d _ t_te_ex_xt_t Clear text
X        
X******************************************************************************/
X
X#include <stdio.h>
X
X
X/* Type Definitions... */
X
Xenum TypeStyle {
X    kUndefinedStyle,
X    kPlain,
X    kBold, 
X    kExtraBold, 
X    kUltraBold, 
X    kUnderline, 
X    kBoldUnderline
X};
Xtypedef enum TypeStyle TypeStyle;
X
Xenum PageFormat {
X    kUndefinedPage, 
X    kDefaultPage, 
X    kFullPage10Point, 
X    kHalfPage8Point, 
X    kHalfPage7Point
X};
Xtypedef enum PageFormat PageFormat;
X
Xtypedef unsigned char Char;
X
X
X/* Function Prototypes... */
X
XPageFormat ParseArguments (int, Char *[]);
X
Xvoid PrintHeader (PageFormat);
X
XTypeStyle WhatStyle (Char *);
X
Xint CollectPlain         (Char *, Char *);
Xint CollectBold          (Char *, Char *);
Xint CollectExtraBold     (Char *, Char *);
Xint CollectUltraBold     (Char *, Char *);
Xint CollectUnderline     (Char *, Char *);
Xint CollectBoldUnderline (Char *, Char *);
X
X
X/* Global Constants... */
X
X#ifdef applec
Xconst Char      kOpenSingleQuote  = 'T';    /* Replace T with Option-]       */
Xconst Char      kCloseSingleQuote = 'U';    /* Replace U with Shift-Option-] */
X#else
Xconst Char      kOpenSingleQuote  = 0xD4;
Xconst Char      kCloseSingleQuote = 0xD5;
X#endif
X
X
X/* Entry Point... */
X
Xmain (int argc, Char *argv[])
X{
X    int     indexInLine;
X    int     lengthInLine;
X    int     (*collectChars)(Char *, Char *);
X    
X    Char    *styleCommand;
X    Char    inLine [1024];      /* Actually, I believe these buffers need be */
X    Char    outLine [1024];     /*  only inLine [561], outLine [103], and    */
X    Char    pureChars [1024];   /*  putChars [81]. But hey, memory is cheap. */
X    Char    *fontCommand;
X    
X    PageFormat  fmt;
X    
X    
X    /* Parse the command line arguments */
X    fmt = ParseArguments (argc, argv);
X    if ( fmt == kUndefinedPage )
X    {
X        fprintf (stderr, "Invalid Argument\n");
X        fprintf (stderr,
X                 "Usage: %s [-f | -h8 | -h7] <Man.file >RTF.File\n",
X                 argv[0]);
X        return 0;
X    }
X    
X    /* Print the appropriate header. */
X    PrintHeader (fmt);
X    
X    /* Set up the font command string */
X    switch ( fmt )
X    {
X        case kFullPage10Point :
X            fontCommand = "\\f1\\fs20";
X            break;
X        
X        case kHalfPage8Point :
X            fontCommand = "\\f1\\fs16";
X            break;
X        
X        case kHalfPage7Point :
X            fontCommand = "\\f1\\fs14";
X            break;
X        
X        case kDefaultPage :
X        case kUndefinedPage :
X        default :
X            fontCommand = "";
X            break;
X    }
X    
X    /* Convert the file */
X    while ( gets(inLine) != NULL )
X    {
X        indexInLine = 0;
X        lengthInLine = strlen(inLine) + 1;      /* Process the '\0' also. */
X        
X        while ( indexInLine < lengthInLine )
X        {
X            switch ( WhatStyle (&inLine[indexInLine]) )
X            {
X                case kPlain :
X                    collectChars = CollectPlain;
X                    styleCommand = "";
X                    break;
X                    
X                case kBold :
X                    collectChars = CollectBold;
X                    styleCommand = "\\b";
X                    break;
X                
X                case kExtraBold :
X                    collectChars = CollectExtraBold;
X                    styleCommand = "\\b";
X                    break;
X                
X                case kUltraBold :
X                    collectChars = CollectUltraBold;
X                    styleCommand = "\\b";
X                    break;
X
X                case kUnderline :
X                    collectChars = CollectUnderline;
X                    styleCommand = "\\ul";
X                    break;
X
X                case kBoldUnderline :
X                    collectChars = CollectBoldUnderline;
X                    styleCommand = "\\b\\ul";
X                    break;
X
X                case kUndefinedStyle :
X                default :
X                    fprintf (stderr,
X                             "\nUnknown style returned from 'WhatStyle()'"
X                             "\nIndex is %d"
X                             "\nLine is \"%s\"\n",
X                             indexInLine + 1, inLine);
X                    inLine [indexInLine + 1] = '\0';
X                    fprintf (stderr,"Error is %s^--Here\n",inLine);
X                    return -1;
X            }
X            
X            indexInLine += collectChars (&inLine[indexInLine], pureChars);
X            
X            strcpy (outLine, "{\\plain");
X            strcat (outLine, fontCommand);
X            strcat (outLine, styleCommand);
X            strcat (outLine, " ");
X            strcat (outLine, pureChars);
X            strcat (outLine, "}");
X            
X            puts (outLine);
X        }           
X    }
X    
X    /* Print trailer */
X    puts ("}");
X
X    return 0;
X}
X
X
XPageFormat ParseArguments (int argc, Char *argv[])
X{
X    switch ( argc )
X    {
X        case 1 :
X            return kDefaultPage;
X        
X        case 2 :
X            if ( strcmp (argv[1], "-f") == 0 )
X                return kFullPage10Point;
X            else if ( strcmp (argv[1], "-h8") == 0 )
X                return kHalfPage8Point;
X            else if ( strcmp (argv[1], "-h7") == 0 )
X                return kHalfPage7Point;
X            else 
X                return kUndefinedPage;
X        
X        default :
X            return kUndefinedPage;
X    }
X}
X
Xvoid PrintHeader (PageFormat fmt)
X{
X    puts ("{\\rtf0\\mac\\deff1");
X    puts ("{\\fonttbl{\\f0 \\fswiss Helvetica;}{\\f1 \\fmodern Courier;}}");
X
X    switch ( fmt )
X    {
X        case kFullPage10Point :
X            puts ("\\paperw12240\\paperh15840\\margt648\\margb648"
X                  "\\margl1800\\margr0\\widowctrl\\ftnbj\\ftnrestart"
X                  "\\ftnstart1\\pgnstart1\\deftab720\\fracwidth\\sectd"
X                  "\\linemod0\\linex0\\cols1\\colsx863");
X            break;
X        
X        case kHalfPage8Point :
X            puts ("\\paperw15840\\paperh12240\\landscape\\margt144"
X                  "\\margb432\\margl144\\margr144\\widowctrl\\ftnbj"
X                  "\\ftnrestart\\ftnstart1\\pgnstart1\\deftab720"
X                  "\\fracwidth\\sectd\\linemod0\\linex0\\cols2\\colsx144");
X            break;
X        
X        case kHalfPage7Point :
X            puts ("\\paperw15840\\paperh12240\\landscape\\margt1440"
X                  "\\margb576\\margl720\\margr720\\widowctrl\\ftnbj"
X                  "\\ftnrestart\\ftnstart1\\pgnstart1\\deftab720"
X                  "\\fracwidth\\sectd\\linemod0\\linex0\\cols2\\colsx1080");
X            break;
X        
X        case kUndefinedPage :
X        default :
X            break;
X    }
X    
X    puts ("\\pard\\plain");
X}
X
XTypeStyle WhatStyle ( Char *line )
X{
X    Char    c1, c2, c3, c4, c5, c6, c7;
X    
X    c1 = line [0];
X    c2 = line [1];
X    c3 = line [2];
X    c4 = line [3];
X    c5 = line [4];
X    c6 = line [5];
X    c7 = line [6];
X    
X    if ( c1 == c3 && c3 == c5 && c5 ==c7
X              && c2 == '\b' && c4 == '\b' && c6 == '\b'
X              && c1 != '\0' )
X        return kUltraBold;
X    
X    else if ( c1 == c3 && c3 == c5
X              && c2 == '\b' && c4 == '\b'
X              && c1 != '\0' )
X        return kExtraBold;
X    
X    else if ( c1 == '_' && c3 == c5 
X              && c2 == '\b' && c4 == '\b' 
X              && c3 != '\0' )
X        return kBoldUnderline;
X
X    else if ( c1 == '_' && c2 == '\b' )
X        return kUnderline;
X
X    else if ( c1 == c3 && c2 == '\b' && c1 != '\0' )
X        return kBold;
X    
X    else if ( c1 == '\b' || (c1 != '\0' && c2 == '\b') )
X        return kUndefinedStyle;
X
X    else 
X        return kPlain;
X}
X
Xint CollectPlain (Char *in, Char *out)
X{
X    Char    c;
X    Char    *startPos;
X    
X    startPos = in;
X    
X    while ( 1 )
X    {
X        switch ( c = *in++ )
X        {
X            case '\b' :
X                *--out = '\0';
X                return (in - startPos) - 2;
X            
X            case '\f' :
X                *out++ = '\\';
X                *out++ = 'p';
X                *out++ = 'a';
X                *out++ = 'g';
X                *out++ = 'e';
X                *out++ = ' ';
X                break; 
X                
X            case '`' :
X                *out++ = kOpenSingleQuote;
X                break;
X            
X            case '\'' :
X                *out++ = kCloseSingleQuote;
X                break;
X            
X            case '\0' :
X                *out++ = '\\';
X                *out++ = 'p';
X                *out++ = 'a';
X                *out++ = 'r';
X                *out++ = ' ';
X                *out   = '\0';
X                return in - startPos;
X            
X            case '{' :
X                *out++ = '\\';
X                *out++ = '{';
X                break;
X            
X            case '}' :
X                *out++ = '\\';
X                *out++ = '}';
X                break;
X            
X            case '\\' :
X                *out++ = '\\';
X                *out++ = '\\';
X                break;
X            
X            default :
X                *out++ = c;
X                break;
X        }
X    }
X}
X
Xint CollectBold (Char *in, Char *out)
X{
X    Char    c1, c2, c3;
X    Char    *startPos;
X    
X    startPos = in;
X    
X    while ( 1 )
X    {
X        c1 = in [0];
X        c2 = in [1];
X        c3 = in [2];
X        
X        if ( c1 != c3 || c2 != '\b' || c1 == '\0' )
X        {
X            *out = '\0';
X            return in - startPos;
X        }
X        else
X        {
X            in += 3;
X            switch ( c1 )
X            {
X                case '`' :
X                    *out++ = kOpenSingleQuote;
X                    break;
X                
X                case '\'' :
X                    *out++ = kCloseSingleQuote;
X                    break;
X                
X                case '{' :
X                    *out++ = '\\';
X                    *out++ = '{';
X                    break;
X                
X                case '}' :
X                    *out++ = '\\';
X                    *out++ = '}';
X                    break;
X                
X                case '\\' :
X                    *out++ = '\\';
X                    *out++ = '\\';
X                    break;
X                
X                default :
X                    *out++ = c1;
X                    break;
X            }
X        }
X    }
X}
X
Xint CollectExtraBold (Char *in, Char *out)
X{
X    Char    c1, c2, c3, c4, c5;
X    Char    *startPos;
X    
X    startPos = in;
X    
X    while ( 1 )
X    {
X        c1 = in [0];
X        c2 = in [1];
X        c3 = in [2];
X        c4 = in [3];
X        c5 = in [4];
X                
X        if ( c1 != c3 || c3 != c5
X             || c2 != '\b' || c4 != '\b'
X             || c1 == '\0' )
X        {
X            *out = '\0';
X            return in - startPos;
X        }
X        else
X        {
X            in += 5;
X            switch ( c1 )
X            {
X                case '`' :
X                    *out++ = kOpenSingleQuote;
X                    break;
X                
X                case '\'' :
X                    *out++ = kCloseSingleQuote;
X                    break;
X                
X                case '{' :
X                    *out++ = '\\';
X                    *out++ = '{';
X                    break;
X                
X                case '}' :
X                    *out++ = '\\';
X                    *out++ = '}';
X                    break;
X                
X                case '\\' :
X                    *out++ = '\\';
X                    *out++ = '\\';
X                    break;
X                
X                default :
X                    *out++ = c1;
X                    break;
X            }
X        }
X    }
X}
X
Xint CollectUltraBold (Char *in, Char *out)
X{
X    Char    c1, c2, c3, c4, c5, c6, c7;
X    Char    *startPos;
X    
X    startPos = in;
X    
X    while ( 1 )
X    {
X        c1 = in [0];
X        c2 = in [1];
X        c3 = in [2];
X        c4 = in [3];
X        c5 = in [4];
X        c6 = in [5];
X        c7 = in [6];
X                
X        if ( c1 != c3 || c3 != c5 || c5 != c7
X             || c2 != '\b' || c4 != '\b' || c6 != '\b'
X             || c1 == '\0' )
X        {
X            *out = '\0';
X            return in - startPos;
X        }
X        else
X        {
X            in += 7;
X            switch ( c1 )
X            {
X                case '`' :
X                    *out++ = kOpenSingleQuote;
X                    break;
X                
X                case '\'' :
X                    *out++ = kCloseSingleQuote;
X                    break;
X                
X                case '{' :
X                    *out++ = '\\';
X                    *out++ = '{';
X                    break;
X                
X                case '}' :
X                    *out++ = '\\';
X                    *out++ = '}';
X                    break;
X                
X                case '\\' :
X                    *out++ = '\\';
X                    *out++ = '\\';
X                    break;
X                
X                default :
X                    *out++ = c1;
X                    break;
X            }
X        }
X    }
X}
X
Xint CollectUnderline (Char *in, Char *out)
X{
X    Char    c1, c2, c3;
X    Char    *startPos;
X    
X    startPos = in;
X    
X    while ( 1 )
X    {
X        c1 = in [0];
X        c2 = in [1];
X        c3 = in [2];
X        
X        if ( c1 != '_' || c2 != '\b' || c3 == '\0' )
X        {
X            *out = '\0';
X            return in - startPos;
X        }
X        else
X        {
X            in += 3;
X            switch ( c3 )
X            {
X                case '`' :
X                    *out++ = kOpenSingleQuote;
X                    break;
X                
X                case '\'' :
X                    *out++ = kCloseSingleQuote;
X                    break;
X                
X                case '{' :
X                    *out++ = '\\';
X                    *out++ = '{';
X                    break;
X                
X                case '}' :
X                    *out++ = '\\';
X                    *out++ = '}';
X                    break;
X                
X                case '\\' :
X                    *out++ = '\\';
X                    *out++ = '\\';
X                    break;
X                
X                default :
X                    *out++ = c3;
X                    break;
X            }
X        }
X    }
X}
X
Xint CollectBoldUnderline (Char *in, Char *out)
X{
X    Char    c1, c2, c3, c4, c5;
X    Char    *startPos;
X    
X    startPos = in;
X    
X    while ( 1 )
X    {
X        c1 = in [0];
X        c2 = in [1];
X        c3 = in [2];
X        c4 = in [3];
X        c5 = in [4];
X        
X        if ( c1 != '_' || c2 != '\b' || c4 != '\b' || c3 != c5 || c3 == '\0' )
X        {
X            *out = '\0';
X            return in - startPos;
X        }
X        else
X        {
X            in += 5;
X            switch ( c3 )
X            {
X                case '`' :
X                    *out++ = kOpenSingleQuote;
X                    break;
X                
X                case '\'' :
X                    *out++ = kCloseSingleQuote;
X                    break;
X                
X                case '{' :
X                    *out++ = '\\';
X                    *out++ = '{';
X                    break;
X                
X                case '}' :
X                    *out++ = '\\';
X                    *out++ = '}';
X                    break;
X                
X                case '\\' :
X                    *out++ = '\\';
X                    *out++ = '\\';
X                    break;
X                
X                default :
X                    *out++ = c3;
X                    break;
X            }
X        }
X    }
X}
SHAR_EOF
echo shar: 93 control characters may be missing from "'Man2RTF.c'"
if test 22013 -ne "`wc -c < 'Man2RTF.c'`"
then
	echo shar: error transmitting "'Man2RTF.c'" '(should have been 22013 characters)'
fi
fi # end of overwriting check
#	End of shell archive
exit 0
---