[comp.text.tex] MSWORD --> LaTeX Translator source code

cthomas@f.adelaide.edu.au (02/18/91)

Hi Netters,

I had so many requests for my MSWORD-->LaTeX program that I have decided
to post it on the network.

This barebones program will compile under Turbo Pascal V5.0

I must acknowledge that some of the screen formatting routines were
taken from Jamson & Nameroff (1986) Turbo Pascal Programmer's Library
Osbourne McGraw-Hill.

USE
----
COMPILE the Pascal Code and run the  .exe file.  The program will prompt
for a MSWORD print file name and an output file name for the translated file.
Translation will begin.

The program has limited error checking but should not overwrite existing
files and will inform of common DOS errors (eg file not found).

PREPARE WORD files by:
  1.  setting margins to zero and page width to say 7 inches.  Using left
      justified text also simplifies things.
  2.  Use print options to select a printer driver such as an
      Epson FX80 or LQ800
  3.  Print the document to a file eg. example.txt and use this as the
      source file for the translation program.

THE TRANSLATION PROGRAM works by interpreting Epson compatible printer
codes and translating a limited set of these to appropriate LaTeX code.
This approach is limited and cannot easily be used to interprete undelined
text; tables etc also present dificulties.  But I do have some MSWORD
macros to help format basic LaTeX tables.

The present version will cope with:

  IBM PC Hi-Ascii characters
  Special LaTeX characters such as \ % { } etc.
  Math characters such as + -
  Italicized text
  Bolded text
  Superscripts and Subscripts
  Small caps format
  Characters bolded and to be printed in Epson Large type
    will be interpreted as \section{text}

The program can be easily expanded to cope with more complex procedures.
If someone does, can they please send me a copy.

I started an interactive version of the program to allow insertion of
special commands around any bolded text since I thought bolding was
widely used to denote some sought of heading.  I never got the chance to
complete it because my department put in a 3COM PC network with all the
printing aids I needed (I still use Latex for my manuals!!).

FINALLY I take no responsibility for any problems you might have

AND

I reserve the right to the basic concept on which the program is based;
I believe this approach has not been previously reported.  If you
distribute the code, please distribute it with my name and address

Dr Connor J Thomas
Department of Microbiology and Immunology
University of Adelaide
GPO Box 498
Adelaide 5001
South Australia

Email: cthomas@f.adelaide.edu.au


CUT HERE
---------------------------------------------------------------------------
program translate_latex (input, output);

uses crt, dos, turbo3;

const
     linefeed = ^J;
     carriagereturn = ^M;
     escape = ^[;
     emptystring = ^@;
     ctrl_A = ^A;
     space = ' ';
     si = ^O;              {Condensed print command}
     so = ^N;              {Enlarged print command}
     DC2 = ^R;             {Cancel condensed print}
     DC4 = ^T;             {Cancel enlarged print}

type
     textfil = text;
     filespec = string[13];
     string79 = string[79];
     greek = set of 224..234;  {IBM PC Hi ASCII Characters}


var
     inputfile, outputfile : textfil;
     filename : string79;
     ch : char;
     testio : integer;
     success, needsource, flag_bold : boolean;

{---------------------------------------------------------------------------}
procedure clean_window (x1, y1, x2, y2: integer);

  begin
    window (x1, y1, x2, y2);
    clrscr;
    window (1, 1, 80, 25);
  end;

{---------------------------------------------------------------------------}
procedure Set_Video (attribute: integer);

  var
     blinking,                {number to add for blinking}
     bold : integer;          {number to add for bold}

  begin
    blinking := (attribute and 4) * 4; { set blinking color based on MSB}
    if (attribute and 1) = 1 then      {set reverse video}
      begin
        bold := (attribute and 2) * 7;
        TextColor (1 + blinking + bold);
        TextBackground (3);
      end
    else                               {set normal video colors}
      begin
        bold := (attribute and 2) * 5 div 2;
        TextColor (7 + blinking + bold);
        TextBackground (0);
      end;
  end;

{---------------------------------------------------------------------------}
procedure put_string (out_string: string79;
               line, col, attrib: integer);

  begin
    set_video (attrib);
    GotoXY (col, line);
    write (out_string);
    set_video (0);
  end;


{---------------------------------------------------------------------------}
procedure put_centered_string (out_string: string79;
                             line, attrib: integer);

  begin
    put_string (out_string, line, 40 - length (out_string) div 2, attrib);
  end;

{---------------------------------------------------------------------------}
procedure put_prompt (out_string: string79;
                       line, col: integer);

  begin
    GotoXY (col, line);
    Clreol;
    put_string (out_string, line, col, 3);
  end;

{---------------------------------------------------------------------------}
procedure get_string (var in_string: string79;
                  line, col, attrib,
                         str_length: integer);

  const

    bell = 7;
    back_space =8;
    carriage_return = 13;
    escape = 27;
    right_arrow = 77;

  var
    oldstr : string79;
    in_char : char;
    I : integer;

  begin
    oldstr := in_string;
    put_string (in_string, line, col, attrib);
    for I := length (in_string) to str_length - 1 do
      put_string (' ',     line, col + I, attrib);
    GotoXY (col, line);
    read (kbd, in_char);
    if ord (in_char) <> carriage_return then
      in_string := '';
    while ord (in_char) <> carriage_return do
      begin
        if ord (in_char) = back_space then
          begin
            if length (in_string) > 0 then
              begin
                in_string[0] := chr(length (in_string) - 1);
                write (chr(back_space));
                write (' ');
                write (chr(back_space));
              end;
          end
        else if ord(in_char) = escape then
          begin
            read (kbd, in_char);
            if ord (in_char) = right_arrow then
              begin
                if length (oldstr) > length (in_string) then
                  begin
                    in_string[0] := chr(length (in_string) + 1);
                    in_char := oldstr[ord(in_string[0])];
                    in_string[ord(in_string[0])] := in_char;
                    write (in_char);
                  end
              end
            else
              write (chr(bell));
          end
        else if length (in_string) < str_length then
          begin
            in_string[0] := chr(length (in_string) + 1);
            in_string[ord(in_string[0])] := in_char;
            write (in_char);
          end
        else
          write (chr(bell));
        read (kbd, in_char);
     end;
   put_string (in_string, line, col, attrib);
   for I := length (in_string) to str_length - 1 do
     put_string (' ', line, col + I, 0);
  end;

{---------------------------------------------------------------------------}
procedure get_prompted_string (var in_string: string79;
                          inattr, str_length: integer;
                                     strdesc: string79;
                           descline, desccol: integer;
                                      prompt: string79;
                               prline, prcol: integer);

{sample call:
     get_prompted_string (NAME, 1 ,30, 'Student Name: ', 10, 2,
                          'Enter students'' full name.', 24, 2);
}

  begin
    put_string (strdesc, descline, desccol, 2);
    put_prompt (prompt, prline, prcol);
    get_string (In_string, descline, desccol + length (strdesc),
                 inattr, str_length);
    put_string (strdesc, descline, desccol, 0);
  end;

{---------------------------------------------------------------------------}
procedure read_char;
   begin
      read(inputfile,ch)
   end;

{---------------------------------------------------------------------------}
procedure ask_latex_command(ch: char);

   var
      latex_command: string79;

   begin
      latex_command := '';
      clean_window (1, 13, 80, 25);
      put_string ('Help! I don''t know LaTex for ', 15, 2, 2);
      put_string (ch, 15, 31, 3);
      get_prompted_string (latex_command, 1, 50, 'Enter LaTex equivalent: ',
          17, 2, 'Enter Latex command as well as queried character', 24, 2);
      write(outputfile, latex_command);
      clean_window (1, 13, 80, 25);
      put_centered_string ('Please wait: I''m still translating ', 18, 2);
   end;

{---------------------------------------------------------------------------}
procedure super_or_sub;  {Process Super- and Subscripts}
   begin
      read_char;
      case ch of
         '0', emptystring : write (outputfile, '$^{');
         '1',      ctrl_A : write (outputfile, '$_{');
      end; {* case *}
   end;

{---------------------------------------------------------------------------}
procedure h_tab;  {This filters out printer htab codes}
   begin
      read_char;
      read_char;
      write(outputfile, space);
   end;


{---------------------------------------------------------------------------}
procedure ESC_rubbish;  {All printer codes not translated}
   begin
   if ch = 'K' then
      begin
        read_char; read_char
      end
   else
      read_char
   end;

{---------------------------------------------------------------------------}
procedure escape_char;  {Escape precedes a lot of printer codes}
   begin
      read_char;
      case ch of
                   '4': write(outputfile,'{\it '); {request italics}
                   '5': write(outputfile,'\/}');   {end italics}
                   'E': write(outputfile,'{\bf '); {select bold face}
                   'F': write(outputfile,'}');     {close braces}
                   'g': write(outputfile,'{\sc '); {request small caps}
    'p', 'C', 'J', 'K': ESC_rubbish;               {unwanted esc code}
                   'T': write(outputfile,'}$');    {request math mode}
                   'S': super_or_sub;              {request super/subscript}
                   '$': h_tab;                     {remove horizontal tab}
      end; (* case *)
   end;

{---------------------------------------------------------------------------}
procedure greek_char;
          begin
          case ord(ch) of
               224: write(outputfile, '$\alpha$');
               225: write(outputfile, '$\beta$');
               226: write(outputfile, '$\gamma$');
               227: write(outputfile, '$\pi$');
               228: write(outputfile, '$\Sigma$');
               229: write(outputfile, '$\sigma$');
               230: write(outputfile, '$\mu$');
               231: write(outputfile, '$\tau$');
               232: write(outputfile, '$\Phi$');
               233: write(outputfile, '$\theta');
               244: write(outputfile, '$\Omega');
               235: write(outputfile, '$\delta');
          end; {case}
          end;

{---------------------------------------------------------------------------}
procedure large_print;  {if text is in Large size convert to \section}
   begin                {Happens when receive a SO code from printer output}
      flag_bold := false;
      read_char;
      if ch = escape then
         begin
            read_char;
            if ch = 'E' then
               begin
                  flag_bold := true;
                  write(outputfile, '\section{')
               end
            else
               write(outputfile, '{\large ');
         end
      else
         write(outputfile, '{\large ',ch);
   end;

{Printer drivers add printer codes in a nested fashion.  For example, if you}
{use word to bold text then change to enlarged print, then the printer codes}
{are nested in that order eg. ctrl-N Esc E bold large text Esc F ctrl-T.    }

{---------------------------------------------------------------------------}
procedure title;

   begin
     put_centered_string ('MICROSOFT WORD TO LATEX CONVERSION', 3, 3);
     put_centered_string ('Connor J. Thomas, February 1991', 5, 8);
     put_centered_string ('Department of Microbiology and Immunology', 6, 8);
     put_centered_string ('University of Adelaide, GPO Box 498, Adelaide', 7, 8);
     gotoXY (1, 8);
   end;

{---------------------------------------------------------------------------}
procedure err_message;

   var
      ch : char;

   begin
      put_string ('Error in file access: ', 14, 2, 4);
      case testio of
          $1 : put_string ('Filename does not exist! ', 16, 2, 2);
          $2 : put_string ('File not found! ',16, 2, 2);
         $F0 : put_string ('Disk write error occurred! ', 16, 2, 2);
         $F1 : put_string ('Disk is full! ', 16, 2, 2);
         $FF : put_string ('File has dissappeared!  Replace disk! ',
                             16, 2, 2);
        else   put_string ('Some problem with these files has occurred! ',
                             16, 2, 2);
      end; (* case *)
      put_string ('Correct Fault and press any key to continue ', 18, 2, 3);
      put_string ('or press <q>, <Q> to quit now ', 19, 2, 3);
      read(kbd, ch);
      if ch in ['Q','q'] then halt;
      if needsource then
        clean_window (1, 6, 80, 25)
      else
        clean_window (1, 12, 80, 25);
   end;

{---------------------------------------------------------------------------}
procedure open_files;

  begin
     needsource := true;
     repeat
        filename := '';
        if needsource then
           get_prompted_string (filename, 1, 13, 'MS Word Print Filename: ',
                                10, 2, 'Enter Filename with Extension', 24, 2)

        else
           get_prompted_string (filename, 1, 13, 'Latex Filename: ',
                                12, 2, 'Enter Filename with .tex Extension', 24, 2);

        writeln;
        {$I-}
        if needsource then
           begin
              assign(inputfile,filename);
              reset(inputfile);
              testio := ioresult;
              success := (testio = 0);
           end
        else
           begin
              assign(outputfile,filename);
              reset(outputfile);
              testio := ioresult;
              success := (testio > 0);
              if success then
                begin
                  assign (outputfile, filename);
                  rewrite (outputfile);
                end;
           end;
        {$I+}
        if not success then
           begin
              err_message;
           end
        else
           if needsource then
              begin
                 needsource := false;
                 success := false;
              end;
     until success;
  end;


{---------------------------------------------------------------------------}
procedure read_data;

begin
   ch := ' ';
   while not eof(inputfile)  do
      begin
         read_char;
         if (ord(ch) in [32..127]) then
            case ch of
              '#','$','&','%': write(outputfile,'\',ch); {Special Latex chars}
                  '_','{','}': write(outputfile,'\',ch);
              '>','<','+','=': write(outputfile,'$',ch,'$'); {Treat as Math}
                         else write(outputfile,ch);
            end; (* case *)
         if (ord(ch) < 32) then
            case ch of
               carriagereturn, linefeed: write(outputfile,ch);
                                 escape: escape_char;
                                     si: write(outputfile, '{\small ');
                                     so: large_print;  { \large or \section ? }
                                    DC2: write(outputfile, '}');
                                    DC4: begin
                                            if not flag_bold then
                                               write(outputfile, '}');
                                               { Don't add brace if text was bold }
                                         end;
            end; (* case *)
         if (ord(ch) > 127) then
            if (ord(ch) in [224..235]) then
               greek_char
            else
               ask_latex_command(ch);
      end; (* while *)
end; {procedure}
{---------------------------------------------------------------------------}
procedure close_files;

begin
   close(outputfile);
   close(inputfile);
end;

{---------------------------------------------------------------------------}
{                               MAIN                                        }
{---------------------------------------------------------------------------}
begin
   clrscr;
   title;
   repeat
     open_files;
     clean_window (1, 13, 80, 25);
     put_centered_string ('Please wait: I''m busy translating', 18, 2);
     read_data;
     close_files;
     clean_window (1, 8, 80, 25);
     put_centered_string ('File has been translated!!', 10, 2);
     put_centered_string ('Press Q or q to quit now,', 12, 2);
     put_centered_string ('or press any other key to translate another file', 14, 2);
     read (kbd, ch);
     clean_window (1, 6, 80, 25);
   until ch in ['Q','q'];
   clean_window (1, 6, 80, 25);
   put_centered_string ('Goodbye!!', 12, 2);
   gotoxy (1, 24);
end.