[net.micro.pc] Development of the Navy DIF standard

ted@imsvax.UUCP (Ted Holden) (01/05/86)

               Standards are  a funny business.  Some are simply published,
          allowing manufacturers to implement  them as  they each  see fit.
          Has anyone  out there  ever tried connecting a Fortune 32:16 to a
          PC with "standard" RS232 cable and watched all the neat  fire and
          flames?  Other standards, such as 2780, get implemented in such a
          way that every other manufacturer can do it with IBM,  but no two
          of them  can do it with each other.  Neat, huh?  To my knowledge,
          the Navy DIF effort represents the ONLY instance  I've ever heard
          of in which an actual real-world ability was hammered out AMONGST
          vendors at the same time at which the standard involved was being
          developed.  

               The Navy  had several  objectives in mind in wanting the DIF
          standard in REALITY rather than  just  on  paper.    Foremost was
          guaranteeing that  whole bases  of documents  don't get lost when
          organizations switch vendors for whatever reason  or, conversely,
          preventing  organizations  from  being  locked  into  one  vendor
          forever.  Second  was  the  intriguing  idea  of  transmission of
          documents over  true multi-vendor  networks.   There was also the
          notion of sending processible  documents  around  the  world over
          communication lines  as an alternative to mailing paper printouts
          around.  It was viewed as highly desirable to have a system which
          transmitted both page images and editing codes accurately.  Some
          of what  was involved  may be  viewed from part of one of the DIF
          writing guidelines which were being passed around at the time.

          .................................................................
          .................................................................


               ~    Leading Zeroes.  The parameters in DIF codes  are to be
          regarded as  integer numbers and may contain leading zeroes.  The
          selective parameters in the 13 original codes which were obtained
          from  ANSI  X3.64  are  no  exceptions.  DIF import programs must
          allow for the possibility of leading zeroes in all parameters.

               ~    Indent or TLM.    A  large  number  of  word processors
          incorporate  indenting  in  such  a  manner  that  the indent key
          carries the cursor forward to the next tab stop and sets the wrap
          point at that position.  Other word processors will have a "W" on
          the ruler line.  When this point is  reached by  tabbing, it sets
          the wrap point to that position.  Other systems, most notably the
          Datapoint and Muse word processors have an inset  key.   This key
          sets  a  temporary  wrap  point  at  the  cursor  location.   The
          exceptions  to  the  rule  are  numerous.    The   difficulty  of
          translating  other  notions  of  indenting  directly into the tab
          location methodology  is  too  great  to  allow  that  to  be the
          standard.   DIF uses the notion of TLM (Temporary Left Margin) in
          which an  absolute column  position is  specified.   As a result,
          programmers writing  DIF filters  for word processors that indent
          to the next tab stop need to  be aware  that a  TLM command might
          require translation  into SEVERAL rather  than one indent.  How
          many depends on the number of  tab  stops  that  lie  between the
          present cursor position and the value of the TLM command.

               ~    Indents with  no corresponding  tab stops.    Most word
          processors require tab stops for indenting,  whereas there  is no
          guarantee  that  they  will  be  there  in DIF files.  Other word
          processors don't  use them;   Again  using VistaWord  and Muse as
          examples, we  see systems which don't use tab stops for indenting
          and which wouldn't need to have a tab  stop set  at column forty,
          say, to  indent to  column forty,  and note  that DIF files taken
          from these systems would reflect this.    DIF import  filters for
          word processors  which require a tab stop at column N in order to
          indent to column N must insert tab stops rather than assume them.
          They  must  scan  all  DIF  files  on  import for points at which
          indenting actually occurs and add these columns to every ruler as
          tab stops.   The  DIF filter must then issue an extra perform tab
          to skip over a  tab  stop  which  was  inserted  for  purposes of
          indenting  when  a  real  tab  occurs THROUGH that column.  A DIF
          import filter for a system such as VistaWord  or Muse  which does
          not use tab stops for indenting must be able to handle indents to
          points several columns beyond  the present  cursor position.   If
          need be, the import filter must provide tabbing or spacing to the
          point of such indents. 

               ~    Indents to the present  cursor  position.    Those word
          processors with  an indent  button that carries the cursor to the
          next tab stop cannot indent to  the present  cursor position even
          if a  tab stop  exists at that location.  This kind of thing will
          be produced by the DIF  filters  of  several  real  systems.   An
          example might  be a hard return followed by four blank spaces and
          then an indent to column five.  A DIF import filter must  be able
          to reposition  the file  pointer.   In this  example, the pointer
          would have to back up over the last character, be  it a  tab or a
          space, to make room for the required indent.

               ~    Text  and   decimal  tabs.    The  DIF  is  written  to
          correspond to a word processor with two kinds of tab  stops, text
          and decimal,  and two kinds of keys, perform text tab and perform
          decimal tab. The intent of a DIF Perform  Decimal Tab  is for the
          cursor to  bypass any  intervening text tab stops.  Conversely, a
          perform text tab command in a DIF file  should be  interpreted as
          requiring the  cursor to  move to the next text tab stop possibly
          bypassing intervening decimal tab stops.  A DIF import filter for
          a given  word processor may or may not have to insert intervening
          tabs of the proper kind to achieve this effect.

               ~    In the same vein, DIF export filters must never produce
          DIF files in which the same column is concurrently used both as a
          decimal and a text tab.  This will require pre-scanning for a DIF
          export filter for a Wang-like word processor (which uses only one
          kind of tab stop and multiple kinds of tab buttons).

               ~    Hanging the wrong man.  One might wonder,  "why so much
          fuss over  tabs and indents?"  The DIF is an attempt to solve the
          majority  of  media  conversion   problems  pertaining   to  word
          processing documents.   Perfection is not an overall goal, nor is
          it attainable, since the number of things which could go wrong in
          such an  effort is  essentially infinite.  Substantial effort has
          been exerted to insure that nothing ever goes wrong in such a way
          as to  change the  meaning of a document.  Ending up on the wrong
          column in the case of tabs or indents  falls into  this category.
          In  the  case  of  tables,  mis-positioning  could result in this
          year's numbers appearing as  last  year's,  profits  appearing as
          losses, or  debits appearing as credits etc.  Picture a memo from
          the territorial governor  to  the  prison  warden  which includes
          lists of  prisoners to be released and prisoners to be hanged, in
          tabular form, thus:

             TO BE RELEASED        TO BE HANGED

             Tom Smith             Jim Larson
             Dave Brown            Don Jones
             Rick Jackson          Terry Abrams

             Now picture  this memo  being transmitted  from the governor's
             office to the prison via DIF and one of the DIF filters losing
             track of a tab or a tab stop....

             ~  Wrapping and then tabbing.   Since  no two  word processors
          use the  same wrapping mechanism, the cursor position following a
          wrap may vary from a source  to a  target machine.   Any document
          which contains  a tab  after a  soft return  or wrap could easily
          lose its meaning on the target  system with  the tab  sending the
          cursor to the wrong place.  This situation should be flagged as a
          fatal error on DIF export filters and the user told  to rekey the
          document without the wrap-tab combination.

             ~  Formatting functions.   The  rule requiring  DIF line form-
          atting functions  to  be  implemented  at  the  point immediately
          following  the  next  break  function  is crucial.  The functions
          involved are those which are changed  with a  ruler change:   tab
          stops,  left  and  right  margins,  and  line spacing.  There are
          exceptions.  Some word processors  allow  one  or  more  of these
          functions  to  be  set  at  the  beginning  of the last line of a
          paragraph with the intent  of  establishing  new  values  for the
          first line  of the  NEXT paragraph.  Without the convention which
          DIF uses, these new values would effect the  PRESENT paragraph on
          most   target    systems   producing   an   undesirable   result.
          Consequently, some DIF export  filters will  need to  generate an
          inversion process in order to produce the desired results.

             In the  conventional sense  of word  processing, a hard return
          will immediately precede a ruler change.   The  DIF export filter
          should  switch  this  hard  return  to  follow  the DIF functions
          generated by the ruler line enabling  the changes  to occur where
          intended.    In  the  rare  instance  in  which a ruler change is
          neither preceded nor followed  by a  hard return,  the DIF export
          filter  should  add  a  hard return immediately following the DIF
          functions generated by the ruler  change.    This  adds  an extra
          blank line  to the document on a target system, but is the lesser
          of  evils.  It  guarantees  that  the  ruler  change  is effected
          immediately on  the target  system, rather  than two lines or two
          pages down.

             ~  Centering function.   With  most word  processors, the only
          things which  normally get between a hard return and a center are
          the  graphic  rendition  functions:    overstrike,  emphasis, and
          underline.  A DIF export filter should, when it encounters one of
          these, set a flag and implement the function  only when  the next
          text character is encountered.

             ~  Translating meaning  and intent.   Most programmers writing
          DIF filters will encounter several DIF features to  which nothing
          on  a  word  processor  exactly  corresponds.    The  idea  is to
          translate these items into something which  will look  correct on
          the  printed  page  and  will  have some editability, albeit, not
          exactly in the same manner as  originally created.   For example,
          there are  no top, header, bottom, or footer margins in some word
          processors,  which,  nonetheless,  have  multi-line  headers  and
          footers.   The DIF  import filter could, in this case translate a
          top margin of N into  (N-1)  blank  lines  above  the  text  in a
          header.

             ~  Severe  word  processor  limitations.    The  DIF  involves
          opening the doors to  a wider  world than  that of  your own word
          processor.  Lack of a minor DIF function such as a soft-hyphen is
          not serious.   These  usually  occur  infrequently  and  a simple
          substitution is  available (an  ordinary hyphen).   A limitation,
          however, such as a maximum of 100 ruler lines or 20 tab  stops on
          a ruler  line, might easily cause problems for the user of such a
          system  when  receiving  documents  from  other  word  processing
          systems via  DIF.   The DIF like features should be included as a
          minimum  in  word  processing   offerings  of   the  majority  of
          manufacturers in order to retain a competitive edge.

             ~  The DIF and the next generation of word processors.  In the
          same  vein,  it  seems  reasonable  to  suggest  that   the  next
          generation  of   word  processors   be  written  with  particular
          attention to  the DIF.   Any  feature which  has caused insoluble
          problems to  DIF writers  should be a candidate for correction in
          the native word processor. Two examples  of such  problems are as
          follows:

             a. On  most  word  processing  systems,  numbers  following  a
                decimal tab may not  proceed to  the left  any further than
                the last  previous tab  stop.  This could cause problems if
                that last tab  stop  were  one  inserted  by  a  DIF import
                program for  purposes of  indenting and  were irrelevant to
                the decimal tab stop at hand.

             b. On some word processors,  graphic rendition  functions such
                as underscore  or bold face appear on the screen as special
                characters and take up  space.   This could  force a missed
                tab on  a document  being imported via DIF from a system on
                which such functions don't take up space.

             ~  Error messages.  DIF  programs should  generate fatal error
          messages more or less for the following kinds of reasons:

             a. Non-recoverable  situations  such  as  a CSI sequence which
                fails to end in a recognizable manner.

             b. A situation which would  indicate that  text had  been lost
                during  communications.    For  instance,  two  emphasis-on
                commands with no intervening  emphasis  off  would indicate
                that,  in  all  likelihood, text containing the intervening
                emphasis-off command had been lost during transmission.

             c. A  situation  which  might  cause  ambiguity  regarding the
                meaning of a document, such as a wrap-tab sequence.

             Other kinds of error situations should be flagged as non-fatal
          errors and allow processing to continue.

             ~  Left margins.  Three kinds of  left margins  normally occur
          in word  processing systems:  an indent or temporary left margin;
          a printing offset which usually is specified in a print  menu and
          applies to  the entire  document; or a variable left margin which
          normally appears as a letter "L"  on a  ruler line.   It  is only
          this last item which corresponds to the notion of a variable left
          margin in the DIF.  For programmers writing DIF filters  for word
          processors with  only the  print offset  type of left margin, the
          following approach is  recommended:    On  import.    Pre-scan an
          entire  incoming  DIF  document  for  the least left margin (LLM)
          which actually occurs, subtract one from LLM, put out LLM  as the
          printing offset,  try left  and right margin, every tab stop, and
          every indent parameter in the document.   Left  margins which are
          still  different  from  1  after  all  this  must  be effected by
          indenting.  On export.  Add the printing offset to every left and
          right margin,  every tab  stop, and every indent parameter in the
          DIF document being created.