[net.text] The Navy Document Interchange Format

ted@imsvax.UUCP (Ted Holden) (10/26/85)

     The Navy Document Interchange Format (DIF) is the first genuinely serious
effort in the direction of being able to freely interchange entire word-
processing documents amongst different office-automation devices and PC and
UNIX word-processing systems, which can be counted on to preserve not only
the proper appearance (and hence often the meaning) of documents, but also
the word-processing functions.  The DIF is therefore also the first step in
the direction of literacy without paper, since it takes little or no imagin-
ation to picture its use in conjunction with electronic mail systems.

     Attempts thus-far in document conversion have proceeded in two or three directions, all of which are markedly inferior to the Navy DIF.  There are contractors who write one-to-one conversion routions (e.g. Wang to Lanier) for popular
systems.  Using these peoples services is expensive, inconvenient, and very 
limited in potential since it is essentially a 2**n proposition to try to have
one-to-one conversions for ALL word-processing systems, and none of these
services provide for more than a small fraction of what's actually out there.
There is the "magic black box" approach wherein the user pays $7000 to $20000
for the black box and an extra $500 or so for each set of one-to-one soft-
ware conversion routines to run on it.  Again, this approach is expensive and
limited.  And, there is IBM's DCA/DISOSS approach, which many feel is 
important since it offers the promise of being able to exchange documents 
between systems having it and IBM mainframes.

     Unfortunately, DCA/DISOSS has a severe limitation as far as its 
potential use as an intermediate file-structure for the exchange of word-
processing documents.  DCA does not correspond to a reasonable word-processor's
file structure;  it corresponds to the functionality of a 1965 selectric
typewriter.  Picture the manner in which the name "John" is bold-faced in any
reasonable 1985 word-processor's file structure;  there is a code for "bold-
face on", followed by the name "John" and then a code for "bold-face off".  
The same is true of the Navy DIF, which functions entirely like the file
structure of a reasonable full-featured 1985 word-processor.  But, in DCA,
the sequence would go "John", followed by a code meaning "back-space 4", and
then "John" again;  exactly the way you'd do it on a typewriter.  The whole
situation just gets worse from there;  it is basically not possible to write 
totally accurate translation routines from Vendor A's file structure to DCA and
then from DCA to Vendor B's file structure.

     Translation routines which use DCA, as well as many of the little 
"convert" utilities supplied by PC class word processing outfits such as SSI or Multi-Mate etc. will tend to be one-for-one table look-up routines which simply substitute equivalent functions between different vendors formats or between a
vendor's format and a standard format such as DCA or DIF.  These routines will
leave about 60 or 70 percent of the work for the secretary to do;  at worst,
in the case of complicated documents with many tabs and indents e.g. tables,
the conversions will be so bad that the secretary won't even be able to figure
out what the document was supposed to look like.

    The Navy DIF was envisioned as a much higher class system than that, typ-
ically a 95 percent solution.  DIF translation routines, if written properly,
do not act simply as one-to-one table look-up routines;  they go through many
of the same kind of gyrations you go through translating human languages one
to another.  There are huge differences in the ways in which word processors
handle functions which move the cursor, for instance;  some have multiple kinds
of tab stops, others have one kind of tab stop and multiple kinds of tab
buttons.  Table look-up type translations between different classes of systems
will mangle most half-way complex documents.  Again, DIF routines, if well
written, will end up translating meaning and intent most of the time.

     For that reason, DIF programs which are acceptable to the Navy require
a great deal of effort, and have presented a real challenge to the companies
which now have this capability.  These include Datapoint (for VistaWord),
DEC (for WPS), DataGeneral, IBM (DisplaWrite for PC), AT&T (CrystalWriter),
NCR, Hewlett-Packard (for the 3000), Xerox (for the 8010 and 6085), CPT,
CCI (for Office-Power, one of the better UNIX word processors).  Aside
from these, there is a group of about 40 to 60 companies which are committed
to supporting the DIF, and the project has the full clout of the U.S. 
military behind it;  it is well on its way to becoming a defacto standard
within GSA.  By all rights, it should become The defacto standard for document
interchange in the mini-micro world.

    I.M.S., of Rockville, Md. manages the Dept. of Navy Office Automation Lab
(DONOACS) which oversees the DIF project and we supervise DIF testing amongst
vendors.  We wrote the Navy's testbed DIF system for the Fortune 32:16 which
is commercially available from VSC of Rosslyn Va. (703 276-7166), and we
have a set of DIF routines for the SSI WordPerfect package which actually
work (the SSI convert routines Navy DIF selection is essentially just another
table look-up routine) which we sell ourselves for $100/copy.  We intended
this as a gateway from the PC world to the world of commercial OA, and we will
shortly have other PC and UNIX DIF products available.

    Questions concerning DIF may be addressed to me via UNIX mail, or you may
call IMS at 301 984-8343, and ask for Ted Holden, Gary Evans, or Dick
Jeffries.  We will be happy to answer them.