[comp.text] re : SGML and tag sets

ludo@squawk.uucp (Ludo Van Vooren) (07/28/88)
In article <61454@sun.uucp> sears@sun.uucp (Daniel Sears) writes :

>...But I do think that SGML is being treated as a panacea when it doesn't 
>deserve to be.
>
>Here is an example of what I mean:
>
>SGML provides rules for describing tag sets.  So let's create two very simple
>tag sets that we will assume are SGML conforming.  The first has two tags and
>the second has three.
>
>              TAG SET 1                     TAG SET 2
>      ==========================================================
>        <mark1> = paragraph          <mark2> = paragraph
>        <mark2> = chapter heading    <mark1> = chapter heading
>                                     <mark3> = section heading
>
>Note that while <mark1> and <mark2> have opposite meanings in the tag sets,
>it is possible to translate a document from the first tag set to the second.
>But it is not possible to translate a document from the second tag set to
>the first because there isn't an equivalent tag for <mark3>.  What SGML
>tries to guarantee is a way of describing the different tag sets, but it
>does not guarantee that the tag sets will be rich enough to hold all the
>objects that other tag sets may contain.

It does not make sense, in general, to talk about translating a document
encoded to meet the specifications of one Document Type Declaration into
a document meeting the specs of another DTD. The whole point of SGML is
missed if you don't transmit (either explicitly or implicitly) a document's
proper DTD along with it: the DTD is *part* of the document. SGML parsers
can handle a document properly only when they are given the appropriate
DTD first. 

If your goal is to translate a document having a DTD different from the
one you want to use, that translation is always possible, although you
may have to drop or add information, or the result-DTD may have to be
"enriched" to accommodate a more complex document structure.

The SGML's Link feature *does provide* a translation process between tagging
schemes.

The kind of Link that should be used in this situation is the
Explicit Link. This Link can provide powerful control of applications.
The link sets defined in the case of an explicit link are of the following
type :

<!LINK lkname 
	source-element [attribute-specification]
	target-element [attribute-specification]
>

In the case of the translation from TAG SET 2 to TAG SET 1, the tag <mark3>
could be translated, for example, into <mark1 type="section heading">. The
information included in the TAG SET 2 is preserved, and the result will
be valid in the TAG SET 1 if you add an attribute-definition to the element
definition of <mark1>.

   [NOTE FOR SGML EXPERTS ONLY:  The validation of the result document
   could be obtained in the same pass as the translation, if the
   concurrent document facility is allowed (e.g. when the Features clause
   of the SGML declaration contains the entry CONCUR YES). Every DTD is
   checked against the document (specific text tagging), errors being
   reported for any invalid tag.]

For more complicated examples, remember that the Link feature is completely
context-dependant. That gives you the possibility, for example, of having
two different translations of the same tag, depending of the context of
the tag. (See USELINK in ISO 8879 Sect 12.3 Pg 46)

In conclusion, it is important to understand that an SGML system is
supposed to work, not with *one* tagging scheme, but with *any* tagging-
scheme that can be defined in the SGML metalanguage. It is the DTD transmitted
along with the document (or unambiguously referenced) that ensures that
an SGML parser can cope with the document.

------------------------------------------------------------------------
Ludo Van Vooren                               {utai,utzoo}!sq!ludo
SOBEMAP representative                            or
SOFTQUAD, Toronto, Canada                     ludo@sq.com
------------------------------------------------------------------------