emv@math.lsa.umich.edu (Edward Vielmetti) (09/28/90)
Lou Burnard asked me to pass this along: From: Lou Burnard <lou@ws.oxford.ac.uk> Someone on this list asked for suggestions for further reading on SGML. There follows a brief (100 lines) Reading List on SGML, in SGML. It is extracted from the recently published Guidelines for the Encoding and Interchange of Machine Readable Texts of the Text Encoding Initiative. <!DOCTYPE list.citn [ <!-- DTD fragment for citations drafted: LB 26 May 90 --> <!element list.citn - - (citn|citn.struct)* > <!entity % citn.bits "(author|editor|title|title.piece |series|publisher|publ.city|publ.date |imprint|citn.detail|comment)" > <!element citn - o (#PCDATA | %citn.bits)* > <!element citn.struct - o (author?,editor?,title.piece*,title, series?,publisher?,publ.city?,publ.date?, imprint?,citn.detail*,comment?) > <!element %citn.bits - o (#PCDATA) > ]> <!-- This is the short SGML Reading list referred to in section 3.1 --> <LIST.CITN> <CITN.STRUCT> <AUTHOR>Barnard, David et al </author> <TITLE.piece>SGML Based Markup for literary texts </title.piece> <TITLE>Computers and the Humanities</title> <PUBL.DATE>1988</publ.date> <CITN.DETAIL>vol 22 pp 265-76</citn.detail> <COMMENT>Pioneering attempt to represent literary structures using SGML.</comment> </citn.struct> <CITN.STRUCT> <AUTHOR>Barron, David</author> <TITLE.piece> Why use SGML? <TITLE>Electronic Publishing <PUBL.DATE>April 1989 <CITN.DETAIL> vol 2(1) pp 3-24 <COMMENT>Well-written brief overview of SGML in context of other developments in electronic text handling <CITN.STRUCT> <AUTHOR>Bryan, Martin <TITLE>SGML: an author's guide to the Standard Generalized Markup Language <PUBLISHER>Addison-Wesley <PUBL.DATE>1988 <COMMENT>Detailed text book giving full treatment of the standard, but primarily from the publishing perspective. <CITN.STRUCT> <AUTHOR>Coombs, James H. et al <TITLE.piece>Markup systems and the future of scholarly text processing <TITLE> Communications of the ACM <PUBL.DATE>November 1987 <CITN.DETAIL>Vol 30 no 11 ppp 933-47 <COMMENT>Classic polemic in favour of descriptive over procedural markup presented from the scholarly perspective <citn.struct> <author><CITN.STRUCT> <AUTHOR>International Organisation for Standardisation <TITLE>ISO 8879: Information processing - Text and office systems - Standard Generalized Markup Language (SGML) <PUBL.DATE>1986 <COMMENT>Annexes A and B to the Standard provide a formal but readable summary of its most important features. <CITN.STRUCT> <AUTHOR>International Organisation for Standardisation <TITLE>ISO/TR 9573: Information processing - SGML support facilities - Techniques for using SGML <PUBL.DATE>1988 <COMMENT>Tutorial discussion of main features of the standard with some interesting examples <CITN.STRUCT> <EDITOR>McCarty, Willard <TITLE.piece>[Humanist's Markup Topic files] [computer files] <TITLE> Humanist Electronic Discussion Group <PUBL.DATE>March-May 1989 <CITN.DETAIL>Files MARKUP TOPIC-1 to MARKUP TOPIC-6 <COMMENT>Contain a wealth of informed and uninformed comment and speculation about markup in general and SGML in particular. Available from ListServ @ BROWNVM.EARN <CITN.STRUCT> <AUTHOR>van Herwijnen, Eric <TITLE>Practical SGML <PUBLISHER>Wolters Kluwer <PUBL.DATE>1990 (June) <COMMENT>General purpose introductory textbook <CITN.STRUCT> <AUTHOR>Warmer, J. and S. van Egmond <TITLE.piece>The implementation of the Amsterdam SGML parser <TITLE> Electronic Publishing <PUBL.DATE>July 1989 <CITN.DETAIL> vol 2 (2), pp65-90 COMMENT>Discusses some of technical problems in implementing an SGML compiler using standard LL(1) parser-generator techniques <CITN.STRUCT> <AUTHOR>Wu, Gilbert S.K. <TITLE>SGML theory and practice (British Library research paper 68) <PUBL.DATE>1989 <COMMENT>Section 3 is a good 30 page summary of the most salient features of the standard </list.citn> ======================================================================= Lou Burnard Associate Editor, Text Encoding Initiative Oxford University Computing Service LOU@UK.AC.OX.VAX ========================================================================
spqr@ecs.soton.ac.uk (Sebastian Rahtz) (09/28/90)
In article <EMV.90Sep27164852@josephus.math.lsa.umich.edu> emv@math.lsa.umich.edu (Edward Vielmetti) writes:
From: Lou Burnard <lou@ws.oxford.ac.uk>
<AUTHOR>Barron, David</author>
<TITLE.piece> Why use SGML?
<TITLE>Electronic Publishing
<PUBL.DATE>April 1989
<CITN.DETAIL> vol 2(1) pp 3-24
I have castigated Lou in private before, but I can't resist a public dig at
the `holier than thou' TEI guidelines, which fails to do something as
elementary as separating out the journal volume, number and pages. and
doesn't even get the journal title correct! Come back `refer', all is
forgiven.
--
Sebastian Rahtz S.Rahtz@uk.ac.soton.ecs (JANET)
Computer Science S.Rahtz@ecs.soton.ac.uk (Bitnet)
Southampton S09 5NH, UK S.Rahtz@sot-ecs.uucp (uucp)
emv@math.lsa.umich.edu (Edward Vielmetti) (09/29/90)
In article <SPQR.90Sep28134319@manutius.ecs.soton.ac.uk> spqr@ecs.soton.ac.uk (Sebastian Rahtz) writes:
I have castigated Lou in private before, but I can't resist a public dig at
the `holier than thou' TEI guidelines, which fails to do something as
elementary as separating out the journal volume, number and pages. and
doesn't even get the journal title correct!
Yeah, I noticed that too -- the "DTD fragment for citation" that he
used looked pretty weak in comparison with what you'd need for a
full-blown standard bibliographic citation which you'd want to have
some hope of doing any typesetting with.
This came up in the context of the (as of yet only proposed)
comp.bibliography discussion. There is an appropriately ugly ANSI
standard format for bibliographies, which is rich enough to encompass
all (most?) of the information various perverse journal formats need.
Software like Pro-Cite (from PBS Inc.) I think uses the ANSI
bibliography format as its internal representation for bibliographic
entries.
It would seem to be sensible to ask if there is an extant SGML encoding
and DTD which follows the ANSI standard (probably Z39.something, I don't
have the exact citation). You'd hate to have to type it directly, of
course...
--Ed
Edward Vielmetti, U of Michigan math dept <emv@math.lsa.umich.edu>
whose first job at the U was parsing bibliographies with SNOBOL, ugh
marvit@hplpm.hpl.hp.com (Peter Marvit) (10/02/90)
[[ Re: standard bibliographic formats ]] There is at least one de facto standard which libraries use for their on-loine card catalogs - MARC. Of course the acronym's expansion eacapes me. Could MARC be a superset of an ANSI bibliographical standard. For that matter, why not use MARC or a slight improvement thereon? -Peter "BibIX, tib, BibTeX, endNote, refer, roll-my-own" Marvit : Peter Marvit Hewlett-Packard Labs in Palo Alto, CA (415) 857-6646 : : Internet: <marvit@hplabs.hpl.hp.com> uucp: {any backbone}!hplabs!marvit :
wain@seac.UUCP (Wain Dobson) (10/03/90)
In article <MARVIT.90Oct2103912@hplpm.hpl.hp.com> marvit@hplpm.hpl.hp.com (Peter Marvit) writes: >[[ Re: standard bibliographic formats ]] > >There is at least one de facto standard which libraries use for their >on-loine card catalogs - MARC. Of course the acronym's expansion eacapes >me. Machine Readable Catalogue. By the way, there is LC MARC, CANMARC, UNIMARC, INTEMARC, etc. >Could MARC be a superset of an ANSI bibliographical standard. For that >matter, why not use MARC or a slight improvement thereon? You've got to be kidding. People of been working on this problem for over 20 years (more like 2000 or better, actually). Every librarian, whereever, has confronted this problem. One might take a quite look at the 11th Edition of the Britannica under bibliography to get an idea of what is being toyed with here. Basically, bibliography and the description of a bibliographic item can not be easily reduced to the Chicago Style Manual, or to Turabian. At best the guidelines give by these two items only suffice to produce "handlists." -:) -- Wain Dobson, Vancouver, B.C. ...!{uunet,ubc-cs}!van-bc!seac!wain
robin@txsil.lonestar.org (Robin Cover) (10/14/90)
In light of the frequent request for essential bibliography on SGML, I will post a revised version of a brief bibliography, complete with abstracts. Presumably Ed Vielmetti will archive this for others who will join the discussion asking "What is SGML anyway?" I have stripped unsightly SGML coding and categorized the entries. The longer bibliography from which this is excerpted contains a section (developed by Nick Duncan and David Barnard) containing SGML-style tags that go beyond BiBTeX but are based upon BiBTeX (see section #7). As many have noted (re: MARC formats), building the taxonomy & hierarchy for bibliograpic tags is hard, but within disciplines it should be manageable. Happy reading. Robin Cover ====================================================================== STANDARD GENERALIZED MARKUP LANGUAGE (SGML) BRIEF BIBLIOGRAPHY 1. INTRODUCTIONS TO SGML 2. SGML MANUALS (COMMENTARY & INDICES) 3. SGML IMPLEMENTATIONS 4. STANDARDS PUBLICATIONS 5. SERIAL PUBLICATIONS DEDICATED TO SGML 6. E-MAIL FORUMS 7. FURTHER BIBLIOGRAPHY ON SGML ========================================== 1. INTRODUCTIONS TO SGML ========================================== Barron, David. "Why Use SGML?" Electronic Publishing 2/1 (April 1989) 3-24. CODEN: EPODEU; ISSN 0894-3982. Abstract: The Standard Generalised Markup Language (SGML) is a recently-adopted International Standard (ISO 8879). The paper presents some background material on markup systems, gives a brief account of SGML, and attempts to clarify the precise nature and purpose of SGML, which are widely misunderstood. It then goes on to explore the reasons why SGML should (or should not) be used in preference to older-established systems. A summary of the article is also printed in "Why Use SGML," SGML Users' Group Newsletter 13 (August 1989) 10. [Burnard, Lou.] "Use of SGML Markup." Chapter 2 (pp. 9-38) in Guidelines for the Encoding and Interchange of Machine-Readable Texts (Text Encoding Initiative, Draft Version 1.0 - See below) Coombs, James; Renear, Allen; DeRose, Steven . "Markup Systems and the Future of Scholarly Text Processing." CACM 30/11 (1987) 933-947. ISSN: 0001-0782; cf. CACM 31/7 (July 1988) 810-11) Abstract: The authors argue that many word processing systems distract authors from their tasks of research and composition, toward concern with typographic and other tasks. Emphasis on "WYSIWYG", while helpful for display, has ignored a more fundamental concern: representing document structure. Four main types of markup are analyzed: Punctuational (spaces, punctuation,...), presentational (layout, font choice,...), procedural (formatting commands), and descriptive (mnemonic labels for document elements). Only some ancient manuscripts have no markup. Any form of markup can be formatted for display, but descriptive markup is privileged because it reflects the underlying structure. ISO SGML is a descriptive markup standard, but most benefits are available even before a standard is widely accepted. A descriptively marked-up document is not tied to formatting or printing capabilities. It is maintainable, for the typographic realization of any type of element can be changed in a single operation, with guaranteed consistency. It can be understood even with <emph>no</> markup formatting software: compare "<blockquote>" to ".sk 3 a; .in +10 -10; .ls 0; .cp 2". It is relatively portable across views, applications and systems. Descriptive markup also minimizes cognitive demands: the author need only recall (or recognize in a menu) a mnemonic for the desired element, rather than also deciding how it is currently to appear, and recalling how to obtain that appearance. Most of this extra work is thrown away before final copy; descriptive markup allows authors to focus on authorship. (abstract supplied by Steve DeRose) DeRose, Steven J.; Durand, David G.; Mylonas, Elli; Renear, Allen H. "What is Text, Really?" Journal of Computing in Higher Education 1/2 (Winter 1990) 3-26. ISSN: 1042-1726. Abstract: "The way in which text is represented on a computer affects the kinds of uses to which it can be put by its creator and by subsequent users. The electronic document model currently in use is impoverished and restrictive. The authors agree that text is best represented as an ordered hierarchy of content object[s] (OHCO), because that is what text really is. This model conforms with emerging standards such as SGML and contains within it advantages for the writer, publisher, and researcher. The authors then describe how the hierarchical model can allow future use and reuse of the document as a database, hypertext or network." Herwijnen, Eric van. Practical SGML. Dordrecht/Hingham, MA: Wolters Kluwer Academic Publishers. 200 pages. ISBN: 0-7923 0635-X. The book is designed as a "practical SGML survival-kit for SGML users (especially authors) rather than developers," and itself constitutes an experiment in SGML publishing." A painless introduction to the essentials of SGML. Wu, Gilbert. SGML Theory and Practice. British Library Research Paper 68. British Library Research and Development Department, 1989. ISSN 0269-9257 [68]; ISBN 0-7123-3211-1. 93 pages. ========================================== 2. SGML Manuals (Commentary & Indices) ========================================== Bryan, Martin. SGML: An Author's Guide to the Standard Generalized Markup Language. Wokingham/Reading/New York: Addison-Wesley, 1988. ISBN 0-201-17535-5 (pbk); LC CALL NO.: QA76.73.S44 B79 1988. 380 pages. A highly detailed and useful manual explaining and illustrating features of ISO 8879. Goldfarb, Charles F. The SGML Handbook. Oxford: Oxford University Press. Fall, 1990. ISBN: 0198537379. Announced as a "monumental 560-page work" by IBM Senior Systems Analyst and acknowledged "father of SGML." The book constitutes an annotated, cross-referenced and indexed copy of the ISO 8879 Standard and Amendment 1, with annotations, tutorials and reference material. See "News. New Goldfarb Book About SGML," EPSIG News 3/1 (March 1990) 4 and further details in (GCA's) TECHInfo (July 1990) 1. Smith, Joan M.; Stutely, Robert S. SGML: The Users' Guide to ISO 8879. Chichester/New York: Ellis Horwood/Halsted, 1988. ISBN 0-7458-0221-4 (Ellis Horwood) and 0-470-21126-1 (Halsted); LC CALL NO.: QA76.73.S44 S44 1988. The book includes subject indices to ISO 8879. An Overview of the book may be found in the SGML Users' Group Newsletter 9 (August 1988) 9. ========================================== 3. SGML Implementations ========================================== Guidelines for the Encoding and Interchange of Machine Readable Texts, eds. C. Michael Sperberg-McQueen and Lou Burnard. TEI-P1, Version 1.0 15-July-1990. xiv + 280 pages. This volume represents the results of work in Phase I of the International Text Encoding Initiative, sponsored by ACH/ACL/ALLC and several advisory associations. The publication describes and illustrates mechanisms (some experimental) for SGML markup of many kinds of documents, especially for humanities fields (literary and linguistic study). Contact the editors: in the US, Michael Sperberg-McQueen; BITNET: <U35395@UICVM>; Computer Center (M/C 135); University of Illinois at Chicago; Box 6998; Chicago, IL 60680; Tel: (312) 996-2981; in the UK, Lou Burnard; JANET: <lou@vax.ox.ac.uk>; Oxford University Computing Service; 13 Banbury Road; Oxford OX2 6NN; Tel: (44) 865-273238. Standard for Electronic Manuscript Preparation and Markup. (ANSI/NISO Z39.59-1988. Version 2. EPSIG/American Association of Publishers, August, 1987. This document was developed over several years as the "AAP Standard," it is now designated by EPSIG/AAP as "the Electronic Manuscript Standard" or simply as the "Standard." It is SGML-conforming, and provides a suggested tagset for authors and publishers. The document has been recommended for "fast track" ISO approval by working group 6 (TC 46/SC 4/WG 6). EPSIG (Electronic Publication Special Interest Group) also publishes the newsletter EPSIG News in support of its manuscript standard, and generally in support of SGML. Contact: EPSIG; Ms. Betsy Kiser; c/o OCLC, Mail Code 278; 6565 Frantz Road; Dublin, OH 43017-0702; Tel: (614) 764-6195; Fax: (614) 764-6096. Warmer, Jos; van Egmond, Sylvia. "The Implementation of the Amsterdam SGML Parser." Electronic Publishing: Origination, Dissemination and Design (EPOdd) 2/2 (July 1989) 3-28. ISSN: 0894-3982. Abstract: The Standard Generalized Markup Language (SGML) is an ISO Standard that specifies a language for document representation. This paper gives a short introduction to SGML and describes the (Vrije Universiteit) Amsterdam SGML Parser and the problems we encountered in implementing the Standard. These problems include the interpretation of the Standard in places where it is ambiguous and the technical problems in parsing SGML documents. ========================================== 4. Standards Publications ========================================== ISO 8879: Information Processing -- Text and Office Systems -- Standard Generalized Markup Language (SGML). International Organization for Standardization. Ref. No. ISO 8879-1986 (E). Geneva/New York, 1986. [A one-page tech note on the ISO (as a FIPS document, FIPS-PUB-152) provides the following abstract (see "Publishing Standard Allows for the Transfer of Documents from Author to Publisher" [NTIS Tech Note, 081914000; National Bureau of Standards, Gaithersburg, MD; May 1989].) Abstract: This citation summarizes a one-page announcement of technology available for utilization. A Federal Information Processing Standard (FIPS) recently approved by the Secretary of Commerce should help federal agencies improve their communications with publishing organizations. (FIPS are developed by NIST for use by the federal government.) The new standard, called Standard Generalized Markup Language (SGML), provides a common way for defining markup languages so documents can be transferred from author to publisher in a standardized format. By providing a coherent and unambiguous syntax for describing the elements within a document, SGML makes it easier to move unformatted textual data among different installations and processing systems. Developed by the International Organization for Standardization (ISO) and the American National Standards Institute (ANSI) with assistance from NIST, the SGML standard is already being used by the Computer-Aided Acquisition and Logistics Support (CALS) program of the Department of Defense to develop a military specification. NIST is providing technical support for the CALS program. In addition, NIST has developed the first set of conformance tests for SGML; ISO and ANSI are considering using these tests for their own test suites. For possible addenda and changes to 8879, see "Recomendations for a Possible Revision of ISO 8879. ISO/IEC JTC1/SC18/WG8 N931 [Part I]," TAG 12 (December 1989) 6-8 and "Recomendations for a Possible Revision of ISO 8879. Part II. ISO/IEC JTC1/SC18/WG8 N931," TAG 13 (February 1990) 12-15. ISO/TR 9573 Techniques for Using Standard Generalized Markup Language (SGML) December 1, 1988. Ed. Anders Berglund. A major revision underway (as of May 1990) will result in a TR with (16) parts: (1) SGML Tutorial (2) Basic Techniques (3) Advanced Techniques (4) Using Short References for Identifying Markup (5) Using non-Latin Alphabets (6) Referencing and Synchronisation (7) Mathematics and Chemistry (8) Tables (9) Using SGML for Computer-to-Computer Interchange (10) Designing Applications for Database Interfacing (11) Application at ISO CS for International Standards and Technical Reports (12) Public Entity Sets for General and Publishing Symbols (13) Public Entity Sets for Mathematics and Science (14) Public Entity Sets for Latin Based Alphabets (15) Public Entity Sets for non-Latin Based Alphabets (16) Public Entity Sets for Ideograms (adapted from Ludo Van Vooren, "SGML Standards Committee Update: Activities of ISO SC 18 WG8," TAG 14 (May 1990) [11-] 12. See also Joan M. Smith in "More Liaison Statements to ISO," SGML Users' Group Newsletter 13 (August 1989) 6-7. A description of this ISO document is found in "Publication of Techniques for Using SGML," SGML Users' Group Newsletter 11 (January 1989) 3-4. Other Standards Related to SGML 8879: ISO 639 Code for the Representation of Names of Languages. ISO 646-1973 7-bit Coded Character Set for Information Interchange ISO 2022-1982 Information Processing -- ISO 7-bit and 8-bit Coded Character Sets -- Code Extension Techniques. ISO 2375-1974 Data Processing -- Procedure for Registration of Escape Sequences ISO 6429-1983 Additional Control Functions for Character Imaging Devices ISO/DIS 6937 Coded Character Sets for Text Communication ISO/DIS 7350 Text Communication -- Registration of Graphic Character Subrepertoires ISO 8613: Information Processing -- Text and Office Systems -- Office Document Architecture (ODA) and Interchange Formats. ISO 8859 8-bit Single-Byte Coded Graphic Character Sets. 8 parts. ISO 8879 SGML, Amendment No 1. 1 July 1988. 15 pages. ISO 9069 Information Processing -- SGML Support Facilities -- SGML Document Interchange Format (SDIF). ISO 9070 Information Processing -- SGML Suport Facilities -- Registration Procedures for Public Text Owner Identifiers. February 1, 1990. 5 pages. ISO/DIS 9541 Information Processing -- Font and Character Information Interchange. 1989. ISO/TR 9544 Information Processing -- Computer-Assisted Publishing -- Vocabulary (15 July 1988) 43 pages. ISO/DIS 10036 Procedure for Registration of Glyph and Glyph-Collection Identifiers. 1989. [Includes text of ISO/DIS 9541 on registration] ISO/DTR 10037 Information Processing -- SGML and Text Entry Systems -- Guidelines for SGML Syntax-Directed Editing Systems. 1989. ISO/IEC DP 10179 Document Style Semantics and Specification Language (DSSSL) (ISO Project 18.15.6.1). 1988, 1989. Edited by Sharon Adler. ISO/IEC DP 10180 Standard Page Description Language (SPDL). ISO 10646 (Character Encoding) TR XXXX Operational Model for Text Description and Processing Language ========================================== 5. Serial Publications Dediccated to SGML ========================================== <TAG> The SGML Newsletter. This dedicated SGML publication is one of several forms of support given to SGML by the Graphic Communications Association; GCA sponsors other publications, SGML seminars, workshops and SGML events. Contact: Graphic Communications Association; 1730 North Lynn Street, Suite 604; Arlington, VA 22209-2085; Tel: (703) 841-8160; Telex: 510-600-0889; Fax: (703) 841-8171. SGML Users' Group Bulletin. SGML Users' Group Newsletter. Both publications are sponsored by the International SGML Users' Group, founded in 1984 by Joan Smith. Contact: Mr. Stephen G. Downie; SGML Users' Group, Secretary; c/o SoftQuad Inc.; 720 Spadina Avenue; Toronto, Ontario; CANADA M5S 2T9; Tel: 1-416-963-8337. ========================================== 6. E-mail Forums ========================================== BITNET: TEI-L@UICVM The electronic discussion forum for the Text Encoding Initiative (implementing SGML for markup of texts in academic applications, particularly the humanities. Some discussion focuses on theoretical/practical issues of SGML. Send an interactive BITNET message, or standard mail to listserv@uicvm with the single line (as the first line): subscribe tei-l your_name USENET/UUCP News: comp.text.sgml (Moderator: Ed Vielmetti) This discussion forum for SGML began in Fall 1990, and has support from a number of experts using or developing commercial applications of SGML (e.g., SoftQuad; Open Text Systems; Info- Design). The news forum should be accessible from any UNIX site; see your local UNIX gurus. ========================================== 7. Further Bibliography on SGML ========================================== Cover, Robin; Duncan, Nicholas; Barnard, David. "A Bibliography on Structured Text." Technical Report 90-281. Queen's University, Kingston, Ontario. June, 1990. 104 pages, 887 entries. This is a preliminary print version of a bibliographic and information database (compiled by Robin Cover), structured in SGML-database and formatted with SGML ->> BibTeX utilities developed at Queen's University by Nick Duncan and David Barnard. Contact: Department of Computing and Information Science; Queen's University; Kingston, Ontario, Canada K7L 3N6; Tel: (613) 545-6056. The bibliographic database also contains sections on SGML supporting agencies, institutions and SGML software vendors. The electronic version will be placed on a public file server in late 1990 or in 1991. New bibliographic references and other SGML information for this database are welcome: please send citations (published or unpublished materials: technical reports, working papers, internal memoranda, articles, product announcements, product reviews) to Robin Cover via electronic or postal mail. ======================================================================= Robin Cover BITNET: zrcc1001@smuvm1 DTS - Semitics & OT INTERNET: robin@txsil.lonestar.org 3909 Swiss Avenue UUCP: convex!txsil!robin Dallas, TX 75204 TEL: (214) 296-1783/841-3657 =======================================================================