tut%cairo@Sun.COM (Bill "Bill" Tuthill) (07/23/88)
I'm moving a discussion of SGML started in comp.text.desktop into this newsgroup, because I think the issues are larger than a desktop. I feel SGML would be helpful if it were a de facto (rather than merely a de jure) standard. However, since most document publishing systems cannot at present exchange SGML text with each other, SGML is pointless, kind of like Esperanto. Furthermore, even if SGML were more widespread, incompatible tag sets would still pose interchange problems. Here is an excerpt from a memo I wrote a while back. Goldfarb (an IBM employee) is the principal perpetrator of SGML. References are to an article he published in "SIGPLAN Notices," June 1981. ----- SGML is a solution to a non-problem. Goldfarb believes that descriptive markup languages (such as SGML) are superior to procedural ones (such as IBM SCRIPT). Even though this may be true, it is a specious comparison because SCRIPT really stinks. Instead, SGML should be compared to decent procedural languages such as troff and TeX. There are good reasons why troff and TeX macro packages were invented: well-designed macros provide writers with a descriptive layer over a procedural language. When the descriptive layer isn't powerful enough, troff and TeX already have escape hatches so writers can achieve special effects. SGML apparently provides no escape hatches. SGML is no panacea for portability. Being a metalanguage, SGML does not provide one syntax, but only a method for describing different syntaxes. On p. 68 Goldfarb states, "SGML allows variant concrete syntaxes." This is tantamount to saying it isn't really standard. It would probably be as difficult to translate between variant syntaxes as to translate between troff and Interleaf or Frame. SGML was born obsolete. Graphics are missing from the specification, as are provisions for tables and equations. On p. 100 Goldfarb talks about WYSIWYG, but what he apparently means is typewriter input: something like -ms's .DS/.DE macros. Furthermore, every SGML document I've ever seen is extremely ugly. It doesn't say much for a documentation standard when it can't even produce handsome documents. SGML represents no great advance. I was a consultant at UC Berkeley when IBM SCRIPT/GML arrived, and most users said "so what? We already have troff." The specially-hired SCRIPT/GML consultant had no clients-- none. There was no evidence that SGML was superior to other batch systems. A few comparisons in Goldfarb's article make SGML seem inferior: ----- SGML ----- <p> Text processing and word processing systems typically require additional information to be interspersed among the natural text of the document being processed. This added information, called <q>markup</q>, serves two purposes: <ol> <li>Separating the logical elements of the document; and <li>Specifying the processing functions to be performed on those elements. </ol> This figure represents divine document intervention. </p> <fig id=angelfig> <figbody> <artwork depth=24p> </artwork> </figbody> <figcap>Three Angles Dancing </figcap> </fig> ----- troff ----- .LP Text processing and word processing systems typically require additional information to be interspersed among the natural text of the document being processed. This added information, called \*Qmarkup\*U, serves two purposes: .NP Separating the logical elements of the document; and .NP Specifying the processing functions to be performed on those elements. .LP This figure represents divine document intervention. .FN "Three Angels Dancing" .CP angelfig 24P SGML's jargon shows intellectual bankrupcty. As far as I can tell, the term "generic identifier" means tag, and the term "entity reference" means file. Why does Goldfarb have to resort to obfuscatory terminology, if not to hide intellectual deficiencies in the design? SGML embraces pointless data structures. Documents seem to be stored (or conceptualized) in a hierarchical tree, right down to individual words and letters. There is no compelling reason why words and letters cannot be strings and characters. In the concrete syntax described, the ASCII characters < > & % ; appear to be reserved symbols, but Goldfarb offers no method for printing these characters literally. In troff at least only the \ is reserved. Note that < > & % are heavily used in UNIX documentation. SGML requires a guru. SGML documents are supposed to be rigorous, but rigorous means inflexible. If writers want to change the least thing, they will have to consult an SGML guru. It seems that SGML gurus will have to be just as knowledgeable as a TeX or troff macro guru, or a Scribe database administrator.
sylvia@cs.vu.nl (Sylvia van Egmond) (08/02/88)
There has been some discussion on SGML lately. Some misconceptions on SGML have been corrected by a number of people, so I won't go into that. At the Free University, we have implemented a (basic) SGML parser. A technical report, written by Sylvia van Egmond and Jos Warmer, is available for anyone interested. This report gives an introduction to what SGML is, and describes the parser. If you are interested in obtaining the report and/or the parser, please write to Sylvia or Jos at the address below. Free University Amsterdam Dept. of Mathematics and Computer Science De Boelelaan 1081 1081 HV Amsterdam The Netherlands
news@encore.UUCP (Newsboy) (11/09/88)
Desktop SGML would be made in Boston, Nov. 16-18. Can someone tell me the details, like where it is being held, who is sponsoring it, etc? -lar From: kaufman@maxzilla.Encore.COM (Lar Kaufman) Path: maxzilla!kaufman Lar Kaufman <= my opinions Fidonet: 1:322/470@508-534-1842 kaufman@multimax.arpa {bu-cs,decvax,necntc,talcott}!encore!kaufman
yuri@sq.uucp (Yuri Rubinsky) (11/10/88)
In article <4137@encore.UUCP> kaufman@maxzilla.UUCP (Lar Kaufman) writes: >Desktop SGML would be made in Boston, Nov. 16-18. Can someone tell me >the details, like where it is being held, who is sponsoring it, etc? STANDARDS & THE DESKTOP is co-sponsored by the National Association of Desktop Publishers and the Graphic Communications Association (which also sponsors the TechDoc and MarkUp conferences). It is being held at the Hotel Meridien. For further information telephone Marion Ellidge or Patti Hill at the GCA: 703 841-8160. Registration Fees: GCA Member $495.00 NADTP Member $590.00 Nonmember $685.00 Here's the text of the brochure that was published: ---------- STANDARDS & the Desktop Wednesday November 16 Registration 7:30 am 9:00 am No Desk Is an Island Yuri Rubinsky, President, SOFTQUAD INC. In a world made confusing with competing proprietary formats, we can take comfort in the work of standards-makers. The individual at the desktop, the networked individual in the working group, the individual whose microcomputer speaks to a mainframe - all need to retrieve and share old and new data easily. The Standard Generalized Markup Language offers a high-level approach that says `What follows meets a standard'; and then defines the structures needed to understand it. 9:30 am Sometimes You Want to See What You Get, Sometimes You Don't Sharon Adler, Product Development, IBM CORPORATION The answers to several questions - Who creates the document? How is the information going to be used? What types of documents are produced? - will determine whether a What You See is What You Get or code-driven, automated publishing system is most appropriate for certain tasks. Either approach benefits from skillful application of text and graphic standards. 10:15 am SGML for Text and Graphics Pamela Gennusa, Director, Product Marketing, DATALOGICS, INC. As background to SGML's role in the conference, a compact tutorial will explain the fundamentals of this ISO standard, with particular emphasis on its flexibility in defining structures, for expressing different system notations and for passing complex and detailed information to software processes in SGML attributes. 11:30 am Apple's Knowledge Navigator Video Explained: SGML in the Future Yuri Rubinsky A popular conference item for the last year, Apple's elegant videotape and informational master of ceremonies gathering various kinds of data from a variety of other computing systems. If such a dream is to be realized, it will be built on standards which take into account the idiosyncrasies of many systems, many applications and many storage protocols. 1:30 pm The Intelligence Embedded in Pages Haviland Wright, President, AVALANCHE DEVELOPMENT Our collective history is still written in pages; much of our current challenge comes from a need to convert existing text into a more usable - read, electronic - form. To do that most effectively, we must extract from pages not just from words, but also the information about the information. Text tool information is conveyed through structure as well as content. Some text structures, like tables and lists, are local. Others, like cross references, numbered sections and paragraphs, are navigational markers for documents or collections of documents. Conversion of page-oriented text to electronic form will be fully successful when an electronic text can be scanned by readers as easily as a book. 2:00 pm The Status of the AAP Electronic Manuscript Project Betsy Kiser, EPSIG Manager, OCLC OCLC, the Online Computer development and promotion of the `grand-daddy of SGML applications', the standard created by the Association of American Publishers collaborating with some thirty other organizations. 2:20 pm SGML and PageMaker: The Individual at the Desktop Michael Tabor, Publishing Consultant, INTERACCESS INFORMATION DESIGN What are the advantages of standards for someone working in virtual isolation? Implementing a smaller version of the ISO standard in currently available desktop publishing software will provide a major focus for this session with particular discussion of archiving, file control and version control. 2:50 pm SGML on the Campus: The Publishing of Theses, Journals and Books Czeslaw Jan Grycz, Design & Production Manager, UNIVERSITY OF CALIFORNIA PRESS SGML and related coding systems have a great effect on the publishers who accept scholarly work. This session will show examples of the prep- aration of documentary editing projects both using SGML or ignoring its advantages, discussing implications for publishers of receiving each. 3:45 pm The Need for Interchange in Corporate Publishing Robert Marcum, Engineering Specialist, GENERAL DYNAMICS Desktop publishing has provided a proliferation of hardware and software solutions for problems we didn't know we had. The corporate solution: obtain as many of each type of hardware platform and software package as can be justified; evaluate everything on line; minimize training; and pray. The major constraint? The traditional support groups (Technical Publications, Graphic Services, Art and Editorial, etc.) can no longer be used for programs with desktop computing capabilities. This presentation addresses how this environment can be made into a productive team production process for the generation of compound electronic documents via networking (file transfer), translation (file conversion to standard formats), data element filing (DBMS or file control system) and documentation integration (merging of text and graphic elements); all via a desktop automated batch processor (without human intervention). Thursday November 17 8:30 am Graphics and the Future Lee Silverman, Senior Graphics Manager, Engineering Division, COMPUGRAPHIC CORPORATION As the early cave drawings did, certainly, the future of communication holds, at its essence, graphic representation. Today, though, we are forced by new media not only to communicate locally or even nationally but rather internationally and globally. Without a single world-wide spoken or written language, graphics will be the key for transmitting many technical, political, social, scientific and even aesthetic concepts and articles of information. These graphics must be universally displayed, read, and understood. Digital standards will make this possible. 9:00 am Everything You Need to Know about Graphics and Graphic Standards Lee Silverman Hu Hohn, Director, COMPUTER ARTS LEARNING CENTER, MASSACHUSETTS COLLEGE OF ART Macroscopically, in a document of 50,000 words, only the graphic elements stand out. Taking the microscopic view, each character of each word is composed of curves and lines filled with color. The page, the graphic and the character all represent components that we would like to exchange, interpret, image electronically or print photomechanically using a set of standards which describe them accurately, and, most importantly, repeatably. 10:30 am What Do Electronic Graphic Systems Do? What Should They Do? David Mayer, Marketing Manager, Electronic Publishing Systems, AUTO-TROL TECHNOLOGIES How we look at pages changes dramatically depending on whether we look at all pages as textual with some illustration thrown in, or at all pages as graphics where it's important to know specialized information (the SGML information, for instance) about all the text components within the image. Current standards offer a foundation for the latter approach. 11:15 am The Argument for Having SGML Everywhere Frank Gilbane, President, PUBLISHING TECHNOLOGY MANAGEMENT The stockpile of examples is growing of instances where the structural information contained in SGML files and databases provides valuable (and sometimes unexpected) insights. Access to structure at a more fundamental level - at a layer just between the operating system and the application, for example - would revolutionize current utilities. 1:30 pm Structuring Text for Research Michael Sperberg-McQueen, Editor-in-Chief, Text Encoding Initiative, UNIVERSITY OF ILLINOIS AT CHICAGO Literary texts intended for research need to be tagged more intensively than in most electronic publishing or office automation. This session discusses the origins and goals of a cooperative effort by a number of professional organizations to formulate, disseminate and promote guidelines for this encoding, with particular emphasis on the appropriateness of SGML as a structuring system. 2:00 pm Structuring Literary Databases Elli Mylonas, Managing Editor, PERSEUS PROJECT, HARVARD UNIVERSITY A great body of diverse literary texts exists electronically but with little structural encoding and few appropriate SGML Document Type Definitions for them. The Perseus Project faces this issue in building an SGML hypertext database of fifth century Greek texts with linked images. 2:30 pm Electronic Dictionary Interchange Robert Amsler, Dictionary Encoding Initiative, BELLCORE While several ad hoc schemes for encoding dictionary entries exist, and even larger numbers of idiosyncratic typesettting formats, the development of a text standard for the interchange of machine-readable dictionaries is seen as essential for future generations of scholars. One group has drafted a preliminary interchange standard in SGML and recently sponsored a workshop to present it and receive comments. Dr. Amsler will report on that work. 3:15 pm An SGML Application for Music Performance, Analysis and Publishing Alan D. Talbot, Project Manager, Music Engraving Project, NEW ENGLAND DIGITAL; Secretary, ANSI MUSIC INFORMATION PROCESSING STANDARDS COMMITTEE An application written in SGML is generally used for markup of text. The core of SGML, however, is really a pure design tool. The ANSI X3V1.8M MIPS project, using SGML in this fashion, has developed a study that deals with interchange of timing information, with parallel tracks in a document (or performance), and with the interrelationship of complex and occasionally overlapping hierarchical structures. 3:45 pm Structuring and Scripting the Multi-Media Document Richard Moore, Applications Technology Group, APPLE COMPUTER A generic and extensible interchange medium for multimedia data is critically needed by the personal computer industry world - but as a unifying mechanism for the industry's anarchistic software base. In the interactive world of desktop computers, however, interchange is not enough: we also need to deal with the on-line management of compound documents. We need grammar-based, transaction-oriented, data-management tools which can be used as utilities by applications and the rapid development of `entity' standards. 4:15 pm The State of the Art in Information Retrieval: Background for Hypertext Edward Fox, Associate Professor, Department of Computer Science, VIRGINIA POLYTECHNIC INSTITUTE AND STATE UNIVERSITY; Editor, ACM PRESS DATABASE & ELECTRONIC PRODUCTS Using computers to help people find useful items involves applying a variety of methods (artificial intelligence, indexing and text processing are examples) and technologies (human-computer interfaces, magnetic and optical storage devices, networks) to large collections of multimedia objects. Research has shown the viability of natural language interest statements, benefits of interactive feedback, and the the added information about context and structure that comes from SGML, we can now build more effective systems integrating information retrieval and hypertext. Friday November 18 8:30 am The Shape of Hypertext Today Janet H. Walker, Member of Research Staff, DIGITAL EQUIPMENT CORP., CAMBRIDGE RESEARCH LAB What kinds of protocols are necessary for hypertext systems to communicate effectively? What functionality must be represented in a standard that attempts to link dissimilar functions among innovative, proprietary applications? This session begins with a historical look and outlines the internal architectures that make up the diverse hyper-realm. 9:00 am SGML and Hypertext Steve DeRose, Computational Linguist, SUMMER INSTITUTE OF LINGUISTICS SGML can provide a basis for representing hypertext although the structures found in hypertext do not intuitively map into SGML as we use it today. Current projects at the Dallas Theological Seminary, the Summer Institute, and others, begin to address these concerns. 9:30 am Applying Hypertext Technology to Standards Development, Dissemination and Implementation Sandy Ressler, Computer Scientist, NATIONAL INSTITUTE OF STANDARDS & TECHNOLOGY (formerly NBS) Hypertext methods are applicable to the various facets of developing and delivering information processing standards which may, for the future, be thought of as collections of `interconnected writings' incorporating graphics, audio and computer programs. Combined with the increasingly available CD-ROM optical storage technology, a new medium for delivering entire databases of standards, their related documents with software to aid in their implementations is a realistic goal. 10:00 am Proactive Hypertext Philip Lehman, Vice President, SCRIBE SYSTEMS Many hypertext users are interested in applications that can be built on top of such technology, including communication with and control of non-hypertext systems. This interaction, which might be termed proactive hypertext, is taking a growing role as user requirements become more refined. This session will draw examples from technical publishing within the aerospace industries. 10:45 am Browsing and Interchange Louis Gomez, District Manager, Information Technology, BELLCORE `Superbook', a prototype Document Browser, was developed to make accessible the volumes of electronic data already marked up. The session will take lessons learned in Superbook to larger problems of information interchange, with emphasis on the telecommunications industry. 11:15 am Production Hypertext Ian Williams, Product Manager, Corporate Solutions, OFFICE WORKSTATIONS LTD. With both commercial software products and an active system integration business built around hypertext databases, OWL's perspective is grounded in the reality of very large, successful installations. Accordingly, the company's viewpoint on interchange is realistic, practical and oriented to production levels of capability. 1:30 pm Functional Requirements for Hypertext Interchange Standards Jim Norton, AUGMENTation Systems Consultant Twenty years working with AUGMENT, the hypertext collaborative work system developed by human interface pioneer Doug Englebart, has made Jim Norton one of the world's more experienced hypertext users. His address will specify the features and mechanisms that must be present in interchange standards for those standards to adequately represent hypertext and collaborative work functionality. 2:00 pm The Long View Robert Akscyn, President, KNOWLEDGE SYSTEMS INC. With a decade of experience in hypertext systems, and from the vantage point of a commercial software developer, Robert Akscyn will complement Jim Norton's perspective on the issues involved in hypertext interchange standards. 2:45 pm Balancing the Present and the Future: Standards and New Media Tim Oren, Applications Technology Group, APPLE COMPUTER Setting the stage for the working session to follow, this talk will describe specific cases where there is an urgent requirement for interchange between dissimilar new media systems, the markets which will need such interchange soon, and, in counterbalance, concerns that such a standard not compromise future innovations not yet imagined. 3:00 pm Towards Exchange between Hypertext Systems: A Collective Working Session Co-Chairs: Tim Oren & William W. Davis, Jr., Electronic Publishing Technical Advisor, US INTERNAL REVENUE SERVICE _________________________________________________________________________ Yuri Rubinsky SoftQuad Inc. 720 Spadina Avenue Toronto M5S 2T9 Canada (416) 963-8337 uucp: sq!yuri internet: yuri@sq.com
jbd@dasteel.uucp (Jonathan Dasteel) (03/14/91)
Does anyone know of a context free grammar (or yacc spec) for SGML? PS: Did there used to be a comp.text.sgml? If so, what happened to it? -- -- JBD ------------------------------------------------------------------------------- Jonathan Dasteel Dasteel Software 213-394-1229 1148 Fourth Street, Suite 100 uunet!dasteel!jbd Santa Monica, CA 90403 Typesetting and graphics software -------------------------------------------------------------------------------