[alt.hypertext] Designing Online Documents

mm@praxis.co.uk (Michael Mannion) (11/15/90)

Having just scanned thru `Designing & Writing Online Documentation' by W.Horton
can anyone offer some thoughts or share their experience on the
following:

1. Given a large paper technical reference manual which is suited to
beig online and which is already word-processed, is it better to start
from scratch and re-design the document into hypertext format or would
it be better to take what exists and `mould' it into hypertext format?

2. If we can make the distinction between the hypertext `design' and
its subsequent `implementation' how can we describe the design so that
it could be formally reviewed.

3. Does anyone have a feel for how long it takes to put a paper
document, book online?

4. Does anyone have any experience of  hypertext design teams working
on the same document and could offer useful advice?

Thanks in advance

Mike Mannion

science@oasys.dt.navy.mil (Mark Zimmermann) (11/17/90)

I would suggest checking out GNU Emacs's ``Info'' and ``Texinfo''
systems, which allow one to build and edit technical manuals (e.g.,
the GNU Emacs manual and other software documentation), and to browse
it online in a rather neat and definitely (to my mind) hypertextual
fashion.  Take the online short course on Info to learn quickly how to
use it; from within Emacs, type C-h i (control-H for `help' followed
by `i' for Info) to get to Info.  If you're not into Emacs, find
somebody who has it up and running (it doesn't fit on a PC or Mac, but
is often seen on Suns, VAXen, etc.) and get them to do a demo.

BTW, ``Infosim'' and ``Para'' modes of Emacs go beyond the
Info/Texinfo framework and make it easier to build online browsable
cross-linked documents, and optionally print out a linear path through
them ... subscribe to the Para mailing list by sending a request to
``para-request@cs.cmu.edu'' for more details....  ^z

joe [Joe Zitt] (11/18/90)

mm@praxis.co.uk (Michael Mannion) writes:

[...a few questions about hypertext conversion...]

I've worked on a few projects involving text-hypertext conversion. 
Currently I'm working with a team (about 6 people) to take a large stack 
on man pages (394 of them) and convert them into a hypertext database to 
be read with KRS software.

Horton's book, which you mention, is perhaps one of the best guides.

One important thing, above all else: Don't skimp on planning and 
organisation. If you look carefully at the material, and figure out ahead 
of time what the chunks are, and how they are to be combined and divided, 
you can avoid a LOT of later confusion and retropatching.

If your source materials are available as text files, use Unix tools, as 
possible, to analyse the materials. For example, I've written some awk 
code that extracts the NAME and SEE ALSO information from the man pages, 
and tells me what many of the hypertext links will be, and, more 
importantly, compare SEE ALSO lists against one another, to tell us how 
the information clusters together. In this way, we can assign to each 
writer a cluster of related articles, rather than dividing the list 
automatically by title. Through a careful use of awk, grep, and sed, I 
also hope to be able to insert and check the links automatically.

Putting hypertext together involves a lot of stuff that is too boring, 
detail oriented, and repetitive to be left for humans. Let the computer 
deal with what it good at, and you can deal with the messy leftovers... 
like content :-).

Joe Zitt		...cs.utexas.edu!kvue!zitt!joe 		(512)450-1916

glushko@srchtec.UUCP (Bob Glushko) (11/19/90)

In article <5514@newton.praxis.co.uk> mm@praxis.co.uk (Michael Mannion) writes:
>Having just scanned thru `Designing & Writing Online Documentation' by W.Horton
>can anyone offer some thoughts or share their experience on the
>following:
>
>1. Given a large paper technical reference manual which is suited to
>beig online and which is already word-processed, is it better to start
>from scratch and re-design the document into hypertext format or would
>it be better to take what exists and `mould' it into hypertext format?
> 
  It depends on a lot of factors.  You say the manual "is suited to being
online."  I assume this means that it is composed of a large number of
small topics that stand alone but which are enhanced or expanded upon
by reference to other topics.  How big these components are and the 
nature of the cross references they contain are important design issues
that determine how easily you can turn text into hypertext.
  At all costs, you want to avoid "doing it by hand." The hand-crafted,
artistic approach is always tempting the first time, but when the time 
comes to revise the manual you will hate yourself.  What you want is to
be able to extract the components using the formatting instructions that
indicate section headings or other natural partitions.  Likewise, you
want to be able to get the links because they are explicitly indicated
with some sort of structural markup.  If your word processing form doesn't
do this, then you may want to have a talk with the people who wrote the
manual and tell them how to use style sheets or formatting instructions
in ways that you can process automatically.

>2. If we can make the distinction between the hypertext `design' and
>its subsequent `implementation' how can we describe the design so that
>it could be formally reviewed.
>
  In my experience the most important thing to do is to separate all 
consideration of the logical design of the hypertext from considerations
of how it looks.  Most hypertext projects get infatuated with user 
interface aspects, and this focus drives out any serious consideration
of hypertext-design-as-database-design, which is what it really is most
of the time.  You want to ask questions like "how many links are there,"
"are they one-way or two-way," "what exactly is being linked to what (is
it a word, a filename, a bitmap)," and only afterwards worry about the
kind of icons or link markers you use.  HyperCard and other hypertext
programs encourage a bad engineering approach because you have to worry
about user interface issues (i.e., create buttons on cards) to develop
and debug the control structure of the program.

>3. Does anyone have a feel for how long it takes to put a paper
>document, book online?
>
Again it depends.  Putting it on line may mean loading it into a text
database, doing some kind of automatic or semi-automatic conversion, or
doing a carefully hand-crafted hypertext applications with lots of sexy
bells and whistles.  

It will take a lot longer than you think, that's for sure.  I have
worked on or consulted with a couple dozen projects of this kind and 
lots of things can go wrong:

    unrealistic expectations 
    missing people on the design and development team (especially
       when you start messing with multimedia)
    no design guidelines to follow (at least for realistic scale projects)
    installed base constraints (hypertext on a PC is a cruel hoax)
    poor quality source files
    no industrial-scale hypertext technology
    legal problems (what copyright category is an animated encyclopedia;
        you mean I can't scan in those pictures?)

Each of these can kill you; together they are almost guaranteed to do
so unless you think about them hard.

>4. Does anyone have any experience of  hypertext design teams working
>on the same document and could offer useful advice?

I wrote a paper on exactly that.  It is called "using off the shelf
software to create a hypertext encyclopedia" and it compares the
use of HyperCard, HyperTies, and Guide to do the same problem. It is 
in Technical Communication in February 1990.

I have written a fair amount about hypertext conversion in other places.
In February 1990 I have a paper in Unix Review called "Visions of Grandeur"
(terrible title, but the editors picked it, not me) that talks about
how hard it is to do hypertext on a practical scale.  I talk about
problems with online manuals in Unix explicitly.

I also have a paper called "hypertext engineering" in the Proceedings
of the ACM conference on document processing from December 1988 that
talks about the design and implementation issues in converting a
printed encyclopedia to an online one. 

I teach courses on these topics at various places; I'll be doing one
at the ACM CHI confernece in New Orleans in April.  I am also writing
a book about it, full of the case studies that led to my list of
ways to make your project fail.  Hang in there and good luck!

Bob Glushko
Search Technology
4725 Peachtree Corners Circle #200
Norcross, GA 30092 
(404) 441-1457

brennan@rtp.dg.com (Dave Brennan) (11/28/90)

In article <DwJss1w163w@zitt> joe [Joe Zitt] writes:
   I've worked on a few projects involving text-hypertext conversion. 
   Currently I'm working with a team (about 6 people) to take a large stack 
   on man pages (394 of them) and convert them into a hypertext database to 
   be read with KRS software.

What type of problems have you run into on this project?  I've been working
on an X11/Motif hypertext help system, which will probably eventually have
to display man pages (which I'd like to see).  I haven't put a lot of
thought into yet and would be interested in hearing from others.

The SEE ALSO sections is an obvious place to look for cross references, but
I've been wondering about how we can detect other items in text that are
good cross reference candidates.  In many cases the man page refers to
include files or structures not included in the mans which would be useful
to call up.  I'm always browsing through /usr/include when I can't find
enough information in the man pages.

What will writers be doing with the clusters of articles?  Enhancing their
hypertext suitability?  It seems that to make non-hypertext documents
"good" hypertext documents some manual intervention will almost inevitably
be involved.  (Which is what I'm finding as I try to convert a printed
manual to hypertext form.)

                                          |\
Dave Brennan                              | \____oo_     brennan@rtp.dg.com
=========================================((__|  /___>    ...rti!dg-rtp!brennan
User Interfaces, Data General                | //        daveb@rpitsmts.bitnet
Research Triangle Park, NC                   |//         Phone: (919) 248-6330

lark@tivoli.UUCP (Lar Kaufman) (11/29/90)

In article <BRENNAN.90Nov27180908@bach.rtp.dg.com> brennan@rtp.dg.com (Dave Brennan) writes:
>The SEE ALSO sections is an obvious place to look for cross references, but
>I've been wondering about how we can detect other items in text that are
>good cross reference candidates.  In many cases the man page refers to

I have written filters before that collect all strings in a man page 
that are of the form: function(x) .  I ran this on _formatted_ man pages.

This collects all listings that would appear in the SEE ALSO section, and 
any other referenced utilities.  This is a good starting point for building 
cross-reference tools.  The rest I leave as an exercise...

-lar

-- 
---------                             TIVOLI Systems, Inc.
Lar Kaufman                           (voice) 512-329-2455
                                      (fax)   512-329-2755
Austin, Texas        USA              (e)  lark@tivoli.com

fabio@dm.unibo.it (Fabio Vitali) (12/02/90)

In article <BRENNAN.90Nov27180908@bach.rtp.dg.com> brennan@rtp.dg.com (Dave Brennan) writes:
>What type of problems have you run into on this project?  I've been working
>on an X11/Motif hypertext help system, which will probably eventually have
>to display man pages (which I'd like to see).  I haven't put a lot of
>thought into yet and would be interested in hearing from others.
>[...]
>
>What will writers be doing with the clusters of articles?  Enhancing their
>hypertext suitability?  It seems that to make non-hypertext documents
>"good" hypertext documents some manual intervention will almost inevitably
>be involved.  (Which is what I'm finding as I try to convert a printed
>manual to hypertext form.)

I think the single most useful feature you can add to the system is the
possibility for the user to add his own links in a very friendly and easy
way. You could store them apart from the actual man text, and allow for more
than one file of links to be active at the same time. 
Better yet, make the system links (i.e. those that YOU provide) an external
file as well, so that people can add, delete, modify, activate and
deactivate links' file exactly up to their needs. 

Another thing that I find sometimes irritating in most hypertext systems is
the One-Card-At-A-Time limit. If you want it to be really useful, think of
displaying every man page in a window on its own, and let the user close the
one he is not using anymore. This does not solve the problem of displaying
at the same time two pages of the same man item. If you add the possibility
of splitting the page window in two independent ones, each scrollable
independently, this would be solved too. 

Just my Lit. 60 (5 U.S. cents) worth.

Fabio
-- 

Fabio Vitali                        You don't possess me, don't impress me,
Dept of Computer Science                                Just upset my mind.
Univ. of Bologna  ITALY                   Can't instruct me, or conduct me,

carroll@cs.uiuc.edu (Alan M. Carroll) (12/03/90)

We've implemented a very simple version of hyper-text browsing of Emacs Info
and the UNIX(c) man pages using Epoch (a multi-X-windowing version of
GNU-Emacs). We originally tried having a new X-window for each entry, but
that created _way_ too much screen clutter. What we use now is a user
configurable limit on the number of screens. When the limit is reached,
X-windows are "recycled". Also, multiple displays of the same text is
prevented. If you jump to a block that is already displayed, that X-window
displaying is pushed to the top instead of opening a new one.
-- 
Alan M. Carroll                "It's psychosomatic. You need a lobotomy,
Epoch Development Team          I'll get a saw."
CS Grad / U of Ill @ Urbana    ...{ucbvax,pur-ee,convex}!cs.uiuc.edu!carroll

blob@Apple.COM (Brian Bechtel) (12/03/90)

brennan@rtp.dg.com (Dave Brennan) writes:

>In article <DwJss1w163w@zitt> joe [Joe Zitt] writes:
>   I've worked on a few projects involving text-hypertext conversion. 
>   Currently I'm working with a team (about 6 people) to take a large stack 
>   on man pages (394 of them) and convert them into a hypertext database to 
>   be read with KRS software.

>What type of problems have you run into on this project?  I've been working
>on an X11/Motif hypertext help system, which will probably eventually have
>to display man pages (which I'd like to see).  I haven't put a lot of
>thought into yet and would be interested in hearing from others.

You should read "Engineering Issues for Hypertext" by Robert Glushko in
the Hypertext '89 Proceedings, available from the ACM.  I (ahem) have a
paper in the ECHT'90 (European Conference on HyperText) Proceedings
discussing the implementation of Inside Macintosh as Hypertext, where I
discuss some issues we encountered.  The ECHT'90 Proceedings are
published as a book by Cambridge University Press; "Hypertext: Concepts,
Systems and Applications," edited by A. Rizk, N. Streitz, and J.
Andre.

>The SEE ALSO sections is an obvious place to look for cross references, but
>I've been wondering about how we can detect other items in text that are
>good cross reference candidates.  In many cases the man page refers to
>include files or structures not included in the mans which would be useful
>to call up.  I'm always browsing through /usr/include when I can't find
>enough information in the man pages.

If information is meant to be linked in, then you need to link it in.
Document Selection is a key issue in Hypertext engineering; of course,
you could include the world (a la Xanadu) but reality usually precludes
including that much material.

>What will writers be doing with the clusters of articles?  Enhancing their
>hypertext suitability?  It seems that to make non-hypertext documents
>"good" hypertext documents some manual intervention will almost inevitably
>be involved.  (Which is what I'm finding as I try to convert a printed
>manual to hypertext form.)

So far, I haven't seen any indications that you can do automatic
hypertext conversion of printed material, especially when the printed
material was designed before the hypertext project started.

--Brian Bechtel     blob@apple.com     "My opinion, not Apple's"

adamb@hydre.enst.fr (Adam Beguelin) (12/04/90)

In article <47034@apple.Apple.COM>, blob@Apple.COM (Brian Bechtel) writes:
> brennan@rtp.dg.com (Dave Brennan) writes:
> you could include the world (a la Xanadu) but reality usually precludes

Can anyone provide me with more information on Xanadu.  All I have read
about it is what was in the recent aniverary of Byte.  Anyone know how
close it is to being a product or how one might become a beta-test site?

	Adam

-- 
Email:  adamb@inf.enst.fr or adamb@boulder.colorado.edu
Office: (1) 4581 7881  (From the USA prefix 011-33)
Home:   (1) 4581 7138 
Post:   Adam Beguelin // 212, rue de Tolbiac Ch. 420 // 75013 Paris France

mary@tivoli.UUCP (Mary Anthony) (12/05/90)

rticle <DwJss1w163w@zitt> joe [Joe Zitt] writes:

>The SEE ALSO sections is an obvious place to look for cross references, but
>I've been wondering about how we can detect other items in text that are
>good cross reference candidates.  In many cases the man page refers to
>include files or structures not included in the mans which would be useful
>to call up.  I'm always browsing through /usr/include when I can't find
>enough information in the man pages.

Is it really necessary to include a SEE ALSO section in a hypertext document?
By its very nature isn't a link a SEE ALSO reference?  Since hypertext documents
generally exist in windowed environments, the amount of document space is
limited.  Wouldn't you want to limit the amount of space taken up in the
document window by unnecessary headings?

I agree, it is a problem when references are made in a hypertext document to
include files that do not exist within the document.  However, as you
demonstrated, it is always possible for the user to view the file in another
window in his environment.  Since include files change often from release to 
release there may be good reason not to document them in detail. Wouldn't it
be nice if writers could link to an include file itself.  The link would
then pull up the appropriate (and one would hope most recent) include file
rather than a documentation node.

brennan@rtp.dg.com (Dave Brennan) (12/07/90)

In article <1990Dec1.204627.7223@dm.unibo.it> fabio@dm.unibo.it (Fabio Vitali) writes:

   I think the single most useful feature you can add to the system is the
   possibility for the user to add his own links in a very friendly and easy
   way. You could store them apart from the actual man text, and allow for more
   than one file of links to be active at the same time. 

I agree with this, and it's something I eventually want to have our system
do.  The problem is how to store links separately.  For example, file
position won't necessarily work, because when the document is updated some
items will surely move.  Got any good ideas?

   Another thing that I find sometimes irritating in most hypertext systems is
   the One-Card-At-A-Time limit. If you want it to be really useful, think of
   displaying every man page in a window on its own, and let the user close the
   one he is not using anymore.

We'll eventually be adding what I call "tear out" pages.  The system will
allow the user to create a new window which contains the node he or she is
currently viewing.  (The duplicate window would likely be for viewing only
- ie: no navigation.)

                                          |\
Dave Brennan                              | \____oo_     brennan@rtp.dg.com
=========================================((__|  /___>    ...rti!dg-rtp!brennan
User Interfaces, Data General                | //        daveb@rpitsmts.bitnet
Research Triangle Park, NC                   |//         Phone: (919) 248-6330

glushko@srchtec.UUCP (Bob Glushko) (12/07/90)

 I appreciate Brian's recommendation of my paper, but you won't find
a paper called Engineering in Hypertext in the 1989 ACM Hypertext 
conference proceedings.  The correct citation is:

Hypertext Engineering: Practical methods for creating a compact disc
  encyclopedia.  ACM Conference on Document Processing Systems, 1988.

(I have a paper in the Hypertext '89 conference on the same kinds
of practical engineering issues, but it is called "Design Issues for 
Multi-Document Hypertexts.)

See also a paper of mine with the terrible name "Visions of Grandeur" in
the UNIX REVIEW of Feb 1990 (the hypertext issue).  This is actually the
best of the articles about the topic on online help and documentation in
a Unix environment.

If you can't find these in your library, I know where to get them...

bob glushko
search technology
4725 peachtree corners circle
norcross, ga 30092
(404) 441-1457

glushko@srchtec.UUCP (Bob Glushko) (12/07/90)

In article <47034@apple.Apple.COM>, blob@Apple.COM (Brian Bechtel) writes:
> 
> If information is meant to be linked in, then you need to link it in.
> Document Selection is a key issue in Hypertext engineering; of course,
> you could include the world (a la Xanadu) but reality usually precludes
> including that much material.

The key here is that whether a document is relevant depends on the task
the user performs with it, which means that you can't select documents
just because they are there.  An example I use is a software manual and
a phone book.  I can imagine lots of benefits for having both the manual
and the phone book online, but I can't think of many that require the
two of them to be linked together.   Another example is the phone book
(yellow pages) and a map.  Each is useful to have online, but there are
nice complementarities between the map and the phone book.  Links from
the map to the phone book let me know the phone number of anyplace on 
the map and let me quickly learn about the kind of neighborhood it is
in (lots of bars, or lots of churches?), while links from the phone
book to the map let me quickly find out where someplace is.  A third
example would be a thesaurus and the phone book.  Links from the phone
book to the thesaurus would help me generate other categories to look
under (is it automobile, car, ??), but I can't imagine useful links
from the thesaurus to the phone book.  So useful links might be
asymmetric.

> >What will writers be doing with the clusters of articles?  Enhancing their
> >hypertext suitability?  It seems that to make non-hypertext documents
> >"good" hypertext documents some manual intervention will almost inevitably
> >be involved.  (Which is what I'm finding as I try to convert a printed
> >manual to hypertext form.)
> 
> So far, I haven't seen any indications that you can do automatic
> hypertext conversion of printed material, especially when the printed
> material was designed before the hypertext project started.

You might not be able to do completely automatic conversion, but you
can do some.  I try to determine (or locate) the writing guidelines 
that were followed when the documents were created; sometimes the best
thing to do is to try to find a writer.  Writers can often be educated
about the consequences of certain styles, so you can get them to change
for the next revision of the document. (E.g., tell the writers to make
sure they cite a document in a unique way, rather than by page number
or by indirect reference like "see the next section").  I use a couple
of different indexing programs to see how cross references are used.

      see figure 1...
          Figure 2 says ...
      in  FIgure 6, we ...

the idea is to understand the variation in cross reference structure
so that you can either write a program to locate them, or so that you
can convince your writers to do it differently (and more reliably) the
next time.  Progressive refinement of your methods and the source files
that you apply them to is the essence of an engineering approach to 
hypertext, which is why I call it hypertext engineering.

bob glushko
search technology
4725 peachtree corners circle
norcross, ga 30092
(404) 441-1457

glushko@srchtec.UUCP (Bob Glushko) (12/08/90)

In article <BRENNAN.90Dec6220249@teton.rtp.dg.com>, brennan@rtp.dg.com (Dave Brennan) writes:
> 
> The problem is how to store links separately.  For example, file
> position won't necessarily work, because when the document is updated some
> items will surely move.  Got any good ideas?
> 

Don't get seduced into thinking that you have to have bit-level precision
in anchoring your links.  In a man page, for example, links are likely to
be made to a particular sub-section (EXAMPLES, BUGS,...) or to a special
word like another command name or file name.  In the first case, you can
probably safely assume that any revised man page is still going to have
all of the sub-sections in it; in the second case, the special file names
are likewise still going to be there.  So this means you can define the link
to
    "the EXAMPLES heading"
    "the first occurrence of /x/y/file.c after the EXAMPLES heading"

either of which are straightforward to define in a way that is robust
with respect to edits of the man page.

The moral of the story is:  If writers don't make precise cross references
in print, why do we think they have to be able to make precise cross 
references in hypertext?  Links to a sub-section of a man page seem pretty
precise to me.  (Think back to all those term papers you wrote when you
cited articles you didn't even find in the library.  Who are you kidding,
trying to make links stick to particular words?)

bob "hypertext engineering" glushko
search technology
4725 peachtree corners circle, suite 200
norcross, ga 30092 (404) 441-1457