[misc.handicap] CBFB_ARTICLES require.txt

robertj@tekgen.bv.tek.com (Robert Jaquiss) (05/13/91)
Index Number: 15524

III Chapter 3
REQUIREMENTS ANALYSIS & SPECIFICATION

III.A Introduction

We begin our requirements analysis with a brief look at the
reading and writing environment of "an educated person" who does
not have a disability.

     An educated  person opens a door and invites a friend into
     the private library.  The friend has never been in this room
     before and gazes around.  The room is not too large, but
     contains shelves of books from floor to ceiling.  A
     strategically placed, large, oak desk dominates the room.
     In addition to the desk chair, a comfortable sofa is placed
     so the lamp sheds perfect light on the environment.

     The two people take time to examine the volumes in the room.
     The library is divided into several areas The literature
     section contains great works that must be read by any person
     who claims to be educated.  Another section contains the
     professional titles that relate to this persons chosen
     profession.  The next section holds reference materials
     including, an encyclopedia, the Oxford English Dictionary, A
     modern English dictionary, and technical reference
     materials.   The final section is entirely devoted to
     periodicals.  This section primarily contains magazines, but
     a few old newspapers can be seen.  This person's private
     library is not unlike the large university library across
     town.

     Upon examining the desk we find that a manuscript is under
     production.  Text with drawings and diagrams have been
     artfully produced with the assistance of a word processor.
     We find that a manual of style is open and has been
     consulted for a technicality.  The entire well formatted
     manuscript is  visually appealing.  The author realizes that
     if the publisher accepts their document, many visual changes
     will be made before it appears in print.

The problem, of course, is that this scene could not be acted out
by a person with a print-disability.  Our goal is to produce a
reading and writing system that emulates the picture painted
above.  We are attempting to create an inviting, comfortable,
easy to use system for all individuals with a print-disability.
In the same way that we described the home library above, we will
attempt to describe the same electronic library that we will
specify in this chapter.

     An educated  person opens a door and invites a friend into
     the private library.  The friend has never been in this room
     before and gazes around.  The room is not too large, but
     contains a computer based system designed for this person
     with a print-disability.  There are three output devices
     attached to this computer and two input devices.  One output
     device is a normal monitor and one input device is a
     traditional keyboard.

     The person with a print-disability moves to the system and
     inputs the word "library."  Immediately three options are
     presented. The reference section can be selected, the
     periodical works can be chosen or individual titles can be
     browsed.  They select the individual titles and several
     categories are presented.  Since both these persons are
     interested in scientific work, they decide to examine
     together a college textbook.

     They select a mathematical title in the electronic library.
     All three output devices show the table of contents.  It is
     easy to get an overview of the entire book from this point.
     Chapter 3 looks interesting and with a simple movement the
     computer brings the opening portion of that chapter to the
     three output devices.  This is mathematical information and
     each of the three output devices displays the information
     from the book in three different ways.  Both people are
     examining the same material, but the representation of that
     material is different on each output device.  On the next
     electronic page, an elaborate graph appears.  The person
     with a print-disability at this point pulls up a simplified
     version of that drawing.  Besides the simplified drawing a
     textual description accompanies the simplified graphic.  The
     simplified graphic conveys an overview of the figure and
     then we return to the complex drawing and read the
     accompanying textual description.  From this point they turn
     to the index to look up something.  It is easily found by
     specifying a search string in the index.  Another simple
     movement brings all output devices to the point in the text
     indicated in the index.

     As they  continue to browse the electronic library, one
     thing becomes clear.  The books, reference materials and
     periodicals that are available to the sighted community have
     counterparts in the print-disabled electronic version.  What
     is striking is that at any point the print-disabled person
     can relate to the fully sighted person in their chosen
     medium through the computer representation.  That is, the
     print disabled person may be using one input or output
     device and the same information is manipulated and presented
     in a different way simultaneously.  Writing is handled in
     much the same way.  What the print-disabled person creates
     is represented in different modalities on the attached
     output devices.  The person with a print-disability may
     input data in one way, and the person who is normally
     sighted can read the information in the customary way.

The task is to specify the reading and writing system so that an
effective data representation design can be constructed.  At some
point all requirements of the reading and writing system must be
recorded.  Presently we restrict our system requirements analysis
to the factors that will effect the data representation
requirements and design.  The electronic library described above
is a good starting point for the specification of the reading and
writing system.  The software controlling the electronic library
manipulates the data represented in the electronic book and
outputs that information in a variety of ways.  We must generally
describe those features of output that are dependent on the data
representation.  Then we must design a data representation system
that can effectively support those different methods of output.
An example may serve us here.

     The scene above describes the users going to the index of
     the book, looking up a word, and jumping to that portion of
     the book referenced in the index.  Assuming that the index
     itself is represented in ASCII characters, finding a
     particular item in the index is trivial and requires no
     specialized data representation.  The move to a particular
     location in the textbook found in the index does require
     special notation in the original data representation.  In
     fact the index was probably created by having key words
     marked in some way in the text.  If the user wants to cut
     and paste a portion of text under examination, that function
     can be conducted without regard to the data representation
     itself.  One simply copies the data from the book and moves
     it to another location in software.  There is no need to
     consider the cut and paste function when designing the data
     representation system.  These two functions show us what
     should be considered in this section.  The implementation of
     an index with references in the data requires maintaining
     some code in the data to indicate that the word is in the
     index.  The cut and paste function requires no modification
     of the data and should only be mentioned in passing.

For these reasons we restrict our requirements analysis to those
structural and functional considerations that may effect the data
representation design.  How the data is conveyed is not as
important at this time as what data is represented.  Our job is
to define the items that must be preserved and eliminate all
extraneous information.  We must convert information that is
conveyed visually to its logical equivalent.  In a book a heading
may be represented in characters that are one half inch high.
This should translate to information that this is a heading, not
that the characters are a particular size.  The visual world uses
size to indicate importance.  This will translate to the concept
of a level heading in our data representation system.

In this chapter we intend to describe those characteristics that
affect the data representation design.  We will describe these
characteristics from an "Information Hierarchy."  From this
description software structure and design can be derived.  We
will formally describe the information requirements.  From this
description the data design can be established.  User interface
components are independent and can be specified at a later time.
We will not define the "library" level structure.  The individual
"book" structure will be defined in detail.  This is the most
complex document structure.  Periodicals and reference materials
will be formally defined at a later time.

III.B Information Structure

Information is organized in a way that permits access to
structured data.  If that structure does not exist in the data
representation, that information is not easily accessed.  The
defined hierarchy determines the type of information that can be
derived from the data.

III.B.1 Information Hierarchy

The  hierarchy of information is represented in a tree structure.
We recognize that a graphical representation of this tree will be
instructive for many sighted individuals.  Many advisors working
on this project, including the principal investigator, cannot
take advantage of these graphical tools.  Therefore, we will use
a simple notation system to describe the tree structure.

III.B.2 Terminology

The tree structure starts at level 0.  Each move down the tree
moves us to another level.  Each level on a path may have an
associated name.  At a particular level we may have a name used
in other sections of the tree.  "Paragraph" is a name of an
object that will occur in many branches of our tree.  The naming
of this object helps identify the structure and will aid in
writing reusable computer code.  Many possible paths from one
level to the next level may exist.  (NOTE:  I use the term "down"
to indicate a move from one level to the next level.  We start at
the "root," level 0 and move down.  The next level down has a
higher number.  I know this does not make much sense, but the
idea draws from nature.)

LEVEL 0 BOOK

The tree structure we define here is the structure of a book with
all variations that may occur.  It is not necessary that each
component exist in a particular instance of a book, but the
structure should allow for any variation.  Anything that may be
found in a book will have a logical equivalent in this structure.
We use legal outline notation for the representation of this
structure.  Each number in the chain represents a move down the
tree structure.  I will briefly describe the first few levels so
an overview can be gained.  I will then go into detail.

1 FRONT MATTER
1.1 Copyright Information
1.2 Title Page
1.3 Dedication
1.4 Preface
1.5 Preface To Previous Editions
1.6 Table of Contents
1.7 Introduction
2 BODY
2.1 Chapter
2.1.1 Level Heading
2.1.1.1 Subheading
2.1.1.1.1 Further Subheading Divisions
3 END MATTER
3.1 Appendices
3.2 Glossary
3.3 Index
3.3.1 Index headings
3.3.1.1 Index entries

The three logical divisions of a book, front matter, body, and
end matter are level 1 items. Each smaller structural item is
represented by an additional  "." dot followed by a number.  The
final number lacks trailing dot.  This notation is used so
advisors can discuss the structural elements in perspective of
where it exists.  A simple novel may only have a few of the
divisions described above.  A college textbook may have all the
features.  We will now  describe in detail each item in the tree
structure of a book.  This will take us to the lowest structural
levels, the paragraph, table or figure. Separate treatment of
each of these elements will be necessary later in this chapter.

III.B.3 Hierarchial Structural Description

1 FRONT MATTER

Front matter contains all information that comes before the body
of the book.  This varies from title to title.  The copyright
information is essential in the production of an electronic book.

1.1 Copyright information
     Copyright holder, copyright dates, ISBN, and other
     traditional legal information must be presented here.

1.2 Title Page
     Title specification including authors information and
     publisher information should be presented here.

1.3 Dedication
     Author's dedication statement should be preserved.

1.4 Preface
     Prefaces vary greatly and require more detailed treatment.
     Sometimes level headings will be necessary and perhaps
     figures and tables.  We will treat these elements later.
     Usually a series of paragraphs are used.

1.5 Prefaces To Previous Editions
     Preface to previous editions are preserved and follow the
     current preface.

1.6 Table Of Contents
     The table of contents is extremely important.  This
     component of the front matter will be used as the "first
     screen" of information presented when reading the book.  The
     table of contents shows the structure of the book.  It lists
     the front matter components, the book's body structure with
     each level heading and subheadings, and all end matter
     components.  The table of contents provide hyper text links
     into the book's parts.

1.7 Introduction
     Introductions may occur in the front matter or may be the
     first chapter of the book.  This is dependent on the book
     itself.  The structure of the introduction follows the same
     hierarchy as described below in the chapter section.

2 BODY
     The body of a book is a collection of chapters.  In the
     simplest form there is only one chapter that may contain
     only one level heading.  In more complex books there may be
     any number of chapters.  A chapter continues until the next
     chapter or end matter begins or the end of the book is
     found.  In essence a chapter is the highest level heading.
     A novel may not specifically say this is a chapter.  Here
     the body of the book directly goes to the level heading or
     the text level.  Most often, the body is divided into
     numbered chapters that have a title associated with the
     chapter number.

2.1 Chapter
     Chapters are major divisions of books that are numbered and
     may have an associated title.  Chapters are the highest
     division of the body.  Chapters may have level headings and
     subheadings and sub subheadings.  The chapter is treated no
     differently than other level headings.

2.1.1 Headings
     Headings are major divisions of chapters in the same way
     that chapters are divisions of the body of the book.  They
     are numbered and fall below chapters in the hierarchy.

2.1.1.1 Subheadings
     Subheadings are further divisions of headings.  They are
     numbered in descending order following the same pattern used
     throughout the book.

2.1.1.1.1 Sub Subheadings
     The level headings can be divided as far down as necessary.
     Usually it will not be necessary to go more than four levels
     deep.  There are some applications that will require more
     than four levels of division.

2.2 Paragraph
     The paragraph is the smallest structural division of a book.
     Tables and figures are on the same logical level as a
     paragraph.  A paragraph may follow any logical division of a
     book.  The introduction in the front matter may go to the
     paragraph level.  Paragraphs have a definite beginning and
     end.   Within paragraphs all textual, mathematical, and
     scientific information may be represented.

2.3 Tables
     Tables are one of three logical methods of representing
     information.  Tables are specifically arranged and the
     information represented in each element of the table is
     similar to what can be represented in paragraphs.  Tables
     are numbered and may have titles associated with the table.

2.4 Figures
     Figures are graphical representations of information.  They
     are at the same logical level as paragraphs and tables.  The
     figures have up to four components.  A graphic file
     associated with the original graphic presented in the paper
     version of the book is named.  A textual description of this
     graphic is included at the point where the graphic occurs in
     the book.  A second, simplified graphic file, is also named
     in the text and the textual description of this graphic
     precedes the original description.  The order of the
     description should be first the simplified graphic followed
     by the complex graphic.  If the graphic is so simple in its
     original form that no further simplification is needed, a
     statement in the book should indicate that no simplified
     graphic was necessary.

3 END MATTER
     End matter is the last of the three logical divisions of a
     book.  In the simplest form, no end matter is present.  In
     the complex instances of books many appendices, glossaries,
     and one or more indexes may be present.

3.1 Appendix
     An appendix is a logical division of a book that provides an
     opportunity to provide greater detail than what was
     appropriate in the chapter of a book.  Sometimes this may
     follow the same logical structure as a chapter.  In other
     cases it may follow the organization of an alphabetical
     listing of additional information.  This alphabetical list
     we shall call a list of "entries."

3.1.1 Entry
     The entry shall have a specific notation before the
     beginning of the entry. This notation indicates that it is
     meant to be located by that entry name. An entry can be
     treated as a paragraph, table, or figure.

3.2 Glossary
     Glossaries are normally dictionaries that provide more
     information than what is provided directly in the text
     portions of the body. The glossary is a series of entries.

3.2.1 Entry
     Defined above

3.3 Index
     The index is very important and needs to be treated with
     great care.  Some considerations are: that the page numbers
     are preserved, exact location of the referenced item is
     known, groups of index items are collected under headings,
     only those page numbers noted in the paper version are
     referenced.  Classroom discussions frequently reference page
     numbers.  The page numbers in the electronic version of the
     text should coincide with the pages in the paper copy.

3.3.1 Index Entry
     An index entry is normally a specially noted word or phrase.
     The special notation allows for direct access to that
     location in the text.  The index may be created directly by
     collecting the noted index items.  Electronic versions use
     this notation to create hypertext links.  The index string
     is noted in the text.  When creating the index, entries with
     the same first word would be sub indexed under that word.
     For example:
           @index"abstract data types linked list"
          @index "abstract data types trees"
          @index "abstract data types queue"
     These three index items would be listed under the common
     index heading ABSTRACT DATA TYPES.  The words linked list
     would have page numbers after the entry and an associated
     hypertext link could be created.  Trees and queue would be
     entries under the linked list entries.

ABSTRACT DATA TYPES
     linked list 93, 105, 233
     trees 35, 199, 266
     queue 110, 244, 301

4 Internal references
     Three types of internal references are provided cross
     references, footnotes, and endnotes.

4.1 Cross References
     Cross referenced items are similar to index entries.  A
     cross reference allows for easy movement to other portions
     of the text.  In printed material you commonly find
     information like "see page 193."   This type of cross
     reference is created by noting a particular location in the
     book and then referring to that named location.  These cross
     references are created by the computer at time of printing.
     If pages are added or deleted the pages referenced are
     automatically changed.  A portion of text may contain a
     label that notes that location.  Anywhere in the future the
     label name may be used in the same way that you may name a
     page.  Named labels may reference forward or backward in the
     text.   There are two types of named labels, place-labels
     and call-labels.  The place-label command is a location in
     the text.  The call-label points to a place label and
     provides for a hypertext link to be created.  For example
     three separate locations in a book may contain the following
     lines:

          George Bush is President of the United States. @place-
          label bush1

          President Bush ordered the liberation of Kuwait.
          @place-label bush2

          What two things are George Bush known for? See pages
          @call-label bush1 @call-label bush2

     The words "@place-label bush1," "@place-label bush2" mark a
     specific location in the text.  The place labels are not
     shown in the paper version of the book and will not be seen
     in the electronic version.  The data representation will
     need to show these locations.  The @call-label command
     followed by a unique label name allows for the replacement
     of page numbers by the called labels.  In the electronic
     version pages can also be indicated and direct jumping to
     these locations can be made simple.  (NOTE: I use the "at"
     sign,"@," as an "escape character," the data representation
     design will address the issue of escape characters fully.)

4.2 Footnotes
     Footnotes will be noted in text and associated with a
     number.  They will be accessible  as a hypertext jump or as
     traditional footers.

4.3 Endnotes
     Similar to footnotes, but may be read as traditional end of
     chapter material.

     The three remaining portions of the structural requirements
     are the paragraph, tables and figures.

III.B.3.a Paragraph

The paragraph is the smallest structural element.  Paragraphs
consist of collections of sentences, characters, mnemonic
symbols, or mathematical and scientific information.  The
grammatical meaning of a paragraph is the most common form of the
structural paragraph.  Collections of sentences will be found
most frequently at this level.  Paragraph styles must be
mentioned here.

     1. Standard paragraph has one or more sentences
     2. Indented paragraphs are set off from other paragraphs for
     a variety of reasons.  Indentation is a technique for
     drawing attention to specific information.
     3. Paragraphs in list form.  A paragraph may consist of a
     list of items.  The items may be sentences, words, phrases
     or whatever information the author chooses.  These list are
     sometimes numbered and other times a special character is
     placed before the entry to draw attention to each item.
     Classically this character is called a "bullet"  The bullet
     can take on almost any form.  Bullet draw visual attention
     to each entry. Again it is a technique for drawing attention
     to an item.
     4. A paragraph may take the form of a mathematical or
     scientific formula.  The author may select to represent the
     equation in a paragraph by itself rather than noting the
     equation within a sentence or paragraph.  Often a formula
     may be represented at the paragraph level and subsequent
     paragraphs refer to parts of the formula.  The ability to
     represent mathematical and scientific information within a
     sentence or as a paragraph itself must be preserved.

III.B.3.b Tables

Tables represent related information in a structured form.  The
structure of the table is in rows and columns.  The most common
notion of a table is a spreadsheet.  The columns have an assigned
letter starting at "A."  The numbered rows start at "1."  Any
location can be precisely referenced by a letter and number.
Normally the titled rows and columns  have information below and
across logically associated with the titles.  The variety of
information in tables create difficulties for visual
representation as well as logical representation.  Any cell may
contain  a single character, or many paragraphs of information
may be represented.

It should be possible to read any column heading and the
information organized under the column.  Similarly, it is
necessary to read across rows.  Realize that any one cell may
contain more information than can be represented on a screen at
any one time.  Using a Row column indicator allows for software
to present information in a manner selected by the reader.

III.B.3.c Figures

Figures are drawings, charts, pictures and any other visual
information.  These figures have up to four components.  1) a
graphic file that is a simplified version of the graphic in the
book. 2) a text description of the simplified figure as it
relates to the information contained in the book. 3) a graphic
that closely represents the original graphic in the book.  4) a
text description of the graphic in the book as it relates to the
context of the book.

The text should be labeled as a description of the figure and a
named graphic computer file should be associated with the text
description.

III.C Emphasized Text

Emphasized text can be seen throughout any modern textbook.
Making books "look inviting" is an important task of the
publisher.  Preserving all emphasized text complicates books
designed for the print-disabled beyond usefulness.  Balance is
needed when deciding what to preserve and what to eliminate in
emphasized text.  A brief example is needed here.

     A publisher may choose to emphasize a particular word. Let
     us select the word "book" for this example. Visually the
     entire word is in bold face print.  The "B" is printed in a
     larger size font and the style of the "ook" is slightly
     different from the first character in the word.  Spacing
     before and after is somewhat larger than other spacing. The
     publisher has certainly drawn attention to the word.

 What are we to do with this emphasized word in our system?  We
wish to preserve the fact that this word is emphasized and there
must be an associated description of the emphasis type.   This
emphasis also must have a definite beginning and end.  Let us
consider a grammar textbook that uses emphasis to indicate
different parts of speech.  The subject of the sentence may be
bolded, the verb in italic print, the object may be highlighted,
and all nouns are underlined.  Information is presented by the
use of emphasized text.

First let us decide on broad categories to eliminate.
     1. Size need not be preserved.  Usually size of print
     indicates structural components.  We have dealt with this
     logically by using level headings.
     2. Font style is meaningless to the visually impaired.  If
     the presentation of a character is in "Dutch," "Hampton,"
     "Gothic" or another style, little information is conveyed.
     This may be more visually pleasing in one font more than
     another, but I doubt whether an author made decisions of
     this nature when they created the book.

Second let us decide on the types of emphasis to preserve.  The
following is a list of the types of named emphasis we will
preserve:
     1. Bold face print
     2. Italic print (Italic "leans" right.)
     3. Slanted print (Slanted "leans" left.)
     3. Underlined text
     4. Double underlined text
     5. Highlighted text
We  expect that software will associate the emphasized text with
a screen attribute:  Reverse video for bold, underline with
underline, flashing with italic and so on.  Any type of emphasis
will be mapped to one of these characteristics.  If a publisher
uses more than one type of bold, all bold styles will map to the
one bold style in this electronic version of the book.

III.D Characters, Symbols and Punctuation

ASCII is the rock we will build our system upon.  There are 256
characters defined in the ASCII character set and the extended
character set.  Yet there are well over a thousand characters,
mathematical and scientific symbols, and punctuation marks used
in the printing industry today.  Each character or symbol must
map to an associated ASCII character or mnemonic defined in our
system.   As a rule the printing industry is complex.  This
complexity will not serve to convey information effectively if
text is filled with mnemonics that confuse more than clarify.
For example there are at least five types of dashes used in
printing: the dash, the minus sign, a hyphen, a symbol known as
an em-dash and another called an en-dash.  Our system will map
the minus sign and the hyphen to the "-," ASCII 45.  The others
will map to a mnemonic for a dash.  Similarly, typeset books
normally have opening quotation marks and closing quotation
marks.  These will map to ASCII 34, the quotation mark.  In this
way we can make the text less cluttered.  Information is lost by
this simplification, but it is questionable how many fully
sighted people have noticed that there are at least five types of
dashes.  The specific mapping will be explicitly defined in the
design section of the next chapter.

Mathematical and scientific notation uses spacial relations to
define equations precisely.  We must be capable of translating
the spacial relations to a linear, text based format.  For these
purposes we borrow heavily from TeX discussed in Chapter 2.  If
we say, "square root of three over four," we have not precisely
defined what equation we are talking about. Is three over four
inside the square root symbol or is the square root symbol over
four?  The system we define must be precise and translatable to a
correct visual representation and to a correct tactile
representation.  It seems that it will be necessary to have a
special "math and science" mode. This mode will use certain
symbols for scientific purposes.  These requirements are
necessary and form the basis for a reading and writing system
that can take us into a new level of education for the print-
disabled.

III.E Conclusion

We have defined the structural components required for reading
books.  In addition to the structure we have included
requirements necessary for hypertext links.  Limitations were
placed on the complexity of emphasized text.  The requirement for
logical transformation of visual clues was constantly emphasized.
Simplification of enhancements and characters that logically
preform the same task is encouraged.  Graphic figures are treated
fully allowing maximum use of existing vision for all disability
groups. Finally, we require that all mathematical and scientific
information be represented in a linear text based system that
allows for no ambiguity.  These features implemented in a
logical, easy to use data representation system will provide a
foundation for a reading and writing system for the print-
disabled.