robertj@tekgen.bv.tek.com (Robert Jaquiss) (05/13/91)
Index Number: 15524 III Chapter 3 REQUIREMENTS ANALYSIS & SPECIFICATION III.A Introduction We begin our requirements analysis with a brief look at the reading and writing environment of "an educated person" who does not have a disability. An educated person opens a door and invites a friend into the private library. The friend has never been in this room before and gazes around. The room is not too large, but contains shelves of books from floor to ceiling. A strategically placed, large, oak desk dominates the room. In addition to the desk chair, a comfortable sofa is placed so the lamp sheds perfect light on the environment. The two people take time to examine the volumes in the room. The library is divided into several areas The literature section contains great works that must be read by any person who claims to be educated. Another section contains the professional titles that relate to this persons chosen profession. The next section holds reference materials including, an encyclopedia, the Oxford English Dictionary, A modern English dictionary, and technical reference materials. The final section is entirely devoted to periodicals. This section primarily contains magazines, but a few old newspapers can be seen. This person's private library is not unlike the large university library across town. Upon examining the desk we find that a manuscript is under production. Text with drawings and diagrams have been artfully produced with the assistance of a word processor. We find that a manual of style is open and has been consulted for a technicality. The entire well formatted manuscript is visually appealing. The author realizes that if the publisher accepts their document, many visual changes will be made before it appears in print. The problem, of course, is that this scene could not be acted out by a person with a print-disability. Our goal is to produce a reading and writing system that emulates the picture painted above. We are attempting to create an inviting, comfortable, easy to use system for all individuals with a print-disability. In the same way that we described the home library above, we will attempt to describe the same electronic library that we will specify in this chapter. An educated person opens a door and invites a friend into the private library. The friend has never been in this room before and gazes around. The room is not too large, but contains a computer based system designed for this person with a print-disability. There are three output devices attached to this computer and two input devices. One output device is a normal monitor and one input device is a traditional keyboard. The person with a print-disability moves to the system and inputs the word "library." Immediately three options are presented. The reference section can be selected, the periodical works can be chosen or individual titles can be browsed. They select the individual titles and several categories are presented. Since both these persons are interested in scientific work, they decide to examine together a college textbook. They select a mathematical title in the electronic library. All three output devices show the table of contents. It is easy to get an overview of the entire book from this point. Chapter 3 looks interesting and with a simple movement the computer brings the opening portion of that chapter to the three output devices. This is mathematical information and each of the three output devices displays the information from the book in three different ways. Both people are examining the same material, but the representation of that material is different on each output device. On the next electronic page, an elaborate graph appears. The person with a print-disability at this point pulls up a simplified version of that drawing. Besides the simplified drawing a textual description accompanies the simplified graphic. The simplified graphic conveys an overview of the figure and then we return to the complex drawing and read the accompanying textual description. From this point they turn to the index to look up something. It is easily found by specifying a search string in the index. Another simple movement brings all output devices to the point in the text indicated in the index. As they continue to browse the electronic library, one thing becomes clear. The books, reference materials and periodicals that are available to the sighted community have counterparts in the print-disabled electronic version. What is striking is that at any point the print-disabled person can relate to the fully sighted person in their chosen medium through the computer representation. That is, the print disabled person may be using one input or output device and the same information is manipulated and presented in a different way simultaneously. Writing is handled in much the same way. What the print-disabled person creates is represented in different modalities on the attached output devices. The person with a print-disability may input data in one way, and the person who is normally sighted can read the information in the customary way. The task is to specify the reading and writing system so that an effective data representation design can be constructed. At some point all requirements of the reading and writing system must be recorded. Presently we restrict our system requirements analysis to the factors that will effect the data representation requirements and design. The electronic library described above is a good starting point for the specification of the reading and writing system. The software controlling the electronic library manipulates the data represented in the electronic book and outputs that information in a variety of ways. We must generally describe those features of output that are dependent on the data representation. Then we must design a data representation system that can effectively support those different methods of output. An example may serve us here. The scene above describes the users going to the index of the book, looking up a word, and jumping to that portion of the book referenced in the index. Assuming that the index itself is represented in ASCII characters, finding a particular item in the index is trivial and requires no specialized data representation. The move to a particular location in the textbook found in the index does require special notation in the original data representation. In fact the index was probably created by having key words marked in some way in the text. If the user wants to cut and paste a portion of text under examination, that function can be conducted without regard to the data representation itself. One simply copies the data from the book and moves it to another location in software. There is no need to consider the cut and paste function when designing the data representation system. These two functions show us what should be considered in this section. The implementation of an index with references in the data requires maintaining some code in the data to indicate that the word is in the index. The cut and paste function requires no modification of the data and should only be mentioned in passing. For these reasons we restrict our requirements analysis to those structural and functional considerations that may effect the data representation design. How the data is conveyed is not as important at this time as what data is represented. Our job is to define the items that must be preserved and eliminate all extraneous information. We must convert information that is conveyed visually to its logical equivalent. In a book a heading may be represented in characters that are one half inch high. This should translate to information that this is a heading, not that the characters are a particular size. The visual world uses size to indicate importance. This will translate to the concept of a level heading in our data representation system. In this chapter we intend to describe those characteristics that affect the data representation design. We will describe these characteristics from an "Information Hierarchy." From this description software structure and design can be derived. We will formally describe the information requirements. From this description the data design can be established. User interface components are independent and can be specified at a later time. We will not define the "library" level structure. The individual "book" structure will be defined in detail. This is the most complex document structure. Periodicals and reference materials will be formally defined at a later time. III.B Information Structure Information is organized in a way that permits access to structured data. If that structure does not exist in the data representation, that information is not easily accessed. The defined hierarchy determines the type of information that can be derived from the data. III.B.1 Information Hierarchy The hierarchy of information is represented in a tree structure. We recognize that a graphical representation of this tree will be instructive for many sighted individuals. Many advisors working on this project, including the principal investigator, cannot take advantage of these graphical tools. Therefore, we will use a simple notation system to describe the tree structure. III.B.2 Terminology The tree structure starts at level 0. Each move down the tree moves us to another level. Each level on a path may have an associated name. At a particular level we may have a name used in other sections of the tree. "Paragraph" is a name of an object that will occur in many branches of our tree. The naming of this object helps identify the structure and will aid in writing reusable computer code. Many possible paths from one level to the next level may exist. (NOTE: I use the term "down" to indicate a move from one level to the next level. We start at the "root," level 0 and move down. The next level down has a higher number. I know this does not make much sense, but the idea draws from nature.) LEVEL 0 BOOK The tree structure we define here is the structure of a book with all variations that may occur. It is not necessary that each component exist in a particular instance of a book, but the structure should allow for any variation. Anything that may be found in a book will have a logical equivalent in this structure. We use legal outline notation for the representation of this structure. Each number in the chain represents a move down the tree structure. I will briefly describe the first few levels so an overview can be gained. I will then go into detail. 1 FRONT MATTER 1.1 Copyright Information 1.2 Title Page 1.3 Dedication 1.4 Preface 1.5 Preface To Previous Editions 1.6 Table of Contents 1.7 Introduction 2 BODY 2.1 Chapter 2.1.1 Level Heading 2.1.1.1 Subheading 2.1.1.1.1 Further Subheading Divisions 3 END MATTER 3.1 Appendices 3.2 Glossary 3.3 Index 3.3.1 Index headings 3.3.1.1 Index entries The three logical divisions of a book, front matter, body, and end matter are level 1 items. Each smaller structural item is represented by an additional "." dot followed by a number. The final number lacks trailing dot. This notation is used so advisors can discuss the structural elements in perspective of where it exists. A simple novel may only have a few of the divisions described above. A college textbook may have all the features. We will now describe in detail each item in the tree structure of a book. This will take us to the lowest structural levels, the paragraph, table or figure. Separate treatment of each of these elements will be necessary later in this chapter. III.B.3 Hierarchial Structural Description 1 FRONT MATTER Front matter contains all information that comes before the body of the book. This varies from title to title. The copyright information is essential in the production of an electronic book. 1.1 Copyright information Copyright holder, copyright dates, ISBN, and other traditional legal information must be presented here. 1.2 Title Page Title specification including authors information and publisher information should be presented here. 1.3 Dedication Author's dedication statement should be preserved. 1.4 Preface Prefaces vary greatly and require more detailed treatment. Sometimes level headings will be necessary and perhaps figures and tables. We will treat these elements later. Usually a series of paragraphs are used. 1.5 Prefaces To Previous Editions Preface to previous editions are preserved and follow the current preface. 1.6 Table Of Contents The table of contents is extremely important. This component of the front matter will be used as the "first screen" of information presented when reading the book. The table of contents shows the structure of the book. It lists the front matter components, the book's body structure with each level heading and subheadings, and all end matter components. The table of contents provide hyper text links into the book's parts. 1.7 Introduction Introductions may occur in the front matter or may be the first chapter of the book. This is dependent on the book itself. The structure of the introduction follows the same hierarchy as described below in the chapter section. 2 BODY The body of a book is a collection of chapters. In the simplest form there is only one chapter that may contain only one level heading. In more complex books there may be any number of chapters. A chapter continues until the next chapter or end matter begins or the end of the book is found. In essence a chapter is the highest level heading. A novel may not specifically say this is a chapter. Here the body of the book directly goes to the level heading or the text level. Most often, the body is divided into numbered chapters that have a title associated with the chapter number. 2.1 Chapter Chapters are major divisions of books that are numbered and may have an associated title. Chapters are the highest division of the body. Chapters may have level headings and subheadings and sub subheadings. The chapter is treated no differently than other level headings. 2.1.1 Headings Headings are major divisions of chapters in the same way that chapters are divisions of the body of the book. They are numbered and fall below chapters in the hierarchy. 2.1.1.1 Subheadings Subheadings are further divisions of headings. They are numbered in descending order following the same pattern used throughout the book. 2.1.1.1.1 Sub Subheadings The level headings can be divided as far down as necessary. Usually it will not be necessary to go more than four levels deep. There are some applications that will require more than four levels of division. 2.2 Paragraph The paragraph is the smallest structural division of a book. Tables and figures are on the same logical level as a paragraph. A paragraph may follow any logical division of a book. The introduction in the front matter may go to the paragraph level. Paragraphs have a definite beginning and end. Within paragraphs all textual, mathematical, and scientific information may be represented. 2.3 Tables Tables are one of three logical methods of representing information. Tables are specifically arranged and the information represented in each element of the table is similar to what can be represented in paragraphs. Tables are numbered and may have titles associated with the table. 2.4 Figures Figures are graphical representations of information. They are at the same logical level as paragraphs and tables. The figures have up to four components. A graphic file associated with the original graphic presented in the paper version of the book is named. A textual description of this graphic is included at the point where the graphic occurs in the book. A second, simplified graphic file, is also named in the text and the textual description of this graphic precedes the original description. The order of the description should be first the simplified graphic followed by the complex graphic. If the graphic is so simple in its original form that no further simplification is needed, a statement in the book should indicate that no simplified graphic was necessary. 3 END MATTER End matter is the last of the three logical divisions of a book. In the simplest form, no end matter is present. In the complex instances of books many appendices, glossaries, and one or more indexes may be present. 3.1 Appendix An appendix is a logical division of a book that provides an opportunity to provide greater detail than what was appropriate in the chapter of a book. Sometimes this may follow the same logical structure as a chapter. In other cases it may follow the organization of an alphabetical listing of additional information. This alphabetical list we shall call a list of "entries." 3.1.1 Entry The entry shall have a specific notation before the beginning of the entry. This notation indicates that it is meant to be located by that entry name. An entry can be treated as a paragraph, table, or figure. 3.2 Glossary Glossaries are normally dictionaries that provide more information than what is provided directly in the text portions of the body. The glossary is a series of entries. 3.2.1 Entry Defined above 3.3 Index The index is very important and needs to be treated with great care. Some considerations are: that the page numbers are preserved, exact location of the referenced item is known, groups of index items are collected under headings, only those page numbers noted in the paper version are referenced. Classroom discussions frequently reference page numbers. The page numbers in the electronic version of the text should coincide with the pages in the paper copy. 3.3.1 Index Entry An index entry is normally a specially noted word or phrase. The special notation allows for direct access to that location in the text. The index may be created directly by collecting the noted index items. Electronic versions use this notation to create hypertext links. The index string is noted in the text. When creating the index, entries with the same first word would be sub indexed under that word. For example: @index"abstract data types linked list" @index "abstract data types trees" @index "abstract data types queue" These three index items would be listed under the common index heading ABSTRACT DATA TYPES. The words linked list would have page numbers after the entry and an associated hypertext link could be created. Trees and queue would be entries under the linked list entries. ABSTRACT DATA TYPES linked list 93, 105, 233 trees 35, 199, 266 queue 110, 244, 301 4 Internal references Three types of internal references are provided cross references, footnotes, and endnotes. 4.1 Cross References Cross referenced items are similar to index entries. A cross reference allows for easy movement to other portions of the text. In printed material you commonly find information like "see page 193." This type of cross reference is created by noting a particular location in the book and then referring to that named location. These cross references are created by the computer at time of printing. If pages are added or deleted the pages referenced are automatically changed. A portion of text may contain a label that notes that location. Anywhere in the future the label name may be used in the same way that you may name a page. Named labels may reference forward or backward in the text. There are two types of named labels, place-labels and call-labels. The place-label command is a location in the text. The call-label points to a place label and provides for a hypertext link to be created. For example three separate locations in a book may contain the following lines: George Bush is President of the United States. @place- label bush1 President Bush ordered the liberation of Kuwait. @place-label bush2 What two things are George Bush known for? See pages @call-label bush1 @call-label bush2 The words "@place-label bush1," "@place-label bush2" mark a specific location in the text. The place labels are not shown in the paper version of the book and will not be seen in the electronic version. The data representation will need to show these locations. The @call-label command followed by a unique label name allows for the replacement of page numbers by the called labels. In the electronic version pages can also be indicated and direct jumping to these locations can be made simple. (NOTE: I use the "at" sign,"@," as an "escape character," the data representation design will address the issue of escape characters fully.) 4.2 Footnotes Footnotes will be noted in text and associated with a number. They will be accessible as a hypertext jump or as traditional footers. 4.3 Endnotes Similar to footnotes, but may be read as traditional end of chapter material. The three remaining portions of the structural requirements are the paragraph, tables and figures. III.B.3.a Paragraph The paragraph is the smallest structural element. Paragraphs consist of collections of sentences, characters, mnemonic symbols, or mathematical and scientific information. The grammatical meaning of a paragraph is the most common form of the structural paragraph. Collections of sentences will be found most frequently at this level. Paragraph styles must be mentioned here. 1. Standard paragraph has one or more sentences 2. Indented paragraphs are set off from other paragraphs for a variety of reasons. Indentation is a technique for drawing attention to specific information. 3. Paragraphs in list form. A paragraph may consist of a list of items. The items may be sentences, words, phrases or whatever information the author chooses. These list are sometimes numbered and other times a special character is placed before the entry to draw attention to each item. Classically this character is called a "bullet" The bullet can take on almost any form. Bullet draw visual attention to each entry. Again it is a technique for drawing attention to an item. 4. A paragraph may take the form of a mathematical or scientific formula. The author may select to represent the equation in a paragraph by itself rather than noting the equation within a sentence or paragraph. Often a formula may be represented at the paragraph level and subsequent paragraphs refer to parts of the formula. The ability to represent mathematical and scientific information within a sentence or as a paragraph itself must be preserved. III.B.3.b Tables Tables represent related information in a structured form. The structure of the table is in rows and columns. The most common notion of a table is a spreadsheet. The columns have an assigned letter starting at "A." The numbered rows start at "1." Any location can be precisely referenced by a letter and number. Normally the titled rows and columns have information below and across logically associated with the titles. The variety of information in tables create difficulties for visual representation as well as logical representation. Any cell may contain a single character, or many paragraphs of information may be represented. It should be possible to read any column heading and the information organized under the column. Similarly, it is necessary to read across rows. Realize that any one cell may contain more information than can be represented on a screen at any one time. Using a Row column indicator allows for software to present information in a manner selected by the reader. III.B.3.c Figures Figures are drawings, charts, pictures and any other visual information. These figures have up to four components. 1) a graphic file that is a simplified version of the graphic in the book. 2) a text description of the simplified figure as it relates to the information contained in the book. 3) a graphic that closely represents the original graphic in the book. 4) a text description of the graphic in the book as it relates to the context of the book. The text should be labeled as a description of the figure and a named graphic computer file should be associated with the text description. III.C Emphasized Text Emphasized text can be seen throughout any modern textbook. Making books "look inviting" is an important task of the publisher. Preserving all emphasized text complicates books designed for the print-disabled beyond usefulness. Balance is needed when deciding what to preserve and what to eliminate in emphasized text. A brief example is needed here. A publisher may choose to emphasize a particular word. Let us select the word "book" for this example. Visually the entire word is in bold face print. The "B" is printed in a larger size font and the style of the "ook" is slightly different from the first character in the word. Spacing before and after is somewhat larger than other spacing. The publisher has certainly drawn attention to the word. What are we to do with this emphasized word in our system? We wish to preserve the fact that this word is emphasized and there must be an associated description of the emphasis type. This emphasis also must have a definite beginning and end. Let us consider a grammar textbook that uses emphasis to indicate different parts of speech. The subject of the sentence may be bolded, the verb in italic print, the object may be highlighted, and all nouns are underlined. Information is presented by the use of emphasized text. First let us decide on broad categories to eliminate. 1. Size need not be preserved. Usually size of print indicates structural components. We have dealt with this logically by using level headings. 2. Font style is meaningless to the visually impaired. If the presentation of a character is in "Dutch," "Hampton," "Gothic" or another style, little information is conveyed. This may be more visually pleasing in one font more than another, but I doubt whether an author made decisions of this nature when they created the book. Second let us decide on the types of emphasis to preserve. The following is a list of the types of named emphasis we will preserve: 1. Bold face print 2. Italic print (Italic "leans" right.) 3. Slanted print (Slanted "leans" left.) 3. Underlined text 4. Double underlined text 5. Highlighted text We expect that software will associate the emphasized text with a screen attribute: Reverse video for bold, underline with underline, flashing with italic and so on. Any type of emphasis will be mapped to one of these characteristics. If a publisher uses more than one type of bold, all bold styles will map to the one bold style in this electronic version of the book. III.D Characters, Symbols and Punctuation ASCII is the rock we will build our system upon. There are 256 characters defined in the ASCII character set and the extended character set. Yet there are well over a thousand characters, mathematical and scientific symbols, and punctuation marks used in the printing industry today. Each character or symbol must map to an associated ASCII character or mnemonic defined in our system. As a rule the printing industry is complex. This complexity will not serve to convey information effectively if text is filled with mnemonics that confuse more than clarify. For example there are at least five types of dashes used in printing: the dash, the minus sign, a hyphen, a symbol known as an em-dash and another called an en-dash. Our system will map the minus sign and the hyphen to the "-," ASCII 45. The others will map to a mnemonic for a dash. Similarly, typeset books normally have opening quotation marks and closing quotation marks. These will map to ASCII 34, the quotation mark. In this way we can make the text less cluttered. Information is lost by this simplification, but it is questionable how many fully sighted people have noticed that there are at least five types of dashes. The specific mapping will be explicitly defined in the design section of the next chapter. Mathematical and scientific notation uses spacial relations to define equations precisely. We must be capable of translating the spacial relations to a linear, text based format. For these purposes we borrow heavily from TeX discussed in Chapter 2. If we say, "square root of three over four," we have not precisely defined what equation we are talking about. Is three over four inside the square root symbol or is the square root symbol over four? The system we define must be precise and translatable to a correct visual representation and to a correct tactile representation. It seems that it will be necessary to have a special "math and science" mode. This mode will use certain symbols for scientific purposes. These requirements are necessary and form the basis for a reading and writing system that can take us into a new level of education for the print- disabled. III.E Conclusion We have defined the structural components required for reading books. In addition to the structure we have included requirements necessary for hypertext links. Limitations were placed on the complexity of emphasized text. The requirement for logical transformation of visual clues was constantly emphasized. Simplification of enhancements and characters that logically preform the same task is encouraged. Graphic figures are treated fully allowing maximum use of existing vision for all disability groups. Finally, we require that all mathematical and scientific information be represented in a linear text based system that allows for no ambiguity. These features implemented in a logical, easy to use data representation system will provide a foundation for a reading and writing system for the print- disabled.