[comp.sys.mac] How to write TEXT editors

oster@dewey.soe.berkeley.edu (David Phillip Oster) (09/18/87)

Now that the system software has been extended to include a 
version of TextEdit that supports multiple fonts, sizes, styles 
(and colors) all within a single text edit record, it is possible 
for programs based on TextEdit to allow users to use multiple fonts.
With new text edit, each word (and potentially each character) can
have its own font.

Almost any program that doesn't allow multiple fonts could be 
extended to allow them. Examples include:
 o Calendar (Mosaic Codes)
 o MockWrite (C.E. Software)
 o MiniWriter (Maitreya Design)
 o Acta (Maitreya Design)

and, of course, hundreds more.

This document lists the problems I had to solve to really do a textEdit
text editor right. I am giving this hard won information away. This 
information cost me many sleepness nights to develop and implement. If all
developers pay attention, then I, as a user, will be able to buy a better
set of products.  So all you developers: listen up!

This is part one. It is a list of the problems you need to solve.  Part two
is a list of proposed standards, that if everyone followed, would make
everyone's life easier.

I had to solve the following problems to actually use new text edit in
a program:

(a) Tech Note #131, listing bugs in new TextEdit
(b) Scrolling code changes
(c) Font/size/style/color menus & dialog
(d) Insertion point font/size/style/color
(e) cut/copy/paste undo
(f) search and replace
(g) alternate forms for deficient systems
(h) smart quotes
(i) Bottom of page concerns
(j) arrows and extended keyboard.
(k) smart cut and paste
(l) preserving print record and window size
(m) save and multi-tasking issues.
(n) standard close/quit box.


(a) Tech Note #131 gives a list of bugs and work arounds for new TextEdit.
I had to do all that.

(b) My scrolling code had to be changed to handle the fact that, if lines 
have different width, the number of lines that fit on the screen is a 
function of position. The three important positions are:
(b.1) the number of lines to page forward if the user clicks in the pageDown
area of the scroll bar.
(b.2) the number of lines to page back if the user clicks in the pageUp
area.
(b.3) the number of lines that fit at the bottom of the screen, so we set the
CtlMax of the scroll bar correctly.
(CtlMax = totalNumberofLines - 1 - LinesThatFitAtBottom)

(c) you need to provide a way for the user to change the new attributes of a 
selected region. I provide font/size/style/color menus. I also provide: 

(c.1) a dialog for setting all these attributes at once.
Rather than use pop-up menus, I think it is better to use SFGetFile style
lists with a scroll bar.  They are easier for the user to cope with. This
dialog also includes an EditText field for the user to type in numbers.

(c.2) a dialog for adding named colors to the color menu (use the color 
picker, then type in a color name. The color name gets sorted into the menu
of available colors.) This dialog also contains a scrollable list of colors,
so that the user can select color to remove from the menu.

(c.3) I correctly remembered that your font/size/style/color menus and dialog
should show checkmarks to let the user know the current state.  Remember that
your size menu should use outline font to let the user know which sizes in 
the currently selected font are actually available a bitmaps and which will
be synthesized. (Note that there is no good way to convey this information
for styles, like underline, which may now also have their own bitmaps.)

(c.4) I wrote a custom menu definition procedure that shows each font in that
font (chicago is in chicago, geneva in geneva, and new your in new york.) My
menu definition procedure handles scroll arrows at the top and bottom, when
required, and since not all fonts are the same height, handles scrolling of
variable sized objects. I use this code in my font&size&style&color dialog
to present a scrollable list of fonts that look right.

(c.5) Remember that while the font/size/color menus set their values
directly into the selected text, the style menu toggles: If there is
no checkmark for "underline", and the user selects "underline" then
the program should underline the selected text. If there is a check,
then, when the user selects underline, it must remove all underlining
from the selected area. Note that there is an assymetry here. If ALL
the text is underlined, it will be checked, and if ALL the text it not
underlined it will be not checked, but if SOME of the text is
underlined, it also will not be checked. The program must do the right
thing.

(d) new text edit does not directly support setting the
font/size/style/color state of the inserttion point, when it is just a
blinking line. I had to implemnt my own mechanism for setting this
state.

(e) cut/copy/paste and most important, undo is implemented for every EditText
item in every dialog.  (including SFPutFile.)  Remember, undo is pretty 
useless without changing the text of the undo item of the edit menu to let
people know what operation will be undone: for example, the user does a font 
change.  The undo menu changes to "Undo Font Change ^Z". The user selects it,
the menu changes to "Redo Font Change ^Z". Undo is not implelemented in new
text edit. I had to do it.

(f) search and replace: Now that the user can have multiple fonts and styles,
fonts and styles become important for searching. I let the user paste into
search and replace EditText items, and also let them use the font/size/style
color menus and the font&size&style&color dialog while they are using the
search&replace dialog. Next to the search string are 5 checkboxes:

          Use this when I search
 _
|X| Text
 = 
| | Font
 = 
| | Size
 = 
| | Style
 = 
| | Color
 -

If the types some 12point plain chicago text in the search EditText item, and 
only has the "text" button checked, then searching behaves like in an old 
application, i.e. font information is ignored.  If the user checks the size 
box, then only 12point text will match the search string. 

Next to the replacement EditText item, I have checkboxes that read:

          Use this when I replace
 _
|X| Text
 = 
| | Font
 = 
| | Size
 = 
| | Style
 = 
| | Color
 -

The meaning here is similar.  This extension to the search and replace dialog
lets the user do everything he can now, and in addition, do things like:
 o Find the next underlined occurence of the word "Loisville"
 o Find all occurences of italic, and turn them to underline
 o Change all occurences of chicago to geneva bold.

All these checkboxes take a lot of room, so I use a custom 9point Chicago
font, and a custom control defintion procedure, and I paste the font into
my application. Since there is no safe way to add just one point size of a
font under the current font manager, this font is called ".Chicago" and
therefore doesn't show up on the font menu.

Of course, all replacements are undoable (including replace all.) I also
provide a menu item: "Go To Previous" that lets the user undo a cursor motion
(a "find" is a cursor motion.)

(g) alternate forms for deficient systems: If you are running under old 
text edit, I have alternate forms of all the dialogs and menus that hide
the fact that font editing is avialable. If you are running without color
quickdraw, I have alternate forms of all the dialogs and menus that hide
the fact that color editing is available.  

What I should have done, and will do, is change the dialogs so that 
unavailable choices are grayed out, and there is a "help" button in every
dialog that explains how to use that dialog and, if choices are grayed out,
why they are grayed out.

(h) smart quotes, and an option to turn them off. The slanted, typesetter
style '` and '' `` quotes look better than the vertical ' and ", but the
text editor should allow people to type normally, and automatically replace
them (except when the user turns it off, for example, usenet doesn't support
them, so I can't use them typing this document.)

(i) Bottom of page concerns. When you display text on the screen, it
is acceptable, and even desirable that, if the last line doesn't
entirely fit, you just show the top part that does fit.  Showing this
partial line reminds the user that more text is available off the
bottom.  The exceptions are: 

(i.1) if the user moves into this line, you need to scroll it so that
it is completely visible.  

(i.2) When printing, you must not show on the printed page lines that
extend partially off the bottom of the screen.

(j) The Apple user interface guidlines dictate what should happen if the user
presses an arrow key, or some of the keys on the extended keyboard:
<Home>, <End>, <PageUp>, <PageDown>, and what should happen if the user is 
also holding down the <Shift>, <Option>, or <Command> keys at the same time.
Text Edit does not do most of this stuff, and much that it does do, it does 
wrong, so you have to do it yourself. I even had to uncover what key codes 
these keys sent! 

(j.1) To properly implement auto-scrolling in the face of these keys,
you have to keep track of whether the left or the right end of the
selection area changed most recently, and auto-scroll that part into
visibility (selection areas may be bigger than the currently visible
window.)

(k) The Apple user interface guidlines dictate that cut and paste should be
"smart" and record whether the cut was at a beginning of a word on the left,
and the end of a word on the right, so that when you cut it should 
automatically delete extra white space if necessary, and when you paste, it 
should automatically insert white space if necesary. Some people hate this
feature and need a way to turn it off.

(l) preserving print record and window size. It is friendly, and the
multi-finder compatibility guidlines recommend, that when the user
saves a file, also saved with it are:

(l.1) the window position, so that when the user opens that document again, it
will show up in the same position of the screen.

(l.1.a) remember, when you open a document using a saved window position,
that this time the user might be on a system with a smaller screen. Force that
window on-screen! Force that window to be smaller than a screen (so the user
can use the grow box.)

(l.2) as long as we are saving the window position, it would be
friendly to also save the cursor position, and what line the document
is scrolled to.  I think it is obnoxious for a document to open with
the cursor off-screen, so I always auto-scroll the document so that
the most recently active end of the cursor is visible.

(l.3)the print record (if the user printed we should try to give him
the same page setup and print quality settings next time.)

(m) Suppose the user opens a document in your program, then deletes the data
file in multi-finder, or renames it, or moves it to another folder. The Apple
user interface guidline people want me to open the file, and not close it
again until the user closes or quits.
I prefer to handle this problem by reading into memory the entire document,
including the finder info and the creation and modification times, and any
resources, then closing the file.  Let the user do horrible things to the 
copy on disk. When the user saves, if my code can't find the original file,
or if the modification date doesn't match, it puts up an sfPutFile dialog
with an explanatory dialog below it on the screen:
"Something changed this file since the last time it was saved. I suggest
you "Save As" this file with a different name." 

The advantage of this scheme is there is no limit to the number of open
documents you can work with (as opposed to the small number of simultaneously
open files) and the program doesn't have to worry about the user renaming
an open file, or changing its folder.

(Thanks to David Dunham of Maitreya Designs for telling me this idea.)

(n) Standard closing/quitting box. A while back, Apple published a 
user interface guidline note saying that programs should use a standard
Closing/Quitting dialog. With the possible use of multi-finder, it is 
difficult for the user to tell which application owns the dialog.  I put my
application's icon in the dialog to solve this problem.

In part two, I'll talk about disk formats of data structures that I would
like declared standard.


--- David Phillip Oster            --My Good News: "I'm a perfectionist."
Arpa: oster@dewey.soe.berkeley.edu --My Bad News: "I don't charge by the hour."
Uucp: {uwvax,decvax,ihnp4}!ucbvax!oster%dewey.soe.berkeley.edu

korn@cory.Berkeley.EDU.UUCP (09/19/87)

Somewhere toward the end of a very good and informative posting, David Oster 
(oster@dewey.soe.berkeley.edu) wrote:

> (m) Suppose the user opens a document in your program, then deletes the data
> file in multi-finder, or renames it, or moves it to another folder. The Apple
> user interface guidline people want me to open the file, and not close it
> again until the user closes or quits.
> I prefer to handle this problem by reading into memory the entire document,
> including the finder info and the creation and modification times, and any
> resources, then closing the file.  Let the user do horrible things to the 
> copy on disk. When the user saves, if my code can't find the original file,
> or if the modification date doesn't match, it puts up an sfPutFile dialog
> with an explanatory dialog below it on the screen:
> "Something changed this file since the last time it was saved. I suggest
> you "Save As" this file with a different name." 
> 
> The advantage of this scheme is there is no limit to the number of open
> documents you can work with (as opposed to the small number of simultaneously
> open files) and the program doesn't have to worry about the user renaming
> an open file, or changing its folder.
> 
> (Thanks to David Dunham of Maitreya Designs for telling me this idea.)

The idea is a neat one, but it turns out there is a situation in which it's
very *dangerous*   Imagine if you will a network.  It has many machines on
it, most of which are workstations, and some of which are servers.  Imagine
that two workstations are editing the same file on a server.  One read in
the file at 1:00pm, the other at 1:18pm.  The first one finishes it's changes,
and writes it back at 1:20pm.  The other finishes at 1:31pm, and writes it's
changes _on_top_of_the_first_set_.  You immediately tell me "Peter, this
can't happen, David's algorythm takes care of that."  Well, in theory it
does.  But in practice, it can fail, because the two Macs in question
(the workstations) may have a slightly different idea of what time it is...
The mac that gets the file at 1:18pm may be 3 minutes fast, and think that
it's really 1:21pm.  When that mac goes to write the file back, it notices
that the the last save to the file happened 'before' this mac got it, so
it 'knows' that it's ok to save on top of it.    and whomp-o, you've lost data.
(and don't think this doesn't happen in networks; it happens all to often!)

A slightly different scheme that works on the same principle might increment
some number, an 'in use' number.  Each time an application edited a copy
of that file, it would increment the 'in use' number.  Each time it closed
the file, it would decrement that number.  If it saw that the number was > 0
at save time, it would give the user the dialog, and let the user decide.
This method also has it's problems.  It's more complex, and it requires
more writes to the file (even if I 'save as', I still have to decrement
the 'in use' number).  And the file might be replaced with one that has
a lower (or higher) 'in use' number than it should have, wreaking havoc.

Perhaps the safest way of dealing with it all is to just set the busy
bit.  That way nobody can mess with the file behind your back (without
having to go through an override that requires they think about what they
are doing).

Peter
--
Peter "Arrgh" Korn
korn@ucbvax.Berkeley.EDU
{decvax,dual,hplabs,sdcsvax,ulysses}!ucbvax!korn

oster@dewey.soe.berkeley.edu (David Phillip Oster) (09/20/87)

In article <20860@ucbvax.BERKELEY.EDU> korn@cory.Berkeley.EDU.UUCP (Peter "Arrgh" Korn) writes:
>> (m) Suppose the user opens a document in your program, then deletes the data
>> file in multi-finder, or renames it, or moves it to another folder. Apple
>> user interface guidline people want me to open the file, and not close it
>> again until the user closes or quits.
>> I prefer to handle this problem by reading into memory the entire document,
>> including the finder info and the creation and modification times, and any
>> resources, then closing the file.  Let the user do horrible things to the 
>> copy on disk. When the user saves, if my code can't find the original file,
>> or if the modification date doesn't match, it puts up an sfPutFile dialog
>> with an explanatory dialog below it on the screen:
>> "Something changed this file since the last time it was saved. I suggest
>> you "Save As" this file with a different name." 

>> The advantage of this scheme is there is no limit to the number of open
>> documents you can work with (as opposed to the small number simultaneously
>> open files) and the program doesn't have to worry about the user renaming
>> an open file, or changing its folder.

>The idea is a neat one, but it turns out there is a situation in which it's
>very *dangerous*   Imagine if you will a network.  It has many machines on
>it, most of which are workstations, and some of which are servers.  Imagine
>that two workstations are editing the same file on a server.  One read in
>the file at 1:00pm, the other at 1:18pm.  The first one finishes it's changes,
>and writes it back at 1:20pm.  The other finishes at 1:31pm, and writes it's
>changes _on_top_of_the_first_set_.  You immediately tell me "Peter, this
>can't happen, David's algorythm takes care of that."  Well, in theory it
>does.  But in practice, it can fail, because the two Macs in question
>(the workstations) may have a slightly different idea of what time it is...
>The mac that gets the file at 1:18pm may be 3 minutes fast, and think that
>it's really 1:21pm.  When that mac goes to write the file back, it notices
>that the the last save to the file happened 'before' this mac got it, so
>it 'knows' that it's ok to save on top of it.    

No, it knows that the file has changed since it read it (it got
newer). Nice try, but it still can't happen unless the new save time
matches _exactly_. You check for an _exact_ match on the modification
time, not just a less than or a greater than. But Peter is right,
there is still the chance of a problem.

>A slightly different scheme that works on the same principle might increment
>some number, an 'in use' number.  Each time an application edited a copy
>of that file, it would increment the 'in use' number.  Each time it closed
>the file, it would decrement that number.  If it saw that the number was > 0
>at save time, it would give the user the dialog, and let the user decide.
>This method also has it's problems.  It's more complex, and it requires
>more writes to the file (even if I 'save as', I still have to decrement
>the 'in use' number).  And the file might be replaced with one that has
>a lower (or higher) 'in use' number than it should have, wreaking havoc.

Not a good scheme, because it requires that every application write
into every file it accesses, and most accesses are read accesses. This
scheme is almost right, though.

Peter, is it really true that if I write a file on a remote machine
the write happens with _my_ clock, and not the remote machine's clock?
After all, I'm just sending a message to the remote machine to do the
write. It is the one that is actually issuing the read and write
system calls that change the data on its disk. Why should it use my
clock? Since you have access to a net, could you please do the
experiment and report back. 

>Perhaps the safest way of dealing with it all is to just set the busy
>bit.  That way nobody can mess with the file behind your back (without
>having to go through an override that requires they think about what they
>are doing).

Setting the busy bit, (I think) also counts as a write, so it fails
for the same reason given above.  I believe that there is no reason to
restrict multiple reads, it is just multiple writes that are the
problem. Obviously, my model is text editors rather than databases. If
I were thinking about databases, I'd need a more sophisticated
concurency control system, but a text editor rewrites the entire file
on each save.  A database program writes pieces of its files, (while
those files may still be being actively read by other programs.)

Let's combine all of the above into a scheme that should work. Suppose
we have a "saveCount" number in the file, that counts the number of
times the file has been saved. Using the file goes as follows:

1.) The user selects "open". The software reads the entire file, both
forks, and the finder information into memory and closes the file.

2.) Some time later, the user selects "save". The software first 
2.a) opens the file read/write.  If the open fails (because the file
is no longer there, or because it is already in use by another writer)
the software puts up its Save As dialog, with the explanatory message
"This file has been changed by another program since the last time you
saved it."
2.b) If the open succeeds, it reads up the save count to see if that
matches what the application thinks it should be. if that doesn't
match, it goes into the above SaveAs mode.
2.c) It writes and closes the file. When it wrote the file, it used an
updated saveCount.

This scheme should be robust, since it doesn't depend on any clock
being accurate across the net.  It does depend on the fact that
AppleShare allows only one process to have a file open for read/write
at a time. Therefore, if I get the "write" file descriptor, I have a
guarantee that noone else has it.  

The original scheme, given in "How to write a TEXT editor, (part 1)"
is still my favorite, and this modification should only be used if it
is necessary. Peter, I hope you will report back soon whether it is
necessary or not.

--- David Phillip Oster            --My Good News: "I'm a perfectionist."
Arpa: oster@dewey.soe.berkeley.edu --My Bad News: "I don't charge by the hour."
Uucp: {uwvax,decvax,ihnp4}!ucbvax!oster%dewey.soe.berkeley.edu

edusoft@ecf.UUCP (09/21/87)

You bring up legitimate concerns about networked computers.

We have a network here, and we have solved the problem about times
by resetting the time on each node from the central node.

Of course, this is a bit easier in a multi-tasking OS.

cck@cunixc.columbia.edu (Charlie C. Kim) (09/21/87)

On time interlocks:

The AppleShare client (at least 1.1) is forced to use a "relative".
What happens is that when it connects to the server it gets the time
of day in a special format (based relative to Jan 1, 2001).
References to time should be based upon this difference between this
and the current Macintosh time on the client.  Thus, unless the clock
stops on a Macintosh, all macs on a network should have the same
"relative" time in reference to a particular server.

Also, time written, modified, etc. are usually written by the server.
This is not to say the client cannot or willnot modify these times,
but the above should alleviate the problems.

I believe the preferable method to interlocking files is to use the
"byte range lock" facility outlined in Inside Macintosh Volume 4.  I
would not depend on the AppleShare deny read/deny write, etc. open
permissions.  If you want to lock entire file, don't do what MacWrite
does though: it locks 0x7fffffff bytes - there is a method for locking
the entire file (I can't remember it happens if you specify 0 bytes or
0xffffffff bytes).

Charlie C. Kim
User Services
Columbia University