[comp.sys.mac] Help wanted: 'TEXT' definition

tedj@hpcid.HP.COM (Ted Johnson) (10/03/87)

Can anyone tell me where the format for 'TEXT' files is defined/described?
It doesn't seem to be in Inside Macintosh...

	-Ted

**************************************************************************
Ted C. Johnson
Hewlett Packard, Design Technology Center
Santa Clara, CA
(408)553-3555
UUCP:  ...hplabs!hpcea!hpcid!tedj
**************************************************************************

oster@dewey.soe.berkeley.edu (David Phillip Oster) (10/06/87)

In article <3320029@hpcid.HP.COM> tedj@hpcid.HP.COM (Ted Johnson) writes:
>Can anyone tell me where the format for 'TEXT' files is defined/described?
>It doesn't seem to be in Inside Macintosh...

This is a complex issue. See my upcoming posting 'How to write a TEXT editor,
part 2'. Until I post it, the next best source is Tech Note #84, the Edit
file format. Here are the high points of that document:

1.) The data fork holds Apple extended Ascii data, with the <CR> character
representing hard carriage return (i.e. paragraph separator in programs that
wrap (like MacWrite) line separator in programming language editors.)

2.) The files type field is TEXT (use GetFinderInfo() to see its type field.)

The following are optional. The file may have no resource fork, or, if it has
a resource fork, it may have the following two resources:

3.) the resource of type 'EFNT' id=1003, is of the form
struct {
  short size;
  Str255 fontName
};
where fontname is as long as it needs to be. There is a terminal null byte,
if necessary, to keep the total size of the resource even. This resource
describes what font and size the file will be displayed in.

4.) the resource of type 'ETAB' id=1004 of the form:
struct {
  short width;
  short count;
}
assert that a tab is has the value of 'count' spaces and a space, for the
purposes of tab expansion, is 'width' wide. Many Mac text editors do not
support tabs as equivalences for spaces, although most programming language
editors do.

My upcoming posting will detail a proposed extension to this standard
to support a multi-color, multi-tasking, multi-font, multi-script (i.e
Japanese, Arabic, and other non-Roman alphabets),
multi-finder-compatible, backward-compatible version of a text file,
that I hope other Mac programmers will adopt, so that we can preserve
the current inter-operability between editors, yet make them more in
tune with the Mac's power and more in tune with the Mac's future.

--- David Phillip Oster            --A Sun 3/60 makes a poor Macintosh II.
Arpa: oster@dewey.soe.berkeley.edu --A Macintosh II makes a poor Sun 3/60.
Uucp: {uwvax,decvax,ihnp4}!ucbvax!oster%dewey.soe.berkeley.edu

singer@endor.harvard.edu (Richard Siegel) (10/06/87)

In article <3320029@hpcid.HP.COM> tedj@hpcid.HP.COM (Ted Johnson) writes:
>
>Can anyone tell me where the format for 'TEXT' files is defined/described?
>It doesn't seem to be in Inside Macintosh...

	TEXT files are just plain text files, like you'd create with emacs.
There's no special format to them.

		--Rich

**The opinions stated herein are my own opinions and do not necessarily
represent the policies or opinions of my employer (THINK Technologies, Inc).

* Richard M. Siegel | {decvax, ucbvax, sun}!harvard!endor!singer    *
* Customer Support  | singer@endor.harvard.edu			    *
* THINK Technologies, Inc.  (No snappy quote)                       *

jww@sdcsvax.UCSD.EDU (Joel West) (10/07/87)

A 'TEXT' file is a series of Macintosh extended ascii printable
characters, with a return (13 decimal) following each line.  It
may optionally include tab characters.

The Apple extended ASCII set defines characters 128-255; see
the Font Manager.  Other control characters (in the range 0-31)
are not used.
-- 
	Joel West  (c/o UCSD)
	Palomar Software, Inc., P.O. Box 2635, Vista, CA  92083
	{ucbvax,ihnp4}!sdcsvax!jww 	jww@sdcsvax.ucsd.edu
So. California: where the ground does the Rocking 'N Rolling for you

earleh@dartvax.UUCP (Earle R. Horton) (10/08/87)

In article <4032@sdcsvax.UCSD.EDU>, jww@sdcsvax.UCSD.EDU (Joel West) writes:
> 
> A 'TEXT' file is a series of Macintosh extended ascii printable
> characters, with a return (13 decimal) following each line.  It
> may optionally include tab characters.
> 
                                         ^^^^^^^^^^^^^^^^^^^

Not necessarily so.  A 'TEXT' file could also have carriage returns at
the end of paragraphs only.  The first generally available application
to produce Macintosh 'TEXT' files (MacWrite) will produce either format,
according to what you want.  A 'TEXT' file with one paragraph may have
either one or zero carriage returns in it.

Here's my two cents worth:

A 'TEXT' file is a file which contains, in its data fork, no data
type which is larger than a byte.  Furthermore, the informational
content of a 'TEXT' file is found entirely within the data fork.
You can put anything you want in a 'TEXT' file, except for ints, longs,
floats, data structures or any extended data types (arrays of characters
are OK).

> The Apple extended ASCII set defines characters 128-255; see
> the Font Manager.  Other control characters (in the range 0-31)
> are not used.

The standard Macintosh printing character set contains four control characters.
These are: 
	Propeller
	Check-mark
	Diamond
	Apple-with-a-bite-out-of-it

In addition, the Font Manager chapter says the NUL character (0) must
have a printing representation.  However, there is no mention of
restricting the content of 'TEXT' files to printing characters in Inside
Macintosh or anywhere else that I know of.  Evidence of this is that the
Font Manager requires each font to have a "missing character" symbol.

A 'TEXT' file contains a byte stream in its data fork, and all bytes are
legal.  A *nice* 'TEXT' file, on the other hand, probably agrees with
Joel's description.  A nice 'TEXT' file doesn't contain stuff which
gives other applications than its creator a hard time when they try to
print it on the screen or printer.

I think it's a little late to insist that all 'TEXT' files be nice
'TEXT' files, but you can try if you want.  Me, I love to embed
formfeeds in the suckers for emphasis.

-- 
*********************************************************************
*Earle R. Horton, H.B. 8000, Dartmouth College, Hanover, NH 03755   *
*********************************************************************

jww@sdcsvax.UCSD.EDU (Joel West) (10/09/87)

In article <7326@dartvax.UUCP>, earleh@dartvax.UUCP (Earle R. Horton) writes:
> I think it's a little late to insist that all 'TEXT' files be nice
> 'TEXT' files, but you can try if you want.  Me, I love to embed
> formfeeds in the suckers for emphasis.

Unlike on an IBM PC, the Mac file type describes the contents of the
file.  Therefore, if it doesn't meet Earle's requirements for a
"nice" text file, it shouldn't be 'TEXT', but should be something
else.

I agree, though, the CR-each-line or CR-each-paragraph variants exists,
but the latter is only for communicating word processors, not any of
the other programs that read or import text files.
-- 
	Joel West  (c/o UCSD)
	Palomar Software, Inc., P.O. Box 2635, Vista, CA  92083
	{ucbvax,ihnp4}!sdcsvax!jww 	jww@sdcsvax.ucsd.edu
So. California: where the ground does the Rocking 'N Rolling for you

tedj@hpcilzb.HP.COM (Ted Johnson) (10/16/87)

Thanks for all the responses!

	-Ted