[comp.sys.mac.programmer] TWO Byte Characters, SysEnvirons -- Help

shinberd@unioncs.UUCP (David Shinberg) (04/12/88)

I have been struggling with a seemingly simple problem for over two 
weeks now. 

How to put characters into a plain text file.  After some time I've
narrowed the problem down to the fact that a character occupies TWO BYTE
in MPW Pascal 2.0.  Since the program is only intended to be used in English
I have no need for this.  It also appears that some aplications such MPW itself
does not recognize two byte characters in a text file.  

My development system is:
	MacII with 5 Meg Ram, 80 Meg Apple Hardrive
	MPW 2.0
	MPW Pascal 2.0

My two questions are:

1) I think my problem is with the SysEnvirons, but I am not totaly sure. 

2) How does TextEdit deal with these tow byte characters and why do some
   applications seem to deal only with one byte characters?

Any Help or even just a reply would be greatly appreciated.  I appologize
if this is an old issue or something quite trival but I'm real new at this
and very lost.

			Thanks in Advance
			Dave Shinberg


|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
| Disclaimer: Don't blame me ... I'm too new at this                          |
|                                                                             |
| David Shinberg		       BITNET: 88_shinb@union                 |
| Box 2073                             UUCP: uunet!steinmetz!unioncs!shinberd |
| Union College                                                               |
| Schenectady NY, 12308                                                       |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

jv0l+@andrew.cmu.edu (Justin Chris Vallon) (04/15/88)

I believe that you are confused about the size of a character.  sizeof(char)
is 1 in Pascal, in text files, and when using the Mac Toolbox

If you are computing the sizeof a char by looking at the space occupied in
a record or on the stack frame, then you need some background on the 680x0
family:

The 68000 uP has a 16 bit data bus, the lower half covering all of the even
bytes in memory, the upper half dealing with the odd bytes.  When the 68000
reads integers (2-bytes) and long-integers (4-bytes) from memory, they must
be word aligned, so that the entire data can be read in 1 fetch.  This means
that the ad dress of an int/long must always be EVEN.  If an attempt is made
to read an int/long from an ODD memory address, you get the (probably familiar)
System Error ID=02
Now, for the 2-byte char size.  Let's say that you have declared a pascal
record with an int, a char, a long and a boolean:

DataType = Record
             anInt: integer;
             aChar: char;
             aLong: longInt;
             aBool: boolean;
           End;

Computing offsets into the record just using data sizes (this is not correct,
you'll see why later):

offset for anInt = 0;  sizeof(integer) = 2
           aChar = 2;  sizeof(char) = 1
           aLong = 3;  sizeof(long) = 4
           aChar = 7;  sizeof(char) = 1
length of record = 8 bytes

However, notice that aLong (a longInt) is not on a word boundary.  To fix this,
the compiler word-aligns any variable longer than 1 byte.  So, the modified
offsets are:

offset for anInt = 0;  sizeof(integer) = 2
           aChar = 2;  sizeof(char) = 1
           aLong = 3->4;  sizeof(longInt) = 4
           aBool = 8;  sizeof(boolean) = 1
length of record = 9 bytes

Now, if you were to compute the size allocated to aChar, it would be
offset(aLong) - offset(aChar) = 4-2 = 2!  However, only 1 byte is used :-(

The same argument holds true for pushing variables onto the stack.  Since an
interrupt may occur at any time, and addresses (longs) are stored on the stack,
the stack pointer must always be EVEN.  So, pushing a 1-byte character onto
the stack effectively uses 2-bytes of stack space to make the SP word-aligned.

Hope this answers your question.  If it doesn't, then I'm lost, too :-)

-Justin