[comp.arch] bi-endian environments

cprice@mips.com (Charlie Price) (05/10/91)

I don't especially want to get into this discussion in deep technical
detail (partly because I probably don't *understand* deep detail),
but I don't want to let these conclusions stand without qualification.

In article <168@titccy.cc.titech.ac.jp> mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
>In article <3225@spim.mips.COM> cprice@mips.com (Charlie Price) writes:

... how the R4000 implements endian-ness...

>Then, shared memory between processes of different endian is impossible.
>
>This means that access to a file through memory mapping mechanism is
>impossible, if the file is accessed from processes of different endian.

You cannot share *arbitrary* binary data transparently through
either shared memory or the filesystem.

You *can* arrange to share ONE binary format.
I think the only reasonable one to pick is characters (i.e. bytes).
For data that processes share actively and access it frequently,
I suspect that you won't like the performance of mixed endian access,
but you can certainly make it *work*.

Your observation about simultaneous access to a mapped file is
essentially the same problem as shared data though shared segments.

For a non-native process using a mapped file, you ned to do the same
transform that you would for any unmapped I/O --
When a page is accessed, and mapped to the process,
the OS must swap the bytes around within the page so that
character-stream order is preserved.
This in-memory page is now in a special state and it has to be
transformed back to the native character-stream order before it
can be written back to disk.
If a memory-mapped file page or shared memory segment page is accessed
by processes that have different endian-ness, the page has to be put into
the proper character-stream order before the access can be granted.

This would be very expensive for highly-shared accesses,
so I suspect people won't do it very much.
Fortunately, most file access, mapped or otherwise, doesn't involve
a high degree of simultaneous access by several processes.
-- 
Charlie Price    cprice@mips.mips.com        (408) 720-1700
MIPS Computer Systems / 928 Arques Ave.  MS 1-03 / Sunnyvale, CA   94088-3650

stephen@estragon.uchicago.edu (Stephen P Spackman) (05/10/91)

In article <3308@spim.mips.COM> cprice@mips.com (Charlie Price) writes:
|You cannot share *arbitrary* binary data transparently through
|either shared memory or the filesystem.

It is in principle possible (read: given a next generation operating
system with its own systems language it will work just fine) to come
VERY close to this, by relying on marshalling code generated from the
type declarations (see my thesis if I ever finish writing it... but
it's a common theme in the newer garbage collection literature).

|You *can* arrange to share ONE binary format.
|I think the only reasonable one to pick is characters (i.e. bytes).

This is a very unfortunate comment. Around HERE, at least, a
"character" is a 16-bit Unichar. Don't let Unix lock in your brain!
:-).

[...description of how you can maintain virtual byte order by
intercepting page mapping...]

|This in-memory page is now in a special state and it has to be
|transformed back to the native character-stream order before it
|can be written back to disk.

You can also sometimes compress at the same time and get back the
performance you lost. (Damn. This comes from the same chap as the
cardmarking generational things. Nice fellow. Here in Chicago
somewhere. Forget his name completely. Damn).

|If a memory-mapped file page or shared memory segment page is accessed
|by processes that have different endian-ness, the page has to be put into
|the proper character-stream order before the access can be granted.

But note that you DON'T HAVE to transform before write-back, not if
the page can be marked with its PRESENT format! :-)

|This would be very expensive for highly-shared accesses,
|so I suspect people won't do it very much.

Not as bad as you might think. Consider dynamically recompiling some
of the IO code and sticking it into the user's address space....

|Fortunately, most file access, mapped or otherwise, doesn't involve
|a high degree of simultaneous access by several processes.

Sometimes. But this isn't always going to be true in an OO world....

It's time to get serious about the place where hardware and software
meet....
----------------------------------------------------------------------
stephen p spackman         Center for Information and Language Studies
systems analyst                                  University of Chicago
----------------------------------------------------------------------

mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (05/10/91)

In article <3308@spim.mips.COM> cprice@mips.com (Charlie Price) writes:

>You cannot share *arbitrary* binary data transparently through
>either shared memory or the filesystem.

So, I have been explicitely talking about sharing of a byte stream from
the begining.

>If a memory-mapped file page or shared memory segment page is accessed
>by processes that have different endian-ness, the page has to be put into
>the proper character-stream order before the access can be granted.
>
>This would be very expensive for highly-shared accesses,
>so I suspect people won't do it very much.

Yes, this is THE problem. And you have a slooow workaround.

>Fortunately, most file access, mapped or otherwise, doesn't involve
>a high degree of simultaneous access by several processes.

Well, for example, without NIS, look up of mapped /etc/passwd will be
considerably slowed down.

						Masataka Ohta

billm@oakhill.sps.mot.com (Bill Moyer) (05/11/91)

In article <3308@spim.mips.COM> cprice@mips.com (Charlie Price) writes:
>I don't especially want to get into this discussion in deep technical
>detail (partly because I probably don't *understand* deep detail),
>but I don't want to let these conclusions stand without qualification.

I know I'm asking for it, but........

>
>In article <168@titccy.cc.titech.ac.jp> mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
>>In article <3225@spim.mips.COM> cprice@mips.com (Charlie Price) writes:
>
>... how the R4000 implements endian-ness...
>
>>Then, shared memory between processes of different endian is impossible.
>>
>>This means that access to a file through memory mapping mechanism is
>>impossible, if the file is accessed from processes of different endian.
>
>You cannot share *arbitrary* binary data transparently through
>either shared memory or the filesystem.
>
>You *can* arrange to share ONE binary format.
>I think the only reasonable one to pick is characters (i.e. bytes).
>For data that processes share actively and access it frequently,
>I suspect that you won't like the performance of mixed endian access,
>but you can certainly make it *work*.
>For a non-native process using a mapped file, you ned to do the same
>transform that you would for any unmapped I/O --
>When a page is accessed, and mapped to the process,
>the OS must swap the bytes around within the page so that
>character-stream order is preserved.

I don't quite understand this. Endianness applies to objects, not pages.
If I update a data structure consisting of two short ints (16-bit)
and an int (32-bit) in little endian mode, and then reverse all of the bytes
of the structure for access by a big-endian process, the two shorts will
be swapped, which is not the desired effect. I don't think the OS has
enough knowledge of the size and placement of elements in an arbitrary
page of data to perform the transformation. What am I missing?

It seems to me that an application could run in either endian mode
if the processor is capable of performing the byte reordering based on
the size of the object being accessed. This would imply that not swapping
bytes within a 32-bit word on 32_bit accesses is not the desired action, 
since once the processor data bus is wired to a given set of byte-lanes
to memory, access to a 32-bit data type in the opposite endianness, followed
by a byte reversal ends up producing the wrong result. In addition,
processors with different size data buses cannot communicate correctly
because they will make different assumptions about where the byte
located at address X is accessed. An 8-bit processor which desires
to read a 32-bit element by performing 4 single byte accesses, and accumulating
the data will not end up with what the 32-bit processor wrote in the
opposite endian configuration, even if the software attemps to account for the
order of storage in memory.

Thoughts ?
>
>-- 
>Charlie Price    cprice@mips.mips.com        (408) 720-1700
>MIPS Computer Systems / 928 Arques Ave.  MS 1-03 / Sunnyvale, CA   94088-3650


Bill Moyer                        insert disclaimer here --> <>
MC88110 Design
Motorola Microprocessor Products Group

zalman@mips.com (Zalman Stern) (05/11/91)

In article <1991May10.180241.540@oakhill.sps.mot.com> billm@speedy.UUCP (Bill Moyer) writes:
>In article <3308@spim.mips.COM> cprice@mips.com (Charlie Price) writes:
[...]
>>You *can* arrange to share ONE binary format.
>>I think the only reasonable one to pick is characters (i.e. bytes).
>>For data that processes share actively and access it frequently,
>>I suspect that you won't like the performance of mixed endian access,
>>but you can certainly make it *work*.
>>For a non-native process using a mapped file, you ned to do the same
>>transform that you would for any unmapped I/O --
>>When a page is accessed, and mapped to the process,
>>the OS must swap the bytes around within the page so that
>>character-stream order is preserved.
>
>I don't quite understand this. Endianness applies to objects, not pages.
>If I update a data structure consisting of two short ints (16-bit)
>and an int (32-bit) in little endian mode, and then reverse all of the bytes
>of the structure for access by a big-endian process, the two shorts will
>be swapped, which is not the desired effect. I don't think the OS has
>enough knowledge of the size and placement of elements in an arbitrary
>page of data to perform the transformation. What am I missing?

You aren't missing anything. The OS cannot make things transparent.
However, it must provide correct access to byte streams which has always
worked before. Since current bi-endian chips(*) muck with byte (and short,
and maybe word) addresses (as opposed to swapping multi-byte data when
necessary) you have to swap the buffers to get byte stream access to work
right.

>
>It seems to me that an application could run in either endian mode
>if the processor is capable of performing the byte reordering based on
>the size of the object being accessed. This would imply that not swapping
>bytes within a 32-bit word on 32_bit accesses is not the desired action, 
>since once the processor data bus is wired to a given set of byte-lanes
>to memory, access to a 32-bit data type in the opposite endianness, followed
>by a byte reversal ends up producing the wrong result.

Removing the double negative, I think you are trying to say that providing
byte swapping on the data bus works... I agree. But its a big hit to the
hardware and its not worth doing. Note, it still doesn't buy byte order
transparency. If a big endian process does a store word and a little endian
process does a load word it will get a byte reversed value. (In a previous
post I said that on an R3000 they would get the same value, I'm not sure
that is true because I think things are swapped with respect to 64 bit
words so FP loads/stores make sense. On the R3000 they may actually get
completely unrelated values...)

>In addition,
>processors with different size data buses cannot communicate correctly
>because they will make different assumptions about where the byte
>located at address X is accessed. An 8-bit processor which desires
>to read a 32-bit element by performing 4 single byte accesses, and accumulating
>the data will not end up with what the 32-bit processor wrote in the
>opposite endian configuration, even if the software attemps to account for the
>order of storage in memory.

This makes no sense to me.

>Bill Moyer                        insert disclaimer here --> <>
>MC88110 Design
>Motorola Microprocessor Products Group

(*) If anyone knows of a chip that actually does transparent byte swapping
on the byte lanes of the data bus, please let me know.
-- 
Zalman Stern, MIPS Computer Systems, 928 E. Arques 1-03, Sunnyvale, CA 94088
zalman@mips.com OR {ames,decwrl,prls,pyramid}!mips!zalman     (408) 524 8395
  "Never rub another man's rhubarb" -- the Joker via Pop Will Eat Itself

jlee@sobeco.com (j.lee) (05/15/91)

In <3308@spim.mips.COM> cprice@mips.com (Charlie Price) writes:

>You *can* arrange to share ONE binary format.
>I think the only reasonable one to pick is characters (i.e. bytes).

This is what "network byte order" is all about.  If I write(2) or
mmap(2) a sequence of *bytes* on my system to disk/tape/tty/socket
and you can't read(2) or mmap(2) them and get the same sequence of
*bytes*, I would argue that someone's device driver is *bust*.  On
the other hand, if we both try to interpret those same bytes as a
sequence of two-byte or four-byte (or even eight-byte) integers,
the values we see can depend on our host byte-ordering (endian-ness).

>When a page is accessed, and mapped to the process,
>the OS must swap the bytes around within the page so that
>character-stream order is preserved.
>This in-memory page is now in a special state and it has to be
>transformed back to the native character-stream order before it
>can be written back to disk.
>If a memory-mapped file page or shared memory segment page is accessed
>by processes that have different endian-ness, the page has to be put into
>the proper character-stream order before the access can be granted.

Character (byte) stream order is *normally* preserved; in order
to do the "conversion" you must shuffle the byte ordering around
-- either when you map the page of data, or when you read a value
from that page.  However, a single, fixed, map-time transformation
for *all* pages based only on "processor endian-ness" will not work;
the transformation required for each page depends on the size and
type (in the case of FP values) of the data being loaded/stored in
that page.  There is a paper by M. Stumm and S. Zhou of the University
of Toronto on this in relation to a Distributed Shared Memory system
for Heterogeneous Processors.  They discuss a DSM system which
shares data between Vaxen and Sun3s -- machines with different
endian-ness, FP encoding, and page-size!  I'm afraid that the exact
reference escapes me at the moment.  However the point is that data
sharing between heterogeneous machines *can* be done, but requires
knowledge of the structure and encoding of the data being exchanged.
(The X11 protocol is a case in point.)

I'm not sure whether or not we've tried to say the same thing, but
if so, I hope that what I have said makes the point clearer.

Jeff Lee -- jlee@sobeco.com || jonah@cs.toronto.edu

jgd@convex.csd.uwm.edu (John G Dobnick) (05/16/91)

From article <1991May15.004848.11929@sobeco.com>, by jlee@sobeco.com (j.lee):
> In <3308@spim.mips.COM> cprice@mips.com (Charlie Price) writes:
> 
>>You *can* arrange to share ONE binary format.
>>I think the only reasonable one to pick is characters (i.e. bytes).
> 
> This is what "network byte order" is all about.  If I write(2) or
> mmap(2) a sequence of *bytes* on my system to disk/tape/tty/socket
> and you can't read(2) or mmap(2) them and get the same sequence of

Silly question time:  What is a "byte"?

There seems to be an assumption of 8-bitted-ness here.  We (until
recently) had a machine running Unix that used 9-bit bytes.  How
does this map into your assumptions?

Curiously,
-- 
John G Dobnick  (JGD2)
Computing Services Division @ University of Wisconsin - Milwaukee
INTERNET: jgd@uwm.edu                      ATTnet: (414) 229-5727
UUCP: uunet!uwm!jgd

"Knowing how things work is the basis for appreciation,
and is thus a source of civilized delight."  -- William Safire

henry@zoo.toronto.edu (Henry Spencer) (05/16/91)

In article <12168@uwm.edu> jgd@convex.csd.uwm.edu writes:
>> This is what "network byte order" is all about...
>
>Silly question time:  What is a "byte"?
>There seems to be an assumption of 8-bitted-ness here.  We (until
>recently) had a machine running Unix that used 9-bit bytes.  How
>does this map into your assumptions?

Poorly, like the 36-bit machines and other oddities.  The fact is that
almost all network protocols are specified as sequences of 8-bit bytes.
(The word "octet" is often used as a more neutral term.)  If your machine
doesn't like 8-bit bytes, that is its problem. :-)  The usual approach
is to pick one of several revolting kludges.
-- 
And the bean-counter replied,           | Henry Spencer @ U of Toronto Zoology
"beans are more important".             |  henry@zoo.toronto.edu  utzoo!henry