[comp.arch] Computer Architecture methodology

barry@PRC.Unisys.COM (Barry Traylor) (06/27/90)

In article <8533@canterbury.ac.nz> PHYS169@canterbury.ac.nz (Mark Aitchison, U of Canty; Physics) writes:
>A little question: do the hardware architecture designers primarily strive to
>make conventional programs and operating systems run well (fast and lean, etc)
>or to make hardware race as fast as possible - independant of ideas of what
>software will run on it (i.e. say "It's up to the software guys to make the
>best of my hardware")?
>
>I know lots of people will have ideas on what they *ought* to do, and looking
>back with the aid of folklore, we can see what happened on occassions in the
>past, but I am hoping some present-day architure people (not just building
>block assemblers) will be able to reply with their design philosophy.
>
>Ta muchly,
>Mark Aitchison.

The B6700 and its successors have all been designed as integrated
hardware/software systems.  There is really nothing in the instruction set
that is not in fairly common use by the compilers.  Admittedly, certain
instructions exist so that the operating system can control the machine,
but as far as I know, there are no instructions in there that are there
just because "the hardware folks thought it would make the machine scream."
The reality of integration came about and has been maintained in the
absence of an assembly language.  The architecture of the machine is
largely driven by a combination of requirements from Cobol, Algol and
Fortran.

In recent history, we have been working on ways to make commonly used
instructions run faster, and there have been some attempts to bend the
software to work with some ill concieved hardware instructions, now
obsolescent.

In the architecture group, there are about 4 hardware engineers, 3 operating
system programmers and 4 to 6 compiler programmers.  Transient personnel
typically bring the group up to 20 at any given time.  Occasionally
hardware wants to make changes.  Occasionally software wants to make
changes.  All changes typically require a full consensus of the
architecture group before they are implemented.

Barry Traylor
barry@prc.unisys.com
Unisys Large A Series Engineering (operating systems programming)

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (06/28/90)

In article <14279@burdvax.PRC.Unisys.COM>, barry@PRC.Unisys.COM (Barry Traylor) writes:
: The B6700 and its successors have all been designed as integrated
: hardware/software systems.  ... The architecture of the machine is
: largely driven by a combination of requirements from Cobol, Algol and
: Fortran.

: In recent history, we have been working on ways to make commonly used
: instructions run faster, and there have been some attempts to bend the
: software to work with some ill concieved hardware instructions, now
: obsolescent.

I would love to hear about which of the hardware instructions are now
regarded as obsolescent and ill conceived.  Are LLLU and SRCH on the way
out?  What finally happened about the 20-bit physical addresses?  What
problems does Ada have with the A series?  What's the true story on the
addition of {HEX|BCD|ASCII|EBCDIC} POINTERs and the later removal of
BCD POINTER?  Has vectormode been extended?  

-- 
"private morality" is an oxymoron, like "peaceful war".

raj@Apple.COM (Raj Sharma) (07/03/90)

In article <8533@canterbury.ac.nz> PHYS169@canterbury.ac.nz (Mark Aitchison, U of Canty; Physics) writes:
>A little question: do the hardware architecture designers primarily strive to
>make conventional programs and operating systems run well (fast and lean, etc)
>or to make hardware race as fast as possible - independant of ideas of what
>software will run on it (i.e. say "It's up to the software guys to make the
>best of my hardware")?
>
>I know lots of people will have ideas on what they *ought* to do, and looking
>back with the aid of folklore, we can see what happened on occassions in the
>past, but I am hoping some present-day architure people (not just building
>block assemblers) will be able to reply with their design philosophy.
>
>Ta muchly,
>Mark Aitchison.

Mark, you have hit upon the most important factor at the root of all sins 
commited in design of computers. Most pundits of computer architecture begin
their education in the subject by learning first how to measure and evaluate 
performances. Therefore their instinct whispers to them to design architectures
to execute specific applications faster and with higher code densities. 
However, these folks leave their abode of learning to venture into the 
commercial world to make their millions and then they meet folks from 
marketting and designing computer systems is no more simple anymore. 
First, to accomodate a larger variety of applications the architecture is 
compromised. Second, most of the applications are silly programs of no 
practical values. Finally, the architecture's performance is poor in 
real applications not to mention the pain level of the system designer. 
The designer looks like a fool when (say) Unix runs slow but the 
tower of hanoi algorithm runs fast. This is when the designer
falls back to EE101 and decides to speed up the clock speed knowing well that
this brute force will always work from the silliest to the smartest programs.

Raj Sharma
"My employer and my mind never meet, so let's leave him alone"
"Remember, we are all merely a phenomena from the juxtaposition of 
energy and time"

barry@tredysvr.Tredydev.Unisys.COM (Barry Traylor) (07/08/90)

In article <3329@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
>
>I would love to hear about which of the hardware instructions are now
>regarded as obsolescent and ill conceived.  Are LLLU and SRCH on the way
>out?  What finally happened about the 20-bit physical addresses?  What
>problems does Ada have with the A series?  What's the true story on the
>addition of {HEX|BCD|ASCII|EBCDIC} POINTERs and the later removal of
>BCD POINTER?  Has vectormode been extended?  
>

LLLU (linked list lookup, pronounced lulu), has been deimplemented for all
recent and future machines;  it was an operating systems instruction that
fell out of use.  SRCH (Masked Search) and BMS (bounded masked search, a
new op) are alive, healthy and heavily used in the OS.

The physical addressing limit is 32 bits (4Gwords or 24Gbytes), extendable
in the future to 36 bits, if necessary.  Addressing is done via a global
segment table now.

There currently is no Ada compiler for A Series, although we believe that
Ada would fit well.

BCL (rather than BCD) was largely eliminated 10 years ago.  The expense of
the hardware support for it and the lack of a standard for 6 bit character
sets did it in.  The machine supports a HEX framesize as well as an 8 bit
framesize that can represent either ASCII or EBCDIC.

Vector Mode died shortly after it was discovered that it ran more slowly on
pipelined machines than non-vector mode.  I do not believe that we have
made a machine that supports vector mode for more than 10 years.

Barry Traylor
barry@prc.unisys.com (or whereever this message came from)
Unisys Large A Series Engineering, Tredyffrin Twp, Pa.

cik@l.cc.purdue.edu (Herman Rubin) (07/08/90)

In article <844@tredysvr.Tredydev.Unisys.COM>, barry@tredysvr.Tredydev.Unisys.COM (Barry Traylor) writes:
> In article <3329@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
> >
> >I would love to hear about which of the hardware instructions are now
> >regarded as obsolescent and ill conceived.  Are LLLU and SRCH on the way
> >out?

			............................

> LLLU (linked list lookup, pronounced lulu), has been deimplemented for all
> recent and future machines;  it was an operating systems instruction that
> fell out of use.  SRCH (Masked Search) and BMS (bounded masked search, a
> new op) are alive, healthy and heavily used in the OS.

			...........................

Users can make use of weird instructions.  Instead of insisting that they 
use the ill-conceived limitations of HLLs, encourage them to use the power
of these instructions that the HLL producers do not know how to use.  You
will find that there are good applications uses of most of them.

Mr. Taylor goes on to say that some instructions were eliminated because
they could be done faster under certain conditions.  It was not clear if
these conditions were universal, or depended on the type of program.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)	{purdue,pur-ee}!l.cc!cik(UUCP)

guy@auspex.auspex.com (Guy Harris) (07/09/90)

>Users can make use of weird instructions.

If users can make use of any arbitrary "weird instruction", what
prevents the desired instruction set of a machine from being unbounded
above?  Eventually you have to choose *some* that you won't implement,
as a machine that implements every possible operation as a single
instruction is clearly unimplementable (few infinite items are
constructable in the real world :-)).

>Instead of insisting that they use the ill-conceived limitations of HLLs,

Are you truly certain that the only reason why the instruction in
question "fell out of use" is that it was due to Unisys "insisting that
[users] use the ill-conceived limitations of HLLs"?  I suspect that
ESPOL (or whatever the OS implementation language is called) will let
you get at just about any of those "weird instructions".  Perhaps no
user could come up with a *good* use for the instruction in question,
where "good" means "good enough to justify its inclusion in the
instruction set, to the exclusion of some other instruction that would
provide a greater performance improvement, or that would be more widely
usable."

>You will find that there are good applications uses of most of them.

Fine.  Show me a good application use of an instruction that, say, ORs
together the 15th bit of the 7th word following the word pointed to by
the operand and the 17th bit of the 9th word following that word and, if
the result is zero, rewinds the 12th tape drive on the machine - unless
the machine has no tape drives, in which case it prints the letter "Q"
on the console.  (No fair defining the application as being that very
operation!)

If you can't, then perhaps there are *some* "weird instructions" that
have no good application uses, and therefore, you do need some way of
choosing which "weird instructions" should go in anyway and which should
be deleted.  Perhaps a practical suggestion of exactly such a way might
have more effect on the designers of hardware and HLLs than a series of
complaints that they're just not doing things right with no real
suggestions as to how they might do them better?

cik@l.cc.purdue.edu (Herman Rubin) (07/09/90)

In article <3627@auspex.auspex.com>, guy@auspex.auspex.com (Guy Harris) writes:
> >Users can make use of weird instructions.
> 
> If users can make use of any arbitrary "weird instruction", what
> prevents the desired instruction set of a machine from being unbounded
> above?  Eventually you have to choose *some* that you won't implement,
> as a machine that implements every possible operation as a single
> instruction is clearly unimplementable (few infinite items are
> constructable in the real world :-)).

			.............................

> >You will find that there are good applications uses of most of them.

		[Example of weird instruction omitted.]

> If you can't, then perhaps there are *some* "weird instructions" that
> have no good application uses, and therefore, you do need some way of
> choosing which "weird instructions" should go in anyway and which should
> be deleted.  Perhaps a practical suggestion of exactly such a way might
> have more effect on the designers of hardware and HLLs than a series of
> complaints that they're just not doing things right with no real
> suggestions as to how they might do them better?

Admittedly there are weird instructions which are at least extremely difficult
to justify.  In fact, I would even argue that some of the "instructions" on
floating point chips, such as the transcendental functions, are nothing more
that programs encoded in microcode.  But there are lots of reasonable hardware
instructions which have either disappeared or were rarely implemented.

Examples of simple instructions in hardware, much more expensive in software,
and for which I know of "reasonable" applications.

	Multiplication of integers with both most and least significant parts
	of the product available

	Division with quotient and remainder simultaneously

	Division of floating point numbers, with integer quotient and
	floating point remainder

	In the above two operations, allowing the choice of which quotient
	and remainder, depending on the signs of the arguments.

	Obtaining the spacing between the ones in a bit sequence.  In the
	algorithms I would produce, this can be a major operation.

	The use of overflow and carry tests.

	Fixed point arithmetic.

	Multiplication of a floating point number by a power of two, not
	using the multiply unit

	Better conversion between integer and floating point.  There is one
	major computer company which produces huge numbers of mainframes which
	do not have any hardware conversion capabilities whatever.  Not only
	that, there was plenty  of room for additional instructions when the
	line was started, and the company's own scientific people complained
	about the problem.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)	{purdue,pur-ee}!l.cc!cik(UUCP)

guy@auspex.auspex.com (Guy Harris) (07/09/90)

>Admittedly there are weird instructions which are at least extremely difficult
>to justify.

Yup; I think the one I concocted is not just "extremely difficult" to
justify, but *impossible* to justify - at least I certainly tried to
make it so....

>In fact, I would even argue that some of the "instructions" on floating
>point chips, such as the transcendental functions, are nothing more
>that programs encoded in microcode.

I'd agree there; Motorola may agree, also, as, at least according to the
Microprocessor Report article on the 68040, they ripped those
instructions out of the on-chip floating point unit and did them in
software - the article says that "by eliminating microcoded algorithms
needed by transcendental functions, the 040's designers were able to
allot more transistors to the computational logic," and then ascribes
much of the speedup of the '040's floating point unit to doing so.

>But there are lots of reasonable hardware instructions which have either
>disappeared or were rarely implemented.

OK, *but* was the one that Unisys removed from the A-series one of them
- and, more importantly, can you tell whether it is without a
description of the instruction?  Complaining as soon as you hear of
somebody removing an instruction from an instruction set, as you
appeared to do in your reply, doesn't seem very sensible; it may well be
that they could have devoted the transistors used to implement that
instruction to implementing one of the ones you wanted, in which case,
had they actually done that, you should be *congratulating* them on
removing it.... 

Giving specific examples of instructions you want in hardware, as you
did in this article, is more useful than just lashing out when you hear
of somebody removing an instruction from an instruction set, as you did
in the posting to which I followed up.

beal@paladin.Owego.NY.US (Alan Beal) (07/11/90)

I used to work on B7800s and A-series machines and I was wondering how
they are doing in the market place and what changes have been made in
the last 3 years.  The most powerful system I recall was an A-17 using
MCP/AS.  Is there now a more powerful machine?  Have the B1000 users
moved on to the A-3s and A-5s or to another architecture entirely?
The 32 bit scheme seemed to be a work around the 20 bit hardware
addressing limitations.  Is there any effort to increase the word
length from 48 bits or change the descriptor formats?  I read where
the global table allows a program to address 2**20 objects, each
up to 2**32 words.  Is that true?  What portion of this is implemented
in hardware and which in software?  I also read that the global
table decreased the amount of stack searching for copy descriptors.
How much does this offset the penalty of the extra indirection using
the global table?  Any other new hardware features on the A-series
machines?  Are they still selling well?

I really liked the operating system and the use of compilers instead
of assembly language.  However, the Algol compiler did not allow any
complex data types other than arrays.  Will this ever change?  My
last expereience was on MCP 3.6 and I am wondering how the MCP
has changed since then.  How well is the semantic database software
being accepted?  Any new developments in distributed computing?
Now that I have been away from the A-series machines, I find myself
missing it more and more every day.  What I wouldn't give again for a system
that displayed sensible error messages, told me exactly on which line
my program failed, provided easy multitasking and interprocess
communication facilities, and had an elegant job control language(WFL).
Oh, to the goood old days.  Before I leave, does Unisys still provide
the source code for the MCP and other software?  Are TCP/IP and
other non-Unisys protocols supported?

Thanks.

-- 
Alan Beal
Internet: beal@paladin.Owego.NY.US
USENET:   {uunet,scifi}!paladin!beal

beal@paladin.Owego.NY.US (Alan Beal) (07/18/90)

Following is the response from Barry Traylor to my questions in an earlier
posting on Unisys A-series machines.  I hope you find it interesting.

Alan Beal
------------------------------------------------------------------------

In article <550@paladin.Owego.NY.US> you write:
>I used to work on B7800s and A-series machines and I was wondering how
>they are doing in the market place and what changes have been made in
>the last 3 years.  

The big iron has been doing OK.  The small iron has been growing at about a
20%/year rate (our advertising says that, but our internal numbers confirm
it).  The stuff still is not "amazingly popular", and they are *mainframes*
regardless of their size (block mode datacom, etc).

>The most powerful system I recall was an A-17 using
>MCP/AS.  

The A17 just came out a couple of years ago (I think).  I was one of the
software project leaders for that box, largely responsible for the hardware
process switching software and interface specifications.  I like the A17.

>Is there now a more powerful machine?  

Yes, but not yet, if you know what I mean.

>Have the B1000 users
>moved on to the A-3s and A-5s or to another architecture entirely?

Don't know.

>The 32 bit scheme seemed to be a work around the 20 bit hardware
>addressing limitations.  Is there any effort to increase the word
>length from 48 bits or change the descriptor formats?  

I was also a project leader for ASD memory management before I moved east.
The ASD stuff has been much more successful than we would have ever hoped.
There are no immediate plans to increase the data portion of the word,
although the tag portion was increased to 4 bits when the A17 was
introduced.

>I read where
>the global table allows a program to address 2**20 objects, each
>up to 2**32 words.  Is that true?  What portion of this is implemented
>in hardware and which in software?  I also read that the global
>table decreased the amount of stack searching for copy descriptors.
>How much does this offset the penalty of the extra indirection using
>the global table?  

Lotsa good questions.  The current limit (to be increased soon) is 2**20
objects, each of which may be 2**20 words long.  If you go to paged arrays,
the first number decreases (although memory management gets easier), but
the second does not increase (this will also be changing soon).  There is a
significant amount of software to support the new memory scheme, but the
basis is a change in the architecture that makes the old address field a
global segment index, so yes, there are also hardware changes.  The global
table pretty dramatically decreased the stack searching requirements, but
did not eliminate them (new architectural changes will eliminate nearly all
requirements for stack searching).  While the lack of stack searching does
somewhat counteract the cost of the indirection, the big gain is in the
lack of memory partitioning and the subsequent thrashing.

>Any other new hardware features on the A-series
>machines?  Are they still selling well?

There is a new level of e-mode that is being used in a soon to be announced
machine.

>
>I really liked the operating system and the use of compilers instead
>of assembly language.  However, the Algol compiler did not allow any
>complex data types other than arrays.  Will this ever change?  

Ain't algol a bitch!  This stuff seems to be continuously going around in
circles in the development groups.  Apparently there is an implementation
of records for Algol that is not generally usable except by SIM.

>My
>last expereience was on MCP 3.6 and I am wondering how the MCP
>has changed since then.  

Being in the middle of it, it's hard for me to say.  The A17 is supported
as well as the Micro-A (an A Series processor in a PC box).  Lotsa new
peripherals are supported.

>How well is the semantic database software
>being accepted?  

I don't know.

>Any new developments in distributed computing?

There is BNAV2.  I don't know if that is a new development.  It does have
some interesting features, however.

>Before I leave, does Unisys still provide
>the source code for the MCP and other software?  

Yes, but for an extra fee now.

>Are TCP/IP and
>other non-Unisys protocols supported?

Yes, but other than TCP/IP, I'm not sure.

Barry Traylor
Unisys Large A Series Engineering
barry@tredydev.unisys.com
barry@prc.unisys.com (next door)