[comp.lang.c++] Zortech "limitation"

nelson_p@apollo.HP.COM (Peter Nelson) (02/11/90)

  I've been using the Zortech C++, Version 2.0 compiler
  to write a CA (Cellular Automata) tool, which I've 
  discussed elesewhere on Usenet, for my PC.  I'm not
  currently taking advantage of its C++ features -- I
  thought I would do that later as an educational exercise--
  for now I'm using vanilla C. 

  But I DO need to handle large, 2-dimensional arrays
  and Zortech seems to have a problem with that.

  The problem is that my total global memory will exceed 
  64K, although I am grudgingly willing to settle for having
  no *one* array exceed that size if that would help here.

  Those who are unfortunate enough to be familiar with 80x86
  architecture are well aware of the weird, segmented addressing 
  scheme that those computers use and how this has forced compiler
  makers to create bizarre "memory models" to handle it.  Zortech
  offers two memory models (Large and Compact) which allow the 
  program to access global or static memory which may exceed 64K.
  (They don't offer a "Huge" memory model, as Microsoft does).

  But they do not apparently offer any way to access memory > 64K
  AS 2-DIMENSIONAL ARRAYS, which is the logical data structure 
  for cellular automata.   All I apparently can do with their 
  product is malloc a chunk of space and access it via pointers.
  If I wanted to do pointer arithmetic all over the place I would
  use Assembler!   Zortech C/C++ is allegedly a high-level language
  but their manual describes this as a "limitation" of their product.
  I would call it a bug. 

  I did call their Arlington office and spoke to several people
  who were unable to provide a workaround but they did suggest that
  the legendary Walter Bright, who has been known to appear in 
  the netherworld of Usenet, might have some ideas about this. 

  If anyone has some ideas on this I would appreciate it if they
  would send me email, since for the last week or so we have not been 
  getting new Usenet postings at this site. 

                                               ---Peter


  PS-  The Arlington office mentioned a Zortech BBS in Washington
       state which I dialed and got a carrier, but for some reason 
       I couldn't talk to it.  My modem *thought* it made a connection
       but nothing I typed got echoed, I didn't get any characters from
       it, and although it never hung up on me, eventually I got bored
       and hung up on it.   Comments, anyone?
                                     

 

nelson_p@apollo.HP.COM (Peter Nelson) (02/12/90)

                                       

   I received a number of responses to my query about how to
   allocate large, global 2-dimensional arrays in Zortech C++,
   where no individual array would be > 64K but the total
   space involved for all of them would exceed 64K.  One
   of my requirements is that I can access them as 2-dimensional
   arrays and not have to access them via pointer arithmetic.

   Most of them involved extensive use of malloc and pointers.

   The whole idea of using a high-level language in the first
   place is to make the architectural ideosyncracies of the 
   machine INVISIBLE to the user and allow the user to express
   the program in terms of his problem.  In my case that means
   2 dimensional arrays.   

   There is no **technical reason** why the Zortech compiler
   can't allow you to allocate arbitrarily large amounts of
   global memory, up to the limitations of available RAM, and
   simply manipulate the DS register to access it.  Sure, there's
   more overhead, but the manual already warns the user that the
   use of the C or L memory models is less efficient, so we
   must presume that the user is willing to pay that price.  

   On a virtual memory machine I know that if I allocate 
   an array that exceeds the size of my available physical 
   RAM I'm going to pay a huge performance penalty due to
   paging.   But if I'm willing to live with that then I 
   don't have to write my program any differently.    

   80x86 machines have their segmented architecture but virtually
   ALL computers have lots of addressing modes and different length
   address or offset registers, with different numbers of ticks
   required to execute them.  But our C compilers here at work don't
   ask me what addressing modes or register sizes to use.  Hell, 
   I don't have to even worry about rewriting my program when
   I go from, say, a 68030 platform to an Apollo DN10000 (RISC).
   If I *want* to tune the program I have large selection of 
   options, but I'm not forced to use them.  

   With the Zortech product, as well as some other DOS products,
   I apparently have to return to those "golden days of yesteryear"
   and be a hobbyist hacker again.  And I thought I'd paid those 
   dues and graduated to software adulthood already.

                                                ---Peter

                                                                

   
      

bright@Data-IO.COM (Walter Bright) (02/13/90)

Newsgroups: comp.lang.c++,comp.lang.c
Subject: Re: Zortech "limitation"
Summary: 
Expires: 
References: <48910321.20b6d@apollo.HP.COM>
Sender: 
Reply-To: bright@Data-IO.COM (Walter Bright)
Followup-To: 
Distribution: usa
Organization: Data I/O Corporation; Redmond, WA
Keywords: 

In article <48910321.20b6d@apollo.HP.COM> nelson_p@apollo.HP.COM (Peter Nelson) writes:
<  But I DO need to handle large, 2-dimensional arrays
<  and Zortech seems to have a problem with that.
<  The problem is that my total global memory will exceed 
<  64K, although I am grudgingly willing to settle for having
<  no *one* array exceed that size if that would help here.

Instead of having:
	int array[N][M];	/* array of N columns		*/
try:
	int (*array[N])[M];	/* array of N pointers to columns */

	/* Create the columns	*/
	for (i = 0; i < N; i++)
		array[i] = (int (*)[M])malloc(sizeof(int [M]));

	#define array_access(i,j)	((*array[i])[j])

(I wrote this off the cuff, so it may have a syntax error, but I've done
this before and it works fine. In fact, it is *faster* than doing huge
arithmetic.)

<  If I wanted to do pointer arithmetic all over the place I would
<  use Assembler!   Zortech C/C++ is allegedly a high-level language
<  but their manual describes this as a "limitation" of their product.
<  I would call it a bug. 

There's no assembler in the example above. One man's bug is another's feature.
C is not a high-level language, it's a "portable assembler", thus the
limitations and capabilities of the underlying instruction set are
reflected in the source code. I occasionally get "bug" reports that the
compiler does not make the PC look like a VAX.

<  PS-  The Arlington office mentioned a Zortech BBS in Washington
<       state which I dialed and got a carrier, but for some reason 
<       I couldn't talk to it.

The phone number is (206) 822-6907. The BBS works fine, and has been for
years. It gets heavy use. It uses a Hayes 2400 baud external modem. I
suggest you try again.

mark@Jhereg.Minnetech.MN.ORG (Mark H. Colburn) (02/13/90)

In article <48910321.20b6d@apollo.HP.COM> nelson_p@apollo.HP.COM (Peter Nelson) writes:
>  But they do not apparently offer any way to access memory > 64K
>  AS 2-DIMENSIONAL ARRAYS, which is the logical data structure 
>  for cellular automata.   All I apparently can do with their 
>  product is malloc a chunk of space and access it via pointers.
>  If I wanted to do pointer arithmetic all over the place I would
>  use Assembler!   Zortech C/C++ is allegedly a high-level language
>  but their manual describes this as a "limitation" of their product.
>  I would call it a bug. 

Well, it is not really a bug.  You can malloc the huge hunk of memory
and then treat it as a two dimensional array of whatever form you
like (or three, or four dimensional as well).  The compiler will do
the pointer arithmetic for you.  It would have to do so regardless of
how you declared the array.

The following section of code should do what you want:

	
	#include <stdio.h>
	#include <stdlib.h>	/* malloc prototype here */

	typedef unsigned char 	my_array_type;

	my_array_type 	      **my_array;
	int			i;
	int			j;


	int
	main()
	{

	    if ((my_array = malloc((size_t)(WIDTH * HEIGHT))) == NULL) {
		printf("Malloc error");
		exit(1);
	    }
	    for (i = 0; i < HEIGHT; i++) {
		for (j = 0; j < WIDTH; j++) {
		    my_array[i][j] = i*j;
		    printf("%6d ");
		}
		printf("\n");
	    }
	}



The initialization section is contrived, and is there simply to show
that you can use the my_array pointer as if it were declared as:

	my_array_type	my_array[WIDTH][HEIGHT];

With no problems.  The code example assumes ANSI C, otherwise the
malloc call should be cast correctly, but since Zortech is ANSI
Compliant, you shouldn't have any problems.

The my_array_type is defined just to show that you could use virtually
any type for the array that you want, including structures, etc.




-- 
Mark H. Colburn                       mark@Minnetech.MN.ORG
Open Systems Architects, Inc.

williams@umaxc.weeg.uiowa.edu (Kent Williams) (02/13/90)

C'mon, dude.  We all know the Intel Processors are brain-damaged.  This is
pretty old news.  Besides, what's the big deal about calling malloc to build
a >64K two-dimensional array?  On the PC, the code generated is going to
be better than if Walter did you the 'favor' of giving you huge arrays.
So you just do something like:

		something **huge_array = (something **)malloc(N*sizeof(something *));
		for(int i = 0; i < N; i++)
				huge_array[i] = (something *)malloc(sizeof(something)*M);

And the normal array subscripting semantics apply, TRANSPARENTLY, AS THOUGH
THIS WERE A LANGUAGE FEATURE PUT THERE TO MAKE THINGS EASY FOR THE PROGRAMMER!
How about that?

In this case the code generated is something like:

		something& access(i,j) {
				basereg = *(huge_array + i)
				indexreg = j * sizeof(something);
				return *(basereg + indexreg);
		}

Whereas if you had huge arrays its something like:

		something& access(i,j) {
				basereg = hideous_segment_calculation(huge_array,i);
				indexreg = hideous_segment_calculation(basereg,j);
		}

Which would you rather have under the hood?

--
                               
Kent Williams                  "What's an Address Bus?  How do Icons work?" 
williams@umaxc.weeg.uiowa.edu  -- Advertisement for Time-Life Books 

jimad@microsoft.UUCP (JAMES ADCOCK) (02/16/90)

In article <619@ns-mx.uiowa.edu> williams@umaxc.weeg.uiowa.edu.UUCP (Kent Williams) writes:
XC'mon, dude.  We all know the Intel Processors are brain-damaged.  This is
Xpretty old news.  Besides, what's the big deal about calling malloc to build
Xa >64K two-dimensional array?  On the PC, the code generated is going to
Xbe better than if Walter did you the 'favor' of giving you huge arrays.
XSo you just do something like:

Looking towards the future, IMHO, make sure you get at least a 386sx, so
that you will be able to run flat model.  To do any kind of serious OOP,
you need at least large model, but large model OOP leads to pretty hideous
machine code.  Flat model OOP seems to be the way to go, and should help
portability a lot, as long as people consider byte ordering.

PS: What micro-processor design *isn't* brain-damaged?

[standard disclaimer]

nelson_p@apollo.HP.COM (Peter Nelson) (02/16/90)

 bright@Data-IO.COM (Walter Bright) posts...

><  If I wanted to do pointer arithmetic all over the place I would
><  use Assembler!   Zortech C/C++ is allegedly a high-level language
><  but their manual describes this as a "limitation" of their product.
><  I would call it a bug. 
>
>There's no assembler in the example above. One man's bug is another's feature.
>C is not a high-level language, it's a "portable assembler", thus the
>limitations and capabilities of the underlying instruction set are
>reflected in the source code. I occasionally get "bug" reports that the
>compiler does not make the PC look like a VAX.

   I don't think its the job of the compiler to make one brand of com-
   puter look like another, whether it's a case of a PC looking like
   a VAX or a Mac looking like a Sun.   Rather, it is to make all
   computers look like the same virtual machine, in this case a
   "C" computer.  

   The key word in Mr. Bright's above paragraph is "portable". 
   Languages achieve portability by allowing the programmer to 
   write his code in a way which is not dependent on the architectual
   whims or affectations of a particular hardware vendor.  In order
   to achieve this it is necessary to obscure the underlying features of
   the target machine's architecture.   In principle I ought to be
   able to take a program which I wrote on my HP/Apollo DN10000
   and port it to my PC or my friend's Mac or my wife's DG Aviion just
   by recompiling it.   This is NOT achieved by making the Aviion, PC,
   or Mac look like a DN10000.  

   In practice, of course, mapping C to a particular architecture may
   be much harder in some cases than others.   80x86's certainly 
   offers some challenges to the compiler writer and the programmer
   may sacrifice some performance to have the architecture hidden from
   him, but that should be his choice.   In today's heterogeneous
   environments, portability is becoming increasingly important and
   I may prefer to take a performance hit if it means not having to 
   rewrite my code when I go from an Apollo to a PC. 

                                               ---Peter
                                                                

   
   PS-  A philosophical point:  The average computer has become vastly
        more powerful in recent years.  An XT might be a 0.5 MIPS 
        machine, but 386's are running 5-7 MIPS and '486's are claiming
        10-15 MIPS.   Motorola is claiming 20 MIPS for their 25 MHz
        68040 and say they will have a 50MHz part out in a year.

        In such an environment productivity, ease of maintenance, 
        readability of code and portability of code may outweigh
        the small loss of efficieny suffered by having the compiler
        generate some DS-register manipulations behind the scenes,
        however much a hack this might seem to the compiler-writer.

        Memory and CPU cycles are a LOT cheaper than they used to be 
        and our priorities should reflect this.

bright@Data-IO.COM (Walter Bright) (02/17/90)

In article <48aa63d6.20b6d@apollo.HP.COM> nelson_p@apollo.HP.COM (Peter Nelson) writes:
<   Languages achieve portability by allowing the programmer to 
<   write his code in a way which is not dependent on the architectual
<   whims or affectations of a particular hardware vendor.

	To achieve portability you must code to the common denominator between
	all the platforms. Note that I don't know of any reasonable method
	to implement on the 8086 things like:
		func()
		{	char array[70000];
			...
		}
	The 8086 hardware simply doesn't support it.

<   In principle I ought to be
<   able to take a program which I wrote on my HP/Apollo DN10000
<   and port it to my PC or my friend's Mac or my wife's DG Aviion just
<   by recompiling it.   This is NOT achieved by making the Aviion, PC,
<   or Mac look like a DN10000.  

	Unless you adhere to portable constructs, making the PC look
	like a DN10000 is the only way.

	Zortech's philosophy is to allow the developer to create a PC
	product with maximum speed/size efficiency. In order to do this,
	the behavior of the compiler matches the behavior of the
	underlying machine. Fighting the compiler and the PC environment
	is counterproductive to producing a PC application.

	Obviously, you disagree with this point of view. I suspect that
	with your philosophy, Smalltalk is a better language for you than
	C++.

jimad@microsoft.UUCP (JAMES ADCOCK) (02/27/90)

>	To achieve portability you must code to the common denominator between
>	all the platforms. Note that I don't know of any reasonable method
>	to implement on the 8086 things like:
>		func()
>		{	char array[70000];
>			...
>		}
>	The 8086 hardware simply doesn't support it.

[standard disclaimer on]

I disagree with this last statement.  8086 hardware *does* support such
a construct.  Rather, it is 8086 compilers that typically don't support
putting a 70K allocation on the stack.  Many 8086 compilers *do* support
70K+ sized objects allocated statically, or from heap.  [I would claim 
putting a 70K allocation on stack is not a good thing to do in any case,
on any machine.  Many [most?] machines/compilers/linkers will have problems
with such an approach, since default stack allocations are much smaller 
than this.]  Even on 32-bit machines it is not too uncommon to run into
16-bit restrictions from their compilers -- such as maybe allowing huge
arrays, but restricting structures to less that 64K.  The *optionally*
segmented architecture of the 80x86 family allows compiler writers many
choices trading off memory models verses code size/complexity.  The most
important memory models are [informally]:

memory model	 code size	data size

small		16		16
large		32		16+16
huge		32		16+32
flat		32		32

where 16+32 means using a 32-bit offset from a 16-bit base address.
The 16-bit base offset is multiplied by some number, typically 8 or
4096, to allow the 16-bit base address to span a range of bytes
greater than 2^16.

An interesting characteristic of the 80x86 family is that the chips can
be put into a mode where they *either* efficiently run code using 8/16
bit data sizes *or* they can be put in a mode where they efficiently
run code using 8/32 bit data sizes.  In 8/16 bit mode 32bit sized data
requires an extra byte in the machine code.  In 8/32 bit mode 16bit sized
data requires an extra byte in the machine code.  Thus, IMHO, *either* small
or flat models generate particularly pleasing machine code.  You need at
least a 386sx to run flat model, but predecessors are all capable of
running small, large, and huge.

IMHO, small model is adequate for learning C++ code.  IMHO, flat model
is the way to go for serious OOP projects.  Flat model is essentially
identical to the memory model used by most 32-bit Un*x machines,  which
should help portability.  Put 70K+ sized objects on a 368 stack if you
insist -- the architecture supports good automatic growing of the stack.
But you'll probably have problems with lots of other machines.

Like I said before, IMHO, if you're buying a new PC, get at least a 386sx.
[But then again, if doing C++ development, you probably want the fastest
 machine you can get your hands on :-]

[standard disclaimer off]