[comp.lang.pascal] Larger arrays.

amull@Morgan.COM (Andrew P. Mullhaupt) (09/01/89)

Let me clarify my previous posting complaining about the implementation choice
of the Pascal languages of the world.

It's not the hardware. Microsoft FORTRAN has a completely seamless implementation
of what most people would call 'huge' arrays. It runs just fine (no tremendous
performance penalty for many things) on little old (regular) PC machines. If OS/2
and a hard drive are present, you can allocate an array as big as your disk,
(or the free space on it). 

Those of you who have written databases, for example, do not need to be told how
convenient this can prove. You are also not likely to miss the significant
advantages of Pascal (language by design, not language by accretion) over FORTRAN
for many applications.

It's that the implementation of Pascal has been opportunistic, in believing (as
was once true) that a 64K limitation on array size was a reasonable hostage for
the added efficiency of 16 bit processing. It was once even reasonable to limit
program size to 64K, and not provide 32 bit integers. These vestigial limits
have seen their day, and I have been sitting on the sidelines, eagerly awaiting
the new versions of Pascal, and I have been often saddened to discover that what
on the packaging says (no 64K limit!) applies only to total code, or who knows
what. It's time this bargain with the devil was done away with. 

Pascal has the syntax for dealing with model independent handling of different
size data objects. The often disused keywords "pack" and "unpack" are available
for storing and implicitly accessing bytes or 16-bit words. The overhead they
cause need not be a great penalty - because in loops, for example, there will
be half, or one quarter as many termination checks. There is a Pascal sensible
way to bypass the segmentation limits of the operating system and hardware.

(I say operating system, because some people actually like the way segmentation
can provide protection of code and data...I don't really buy this, and Pascal
has range checking in the first place (unless you turn it off like everyone does)
as long as segments are seen to protect memory, and not just pass a hardware
compromise on through to the user, I don't really have a problem with them. There
are people who want OS/2 to have segments, and I can live with that. Just as long
as I can have one as big as I want, and I want that as big, or bigger, than all
available memory.)

To summarize:

There is no reason why, when in a climate where BOTH Borland and Microsoft ca
implement "Objects", a major change in the language, that most Pascal programmers
will need to LEARN before they can be used correctly and effectively in a Pascal
readable style, that either or both of these worthy houses cannot set aside the
limitations fo the past whose onus grows with  each passing day. Pascal programmers
by and large, know exactly what to do with static arrays, and when to use other
data structures. The first step in algorithm design is the choice of an appropriate
data structure. Making a decent implementation of big arrays is not a problem
beyond solution, (Microsoft has proved this with FORTRAN). Let those of us who
have suffered this mounting disappointment make some noise so Microsoft and Borland
will realize that we care. 

P. S. I really think that we don't have these in Pascal for two reasons:
1.     They don't think enough of us care to make it worthwhile, that is:
   They are afraid that the first one to put in sensible arrays will lose out on
   the specsmanship (benchmarketing) issue: An unrealistic benchmark, like finding
   the first thousand primes a hundred times, might very well point out the cost
   of correct implementation of arrays. In the Pascal market, it is a judgement
   call which side (much bigger/slightly faster) of the issue the market share
   falls in. 

2.     They are worried that some people will be unhappy with them if existing
   source code (which may have already been carefully tweaked to go fast) slows
   down when recompiled under a new version, or worse yet doesn't work due to
   overactive bit-picking efficiency tricks programmers may have unwisely used.


Finally: I would settle for a compiler directive that switched between the little
and big styles, and the restriction that an entire program (all its units, etc.)
be compiled with one setting consistently throughout. This presents problems for
objects with virtual methods, I guess, but after all, objects are not Pascal.

And while I'm at it: There is an interesting issue between how huge pointers are
normalized differently (more or less often) in Microsoft and Turbo C. This is the
kind of issue that C deserves, and Pascal should avoid like the plague. Let's
keep Pascal beautiful.

Andrew Mullhaupt

filbo@gorn.santa-cruz.ca.us (Bela Lubkin) (09/02/89)

In article <362@e-street.Morgan.COM> Andrew P. Mullhaupt writes:
[Long, reasonable argument for "huge" arrays in PC Pascals deleted]
>1.     They don't think enough of us care to make it worthwhile, that is:
>   They are afraid that the first one to put in sensible arrays will lose
>   out on the specsmanship (benchmarketing) issue: An unrealistic benchmark,
>   like finding the first thousand primes a hundred times, might very well
>   point out the cost of correct implementation of arrays. In the Pascal
>   market, it is a judgement call which side (much bigger/slightly faster)
>   of the issue the market share falls in. 
>
>2.     They are worried that some people will be unhappy with them if
>   existing source code (which may have already been carefully tweaked to
>   go fast) slows down when recompiled under a new version, or worse yet
>   doesn't work due to overactive bit-picking efficiency tricks
>   programmers may have unwisely used.
>
>Finally: I would settle for a compiler directive that switched between
>the little and big styles, and the restriction that an entire program
>(all its units, etc.) be compiled with one setting consistently throughout.

I don't know about MicroSoft, but I'm fairly familiar with the "philosophy"
behind Turbo Pascal.  (I quote "philosophy" because I don't think there >is<
a formally thought-out philosophy behind it).  First, I'd be fairly surprised
if Borland came out with huge-array support, except on '386 or other
processors with large linear address space.  But if they did, I'd be quite
astonished if it worked the way you suggest.  I would expect the compiler to
automatically determine, at compile time, whether an array declaration was
less or greater than 64K, and generate the appropriate code for references to
>that array< based on its size.  Since Turbo doesn't have any sort of dynamic
arrays (ignoring trickery with the heap, which as always would be the
programmer's responsibility), there's no question at compile time of the size
of a particular variable.  Turbo already knows how to deal with different
types of variables under the same syntax (witness:

  Var
    F: File Of Integer;
    T: Text;
    I: Integer;
  ...
    Write(F,I);
    Write(T,I);
  ...

Entirely different code is generated).

The only disadvantage to this approach is that support routines for
calculating both kinds of array offset would be pulled into a program that
used both types of array.  But this code would be very small, and incurred
>only< when both were used, so I see no problem.  >Bela<

>Andrew Mullhaupt

Bela Lubkin     * *   filbo@gorn.santa-cruz.ca.us   CIS: 73047,1112
     @        * *     ...ucbvax!ucscc!gorn!filbo    ^^^  REALLY slow [months]
R Pentomino     *     Filbo @ Pyrzqxgl (408) 476-4633 & XBBS (408) 476-4945