[comp.lang.modula2] Sets.

MSRS002@ECNCDC.BITNET.UUCP (10/01/87)

How big should a set be?

Wirth created the idea of a bitset because it would be easy to implement
a very efficient way to do sets the size of the computer's word.  This
means sets are likely to be 8, 16 or 32 bits, possibly other sizes,
depending on the implementation.

Bruce MacLennan in his book, Principles of Programming Languages presents
an idea which I quite agree with.  "The only reasonable numbers are zero,
one, and infinity."

Since zero and one are not very usefull numbers when you're dealing with
sets ( they're rather special case sets ), I would propose a that the limit
be infinity.  The actual limit would be dependant on the available memory
for storing the set, but it could be a very large number.  Since the type
of each set is known, the compiler could determine how much memory is
required to store the set.  Now, we have the ability to say something
like SET OF CARDINAL, since most implementations have cardinal values of
0 .. 65535.  If you were to store one member per bit, the entire set
could be represented in 8k {of memory, assuming 8 bit bytes.  This is a
reasonable size block of memory to consider even on older 8 bit machines.

                                                     Tom Ruby
                                                  MSRS002@ECNCDC

sjl@myrias.UUCP (Stuart Lomas) (10/03/87)

> How big should a set be?
> 
> Wirth created the idea of a bitset because it would be easy to implement
> a very efficient way to do sets the size of the computer's word.  This
> means sets are likely to be 8, 16 or 32 bits, possibly other sizes,
> depending on the implementation.
> 
> Bruce MacLennan in his book, Principles of Programming Languages presents
> an idea which I quite agree with.  "The only reasonable numbers are zero,
> one, and infinity."
> 
> Since zero and one are not very usefull numbers when you're dealing with
> sets ( they're rather special case sets ), I would propose a that the limit
> be infinity.  The actual limit would be dependant on the available memory
> for storing the set, but it could be a very large number.  Since the type
> of each set is known, the compiler could determine how much memory is
> required to store the set.  Now, we have the ability to say something
> like SET OF CARDINAL, since most implementations have cardinal values of
> 0 .. 65535.  If you were to store one member per bit, the entire set
> could be represented in 8k {of memory, assuming 8 bit bytes.  This is a
> reasonable size block of memory to consider even on older 8 bit machines.
> 
>                                                      Tom Ruby
>                                                   MSRS002@ECNCDC

Hear hear! The only reason for limiting set sizes is to reduce (by a tiny
amount) the work required to implement the compiler. Long bit vectors are
not difficult to implement, and can be very useful. The presence of long
bit vectors in no way reduces the efficiency of short bit vectors, so it
is fallacious to claim that set sizes must be limited for efficiency
reasons. The element type of a set should be able to be ANY SCALAR TYPE
including LONGINT and LONGCARD (and, of course, CHAR).

That being said, I must give some credit to the opposition. I think the
proposal here is that these long sets be implemented as bit vectors. It
certainly is a major problem for the compiler implementer if they are to
be anything else. Long bit vectors are NOT an efficient way to represent
large sparse sets, but they are a nice way to handle large dense sets like
bitmaps. Any user who has an application for large sparse sets should
choose some other representation (such as binary trees) and write a
module to handle them.

Now for the crux of the matter. I believe that it should be explicit in
any eventual Modula-2 standard that "sets" are indeed "bit vectors". It
is simply not possible to choose an implementation for large sets that
will handle every application well, so the language-supported implementation
(if there is to be one at all) should be the simplest available.

Stuart Lomas
Myrias Research Corporation
Edmonton, Alberta, Canada
{ihnp4,ubc-vision,rutgers}!alberta!myrias!sjl

heiser@ethz.UUCP (Gernot Heiser) (10/14/87)

In article <INFO-M2%87093023532046@UCF1VM> Info-Modula2 Distribution List <INFO-M2%UCF1VM.bitnet@jade.berkeley.edu> writes:
>Wirth created the idea of a bitset because it would be easy to implement
>a very efficient way to do sets the size of the computer's word.  This
>means sets are likely to be 8, 16 or 32 bits, possibly other sizes,
>depending on the implementation.

I rather believe that this was done to support bit-fiddeling in Modula-2, since
BITSETs aren't really good for anything else. None of Wirth's compilers support
sets bigger than BITSETs. In this sense, a Modula-2 set is not an abstract data
type but an image of a machine word. The question is  whether they ought  to be
called "sets".
-- 
Gernot Heiser                      Phone:       +41 1/256 23 48
Institute for Integrated Systems   CSNET/ARPA:  heiser%ifi.ethz.ch@relay.cs.net
ETH Zuerich                        EARN/BITNET: GRIDFILE@CZHETH5A
CH-8092 Zuerich, Switzerland       EUNET/UUCP:  {uunet,...}!mcvax!ethz!heiser

MSRS002@ECNCDC.BITNET ("THE DOCTOR.") (10/27/87)

Hmmm, perhaps the BitSet should be defined to be a word size, and uses
exclusively for bit twiddeling and implementation specific things.  Then,
I think we would rather need a more abstract set construct.  I still think
the compiler should allocate as many words as necessary to hold the range
which should be expected in the set ( boy, that made a lot of sense ), and
the limit should be infinity, or at least a rather large approximation of
infinity.  The maximum cardinal value supported by the implementation might
be a reasonable approximation to infinity.  If a cardinal is 16 bits, then
such a set, SET OF CARDINAL would only take up 8k bytes, which isn't too
extreme, not even for my old 8 bit system.

I hope the forthcomming standard says something reasonable about sets.

                                                    Tom Ruby
                                                 MSRS002@ECNCDC

COK2@mts.durham.ac.UK (Barry_Cornelius) (10/27/87)

Tom Ruby (alias "THE DOCTOR." <MSRS002@ECNCDC.EARN) says:
> I hope the forthcomming standard says something reasonable about sets.
I would like to state the situation re sets as far as IST/5/13 (the
Working Group of the British Standards Institution) is concerned:
   1.   BITSET will be associated with a word.
   2.   The actual representation of other sets will be left to
        the implementation.   I guess the representation of a set-type
        may vary from one set-type to another, depending on the number
        of values in the set-type.
   3.   It will be mandatory to provide SET OF CHAR.
   4.   Even if an implementation provides set-types with up to 128
        elements, it should still be able to compile code involving:
           TYPE yearset = SET OF [1900.2000];
        [This is different from Pascal.]
The above are my interpretations of IST/5/13's current position.
I may be wrong.
==
Barry Cornelius
==
Address:
   Computer Science Group, School of Engineering and Applied Science,
   University of Durham, Durham, DH1 3LE, England
Telephone:
   My office: Durham (091 or +44 91) 374 2638
   Secretary: Durham (091 or +44 91) 374 2630
Electronic Mail Addresses:
   JANET:       Barry_Cornelius@uk.ac.dur.mts
   ARPANET:     Barry_Cornelius%mts.dur.ac.uk@cs.ucl.ac.uk
                mcvax!mts.dur.ac.uk!Barry_Cornelius@seismo.css.gov
                Barry_Cornelius%mts.dur.ac.uk@seismo.css.gov
   UUCP:        ...mcvax!ukc!mts.dur.ac.uk!Barry_Cornelius
   BITNET/EARN: Barry_Cornelius%DUR.MTS@AC.UK

hal@pur-phy (Hal Chambers) (10/27/87)

In article <INFO-M2%87102620254979@UCF1VM> Info-Modula2 Distribution List <INFO-M2%UCF1VM.bitnet@jade.berkeley.edu> writes:
>Hmmm, perhaps the BitSet should be defined to be a word size, and uses
>exclusively for bit twiddeling and implementation specific things.  Then,
>I think we would rather need a more abstract set construct.
...
>                                                    Tom Ruby

This is what I have been saying ever since the day I first encountered
Modula-2.  Since BITSET is explicitly machine dependent it should be
part of the pseudo-module SYSTEM and NOT the default type for sets.

There seems to be almost universal agreement on the desirability of an
abstract data type SET.  The majority of these also seem to agree that
there should be no arbitrary restriction on the size other than MAXCARD.

Hal Chambers

schaub@sugar.UUCP (10/31/87)

In article <921@pur-phy>, hal@pur-phy (Hal Chambers) writes:
> This is what I have been saying ever since the day I first encountered
> Modula-2.  Since BITSET is explicitly machine dependent it should be
> part of the pseudo-module SYSTEM and NOT the default type for sets.
> Hal Chambers

That's exactly why we decided to put it into SYSTEM in M2Amiga.
-- 
        //
       //  Markus Schaub
   \\ //   uunet!nuchat!sugar!schaub
    \X/    (713) 523 8422