[comp.lang.modula3] Typefinger printing in SRC M3

wyant@saber.com (02/04/91)

Can someone give a $.50 overview of the type fingerprinting
algorithms used in SRC M3 ? My surface impression from scanning
the sources is that works as follows. Each type (scalar or constructed)
has an operation to return a textual type name. When fingerprinting a
constructed type (esp. an object/record) a string is constructed containing
the recursive enumeration of all scalar type names used in that constructed
type. This string is then treated as input to the "poly" function to compute
a 64-bit fingerprint.

Is this reasonably accurate or am I way out in left field >>

Thanks in advance :-)

Geoff Wyant
wyant@saber.com

hudson@yough.ucc.umass.edu (Rick Hudson) (02/11/91)

  My surface impression from scanning the sources is that works as
  follows. Each type (scalar or constructed) has an operation to return
  a textual type name. When fingerprinting a constructed type (esp. an
  object/record) a string is constructed containing the recursive
  enumeration of all scalar type names used in that constructed type.
  This string is then treated as input to the "poly" function to compute
  a 64-bit fingerprint.  
  Is this reasonably accurate or am I way out in left field >>


This was my impression of how SRC M3 works also. It is a reasonable
probabilistic approach to the problem. A non-probabilistic approach
would be to use the fingerprint only for determining if types were
*not* equivalent and actually compare structures before deciding
if they were equivalent.
- Rick

muller@src.dec.com (Eric Muller) (02/15/91)

In article <9102111538.AA24105@yough.ucc.umass.edu>, hudson@yough.ucc.umass.edu (Rick Hudson) writes:
>   My surface impression from scanning the sources is that works as
>   follows. Each type (scalar or constructed) has an operation to return
>   a textual type name. When fingerprinting a constructed type (esp. an
>   object/record) a string is constructed containing the recursive
>   enumeration of all scalar type names used in that constructed type.
>   This string is then treated as input to the "poly" function to compute
>   a 64-bit fingerprint.  
>   Is this reasonably accurate or am I way out in left field >>

This is accurate. In addition, you have to consider the case of opaque
types. The final phase of linking (practically, when the program
starts, but conceptually at link time) knows everything about the types and
computes a "full" fingerprint, based on the fingerprints of the
partial revelations.

For example, 

INTERFACE I;
TYPE T <: ROOT;
END I.

MODULE M;
TYPE U = T OBJECT a: INTEGER; END;
END M.

When compiling M, we generate a partial fingerprint for "OBJECT a:
INTEGER". A link time, this partial fingerprint is combined with the
fingerprint of type to the right of the "REVEAL T =" that must be
somewhere in the program.

As for the probability of collisions, it is extremely low. You can ask
Andrei Broder (broder@src.dec.com) if you want to know the fine points.

-- 
Eric.