[net.database] speed of ORACLE, large INFORMIX DBMS's

bradbury@oracle.UUCP (Robert Bradbury) (07/21/86)

In article <235@hdsvx1.UUCP>, hoffman@hdsvx1.UUCP (Richard Hoffman) writes:
>  
> > Sorry Greg, but I don't think this is incorrect.
> > To my knowledge our advertising only claims to compress the indexes.  We do
> > not compress the data itself (nor do any of the other RDBMS to my knowledge)
>                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> Ingres does offer primitive data compression for text strings (the supress 
> trailing blanks and nulls using run-length compaction) in the base tables
> themselves, not just the indexes.

We have a mis-understanding of terms here.  By "compression" I meant squeezing
the data into a smaller format (like the compress or pack utilities on UNIX).
I consider Oracle's treatment of indexes using forward and backward compression
to be actual data compression as it has to be "un-compressed" to be useful.
(The actual un-compression isn't that expensive and you can create uncompressed
indexes if space is cheap and time is expensive).

The idea of storing trailing blanks (or padding fields to their maximum width)
makes so little sense to me that I'm surprised that anyone would do it.
Oracle certianly doesn't.  We do allow you to bind data to a data-type which 
will not strip trailing blanks on insertion but I doubt it is commonly used.

As far as using run-lenth compaction (ala Ingres), do sequences of 3 or more
identical characters occur often enough to justify the expense of scanning?

What I meant by the (nor do any ...) comment was that I suspected no one
actually encoded the data (7-bit ascii massaged to fill 8 bit bytes for
example) owing to the problems of scanning and ordering the data.

Other "space saving" (data compression?) features found in Oracle include:
 - Null fields require no storage at all.
 - Numbers are variable length (0 requires 1 byte, 1-99 requires 2 bytes,
   100-9999 3 bytes, 10000-999999 4 bytes, etc.).  It depends on your
   application whether this saves you a little or costs you a little.
   It is done primarily to allow our numeric range to be indepenent of
   (and exceed) the capacity of a machines f.p. unit and to allow
   transparent exporting and importing of data across machines with
   different architectures.


-- 
Robert Bradbury
Oracle Corporation
(206) 364-1442                            {ihnp4!muuxl,hplabs}!oracle!bradbury

ark@ut-sally.UUCP (Arthur M. Keller) (07/31/86)

In article <456@oracle.UUCP> bradbury@oracle.UUCP (Robert Bradbury) writes:
>What I meant by the (nor do any ...) comment was that I suspected no one
>actually encoded the data (7-bit ascii massaged to fill 8 bit bytes for
>example) owing to the problems of scanning and ordering the data.

The problems of indexing, etc., that apply to data compression (such
as Hammond coding) also apply to encryption.  If you are going to
encrypt the data anyway, using data compression techniques before
encryption incurs little cost over encryption alone, saves space, and
may make unwanted decryption by others more difficult.

Arthur

-- 
------------------------------------------------------------------------------
Arpanet: ARK@SALLY.UTEXAS.EDU
UUCP:    {gatech,harvard,ihnp4,pyramid,seismo}!ut-sally!ark