[comp.databases] Sybase bulk copy strangeness

cheeks@cobalt.uucp (Mark Costlow) (02/20/91)

Hi.  I'm seeing some strange behaviour with Sybase's bulk copy mechanism.
I'm wondering if anyone in netland has seen something similar or has any
suggestions.

	I have a table which is very large (by my standards anyway).  It
contains two "text" fields.  The number of records is moderate, about 200k,
but the data size of the table is about 2 GB.  There are no indexes on the
table.  I've been using bulk copy (bcp) to load the data.  When I first
started loading data, bcp reported that it was loading about 11 rows per
second.  As more and more data was loaded, that number would decrease --
after 1 GB was loaded the number was about 4 rows per second.  That would
be fine, except that somewhere around 1.3 GB of data, the number
suddenly dropped to about 1.  Now it's falling slowly again, and is under
.8 rows per second.

	In my current situation this sucks, but I don't expect to solve the
problem in time to do any good for now, but I would like to know if there's
something I can do to speed things along next time I have to do this.  Is
there some better way of organizing the table?  Would splitting the two
large text fields into two tables help?

	The disk configuration of my database is that it has 4 SCSI disks,
1 660 MB and 3 1GB (all Fujitsu), spread across 2 controllers, running on a
Sun 4/470 which doesn't run anything else.

If anyone has any insight into this sort of thing, I'd appreciate the
help.  (I'll talk to sybase support when I get time).  Thanks.

cheeks@edsr.eds.com    or     ...uunet!edsr!cheeks

nobody@blia.sharebase.com (Nobody at all) (02/22/91)

You failed to mention if there are any secondary indices on the
table.  Many systems allow creation of secondary indices after
the fact, which makes the initial bulk load run significantly
faster.  There is a large overhead associated with updating
secondary indices during a massive load.  When done as an
adjunct task after the load, the internal sort and/or block
splits for the index are avoided, and the cost of the
bulk load + secondary create and load < bulk load with secondary indices.

Observations from QBE.


J.O.Beard   jeffb@sharebase.com