[comp.databases] Corrupted database in Oracle

eik@os.is (Einar Kjartansson) (03/04/89)

We have been using Oracle on HP 9000/840 computer for  about 3 months.
During during this priod the database has become corrupted twice, 
both times resulting in total loss of data. The Oracle support people
in Danmark have not come up with any useful information about this.
One of the first symptoms is that the exp utility which is used to
export data out of the database (i.e. for backups) stops working, 
this is a sample output:
	 
	Connected to: ORACLE V5.1.22.2 - Production	
 	 
 	Enter array fetch buffer size(default is 4096)> 
 	Export file: expdat.dmp > 
 	E(ntire Database), U(sers), or T(ables): U > E 
 	Export Grants (Y/N): N > Y 
 	Export the rows (Y/N): Y >	
 	Compress extents (Y/N): Y >	
 	Exporting the entire data base. 
 	. Exporting user definitions. 
 	. Exporting all space definitions. 
 	. Exporting all clusters. 
 	Oracle Error: ORA-0038:	ksbgrb: incorrect table RBA in row block 
 	 
 	EXPORT terminated due to error 

 Both times that this happened we ended up rebuilding the database from 
 backups, resulting in the loss of several days of work. 

 This raises some questions: 

 Do things like this happen to others? 
	on hp9000/3?? or hp9000/8?? ?
 Why is there no way to patch up a corrupted database? 
 Is our experience with the usefulness of the Oracle support typical?


-- 
  Einar Kjartansson                           | eik@os.is
  Orkustofnun (National Energy Authority)     | eik@geysir.uucp
  Grensasvegi 9, IS-108 Reykjavik, Iceland    | mcvax!hafro!geysir!eik
  Phone: 354-1-83600    Fax: 354-1-688896    Home: 354-1-16407

pavlov@hscfvax.harvard.edu (G.Pavlov) (03/05/89)

In article <115@geysir.os.is>, eik@os.is (Einar Kjartansson) writes:
> 
> We have been using Oracle on HP 9000/840 computer for  about 3 months.
> During during this priod the database has become corrupted twice, 
> both times resulting in total loss of data....
     [extract from dump attempt]
> Connected to: ORACLE V5.1.22.2 - Production	
>  	 
> Exporting the entire data base...... 
>
> Oracle Error: ORA-0038: ksbgrb: incorrect table RBA in row block 
>  	 
> EXPORT terminated due to error 
> 
> Both times that this happened we ended up rebuilding the database from 
> backups, resulting in the loss of several days of work. 
> 
  I fail to understand how anyone can seriously consider purchasing a DBMS 
  that insists on "improving" performance by bypassing a given system's file
  management facility.  It's a cheap method for the vendor, but as the above
  has discovered, may be very expensive to the user.

  The problem may be user error, system glitch, or dbms error.  Regardless of
  which, the user is now faced with a big black box (a large extent of disk
  space known only as a huge "file" to the system) which is difficult to pene-
  trate with the usual system debugging tools and has to be overwritten in
  toto from backup.

  What did this user gain and was it really worth it, if (s)he did ?

   greg pavlov, fstrf, amherst, ny

jas@ernie.Berkeley.EDU (Jim Shankland) (03/07/89)

In article <735@hscfvax.harvard.edu> pavlov@hscfvax.harvard.edu (G.Pavlov) writes:
>In article <115@geysir.os.is>, eik@os.is (Einar Kjartansson) writes:
[about how Oracle has repeatedly soiled its filesystem on an HP 9000/840,
causing loss of several days' work each time ...]

>  I fail to understand how anyone can seriously consider purchasing a DBMS 
>  that insists on "improving" performance by bypassing a given system's file
>  management facility.  It's a cheap method for the vendor, but as the above
>  has discovered, may be very expensive to the user....  The user is
>  now faced with a big black box (a large extent of disk
>  space known only as a huge "file" to the system) which is difficult to pene-
>  trate with the usual system debugging tools and has to be overwritten in
>  toto from backup.

There are some persuasive reasons for rolling your own filesystem when
you're implementing a DBMS, especially on UNIX, whose filesystem serves
DBMS implementors poorly.  The problem is not just speed, but reliability.
Ideally, you'd like the UNIX vendors to provide reasonable file system
services; but if they don't, you're stuck with either using the miserable
services they provide, or rolling your own.

Of course, if you do roll your own, you need to do the whole job.  That
means being damned sure it's robust and correct, *and* providing
filesystem debugging, repair, and analysis tools.  Doing the job
right is by no means "a cheap method for the vendor."

Jim Shankland
jas@ernie.berkeley.edu

"There is no help, for all these things are so,
 And all the world is bitter as a tear."

pavlov@hscfvax.harvard.edu (G.Pavlov) (03/11/89)

In article <3242@sybase.sybase.com>, tim@phobos.sybase.com (Tim Wood) writes:
> 
  Re why vendors implement their own file systems for dbms:

> Because most O/S file systems are not optimal for database operation, 
> especially not for on-line transaction processing.
> 
> It's expensive for the user to use a software package that incurs substantial
> overhead with every database access.  

  I don't have any experience with OLTP, so I won't argue on that score.  

  As I am sure you know much better than I do, there are a very large number of
  factors, many unique to an application, that affect the performance of a dbms.
  I have seen "slow" dbms's that utilize their own file management and "fast"
  ones that don't.

  While I won't argue that there is an overhead penalty, I haven't seen anything
  that supports the statement that it is "substantial".  

  I know that Sybase is a good solid product with excellent performance.  But I
  have been subjected to the "we use our own file system so therefore we are 
  better/faster than Brand X who doesn't" hype from second-rate dbms's often
  enough that I seriously doubt that it makes a significant difference.  On the
  contrary, it leads me to suspect that there are serious shortcomings (e.g.,
  second-rate optimizer) elsewhere in such a product.

> The point is not to rely on O/S
> files because the DBMS recovery features are lacking, the point is
> to install a DBMS with complete and robust recovery features.  Then,
> the optimization features do not increase the user's risk.
> 
  greg pavlov, fstrf, amherst, ny

billc@rtech.UUCP (Bill Coffin) (03/13/89)

>From article <531@maxim.ERBE.SE>, by prc@maxim.ERBE.SE (Robert Claeson):
> .... Under System V, Release 3.1 at least, there's a SYNC flag
> for open() and fcntl() that sets synchronous write. 

But that's only sysV.  BSD, I believe, still doesn't have this.

Another good reason to use raw disk access:

You can be clever about physically clustering pages.  This can produce 
substantial performance improvements, since you can suck in an index 
or sequentially scan a table much faster than on vanilla UNIX, where 
pages can be scattered around any old place on the disk, causing longer
seek times.  (Britton-Lee got a lot of mileage out of this ability).

You can also customize buffering to be more appropriate for your needs.

billc@rtech.uucp ( or, if you must, {sun,pyramid,mtxinu,amdahl}!rtech!billc )
<<standard disclaimer>> <<gratuitous witticism>> <<your ad here>>

prc@maxim.ERBE.SE (Robert Claeson) (03/16/89)

In article <2718@rtech.rtech.com>, billc@rtech.UUCP (Bill Coffin) writes:
> >From article <531@maxim.ERBE.SE>, by prc@maxim.ERBE.SE (Robert Claeson):
> > .... Under System V, Release 3.1 at least, there's a SYNC flag
> > for open() and fcntl() that sets synchronous write. 

> But that's only sysV.  BSD, I believe, still doesn't have this.

Most large databases runs under AT&Tix, I think, even though the Berklix
fast file system is better suited for this kind of applications (if it
wasn't for the lack of synchronous write).

> You can be clever about physically clustering pages.  This can produce 
> substantial performance improvements, since you can suck in an index 
> or sequentially scan a table much faster than on vanilla UNIX, where 
> pages can be scattered around any old place on the disk, causing longer
> seek times.  (Britton-Lee got a lot of mileage out of this ability).

I believe that the Berklix file system tries to keep the disk blocks
somewhat in sequence and the free list sorted. As always, I can be wrong.

Let's turn the page and see what AT&Tix 4.0 with the Berklix file system
and the promised transaction processing enhancements can do for databases
using the file system.
-- 
Robert Claeson, ERBE DATA AB, P.O. Box 77, S-175 22 Jarfalla, Sweden
Tel: +46 (0)758-202 50  Fax: +46 (0)758-197 20
EUnet:   rclaeson@ERBE.SE               uucp:   {uunet,enea}!erbe.se!rclaeson
ARPAnet: rclaeson%ERBE.SE@uunet.UU.NET  BITNET: rclaeson@ERBE.SE