glenn@stsim.UUCP (Glenn Ford) (08/17/89)
I am writing a small project (not related to work, personal) that has to do search's on a sorted database. I search off the keyword which is string of N length and is unique to all other keys. For right now I do a binary search off the database, but this tends to be slow when run off a floppy drive (yes, it has to run on a floppy drive in SOME cases). My question is: Is there a faster, yet fairly easy to implement, routine that can search my database? Faster than binary search.. I don't want B-Tree's. My database is allready built. Glenn Ford ..uunet!ocsmd!stsim!glenn
chris@mimsy.UUCP (Chris Torek) (08/18/89)
In article <184@stsim.UUCP> glenn@stsim.UUCP (Glenn Ford) writes: >... search's on a sorted database. ... Is there a faster, yet fairly easy >to implement, routine that can search my database? Faster than binary >search.. I don't want B-Tree's. My database is allready built. An interpolative search will run in O(log log n) time rather than O(log n) in the `average' case. Coding these is not too terribly tricky (one assumes that if a[0] is 0 and a[10] is 100, then a[1] is 10, a[2] is 20, etc., until it is otherwise demonstrated). But what you really want to do is minimise read operations (since you are reading off slow media), which suggests that the algorithm be tweaked slightly: read big blocks (if they are contiguous media-wise) and look at the edges of the block before switching blocks. You will do better, though, if you rebuild the database. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
wjr@ftp.COM (Bill Rust) (08/18/89)
In article <184@stsim.UUCP> glenn@stsim.UUCP (Glenn Ford) writes: >I am writing a small project (not related to work, personal) that has to >do search's on a sorted database. I search off the keyword which is string >of N length and is unique to all other keys. For right now I do a binary >search off the database, but this tends to be slow when run off a floppy >drive (yes, it has to run on a floppy drive in SOME cases). My question >is: Is there a faster, yet fairly easy to implement, routine that can >search my database? Faster than binary search.. I don't want B-Tree's. My >database is allready built. First, I am not sure why you don't want B-Trees (or some variant), since they are pretty easy to implement and quite fast. You don't mention how big the database is or what percentage of the record a key takes up. If the key is a small part of record, making a separate key file that gets read into your application in one piece, will really speed up access into the database, since the keys are searched at RAM speed not disk speed. Also, if the database is a fixed size, a hashing function (directly converting the key to a record position) may be most appropriate. If you want a really good discussion of all this, consult Knuth. Just remember that you want to minimize the number of read operations to disk. One possible longshot, since you are working with floppies, is just read in the whole database. On a PC, you can get upwards of 500K of data read in, with a small program, and, at that point, your searching technique matters a lot less. Bill Rust (wjr@ftp.com)
cliffs@sunrock.east.sun.com (Clifford C. Skolnick) (08/19/89)
In article <184@stsim.UUCP> glenn@stsim.UUCP (Glenn Ford) writes:
I am writing a small project (not related to work, personal) that has to
do search's on a sorted database. I search off the keyword which is string
of N length and is unique to all other keys. For right now I do a binary
search off the database, but this tends to be slow when run off a floppy
drive (yes, it has to run on a floppy drive in SOME cases). My question
is: Is there a faster, yet fairly easy to implement, routine that can
search my database? Faster than binary search.. I don't want B-Tree's. My
database is allready built.
Glenn Ford
..uunet!ocsmd!stsim!glenn
Perhaps you could make a better guest on your binary search start. Say
look at the first character and make a guess about where to start in the
file. You can also branch on much less, less, equal, greater, and much
greater. This will be a bit faster than a binary search. You say you
are using a binary search now, so not much more logic is needed.
Cliff
glenn@ocsmd.ocs.com (glenn ford) (08/23/89)
In article <706@ftp.COM> wjr@ftp.UUCP (Bill Rust) writes: >with floppies, is just read in the whole database. On a PC, you can >get upwards of 500K of data read in, with a small program, and, at >that point, your searching technique matters a lot less. This is what I ended up doing. I just read the whole database into memory and then search from there. My database is fairly small, but when forced to run off a floppy drive it was too slow. Thanks for the replies to all! Glenn Ford ..uunet!ocsmd!stsim!glenn