[comp.lang.pascal] Quick Directories/Better Sorts

c164-bd@cinna.uucp (John D. Mitchell) (11/20/90)

In article <9011180716.AA29001@gnu.AI.MIT.EDU> Adams@J (jma@gnu.ai.mit.edu) writes:
>Is there a way to do the reproduce FindFirst() + FindNext() commands
>using Inline statements or something else to make them FASTER? I'm writing

The reason it's so slow is that TP's FindFirst()/Next() is that it
switches the Disk Transfer Area (DTA) to the one you give it in the
FindFirst() call each time and then restores it after each call.  It
might even do some other housekeeping stuff.  You can get around this
nasty overhead be setting up your own dta and directly use the
DOS Findfirst/next functions.  I have written a generic directory/file
traversal package and then used that for a bunch of little disk
utilities.  The speed/responsiveness of the utilities has been
plenty quick enough for me (and I am into speed :-)) , but I have seen other 
people's utilities that are faster.  There are ways that you could
directly access the disk tables (using 'undocumented' DOS calls).  I haven't 
ever bothered to figure that stuff out.

>
>II. 
>
>I've got a record structure stored on disk, (let's say 5000) records. I'd like
>to be able to get those 5000 records into MEMORY. Right now, I keep running
>out of stack space. (structure too large).. If possible, can I access 
>other memory in pascal (to store my data) and once I run out of memory,
>can I sort on disk at highspeed... One possible Idea I had was to use an 
>Index in memory (hell, 5000-10000 Integers in pascal shouldn't be too memory-
>intensive...) and then sort the "Index" array in memory... but I can't get

One fairly easy to implement solution would be to use a HeapSort with
the in memory heap some size closest to the magical 64K boundary.
You can find a good description of this in Algorithms by Robert Sedgewick.
Be sure to get the second edition.  I'd probably do it this way.


Good luck,
	John D. Mitchell
	johnm@cory.Berkeley.EDU

Adams@J (jma@gnu.ai.mit.edu) (11/20/90)

johnm@cory.BERKLEY.EDU:

  You described of a "Heap sort" using the heap memory (I'd assume I would
have to set the heap up pretty high ( {$M} ).. I know in C you can set
a variable so that it makes itself in the heap storage,, can this be
done in pascal???

	Also: How can a DTA be set up (variables.. offsets, details mon!)
	and how do I read it, is it just a sector read??

thanks again!
-+-+-+-                                                                 -+-+-+-
John Adams                                       INTERNET -> jma@gnu.ai.mit.edu

 "What you get is all real; I can't put on an act, it takes brains to do
  that anyway..." -- XTC
-=-=-=-                                                                 -=-=-=-

c164-bd@laertes.uucp (John D. Mitchell) (11/21/90)

In article <9011200557.AA07075@gnu.AI.MIT.EDU> Adams@J (jma@gnu.ai.mit.edu) writes:
>johnm@cory.BERKLEY.EDU:
>
>  You described of a "Heap sort" using the heap memory (I'd assume I would
>have to set the heap up pretty high ( {$M} ).. I know in C you can set
>a variable so that it makes itself in the heap storage,, can this be
>done in pascal???

First, remember that a Heap (ala Heap Sort) is a Data structure while a
heap (in the memory management of TP) is a run-time available, dynamic
memory pool that is managed by TP (which uses DOS).  In other words, the
Heap data structure has nothing (inherently) to do with the heap of memory.

In using a Heap Sort to sort N records (where the N records are probably way bigger
than available memory (forget for the moment virtual memory)) in TP under DOS,
you should probably set the {$M} stuff to whatever heap memory the rest of your
application needs plus another 64Kb for the Heap (data structure).  That's just
a design decision.  You could just as well allocated an array of records statically
(i.e. at compile time) that would fit into 64Kb.

In TP you use the new()/delete() or the getmem()/freemem() pairs of functions
to allocate/free the heap (memory).  (Note:  don't mix the pairs!).

Go and get (from wherever) a good data structures book and you should find
a discussion about the Heap data structure and it's use in Heap sorts.
There are good ones by A. Tannenbaum (sp?), N. Kruse, and R. Sedgewick to
name a few.

>	Also: How can a DTA be set up (variables.. offsets, details mon!)
>	and how do I read it, is it just a sector read??

The DTA is just a struct (record) that is used by the FindFirst()/Next()
functions (be it TP or DOS).  It only exist in memory.  I'm not a home so
I can't give details now, but it is described in just about any DOS function
reference and in the TP manuals.  Basically you set the filename field
to the file that you're looking for (normal DOS wildcards allowed), set
the file attribute to the appropriate bit pattern for the types of files
that you want (i.e. archive, system, directory, etc.) and FindFirst() does
the rest.  It's all in the manuals.  A great discussion of directory
traversal stuff (along with code) was in Turbo Technix a couple of years
ago.  I have a generic package of routines to traverse directories and
find files written in TurboC.  I could probably make it available.

Good luck,
	John D. Mitchell
	johnm@cory.Berkeley.EDU

boerner@ut-emx.uucp (Brendan B. Boerner) (11/25/90)

In article <19739@oolong.la.locus.com> jfr@locus.com (Jon Rosen) writes:
[...]
>after your program is running.   Of course, you are still limited
>to the 640K size unless you use Turbo 5.5 with VROOMM (ViRtual
>Object-Oriented Memory Manager) which will allow you to use extended
>memory or disk.  Otherwise, you have to do the actual disk stuff your
>self and it can get pretty messy.  Good idea: Pick up a copy of the
[...]

Based on what I've read in the TC++ manual, VROOM is really just
a slick overlay manager similar to what is found on Turbo Pascal.
You mark certain .objs as swappable and then you don't worry about
it.  The overlay manager detects when a function which is not in
memory is called and swaps it in.  It doesn't really give you
virtual memory.  After all the hoopla Borland gave about VROOM, I
was very disappointed when I was reading the manuals and the
description sounded like something out of my TP 5.0 manual.  'Course,
I'd like to be wrong on this point.

Brendan
-- 
Brendan B. Boerner		Phone: 512/471-3241
Microcomputer Technologies	The University of Texas @ Austin
Internet: boerner@emx.cc.utexas.edu     UUCP: ...!cs.utexas.edu!ut-emx!boerner
BITNET:   CCGB001@UTXVM.BITNET