[comp.sys.ibm.pc] Answers!!! not questions

philip@amdcad.AMD.COM (Philip Freidin) (08/18/88)

Many moons ago I posed the problem of a sort package that I had, that
had suddenly stopped working, and had run out of file handles. In my
article, I gave a VERY complete description of all the things that I
had tried to resolve the problem, and it was exhaustive. The responses
were entertaining/irritating, as most of them were irrelevant, as
would have been obvious if my original description of what I had tried
had been read carefully. People seem to read up to the point that they
can form an oppinion, and then post. This mode seems to permeate the
Net, so there you are. I will treat it as an epidemic of:

			Deafness of Eye-balls.

So anyway, so as not to bore you all to death with my ramblings, here
is a synopsis of the problem, and the surprizing resolution. (P.s. it
turns out that that I lied in my original posting, with regard to "I
didn't change Nuthin, an' now it don't work".)

Synopsis of problem:
	Bigsort is a program I wrote that does a poly-phase
	quicksort/mergesort, while pretending that the disk is
	multiple tape drives. This works well because, disks being read
	sequentially, transfer data quite fast.
	Program opens about 8 scratch files, and several others, such
	as input, index, and report. Input file is about 700K ascii.
	Program crashes now, and reports it can't open all it's files.
	I tried games involving TSR's, AUTOEXEC.BAT, and CONFIG.SYS.
	The most misleading thing I did was to boot off a virgin
	distribution floppy, and even that didn't work. The only thing
	that worked was to bump up the FILES=xxx in the config.sys
	file, but why was it needed????? every thing used to work.

The solution:
	Turns out I had changed autoexec.bat to make it quieter. 
	An ECHO OFF at the begining, and for each TSR that was loaded,
	I shut them up as well, like this:
	c:>MARK ALL >nul:
		    ^^^^^
	What was happening was each of the TSR's was holding onto the
	file handle associated with the output redirection, regardless 
	of the fact that it never used it again. The solution was to 
	either increase files=xxx, or put up with a noisy autoexec.bat.
	I chose the latter.

Discussion:
	The default files=xxx is 8.  So when I booted from the floppy,
	without any config.sys, it failed.
	My normal config.sys had it set to 20, so my redirects in the
	autoexec.bat was eating up about 6 of them. Increasing the
	value in config.sys, fixed the problem, as did removing the
	redirects to nul:. The increase of files=xxx was unacceptable
	as I couldn't afford the memory.
	I tracked down the problem by writing a program that reports
	how many file handles are left. Placing this into the
	autoexec.bat file at multiple stategic places revealed the
	problem.

Discovery:
	When msdos starts a program, 5 handles are
	allocated, no matter what you do. STDIN, STDERR, STDOUT,
	STDAUX:, and STDPRN:. ( I am talking about programs written in
	Microsoft C versions 4.0, 5.0, and 5.1 . I am not aware what
	happens with just run of the mill programs)
	BIG SURPRISE: although you can get at
	these handles, if you do a close on them, the handle is not 
	released for other uses. Msdos allows 20 file handles max 
	per program, and these are allocated from the pool defined 
	by files=xxx. Therefore programs are in big trouble if they 
	need more than 15 open files at one time. You can have a pool
	bigger than 20 file handles, but only 20 per program. There
	are certainly kludges around this but I am not interested.

I hope this is of some interest to someone, since I spent way to long
isolating the problem, let alone typing in this monalogue.



Philip Freidin @ AMD SUNYVALE on {favorite path!amdcad!philip)
Section Manager of Product Planning for Microprogrammable Processors
(you know.... all that 2900 stuff...)
"We Plan Products; not lunches" (a quote from a group that has been standing
				 around for an hour trying to decide where
				 to go for lunch)

dixon@control.steinmetz (walt dixon) (08/18/88)

There are really two separate issues here.  Each program segment prefix
(PSP) contains the address of a data structure known as the Job File Table
(JFT).  The default JFT is itself part of the PSP,  but it can be moved.
[DOS 3.3 function int 21h ah=67h does this.  In older DOS versions one
could alter the JFT address within the PSP;  this change causes problems
when the PSP is cloned.]  Each JFT entry is one byte long and contains
either a System File Number (SFN) or 0xff.  The SFN is an index into 
another DOS data structure known as the System File Table (SFT).  The handle
returned by open and create services is an index into the JFT;  a JFT
value of 0ffh indicates the corresponding handle is unused.

Each program has its own PSP,  but there is only one copy of the SFT.
[There is a separate SFT for FCB access.]  The SFT is the focal point for
device independent I/O.  Each SFT entry contains the file name, (the path
portion of the name is removed),  file position,  owner,  reference count,
flags,  and device driver/device control block (DCB) address. [The DCB is an
intermediate data structure for block devices.  The DCB contains the
address of the block device driver.]  Each SFT entry is 35h bytes long.

DOS will expand the SFT upto the files= value from config.sys.  SFT entries
are allocated in groups,  but basically form a linked list.  The initial
block of SFT entries can be found using the undocumented int 21h ah=52h
service.  ES:[BX+4] contains the address of the first SFT block.

When one opens/creates a file,  DOS allocates a handle by scanning the
JFT until it finds an unused entry,  and then allocates an SFT entry.
DOS records the current PSP in the SFT owner field and initializes the
reference count to 1.  If the "no inherit" bit is set,  DOS sets the SFT
flags field appropriately.  After locating the file/device,  DOS examines 
perviously used SFT entries looking for duplicate entries.  If an SFT entry 
already exists,  the newly allocated SFT entry is released and the reference 
count is incremented.

Note that only one SFT entry exists for an open file or device no matter
how many times it has been opened.  When a file/device is closed,  DOS
uses the handle to get to the SFN which in turn locates the SFT entry.
The reference count is decremented.  When this count goes to zero,  the
SFT entry is deallocted.  Only the file owner can cause the SFT entry to
be deallocated.

When you run a program,  COMMAND.COM uses the int 21h ah=4bh load service.
The load service makes the program resident from disk,  clones the parent
(in this case command.com) PSP,  makes the new program the current PSP,
and sets up a termination address.  Any files opened by command.com
are propagated to the child program.  Normally command.com opens stdin,
stdout, stderr,  stdaux,  and stdprn.

When a program terminates,  DOS scans the PSP and closes all its files
and deallocates any memory blocks owned by the PSP.  Closing the files
inherited by command.com mereley decreases the SFT reference count.
When a TSR terminates and stays resident,  its open files are not closed.
COMMAND.COM and any resident TSRs take up SFT entries.

Since an SFT entry takes up a relatively small amount of memory,  it
is normally not a problem to set files= to a reasonable number.  If you
like living dangerously,  a program which needs an abnormally large
number of SFT entries can increase the SFT table size itself and
decrease it when it terminates (watch out for TSRs which open files).

A more complete description of these data structures can be found
in Chapter 4(?) of the revised MS DOS Developer's Guide which will
be published sometime soon (Howard Sams is the publisher). [Although
I am the author of this chapter,  I get no royalties.  Just citing a
good reference.]


Walt Dixon		{ARPA:		dixon@ge-crd.arpa	}
			{US Mail:	GE Corp. R&D		}
			{		PO Box 8		}
			{		Schenectady,  NY 12345	}
			{Phone:		518-387-5798		}

Standard disclaimers apply.