[comp.databases] dBase .ndx format?

usenet@cps3xx.UUCP (Usenet file owner) (01/19/89)

Can someone please either post, e-mail, or provide a reference to the
internal format for dBase III .ndx files? From looking at hex dumps,
I've made some good guesses at what the format for the info I need to
get from the file is, but there are still gray/black areas.

Advans Thanx,

John H. Lawitzke      UUCP: ...rutgers!mailrus!frith!fciiho!jhl
Michigan Farm Bureau        ...decvax!purdue!mailrus!frith!fciiho!jhl
Insurance Group             ...uunet!frith!jhl
"My other computer is an IBM RT Model 135"

awd@dbase.UUCP (Alastair Dallas) (01/24/89)

Try the "Advanced Programmer's Guide" by Castro, Hansen and Rettig, 
published by Ashton-Tate and available in stores like B. Dalton.  If
this is not enough detail, I'm afraid you'll find that its considered
proprietary.

Hope it helps.

/alastair/

ray@spca6.UUCP (ray) (01/26/89)

In article <1558@cps3xx.UUCP> jhl@frith.egr.msu.edu (John H. Lawitzke) writes:
>Can someone please provide a reference to the
>internal format for dBase III .ndx files? 
A while back I wrote a set of programs that did various things to dBASE files,
packing, counting, appending that sort of garbage.  One of them read dBASE III
and dBASE III Plus .ndx files and reported on the file's contents.
At the end of this article is most of the man page for that command. I have
posted the man page because it includes a conceptual description of the
.ndx file.  NOTE: this applies to dBASE III not dBASE IV(featuring AT's new
improved memory over-loader....:-)). If after reading the man page you
are interested in the Microsoft C 4.0 source for dbistat I can email, or
if there is enough interest I could post to comp.sources.misc...either way
let me know.
   DBISTAT			    RunWare			 DBISTAT


	    Because the	internal structure of a	dBASE .NDX file	is not
	    published, as the internals	of DBF files are, we must begin
	    by assigning nomenclature to the various parts of an .NDX
	    file.  An .NDX file	is composed of NODES.  Each node is 512
	    bytes long.	 There are three types of nodes, POINTER nodes,
	    DATA nodes,	and the	ZERO node.

	    The	zero node is the first node in the .NDX	file.  It
	    contains information such as the expression	used to	generate
	    the	key for	this index, known as the key expression.  The
	    key	for an index is	the fields or combination of fields that
	    were used to make the index	(i.e. in the dBASE line	-index
	    on lower(last) to tmp-, lower(last)	is the key expression,
	    and	each record's value for	lower(last) is that record's
	    key.)  The zero node tells whether the key is a
	    numeric(dBASE internal numeric) or character expression.
	    The	node also tells	the length of each key.	 Additionally,
	    the	zero node tells	the maximum number of key entries a node
	    may	have.  It also stores the length of each of these
	    entries.  A	key entry is composed of the key for a record,
	    and	either a record	number or a pointer to the next	lower
	    node in the	node chain.  Finally, the zero node contains two
	    special node numbers.  The zero node contains the number of
	    the	next available node.  And the zero node	contains the
	    number of the node that is first in	the node chain.	 This
	    node I call	the MASTER NODE.  The master node is the node at
	    which dBASE	begins it's search of the index. It may	be a
	    data node or a pointer node.  However, if the master node is
	    a data node	then there is only 1 node in the node chain,
	    (two nodes total in	the file.)

	    A pointer node is a	node that contains keys	and pointers to
	    nodes that are less	than or	equal to that key.  No entry in
	    a pointer node contains a record number.  dBASE will search
	    for	the key	it is looking for in the list of key entries in
	    a pointer node.  When a key	entry, less or equal to	the key
	    that is being searched for is found	dBASE gets the node
	    number in that entry.  That	node number may	be another
	    pointer node, in which case	the process repeats with the new
	    pointer node.

	    Eventually dBASE reaches a data node. In each data node is a
	    key	entry and a record number in the DBF for that key.
	    Depending on the size of the key, there are	usually	multiple
	    key	entries	and record numbers in each data	node.

	    With no options DBISTAT prints just	the information	it found
	    in the zero	node.

       OPTIONS
	    -r|-i|-s  The -r option, causes DBISTAT to print each key
		      and it's record number in	indexed	order.	The -i
		      option causes DBISTAT to report on each pointer
		      and data node it finds in	the .NDX file.	The
		      report is	produced in indexed order.  The	-s
		      option produces a	report similar to the -i report
		      however, the node	information is printed in the
		      order the	nodes are found	in the .NDX file.  This
		      sequential order,	may or may not be the same as
		      indexed order (it	usually	is after just
		      REINDEXing.) Only	one of these options may be used
		      per invocation of	DBISTAT.
=
ray   mit-eddie!uccba!spca6!ray