[comp.databases] dBase tricky INDEXing question...

markus@etn-rad.UUCP (Markus Richardson) (05/21/88)

It seems like I am using a database in a very common way, but I cannot for
the life of me figure out how to get dBase to index it correctly.  

My trans[action].dbf file has five fields.  The first two fields I wish to 
be combined together (for the index) in order to have the records in an order 
based upon the ordering of the first two fields.  Simple enough, right?  I 
guess if both fields were of the same data type it would be no problem,
but as it is the first field (account number) is a numeric type and the 
second field (date of transaction) is a date type.

acct_no = 1000
SEEK acct_no
IF .NOT. FOUND() THEN
	RETURN
ENDIF
DO WHILE (acct_no = trans->account_no)
	? "account_no="+trans->account_no+"  date="+DTOC(trans->date)
	SKIP
ENDDO
RETURN

Attempt 1:
SET EXACT OFF
USE Trans ALIAS Trans
INDEX ON LTRIM(STR(trans->account_no))+DTOC(trans->date) TO Trans.ndx
SEEK LTRIM(STR(account_no))          && Finds the correct matching records but
                                      && they are NOT in order of trans->date!

Attempt 2:
SET EXACT OFF
SET DATE ANSI
USE Trans ALIAS Trans
INDEX ON LTRIM(STR(trans->account_no))+;
	LTRIM(STR(trans->date - CTOD("80.01.01"))) TO Trans.ndx
SEEK LTRIM(STR(account_no))          && Finds the correct matching records but
                                      && they are NOT in order of trans->date!

What am I doing incorrectly?  This seems like such a common scenario for
data retrieval: ordering on a combination numeric+date index.

FYI, I am using dBase ]I[ + version 1.1 .  Please e-mail me any replies 
(at this point in time, I do not trust our newsfeeding machines :-).
Thanks for your help!

-- 

        Markus N. Richardson
        Research and Development
        Eaton Corporation IMSD
        31717 La Tienda Drive
        Westlake Village, CA 91359
        
        { voder,ihnp4,trwrb,scgvaxd,jplgodo }!wlbr!etn-rad!markus
          wlbr!etn-rad!markus@etn-wlv.eaton.com

steve@violet.berkeley.edu (Steve Goldfield) (05/24/88)

In article <529@etn-rad.UUCP> markus@etn-rad.UUCP (Markus Richardson) writes:
Describing a problem in indexing on date.

#Attempt 1:
#SET EXACT OFF
#USE Trans ALIAS Trans
#INDEX ON LTRIM(STR(trans->account_no))+DTOC(trans->date) TO Trans.ndx

I have a slightly different version of dBASE (McMax on a
Macintosh) so if the below doesn't make sense, that's why.
However, my manual says that DTOC gives you the date as
05/23/88 which is unsuitable for sorting in date order. DTOS,
on the other hand gives you YYYYMMDD, i.e., 19880523, which is
suitable for sorting or indexing. If you want reverse order,
subtract from a large number, i.e.,
INDEX ON LTRIM(STR(trans->account_no))+
STR(100000000-VAL(DTOS(trans->date)))

The simpler solution, which I've always used, is just to store
dates as character strings or numbers. That's all you could do in
the old dBASE I started with anyway.

Incidentally, I tried to send this to you by email but you didn't
list an address and our mail program couldn't handle it
automatically.

karl@ddsw1.UUCP (Karl Denninger) (05/25/88)

In article <529@etn-rad.UUCP> markus@etn-rad.UUCP (Markus Richardson) writes:
..It seems like I am using a database in a very common way, but I cannot for
..the life of me figure out how to get dBase to index it correctly.  
..
..My trans[action].dbf file has five fields.  The first two fields I wish to 
..be combined together (for the index) in order to have the records in an order 
..based upon the ordering of the first two fields.  Simple enough, right?  I 
..guess if both fields were of the same data type it would be no problem,
..but as it is the first field (account number) is a numeric type and the 
..second field (date of transaction) is a date type.
..
..acct_no = 1000
..SEEK acct_no
..IF .NOT. FOUND() THEN
..	RETURN
..ENDIF
..DO WHILE (acct_no = trans->account_no)
..	? "account_no="+trans->account_no+"  date="+DTOC(trans->date)
..	SKIP
..ENDDO
..RETURN
..
..Attempt 1:
..SET EXACT OFF
..USE Trans ALIAS Trans
..INDEX ON LTRIM(STR(trans->account_no))+DTOC(trans->date) TO Trans.ndx
..SEEK LTRIM(STR(account_no))          && Finds the correct matching records but
..                                      && they are NOT in order of trans->date!

Yep.

..What am I doing incorrectly?  This seems like such a common scenario for
..data retrieval: ordering on a combination numeric+date index.
..
..FYI, I am using dBase ]I[ + version 1.1 .

The problem is that your first component of the index expression does not
have a constant length.  That is, "LTRIM(STR(xxxx))" is not a constant
length string.

The result of this is:

12388.05.23		- Record number 123, 05/23/88
188.05.23		- Record number 1, 05/23/88

See where this might be a problem?

This "feature" can be avoided by not using LTRIM -- then the string,
is always 10 characters in length and the index will be correctly built.

Enjoy!

----
Karl Denninger                 |  Data: +1 312 566-8912
Macro Computer Solutions, Inc. | Voice: +1 312 566-8910
...ihnp4!ddsw1!karl            | "Quality solutions for work or play"

benh@pnet01.cts.com (B. Humphreys) (05/26/88)

In article <529@etn-rad.UUCP> markus@etn-rad.UUCP (Markus Richardson) writes:
>>It seems like I am using a database in a very common way, but I cannot for
>>the life of me figure out how to get dBase to index it correctly.
>>
>>My trans[action].dbf file has five fields.  The first two fields I wish to
>>be combined together (for the index) in order to have the records in an
order
>>based upon the ordering of the first two fields.  Simple enough, right?  I
>>guess if both fields were of the same data type it would be no problem,
>>but as it is the first field (account number) is a numeric type and the
>>second field (date of transaction) is a date type.
>>
>>acct_no = 1000
>>SEEK acct_no
>>IF .NOT. FOUND() THEN
>>      RETURN
>>ENDIF
>>DO WHILE (acct_no = trans->account_no)
>>      ? "account_no="+trans->account_no+"  date="+DTOC(trans->date)
>>      SKIP
>>ENDDO
>>RETURN
>>
>>Attempt 1:
>>SET EXACT OFF
>>USE Trans ALIAS Trans
>>INDEX ON LTRIM(STR(trans->account_no))+DTOC(trans->date) TO Trans.ndx
>>SEEK LTRIM(STR(account_no))          && Finds the correct matching records
but
>>                                      && they are NOT in order of
trans->date!
>>What am I doing incorrectly?  This seems like such a common scenario for
>>data retrieval: ordering on a combination numeric+date index.
>>
>>FYI, I am using dBase ]I[ + version 1.1 .
 
Karl Denninger writes:
>The problem is that your first component of the index expression does not
>have a constant length.  That is, "LTRIM(STR(xxxx))" is not a constant
>length string.
 
The problem is not only your first commponent of your index expression but
also the second component: DTOC(trans->date)
 
>
>The result of this is:
>
>12388.05.23             - Record number 123, 05/23/88
>188.05.23               - Record number 1, 05/23/88
>
>See where this might be a problem?
>
 
This example is really inaccurate.  First of all, DTOC() does not convert it's
arguement to a different format than the operand.  For example, 05/23/88 does
not become 88/05/23, let alone 88.05.23.  Secondly, it's not the record number
that you're indexing on, it's the account number, correct?
 
>>This "feature" can be avoided by not using LTRIM -- then the string,
>is always 10 characters in length and the index will be correctly built.
 
This observation, however, is correct.  You must make sure that all index keys
generated by your index expression are of the same length.  For example:
 
  Index expression: RTRIM(last_name) + ", " + RTRIM(last_name)
 
  Last Name             First Name            Index expression            Len
  --------------------  --------------------  --------------------------  ---
  McDonald              Roland                Ronald, McDonald             16
  Apathetic             Thug                  Apathetic, Thug              15
 
In this case, the keys are of different lengths.  It is possible to get away
with this for a while, as I have found, but it's not perpetual and it's
certainly not good programming practice.
 
Anyway, the bottom line here is that you want to:
 
  INDEX ON STR(trans->account_no) + DTOC(SUBSTR(trans->date, 7, 2)) + ;
    DTOC(SUBSTR(trans->date, 1, 2)) + DTOC(SUBSTR(trans->date, 4, 2))
 
This will pad account_no's with leading spaces which will force correct
indexing in the situation:
 
  1023  vs.
  205
 
because
 
  " 205" < "1023"
 
And secondly, because the date will now be in a sortable order.  There's
something you must understand about dates.  dBASE will take care of them quite
nicly, manipulatively, while they're of type DATE.  As soon as you convert
them to a string with DTOC(), they are treated exactly as a string and the
year doesn't take precedence in sorting unless listed explicitly first as in
the above index expression.  I hope this discussion has hslped.  If you have
any more problems, mail your questions to me if you'd like.

UUCP: {cbosgd hplabs!hp-sdd sdcsvax nosc}!crash!pnet01!benh
ARPA: crash!pnet01!benh@nosc.mil
INET: benh@pnet01.cts.com

karl@ddsw1.UUCP (Karl Denninger) (05/27/88)

In article <3026@crash.cts.com> benh@pnet01.cts.com (B. Humphreys) writes:
>In article <529@etn-rad.UUCP> markus@etn-rad.UUCP (Markus Richardson) writes:

>>>[Actual example deleted]

>Karl Denninger writes:
>>The problem is that your first component of the index expression does not
>>have a constant length.  That is, "LTRIM(STR(xxxx))" is not a constant
>>length string.
> 
>The problem is not only your first commponent of your index expression but
>also the second component: DTOC(trans->date)

Nope - remember, the original posted had a couple of OTHER lines that
you left out (a Set Date).

>>The result of this is:
>>
>>12388.05.23             - Record number 123, 05/23/88
>>188.05.23               - Record number 1, 05/23/88
>>
>>See where this might be a problem?
> 
>This example is really inaccurate.  First of all, DTOC() does not convert it's
>arguement to a different format than the operand.  For example, 05/23/88 does
>not become 88/05/23, let alone 88.05.23.  Secondly, it's not the record number
>that you're indexing on, it's the account number, correct?

(Account number, yes -- but the example holds regardless of the name of the
numeric field in question.  The only important thing is that the field in
question is of numeric type)

It works as posted if you had used SET DATE ANSI (as the poster specified he 
did).  I tried this out before posting on Foxbase+ (A Dbase III+ near-exact 
clone) and it works fine.  "SET DATE ANSI" is one of those little-used 
options, but it is quite useful for this sort of stuff....

>>>This "feature" can be avoided by not using LTRIM -- then the string,
>>is always 10 characters in length and the index will be correctly built.

>Anyway, the bottom line here is that you want to:
> 
>  INDEX ON STR(trans->account_no) + DTOC(SUBSTR(trans->date, 7, 2)) + ;
>    DTOC(SUBSTR(trans->date, 1, 2)) + DTOC(SUBSTR(trans->date, 4, 2))
> 
>This will pad account_no's with leading spaces which will force correct
>indexing in the situation:

This will work, the gyrations with the date are unnecessary.

>And secondly, because the date will now be in a sortable order.  There's
>something you must understand about dates.  dBASE will take care of them quite
>nicly, manipulatively, while they're of type DATE.  As soon as you convert
>them to a string with DTOC(), they are treated exactly as a string and the
>year doesn't take precedence in sorting unless listed explicitly first as in
>the above index expression.  

True as far as you went -- but the conversion is done according the the 
format you have specified with the SET DATE option.  Dbase III and clones
store dates internally as a binary pattern -- probably a 32-bit value offset
from some "base" date (or even 00/00/00).  When you display a date-format
variable (or field) it's converted on the fly according to the method you've
specified.  It just happens that the default is "US" format.

Note that the side effect of using this "set date" option is that *ALL* date
strings will be displayed and manipulated in "ANSI" format.  We never include 
format strings with our date print-outs, so I have no idea what happens if 
you try to force it to display with the "/"s instead...  If you (and your 
customers) like the ANSI format then it's great, and kills two birds with 
one stone.

Also, beware of switching modes if you have an active index built in the 
other mode!  That one I also haven't tried and it seems like the results 
could be rather scary.

---
Karl Denninger                 |  Data: +1 312 566-8912
Macro Computer Solutions, Inc. | Voice: +1 312 566-8910
...ihnp4!ddsw1!karl            | "Quality solutions for work or play"