[comp.databases] Deleting duplicate records

englandr@phoenix.Princeton.EDU (Scott Englander) (07/16/89)

I'd like to delete duplicate records in a FoxBase file.  I tried setting
the index to unique and then copying the database to a new file, but the
duplicates remained.  I ultimately did it with a simple and SLOW program
that checked for duplicates and deleted them (as i'm typing, it's
running in the background under Multifinder -- there are 4000 records!).
But there must be a better way to do this using the unique index, since
only the first record with a given value of the index field is indexed.
-- 

                                               - Scott

bote@csense.UUCP (John Boteler) (07/17/89)

From article <9426@phoenix.Princeton.EDU>, by englandr@phoenix.Princeton.EDU (Scott Englander):
> I'd like to delete duplicate records in a FoxBase file.  I tried setting
> the index to unique and then copying the database to a new file, but the
> duplicates remained.  I ultimately did it with a simple and SLOW program
> that checked for duplicates and deleted them (as i'm typing, it's
> running in the background under Multifinder -- there are 4000 records!).
> But there must be a better way to do this using the unique index, since
> only the first record with a given value of the index field is indexed.

Assuming that a 'duplicate record' in your case is one which has
the same key as another, and the other fields do not enter into
the decision, here is one suggestion:

set unique on
use &tablename index &indexname
go top
do while .not. eof()
	delete	&& marks this record to be copied later
	skip	&& advances to next unique record in index
enddo
copy to &temp for deleted()	&& copy only those marked
!mv tablename.dbf tablename.old
!mv temp.dbf tablename.dbf
go home	&& relax!

Now, before you remove the old file, are you sure that those records
which did not make it into the new file are truly bogus? rm is forever.

-- 
Bote
uunet!cyclops!csense!bote
{mimsy,sundc}!{prometheus,hqda-ai}!media!cyclops!csense!bote

timk@xenitec.uucp (Tim Kuehn) (07/17/89)

englandr@phoenix.Princeton.EDU (Scott Englander) writes:
>I'd like to delete duplicate records in a FoxBase file.  

... Previous attempt deleted ...

>But there must be a better way to do this using the unique index, since
>only the first record with a given value of the index field is indexed.
>-- 
>
>                                               - Scott

try this:

Definintions:
-------------
dborig: the original database 
dbunique: the database with duplicate records removed
ndxuniq: the index file on the key with only unique values of the key
<key>:	the key value you are indexing on

Program:
--------
use dborig
copy structure to dbuniq
select 0
use dbunique
select dborig
index on <key> to ndxuniq unique
set index to ndxuniq
go top
do while .not. eof()
	select dbuniq
	append blank
	replace dbuniq->field1 with dborig->field1,;
		dbuniq->field2 with dborig->field2,;
		dbuniq->field3 with dborig->field3,;
		dbuniq->field4 with dborig->field4

		<...etc...>
	select dborig
	skip
enddo

hope this helps!


+-----------------------------------------------------------------------------+
|Timothy D. Kuehn				       timk@xenitec	      |
|TDK Consulting Services			       !watmath!xenitec!timk  |
|871 Victoria St. North, Suite 217A					      |
|Kitchener, Ontario, Canada N2B 3S4 		       (519)-741-3623 	      |
|DOS/Xenix - SW/HW. uC, uP, DBMS. 		       Satisfaction Gauranteed|
+-----------------------------------------------------------------------------+

roger@batcomputer.tn.cornell.edu (Roger Boissonnas) (07/18/89)

In article <1989Jul17.165249.26554@xenitec.uucp> timk@egvideo.UUCP (Tim Kuehn) writes:
>englandr@phoenix.Princeton.EDU (Scott Englander) writes:
>>I'd like to delete duplicate records in a FoxBase file.  
>
>... Previous attempt deleted ...
>
>Program:
>--------
>use dborig
>copy structure to dbuniq
>select 0
>use dbunique
>select dborig
>index on <key> to ndxuniq unique
>set index to ndxuniq
>go top
>do while .not. eof()
>	select dbuniq
>	append blank
>	replace dbuniq->field1 with dborig->field1,;
>		dbuniq->field2 with dborig->field2,;
>		dbuniq->field3 with dborig->field3,;
>		dbuniq->field4 with dborig->field4
>
>		<...etc...>
>	select dborig
>	skip
>enddo

You can write a program to do the same thing in only three lines:

USE dborig
INDEX ON <key> TO ndxuniq UNIQUE
COPY TO dbuniq


Roger Boissonnas                                    roger@tcgould.tn.cornell.edu
Cornell Theory Center                                             (607) 255-8306























Roger Boissonnas                                    roger@tcgould.tn.cornell.edu
Cornell Theory Center                                             (607) 255-8306

timk@xenitec.uucp (Tim Kuehn) (07/20/89)

roger@tcgould.tn.cornell.edu (Roger Boissonnas) writes:
>In article <1989Jul17.165249.26554@xenitec.uucp> timk@egvideo.UUCP (Tim Kuehn) writes:
>>
>>Program:
>>--------


... source code deleted ...
>You can do the same thing in only three lines:
>USE dborig
>INDEX ON <key> TO ndxuniq UNIQUE
>COPY TO dbuniq
>
>
>Roger Boissonnas                              roger@tcgould.tn.cornell.edu
>Cornell Theory Center                                      (607) 255-8306

You know what - you're right! I always thought the copy command used the
entire database, irrespective of index keys. Learn something new every
day I guess....

+-----------------------------------------------------------------------------+
|Timothy D. Kuehn	       timk@egvideo (soon to be timk@xenitec)	      |
|TDK Consulting Services			       !watmath!xenitec!timk  |
|871 Victoria St. North, Suite 217A					      |
|Kitchener, Ontario, Canada N2B 3S4 		       (519)-741-3623 	      |
|DOS/Xenix - SW/HW. uC, uP, DBMS. 		       Satisfaction Gauranteed|
+-----------------------------------------------------------------------------+