[comp.os.msdos.programmer] "file cannot be copied onto itself"

cla@nyx.UUCP (Chuck Anderson) (09/15/90)

Qvestion:

How does the DOS "copy" command know when the source and destination file are
one in the same (i.e., "file cannot be copied onto itself")?  I can't believe
that it's just a string compare. The logistics of that are pretty awkward.

I have written my own "nice" copy command (in Turbo C) to prevent accidental
overwriting (destruction) of a file. The program works fine. It allows the
option of renaming the dest (if it exists), appending to it, overwriting it,
or canceling the copy operation.

Then I hit this stumbling block. How do you know if you are going to try to
write a file onto itself?  This sounds simple, but seems rather complex to me
after trying a few things. You cannot *simply* compare the two names. That
involves a lot of analysis as to just how the files were specified on the
command line. There must be a better way.

How does one determine that two files are indeed the same physical file?

  *************************************************************************
     Chuck Anderson                uucp    :         uunet!isis!nyx!cla 
     Boulder, Co. (303) 494-6278   internet:         cla@nyx.cs.du.edu 
  *************************************************************************

ralf@b.gp.cs.cmu.edu (Ralf Brown) (09/15/90)

In article <2091@nyx.UUCP> cla@nyx.UUCP () writes:
}How does one determine that two files are indeed the same physical file?

Under DOS 3.0 and up, the easiest way is to apply the following undocumented
function to both names and then do a string compare.

INT 21 - DOS 3+ internal - RESOLVE PATH STRING TO CANONICAL PATH STRING
        AH = 60h
        DS:SI -> ASCIZ relative path string or directory name
        ES:DI -> 128-byte buffer for ASCIZ canonical fully qualified name
Return: CF set on error
            AX = error code
                02h invalid source name
                03h invalid drive or malformed path
                others???
        CF clear if successful
            AH = 00h
            AL = destroyed (00h or 5Ch or last char of current dir on drive)
            buffer filled with qualified name of form D:\PATH\FILE.EXT or
              \\MACHINE\PATH\FILE.EXT
Notes:  the input path need not actually exist
        letters are uppercased, forward slashes converted to backslashes,
          asterisks converted to appropriate number of question marks, and
          file and directory names are truncated to 8.3 if necessary.
        '.' and '..' in the path are resolved
        filespecs on local drives always start with "d:", those on network
          drives always start with "\\"
        if path string is on a JOINed drive, the returned name is the one that
          would be needed if the drive were not JOINed; similarly for a
          SUBSTed, ASSIGNed, or network drive letter.   Because of this, it is
          possible to get a qualified name that is not legal under the current
          combination of SUBSTs, ASSIGNs, JOINs, and network redirections
        functions which take pathnames require canonical paths if invoked via
          INT 21/AX=5D00h
        supported by OS/2 v1.1 compatibility box
SeeAlso: INT 2F/AX=1123h,1221h
-- 
{backbone}!cs.cmu.edu!ralf  ARPA: RALF@CS.CMU.EDU   FIDO: Ralf Brown 1:129/3.1
BITnet: RALF%CS.CMU.EDU@CMUCCVMA   AT&Tnet: (412)268-3053 (school)   FAX: ask
DISCLAIMER?  Did  | Everything is funny as long as it is happening to
I claim something?| someone else.  --Will Rogers

woody@spock (Woody Suwalski) (09/17/90)

In article <2091@nyx.UUCP> cla@nyx.UUCP () writes:
>Qvestion:
>
>How does the DOS "copy" command know when the source and destination file are
>one in the same (i.e., "file cannot be copied onto itself")?  I can't believe
>that it's just a string compare. The logistics of that are pretty awkward.

The trick is to change attribute of a "source" file, and then get attribute
of the "destination". If "destination" is changed - you speak about the same
file....
Of course, you should later restore old attribute...

Woody

billb@crpmks.UUCP (Bill Bochnik ) (09/18/90)

>How does the DOS "copy" command know when the source and destination file are
>one in the same (i.e., "file cannot be copied onto itself")?  I can't believe
>that it's just a string compare. The logistics of that are pretty awkward.
>

[ Stuf go bye bye]

>Then I hit this stumbling block. How do you know if you are going to try to
>write a file onto itself?  This sounds simple, but seems rather complex to me
>after trying a few things. You cannot *simply* compare the two names. That
>involves a lot of analysis as to just how the files were specified on the
>command line. There must be a better way.
>
>How does one determine that two files are indeed the same physical file?
>



How about opening the source file for reading, then open the destination file
for writing.  The second operation should return an error if the two files are
the same physical one.


-Bill

eric@mks.mks.com (Eric Gisin) (09/18/90)

There is at least one network where DOS function 0x60 (get canonical path)
returns a garbage string and no error. I can't remember the network, but
we did encounter this problem in our "cp" program. The code to canonicalize
a pathname is not that complicated, about 60 lines of C.

And under PC-NFS, there is no way to determine if files are the
same if they are linked together on the UNIX file system.

david@csource.oz.au (david nugent) (09/24/90)

In <10478@pt.cs.cmu.edu> ralf@b.gp.cs.cmu.edu (Ralf Brown) writes:

> In article <2091@nyx.UUCP> cla@nyx.UUCP () writes:
> }How does one determine that two files are indeed the same physical file?

> Under DOS 3.0 and up, the easiest way is to apply the following undocumented
> function to both names and then do a string compare.

> INT 21 - DOS 3+ internal - RESOLVE PATH STRING TO CANONICAL PATH STRING
>         AH = 60h
>         DS:SI -> ASCIZ relative path string or directory name
>         ES:DI -> 128-byte buffer for ASCIZ canonical fully qualified name



Just for your information, I'll relate some experience gained by use
under a wide variety of different DOS/LAN setups;

	Not supported at all under PC-MOS/386 (3.x and 4.x);
	
	Will not produce a valid and usable file name
		when done on a JOINed directory; it gets mapped
		back to the original drive/path, which is no longer
		considered valid by MS-DOS, (works fine on SUBST'd
		drives of course);

	Doesn't work well under MS-NET (fully resolved path
		names aren't valid there);

	Won't work under LAN manager, for the same reason as
		MS-NET;

So, this experience means that resolving paths is useful in the comparison
as above (to establish the file's identity), but such a path isn't
guaranteed to be useful for MS-DOS calls.  In other words, it should be
used for testing only, and the original paths used for the actual copying
once the verification is done.


david
-- 

        Fidonet: 3:632/348   SIGnet: 28:4100/1  Imex: 90:833/387
              Data:  +61-3-885-7864   Voice: +61-3-826-6711
 Internet/ACSnet: david@csource.oz.au    Uucp: ..!uunet!munnari!csource!david

broehl@watserv1.waterloo.edu (Bernie Roehl) (10/04/90)

In article <1213@crpmks.UUCP> billb@crpmks.UUCP (Bill Bochnik (Info Systems)) writes:
>
>>How does the DOS "copy" command know when the source and destination file are
>>one in the same (i.e., "file cannot be copied onto itself")?  I can't believe
>>that it's just a string compare. The logistics of that are pretty awkward.

I suspect that it first fully resolves the name, checks to see they're on the
same drive, then finds what their starting FAT entries are.  (Sort of like
seeing if two directory entries point to the same inode on Unix).

-- 
	Bernie Roehl, University of Waterloo Electrical Engineering Dept
	Mail: broehl@watserv1.waterloo.edu OR broehl@watserv1.UWaterloo.ca
	BangPath: {allegra,decvax,utzoo,clyde}!watmath!watserv1!broehl
	Voice:  (519) 885-1211 x 2607 [work]