[comp.sys.apollo] Bug with cp -r and ACLs

rand@HWCAE.CFSAT.HONEYWELL.COM (01/29/91)

There is a subtle bug in Domain/OS with the Unix cp -r command and the
required ACL entries. It shows itself on directories that have
explicit owners and protections (ie. no (U) umask or (P) process
ownership).

Directories copied with "cp -r" will inherit any extended ACLs
correctly. And the directory ownership too. The problem is with the
protection of the required ACL entries. An example is in order.

Create two directories, one called src and one dest. Set the
protections as such:

# lsacl -all src
   Object ACL:
      Network-wide access allowed
      Required entries:
	rand.%.%            	prwx-
	%.none.%            	-r-x-
	%.%.cmcae           	-----I
	%.%.%               	-r-x-
      Extended entry mask:	-r---
      Extended entries:
	clark.%.%           	-r---
   Initial Directory ACL:
      Network-wide access allowed
      Required entries:
	rand.%.%            	prwx-
	%.none.%            	-r-x-
	%.%.cmcae           	-----I
	%.%.%               	-r-x-
      Extended entry mask:	-r---
      Extended entries:
	clark.%.%           	-r---
   Initial File ACL:
      Network-wide access allowed
      Required entries:
	rand.%.%            	prwx-
	%.none.%            	-r-x-
	%.%.cmcae           	-----I
	%.%.%               	-r-x-
      Extended entry mask:	-r---
      Extended entries:
	clark.%.%           	-r---

# lsacl -all dest
   Object ACL:
      Network-wide access allowed
      Required entries:
	user.%.%            	prw--
	%.server.%          	-----
	%.%.none            	-----I
	%.%.%               	-----
      Extended entry mask:	-r-x-
      Extended entries:
	chew.%.%            	-r-x-
   Initial Directory ACL:
      Network-wide access allowed
      Required entries:
	user.%.%            	prw--
	%.server.%          	-----
	%.%.none            	-----I
	%.%.%               	-----
      Extended entry mask:	-r-x-
      Extended entries:
	chew.%.%            	-r-x-
   Initial File ACL:
      Network-wide access allowed
      Required entries:
	user.%.%            	prw--
	%.server.%          	-----
	%.%.none            	-----I
	%.%.%               	-----
      Extended entry mask:	-r-x-
      Extended entries:
	chew.%.%            	-r-x-

(The protections don't have to be exactly this. Just make them
different. Read the rest and then generate your own test.)

Note, that both have different owners, and required protections. And
that each has a different extended ACL. Ok. now create a directory in
src and a file too. Like so:

# mkdir src/dir src/dir/dir
# touch src/dir/file

Now, to double check the ACLs on these:

# lsacl src/dir src/dir/dir src/dir/file
src/dir:
   Object ACL:
	rand.%.%            	prwx-
	%.none.%            	-r-x-
	%.%.cmcae           	[Ignore]
	%.%.%               	-r-x-
      Extended entry mask:	-r---
	clark.%.%           	-r---
src/dir/dir:
   Object ACL:
	rand.%.%            	prwx-
	%.none.%            	-r-x-
	%.%.cmcae           	[Ignore]
	%.%.%               	-r-x-
      Extended entry mask:	-r---
	clark.%.%           	-r---
src/dir/file:
   Object ACL:
	rand.%.%            	prwx-
	%.none.%            	-r-x-
	%.%.cmcae           	[Ignore]
	%.%.%               	-r-x-
      Extended entry mask:	-r---
	clark.%.%           	-r---

Ok, looks good. Just like it should. Now, copy the src/dir tree into
dest like so:

# cp -r src/dir dest

Now, you have a dest/dir tree. Now, before we look at the ACLs of the
new tree, lets review the ACLs of the dest directory:

# lsacl -all dest
   Object ACL:
      Network-wide access allowed
      Required entries:
	user.%.%            	prw--
	%.server.%          	-----
	%.%.none            	-----I
	%.%.%               	-----
      Extended entry mask:	-r-x-
      Extended entries:
	chew.%.%            	-r-x-
   Initial Directory ACL:
      Network-wide access allowed
      Required entries:
	user.%.%            	prw--
	%.server.%          	-----
	%.%.none            	-----I
	%.%.%               	-----
      Extended entry mask:	-r-x-
      Extended entries:
	chew.%.%            	-r-x-
   Initial File ACL:
      Network-wide access allowed
      Required entries:
	user.%.%            	prw--
	%.server.%          	-----
	%.%.none            	-----I
	%.%.%               	-----
      Extended entry mask:	-r-x-
      Extended entries:
	chew.%.%            	-r-x-

And now for the ACLs of the new tree:

# lsacl dest/dir dest/dir/dir dest/dir/file
dest/dir:
   Object ACL:
	user.%.%            	prwx-     <- Correct user,  wrong protection
	%.server.%          	-r-x-     <- Correct group, wrong protection
	%.%.none            	[Ignore]
	%.%.%               	-r-x-     <- Correct other, wrong protection
      Extended entry mask:	-r-x-
	chew.%.%            	-r-x-
dest/dir/dir:
   Object ACL:
	user.%.%            	prwx-     <- Correct user,  wrong protection
	%.server.%          	-r-x-     <- Correct group, wrong protection
	%.%.none            	[Ignore]
	%.%.%               	-r-x-     <- Correct other, wrong protection
      Extended entry mask:	-r-x-
	chew.%.%            	-r-x-
dest/dir/file:
   Object ACL:
	user.%.%            	prwx-
	%.server.%          	-----
	%.%.none            	[Ignore]
	%.%.%               	-----
      Extended entry mask:	-r-x-
	chew.%.%            	-r-x-

Note that the file owners were inherited correctly. And the extended
ACLs were right too. But, the protections for the required entries
were inherited from the src directory.

This problem seems to exist in SR10.1, 10.2, 10.3; and both System V.3
and 4.3 BSD environments. (I don't have a 10.0 to test.)

Now, before everybody jumps up on the soap box and says, that this
looks a little like JLRU, let me cite the 4.3 BSD manual entry for cp:

> By default, cp preserves the mode and owner of file2 if file2 already
> exists; otherwise it uses the mode of the source file modified by the
> current umask (2) is used.

This is incorrect. cp uses the initial file ACL, just like it should.
If the initial file ACL specifies the umask or to inherit the process
owner, it does. (I like this. I think initial file and directory ACLs
are great. Much better than umask.)

So, preserving the mode of the files is sorta right. But it only
works for files. And the owner is not preserved. So I say it is a
bug. The behavior is inconsistent. Not to mention being a combination
of the two.

This was reported in APR 5B54C694. Unfortunetly HP/Apollo said "Use
/com/cpt." Well, how about people who don't load Aegis? The APR was
reported 04/10/90. It was even found in SR10.3. (I don't know who
found it, but I would suspect it was somebody at HP/Apollo or a Beta
test site since 10.3 was not released then.)

The official line from HP/Apollo is that they won't fix it. Here is the
text of the HP/Apollo response to the APR:

> 1) there is already an easy way to get what you want, namely /com/cpt
> 
> 2) providing duplicate functionality in /bin/cp seems to be unnecessary
>    if we were to do this, it would cost us additional resources in terms
>    of implementation, maintenance and documentation.
>    we have provided unix command equivalents of domain_os command functions
>    where this has a big gain; there is no intention of duplicating every
>    detail of domain_os command execution
> 
> note: the basic reason that the specific case of yours is failing is that
>    /bin/cp attempts to preserve the protection on the directory when it
>    creates the new directory.  If the directory already exists, it will
>    not bother.  A limited version of what you want is thus possible using
>    only unix commands if you mkdir the destination (it will create it using
>    the defaults), and then run cp.  Obviously if the directory tree is
>    more complicated, this solution will not help.

We are trying to get the status of this APR escalated so that
HP/Apollo will fix it. Its not that I want my own patch tape just for
this. But it would be nice to have HP/Apollo acknowledge that it is a
bug, and that they will fix it eventually. What they are saying here
is that it would cost HP/Apollo "additional resources" to make
Domain/OS behave either consistently, or correctly. (I think that this
is a bug. Even those who disagree, must admit that the behavior is
inconstant.)

I must say that once we persuaded HP/Apollo that this actually
happens, they were pretty good about it. All of the people we talked
to at the Hot Line agreed that this is a bug. (In fact the person
who told us that there was an APR already written, asked us not to
kill the messenger.) I believe that it is somebody in Engineering that
said it won't be fixed.

(On another note, it was very difficult to persuade HP/Apollo that we
were not some dumb idiots. With the first person, it took 30 minutes
on the phone working through an example. Then somebody else got it,
and the response was "Operator error." Even though we had already
demonstrated the bug to another person there. Days latter, and after a
fax of a transcript, the second person agreeded with us too.)

--
Douglas Keenan Rand                Honeywell -- Air Transport Systems Division
Phone: +1 602 436 2814               US Snail: P.O. Box 21111 Phoenix AZ 85036
Internet: @cim-vax.honeywell.com:rand@hwcae.cfsat.honeywell.com
UUCP: ...!uunet!asuvax!apciphx!hwcae!rand

pbrooks@mentorg.com (Phil Brooks) (02/03/91)

In article <9101281715.AA16081@hwcae.cfsat.honeywell.com> rand@HWCAE.CFSAT.HONEYWELL.COM writes:
>There is a subtle bug in Domain/OS with the Unix cp -r command and the
>required ACL entries. It shows itself on directories that have
>explicit owners and protections (ie. no (U) umask or (P) process
>ownership).

We also reported this bug and got the same response.  We have elected to use tar instead
of cp -r.  It goes something like tar -cf - foo | (cd /user/joe; tar -xf - )
I am just trying to remember it and didn't try that, so if it doesn't work, look 
at the man page.  The acls work out correctly when you use tar.
-- 
Phil Brooks, Mentor Graphics Corporation
8005 SW Boeckman Road
Wilsonville, OR   97070-7777

goldfish@CONCOUR.CS.CONCORDIA.CA (02/07/91)

| From:    Phil Brooks <pbrooks%mntgfx%sequent%tektronix%zephyr.ens.tek.com.uucp@
|      ***beaver.cs.washington.edu>
| 
| In article <9101281715.AA16081@hwcae.cfsat.honeywell.com> rand@HWCAE.CFSAT.HONE
| YWELL.COM writes: 
| >There is a subtle bug in Domain/OS with the Unix cp -r command and
| >the required ACL entries. It shows itself on directories that have
| >explicit owners and protections (ie. no (U) umask or (P) process
| >ownership).
| We also reported this bug and got the same response.  We have
| elected to use ta r instead of cp -r.  It goes something like tar
| -cf - foo | (cd /user/joe; tar -xf - ) I am just trying to remember
| it and didn't try that, so if it doesn't work, look at the man page.
| The acls work out correctly when you use tar.

There is also an Aegis command named "/com/cpt" which is very good for
"cloning" a directory tree to another place.  While I generally use
the BSD command for a given function, the AEGIS commands are
frequently better suited to system-administrative functions (no
surprise, they were designed to work together)

It is probably worth keeping some Aegis commands online for system
level things ... 

#ifdef soapbox
	Grumbling about inconsistencies between BSD and Domain when
	there is a perfectly good way to do it with an AEGIS tool is
	misplaced religious zeal ... What I would really like is
	crossreferencing in the online BSD man pages to the
	corresponding AEGIS commands to help me locate these methods.

	There are bigger HP cannons pointed at tender Apollo toes than
	this. :-(
#endif

--	  Paul Goldsmith
<goldfish@concour.cs.concordia.ca>				 (514) 848-3031
	(Shirley Maclaine told me there would be LIFETIMES like this)

system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) (02/08/91)

In article <9102061706.aa14313@concour.cs.concordia.ca> goldfish@CONCOUR.CS.CONCORDIA.CA writes:
>It is probably worth keeping some Aegis commands online for system
>level things ... 

Unfortunately I have to agree with this statement - there is however the
fact that HP/Apollo advertises that any/all environments can be used, where
there are cases where this is simply not the case. A prime example is
'/com/sigp', which is needed to kill processes that won't die with 'kill
-9' (which is the untrappable UNIX kill signal).

>  Grumbling about inconsistencies between BSD and Domain when
>  there is a perfectly good way to do it with an AEGIS tool is
>  misplaced religious zeal ... What I would really like is
>  crossreferencing in the online BSD man pages to the
>  corresponding AEGIS commands to help me locate these methods.

There is the problem though, that if you don't have Aegis loaded in your
AA in the first place, or don't have the disk space for it, you are stuck.
We are a multi-vendor UNIX site, and the users have enough trouble with
UNIX without being told/expected to learn Aegis too.
In my case, such cross referencing is the only way I'll ever find out
what Aegis commands do anything - I would never have found the 'cpt' (or
whatever it was) that is able to copy a directory subtree properly.
-- 
Mike Peterson, System Administrator, U/Toronto Department of Chemistry
E-mail: system@alchemy.chem.utoronto.ca
Tel: (416) 978-7094                  Fax: (416) 978-8775

rees@pisa.ifs.umich.edu (Jim Rees) (02/08/91)

In article <1991Feb7.183550.8329@alchemy.chem.utoronto.ca>, system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) writes:

  Unfortunately I have to agree with this statement - there is however the
  fact that HP/Apollo advertises that any/all environments can be used, where
  there are cases where this is simply not the case. A prime example is
  '/com/sigp', which is needed to kill processes that won't die with 'kill
  -9' (which is the untrappable UNIX kill signal).

Not a good example.  You should never blast a process unless you plan to
shut down your node.  And if you're going to shut down, then you don't need
to blast.

The 'cp -r' problem is just a bug.  And there is a bsd workaround (piped
tar, which is the traditional way to copy a tree in Unix anyway).  Do you
have other examples?

system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) (02/08/91)

In article <4fae2d9d.1bc5b@pisa.ifs.umich.edu> rees@citi.umich.edu (Jim Rees) writes:
>In article <1991Feb7.183550.8329@alchemy.chem.utoronto.ca>, system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) writes:
>
>  Unfortunately I have to agree with this statement - there is however the
>  fact that HP/Apollo advertises that any/all environments can be used, where
>  there are cases where this is simply not the case. A prime example is
>  '/com/sigp', which is needed to kill processes that won't die with 'kill
>  -9' (which is the untrappable UNIX kill signal).
>
>Not a good example.  You should never blast a process unless you plan to
>shut down your node.  And if you're going to shut down, then you don't need
>to blast.

I agree that after blasting, you should shut down, but we get stuck processes
(looping in the cpu) when I am not able to reboot immediately (like
nights/weekends), but the user process must be killed so they and other users
can do useful work. There is no way that a cpu-bound process should be
able to ignore a 'kill -9' (a process hung on some non-returning system call
might be able to, but that is not my experience with other UNIXs).

>The 'cp -r' problem is just a bug.  And there is a bsd workaround (piped
>tar, which is the traditional way to copy a tree in Unix anyway).  Do you
>have other examples?

I will be posting my complete list of Apollo bugs after I figure out
which of them were fixed by SR10.3 (not many, since even some of the
ones I was told were fixed, are not, like vi in a pad).
-- 
Mike Peterson, System Administrator, U/Toronto Department of Chemistry
E-mail: system@alchemy.chem.utoronto.ca
Tel: (416) 978-7094                  Fax: (416) 978-8775